Investigate the possibility of creating a RQL parser
What we want
We want to have a Typescript library providing a RQL AST (Abstract Syntax Tree) and parser usable when developping in JS/TS, for example when writing node scripts or react/angular applications. It would be nice to find a solution which could be reused to write an AST/parser in python without having to maintain two grammar files or duplicate logic between JS/TS and python. Indeed our python parser and AST are based on YAPPS which is unmaintained python library.
Solutions to explore:
-
Tree-Sitter : seems to be able to generate parsers useable from many-languages -
Antlr -
Handmade recursive descent parser -
...
Tree-Sitter
Repo to a simple RQL grammar made with tree-sitter: https://forge.extranet.logilab.fr/fbessou/tree-sitter-rql
Pros:
- it is really easy to write the grammar for a parser;
- the AST can be easily manipulated to build tools like formatters or linters
- it is easy to write highlighting rules based on an AST;
- parser is resilient to syntax errors;
- the parser is a c library with binding with rust, js, python, etc. (not tested)
Cons:
- it is hard to get meaningful errors, it seems that the teem is working on it.
- the generated AST is only a big data structure without "domain specific" methods, it means that if we want a rich Object Oriented AST like the one we have in python, we would have to write a algorithm to convert from/to a more interesting abstraction (comment which share the same conclusion )
- the fact that the parser is a c library means that embedding it in a web application would requires webassembly which can be more performant but also less debuggable;
Conclusion: Writing a subset of the RQL grammar was rather easy but it seems that the generated AST is not really practical for use as a basis for an RQL interpreter, RQL validator or RQL AST.
HandMade Recursive Descent Parser in Typescript
FIXME: Continue describing the RDP pros/cons.
A Typescript implementation of an RQL parser has been initiated in https://forge.extranet.logilab.fr/open-source/rqljs.
Pros:
- Error handling can be as rich as we want;
Cons:
- The grammar is not explicit;