(Back to Shell Autocompletion)
From Zulip:
NOTE: Speed isn't the most important thing when considering a model; I think the more important issues are expressiveness (what can it parse), readability, and debuggability.
O(n^3). There are like 10 different algorithms to recognize CFGs, with complex sets of advantages and disadvantages.
LR family
LALR(1) grammar (yacc): O(n), accepting all the limitations with shift-reduce conflicts and suchLL family
LL(1) which can be matched in O(n) time.LL(k) which I believe is also O(n), but ANTLR v4 introduced a more powerful model ALL(*) (“all-star”).O(n^2) loop or are doing exponential backtracking.sed: uses arbitrary code.awk: LALR(1) parser with yacc [1].pgen [2].One of the major motivations for OSH it to test this theory that Chet Ramey wrote about [3], after 20+ years maintaining bash:
One thing I've considered multiple times, but never done, is rewriting the bash parser using straight recursive-descent rather than using bison. [...] Were I starting bash from scratch, I probably would have written a parser by hand. It certainly would have made some things easier.
Andy: In my opinion, this experiment was a big success. The OSH parser is more maintainable and less buggy than bash's parser (although it's admittedly slower, being in Python). bash is 20+ years old and they are still fixing corner cases involving matching { } and ( ).
It's because their code s a messy mix of yacc and C code, and it would be better off as well-structured code (in C, or a higher level language). The interface between the two models messy and ill-defined (and filled with global variables).
Looking at parse.y in bash, there's much more C code there than there is yacc grammar. The grammar solves maybe 25% of the problems you have. And subst.c has a ton of ad hoc parsing code outside the grammar.
$ wc -l parse.y
>6513 parse.y
[1] I forked Kernighan's original Awk and found a couple minor bugs in it. https://github.com/andychu/bwk
[2] The "update" here is due to a private e-mail discussion I had with Guido on pgen's design. http://python-history.blogspot.com/2018/05/the-origins-of-pgen.html
[3] http://aosabook.org/en/bash.html