(Back to Shell Autocompletion)

From Zulip:

Parsing Models and Their Computational Complexity

NOTE: Speed isn't the most important thing when considering a model; I think the more important issues are expressiveness (what can it parse), readability, and debuggability.

Case Studies

One of the major motivations for OSH it to test this theory that Chet Ramey wrote about [3], after 20+ years maintaining bash:

One thing I've considered multiple times, but never done, is rewriting the bash parser using straight recursive-descent rather than using bison. [...] Were I starting bash from scratch, I probably would have written a parser by hand. It certainly would have made some things easier.

Andy: In my opinion, this experiment was a big success. The OSH parser is more maintainable and less buggy than bash's parser (although it's admittedly slower, being in Python). bash is 20+ years old and they are still fixing corner cases involving matching { } and ( ).

It's because their code s a messy mix of yacc and C code, and it would be better off as well-structured code (in C, or a higher level language). The interface between the two models messy and ill-defined (and filled with global variables).

Looking at parse.y in bash, there's much more C code there than there is yacc grammar. The grammar solves maybe 25% of the problems you have. And subst.c has a ton of ad hoc parsing code outside the grammar.

$ wc -l parse.y
>6513 parse.y

[1] I forked Kernighan's original Awk and found a couple minor bugs in it. https://github.com/andychu/bwk

[2] The "update" here is due to a private e-mail discussion I had with Guido on pgen's design. http://python-history.blogspot.com/2018/05/the-origins-of-pgen.html

[3] http://aosabook.org/en/bash.html