Old Notes for Compiler Engineer Job

More Background

However, mycpp needs a rewrite. It's using the internals of MyPy in a hacky way.

And I don't like the "inverted" visitor style. The current idea is to rewrite the type checker with Python's 3.10 match statement (only released in October 2021!). We can reuse the ast library which in recent versions of Python supports type comments.

However, most of the time I spent on this issue was in the relatively small C++ runtime. The garbage collector is a Cheney semi-space collector, and it took me a long time to debug! It's still not done.

High Level Idea / Work Estimates

You should look at this problem and think I can do this whole thing ! (I believe there are many compiler engineers out there with this skillset.)

Again, we want to translate 40K lines of typed Python to a similar amount of C++. And the result has to be debugged and work!

So the whole job should be approximately 10K to 15K lines of code?

However note that we already have a working prototype, which was started in 2019. It passes over half the tests (though the garbage collector not turned on; it only works on small examples). So I consider this project "low risk" in some sense.

Compensation / Starting Date

Location: can be anywhere in the world. You will probably be video conferencing with me in the US eastern time zone.

Other details

Code Requirements

Questions

Who will I be working with?

Mainly me :) I will be working on the language, documentation, and improving the (very large) build system. This mostly shouldn't affect what you're doing, although the code will be evolving. (We catch any regressions in type checking or tests on every commit)

Why not do this yourself?

Basically because it's going too slowly, and potential users like Nix need a fast version of Oil.

Why not recruit open source contributors to do it?

I think this problem can be finished with a big block of time to concentrate on it. I'm spread too thin, and other contributors have jobs.

Note that there have been ~47 contributors to Oil's codebase, although none that are consistent over many months.

Why C++?

Why not plain C?

It makes the translator easier to write, and the code should be easier to read and step through with GDB.

Why didn't you write the whole thing in C++ to begin with?

bash is at least 140K of C code. We implement most of it, and "engulf" it in the much richer Oil language. So you're basically asking why I didn't write 200K or 300K lines of C or C++ by hand :)

Our code is also memory safe by construction, since the metalanguage can't express anything unsafe. So we aim to have 5-10K lines of hand-written native code, rather than 200K or 300K lines.

Why Work on this? / What You Get For Free

TODO

Milestones

  1. First make more tests pass with the existing translator and old runtime. (2 weeks)
  2. Make Python's configure run (the OSH 0.0 milestone!)
  3. Make the existing ~1100 OSH tests pass
  4. Make all the OSH tests pass.
  5. Make it fast
  6. Later: Translate the Oil language.

TODO: when is it considered "done" ? There are corner cases like the ./configure script and extended globs, etc.

Similar Work

Prerequisite Work

To Discuss

Fun Stuff in the Future

This is definitely out of the scope of the project.

But the person who is a good fit for this job might be excited or interested by future work.

TODO: Tea language, bootstrapping, etc.