How-to-design-a-new-programming-language.md (5106B)
- ---
- title: How to design a new programming language from scratch
- date: 2020-12-25
- outputs: [html, gemtext]
- ---
- There is a long, difficult road from vague, pie-in-the-sky ideas about what
- would be cool to have in a new programming language, to a robust,
- self-consistent, practical implementation of those ideas. Designing and
- implementing a new programming language from scratch is one of the most
- challenging tasks a programmer can undertake.
- Note: this post is targeted at motivated programmers who want to make a
- serious attempt at designing a useful programming language. If you just want to
- make a language as a fun side project, then you can totally just wing it. Taking
- on an unserious project of that nature is also a good way to develop some
- expertise which will be useful for a serious project later on.
- Let's set the scene. You already know a few programming languages, and you know
- what you like and dislike about them — these are your influences. You have
- some cool novel language design ideas as well. A good first step from here is to
- dream up some pseudocode, putting some of your ideas to paper, so you can get an
- idea of what it would actually feel like to write or read code in this
- hypothetical language. Perhaps a short write-up or a list of goals and ideas is
- also in order. Circulate these among your confidants for discussion and
- feedback.
- Ideas need to be proven in the forge of implementations, and the next step is to
- write a compiler (or interpreter — everything in this article applies
- equally to them). We'll call this the sacrificial implementation, because you
- should be prepared to throw it away later. Its purpose is to prove that your
- design ideas work and can be implemented efficiently, but *not* to be the
- production-ready implementation of your new language. It's a tool to help you
- refine your language design.
- To this end, I would suggest using a parser generator like yacc to create your
- parser, even if you'd prefer to ultimately use a different design (e.g.
- recursive descent). The ability to quickly make changes to your grammar, and the
- side-effect of having a formal grammar written as you work, are both valuable to
- have at this stage of development. Being prepared to throw out the rest of the
- compiler is helpful because, due to the inherent difficulty of designing and
- implementing a programming language at the same time, your first implementation
- will probably be shit. You don't know what the language will look like, you'll
- make assumptions that you have to undo later, and it'll undergo dozens of
- refactorings. It's gonna suck.
- However, shit as it may be, it will have done important work in validating your
- ideas and refining your design. I would recommend that your next step is to
- start working on a formal specification of the language (something that I
- believe all languages should have). You've proven what works, and writing it up
- formally is a good way to finalize the ideas and address the edge cases. Gather
- a group of interested early adopters, contributors, and subject matter experts
- (e.g. compiler experts who work with similar languages), and hold discussions on
- the specification as you work.
- This is also a good time to start working on your second implementation. At this
- point, you will have a good grasp on the overall compiler design, the flaws from
- your original implementation, and better skills as a compiler programmer.
- Working on your second compiler and your specification at the same time can help
- as both endeavours inform the others — a particularly difficult detail to
- implement could lead to a simplification in the spec, and an under-specified
- detail getting shored up could lead to a more robust implementation.
- Don't get carried away — keep this new compiler simple and small. Don't go
- crazy on nice-to-have features like linters and formatters, an exhaustive test
- suite, detailed error messages, a sophisticated optimizer, and so on. You want
- it to implement the specification as simply as possible, so that you can use it
- for the next step: the hosted compiler. You need to write a third
- implementation, using your own language to compile itself.
- The second compiler, which I hope you wrote in C, is now the bootstrap compiler.
- I recommend keeping it up-to-date with the specification and maintaining it
- perpetually as a convenient path to bootstrap your toolchain from scratch
- (looking at you, Rust). But it's not going to be the final implementation: any
- self-respecting general-purpose programming language is implemented in itself.
- The next, and final step, is to implement your language for a third time.
- At this point, you will have refined and proven your language design. You will
- have developed and applied compiler programming skills. You will have a robust
- implementation for a complete and self-consistent programming language,
- developed carefully and with the benefit of hindsight. Your future community
- will thank you for the care and time you put into this work, as your language
- design and implementation sets the ceiling on the quality of programs written in
- it.