logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

How-to-design-a-new-programming-language.md (5106B)


  1. ---
  2. title: How to design a new programming language from scratch
  3. date: 2020-12-25
  4. outputs: [html, gemtext]
  5. ---
  6. There is a long, difficult road from vague, pie-in-the-sky ideas about what
  7. would be cool to have in a new programming language, to a robust,
  8. self-consistent, practical implementation of those ideas. Designing and
  9. implementing a new programming language from scratch is one of the most
  10. challenging tasks a programmer can undertake.
  11. Note: this post is targeted at motivated programmers who want to make a
  12. serious attempt at designing a useful programming language. If you just want to
  13. make a language as a fun side project, then you can totally just wing it. Taking
  14. on an unserious project of that nature is also a good way to develop some
  15. expertise which will be useful for a serious project later on.
  16. Let's set the scene. You already know a few programming languages, and you know
  17. what you like and dislike about them — these are your influences. You have
  18. some cool novel language design ideas as well. A good first step from here is to
  19. dream up some pseudocode, putting some of your ideas to paper, so you can get an
  20. idea of what it would actually feel like to write or read code in this
  21. hypothetical language. Perhaps a short write-up or a list of goals and ideas is
  22. also in order. Circulate these among your confidants for discussion and
  23. feedback.
  24. Ideas need to be proven in the forge of implementations, and the next step is to
  25. write a compiler (or interpreter — everything in this article applies
  26. equally to them). We'll call this the sacrificial implementation, because you
  27. should be prepared to throw it away later. Its purpose is to prove that your
  28. design ideas work and can be implemented efficiently, but *not* to be the
  29. production-ready implementation of your new language. It's a tool to help you
  30. refine your language design.
  31. To this end, I would suggest using a parser generator like yacc to create your
  32. parser, even if you'd prefer to ultimately use a different design (e.g.
  33. recursive descent). The ability to quickly make changes to your grammar, and the
  34. side-effect of having a formal grammar written as you work, are both valuable to
  35. have at this stage of development. Being prepared to throw out the rest of the
  36. compiler is helpful because, due to the inherent difficulty of designing and
  37. implementing a programming language at the same time, your first implementation
  38. will probably be shit. You don't know what the language will look like, you'll
  39. make assumptions that you have to undo later, and it'll undergo dozens of
  40. refactorings. It's gonna suck.
  41. However, shit as it may be, it will have done important work in validating your
  42. ideas and refining your design. I would recommend that your next step is to
  43. start working on a formal specification of the language (something that I
  44. believe all languages should have). You've proven what works, and writing it up
  45. formally is a good way to finalize the ideas and address the edge cases. Gather
  46. a group of interested early adopters, contributors, and subject matter experts
  47. (e.g. compiler experts who work with similar languages), and hold discussions on
  48. the specification as you work.
  49. This is also a good time to start working on your second implementation. At this
  50. point, you will have a good grasp on the overall compiler design, the flaws from
  51. your original implementation, and better skills as a compiler programmer.
  52. Working on your second compiler and your specification at the same time can help
  53. as both endeavours inform the others — a particularly difficult detail to
  54. implement could lead to a simplification in the spec, and an under-specified
  55. detail getting shored up could lead to a more robust implementation.
  56. Don't get carried away — keep this new compiler simple and small. Don't go
  57. crazy on nice-to-have features like linters and formatters, an exhaustive test
  58. suite, detailed error messages, a sophisticated optimizer, and so on. You want
  59. it to implement the specification as simply as possible, so that you can use it
  60. for the next step: the hosted compiler. You need to write a third
  61. implementation, using your own language to compile itself.
  62. The second compiler, which I hope you wrote in C, is now the bootstrap compiler.
  63. I recommend keeping it up-to-date with the specification and maintaining it
  64. perpetually as a convenient path to bootstrap your toolchain from scratch
  65. (looking at you, Rust). But it's not going to be the final implementation: any
  66. self-respecting general-purpose programming language is implemented in itself.
  67. The next, and final step, is to implement your language for a third time.
  68. At this point, you will have refined and proven your language design. You will
  69. have developed and applied compiler programming skills. You will have a robust
  70. implementation for a complete and self-consistent programming language,
  71. developed carefully and with the benefit of hindsight. Your future community
  72. will thank you for the care and time you put into this work, as your language
  73. design and implementation sets the ceiling on the quality of programs written in
  74. it.