logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

cozy-devnotes-machine-specs.md (11247B)


  1. ---
  2. date: 2017-02-22
  3. # vim: tw=80 spell :
  4. title: "Compiler devnotes: Machine specs"
  5. layout: post
  6. tags: [C, language design]
  7. ---
  8. I have a number of long-term projects that I plan for on long timelines, on the
  9. order of decades or more. One of these projects is cozy, a C toolchain. I
  10. haven't talked about this project in public before, so I'll start by introducing
  11. you to the project. The main C toolchains in the "actually usable" category are
  12. GNU and LLVM, but I'm satisfied with neither and I want to build my own
  13. toolchain. I see no reason why compilers should be deep magic. Here are my goals
  14. for cozy:
  15. - Self hosting and written in C
  16. - An easy to grok codebase and internal design
  17. - Focused on C. No built-in support for other languages
  18. - Adding new targets architectures and ports should be straightforward
  19. - Modular build pipeline with lots of opportunities for external integrations
  20. - Trivially cross-compiles without building another version of the toolchain
  21. - Includes a decent optimizer
  22. Some other plans include opinionated warnings about code and minimal support for
  23. language extensions. Ambitious goals, right? That's why this project is on my
  24. long-term schedule. I've found that large projects are entirely feasible, so
  25. long as you (1) start them and (2) keep working on them for a long time. I don't
  26. need to rush this - gcc and clang may not be ideal, but they work today. In
  27. support of these goals, I'll be writing these dev notes to explain my design
  28. choices and gather feedback — please [email me](mailto:sir@cmpwn.com) if
  29. you have some!
  30. Since I want to place an emphasis on portability and retargetability, I'm
  31. starting by designing the machine spec and its support code, which is used to
  32. add support for new architectures. I don't like gcc's lisp specs, and I *really*
  33. don't like LLVM's "huge pile of C++" approach. I think a really good machine
  34. spec meets these goals:
  35. - Easy to write and human friendly
  36. - More about data than code, but
  37. - Easily extended with C to support architecture-specific nuances
  38. - Provides loads of useful metadata about the target architecture
  39. - Exposes information about the speed and side-effects of each instruction
  40. - Can also be used to generate an assembler and disassembler
  41. - Easily reused to create derivative architectures
  42. Adding a new architecture should be a weekend project, and when you're done the
  43. entire toolchain should both support and run on your new architecture. I set out
  44. to come up with a new syntax that could potentially meet these goals. I started
  45. with the Z80 architecture in mind because it's simple, I'm intimately familiar
  46. with it, and I want cozy to be able to target 8-bit machines just as easily as
  47. 32 or 64 bit.
  48. For reference, here are the gcc and LLVM guides on adding new targets:
  49. - [gcc - Anatomy of a Target Back End](https://gcc.gnu.org/onlinedocs/gccint/Back-End.html)
  50. - [Writing an LLVM Backend](http://llvm.org/docs/WritingAnLLVMBackend.html)
  51. The cozy machine spec is a cross between ini files, yaml, and a custom syntax.
  52. The format is somewhat complex, but once understood is intuitive and flexible.
  53. At the top level, it looks like an ini file:
  54. ```yaml
  55. [metadata]
  56. # ...
  57. [registers]
  58. # ...
  59. [macros]
  60. # ...
  61. [instructions]
  62. # ...
  63. ```
  64. ### Metadata
  65. The **metadata** section contains some high-level information about the
  66. architecture design, and is the simplest section to understand. It currently
  67. looks like this for z80:
  68. ```yaml
  69. [metadata]
  70. name: z80
  71. bits: 8
  72. endianness: little
  73. signedness: twos-complement
  74. cache: none
  75. pipeline: none
  76. ```
  77. This isn't comprehensive, and I'll be adding more metadata as it becomes
  78. necessary. On LLVM, this sort of information is encoded into a string that looks
  79. something like this: `"e-p:16:8:8-i8:8:8-i16:8:8-n8:16"`. This string is passed
  80. into the `LLVMTargetMachine` base constructor in C++. I think we can do a hell
  81. of a lot better than that!
  82. ### Registers
  83. The **registers** section describes the registers on this architecture.
  84. ```yaml
  85. [registers]
  86. BC: 16
  87. B: 8
  88. C: 8; offset=8
  89. DE: 16
  90. D: 8
  91. E: 8; offset=8
  92. HL: 16
  93. H: 8
  94. L: 8; offset=8
  95. SP: 16; stack
  96. PC: 16; program
  97. ```
  98. Here we can start to see some interesting syntax and get an idea of the design
  99. of cozy machine specs. The contents of each section are keys, which have values,
  100. attributes, and children. The format looks like this:
  101. ```yaml
  102. key: value; attributes, ...
  103. children...
  104. ```
  105. In this example, we've defined the BC, DE, HL, SP, and PC registers. HL, DE, and
  106. BC are general purpose 16-bit registers, and each can also be used as two
  107. separate 8-bit registers. The attributes for these sub-registers indicates their
  108. offsets in the parent register. We also define the stack and program registers,
  109. SP and PC, which use the stack and program attributes to indicate their special
  110. purposes.
  111. We can also describe CPU flags in this section:
  112. ```yaml
  113. [registers]
  114. AF: 16; special
  115. A: 8; accumulator
  116. F: 8; flags, offset 8;; flag
  117. _C: 1
  118. _N: 1; offset 1
  119. _PV: 1; offset 2
  120. _3: 1; offset 3, undocumented
  121. _H: 1; offset 4
  122. _5: 1; offset 5, undocumented
  123. _Z: 1; offset 6
  124. _S: 1; offset 7
  125. ```
  126. Here we introduce another feature of cozy specs with `F: 8; flags, offset 8;;
  127. flag`. Using `;;` adds those attributes to all children of this key, so each of
  128. \_C, \_N, etc have the `flag` attribute.
  129. Take note of the "undocumented" attribute here. Some of the metadata included
  130. in a spec can be applied to cozy tools. Some of it, however, is there for other
  131. tools to utilize. We have a good opportunity to make a machine-readable
  132. description of the architecture, so I've opted to include a lot of extra details
  133. in machine specs that third parties could utilize (though there might be a
  134. -fno-undocumented compiler flag some day, I guess).
  135. ### Macros
  136. The **macros** section is heavily tied to the instructions section. Most instruction
  137. sets are quite large, and I don't want to burden spec authors with writing out
  138. the entire thing. We can speed up their work by providing macros.
  139. z80 instructions have a few sets of common patterns in their encodings. Register
  140. groups are often represented by the same set of bits, and we can make our
  141. instruction set specification more concise by taking advantage of this. For
  142. example, here's a macro that we can use for instructions that can use either the
  143. BC, DE, HL, or SP registers:
  144. ```yaml
  145. [macros]
  146. reg_BCDEHLSP:
  147. BC: 00
  148. DE: 01
  149. HL: 10
  150. SP: 11
  151. ```
  152. We have the name of the macro as the top-level key, in this case `reg_BCDEHLSP`.
  153. We can later refer to this macro with `@reg_BCDEHLSP`. Then, we have each of the
  154. cases it can match on, and the binary values these correspond to when encoded in
  155. an instruction.
  156. ### Instructions
  157. The instructions section brings everything together and defines the actual
  158. instructions available on this architecture. Instructions can be organized into
  159. groups at the spec author's pleasure, which can be referenced by derivative
  160. architectures. Here we can take a look at the "load" group:
  161. ```yaml
  162. [instructions]
  163. .load:
  164. ld:
  165. @reg_BCDEHLSP, @imm[16]: 00 $1 0001 $2
  166. ```
  167. On z80, the `ld` instruction is similar to the `mov` instruction on Intel
  168. architectures. It assigns the second argument to the first. This could be used
  169. to assign registers to each other (e.g. `ld a, b` to set A = B), to set
  170. registers to constants, and so on. Our example here uses our macro from earlier
  171. to match instructions like this:
  172. ld hl, 0x1234
  173. The value for this key may reference the arguments with variables. $1 here
  174. equals `10`, from the macro. The `imm` built-in is implemented in C to match
  175. constants and provides $2. An assembler could use this information to assemble
  176. our example instruction into this machine code:
  177. 00100001 00110100 00010010
  178. Which will load HL with the value 0x1234 when executed.
  179. ### Lots more metadata
  180. Now that we have the basics down, let's dive into some deeper details. Cozy
  181. specs are designed to provide most of the information the *entire toolchain*
  182. needs to support an architecture. The information we have so far could be used
  183. to generate assemblers and disassemblers, but I want this file to be able to
  184. generate things like optimizers as well. You can add the necessary metadata to
  185. each instruction by utilizing attributes.
  186. Consider the z80 instruction LDIR, which stands for
  187. "load/decrement/increment/repeat". This instruction is used for memcpy
  188. operations. To use it, you set the HL register to a source address, the DE
  189. register to a destination address, and BC to a length. This instruction looks
  190. like this in the spec:
  191. ```yaml
  192. ldir: 11101101 10110000; uses[HL, DE, BC], \
  193. affects[HL[+BC], DE[+BC], BC[0]], \
  194. flags[_H:0,_N:0,_PV:0], cycles[16 + BC * 5]
  195. ````
  196. That's a lot of attributes! The purpose of these attributes are to give the
  197. toolchain insights into the registers this instruction uses, its side effects,
  198. and how fast it is. These attributes can help us compare the efficiency of
  199. different approaches and understand the how the state of registers evolves
  200. during a function, which leads to all sorts of useful optimizations.
  201. The `affects` attribute, for example, tells us how each register is affected by
  202. this instruction. We can see that after this instruction, HL and DE will have
  203. had BC added to them, and BC will have been set to 0. We can make all sorts of
  204. optimizations based on this knowledge. Here are some examples:
  205. ```c
  206. char *dest, *src;
  207. int len = 10;
  208. memcpy(dest, src, len);
  209. src += len;
  210. ```
  211. The compiler can assign `src` to HL, `dest` to DE, and `len` to BC. We can then
  212. optimize out the final statement entirely because we know that the LDIR
  213. instruction will have already added BC to HL for us.
  214. ```c
  215. char *dest, *src;
  216. int len = 10;
  217. memcpy(dest, src, len);
  218. int foobar = 0;
  219. ```
  220. In this case, the register allocator can just assign BC to `foobar` and avoid
  221. initializing it because we know it's already going to be zero. Many other
  222. optimizations are made possible when we are keeping track of the side effects of
  223. each instruction.
  224. ## Next steps
  225. I've iterated over this spec design for a while now, and I'm pretty happy with
  226. it. I would love to hear your feedback. Assuming that this looks good, my next
  227. step is writing more specs, and a tool that parses and compiles them to C. These
  228. C files are going to be linked into `libcozyspec`, which will provide an API to
  229. access all of this metadata from C. It will also include an instruction matcher,
  230. which will be utilized by the next step - writing the assembler.
  231. The assembler is going to take a while, because I don't want to go the gas route
  232. of making a half-baked assembler that's more useful for compiling the C
  233. compiler's output than for anything else. I want to make an assembler that
  234. assembly programmers would *want* to use.
  235. I have not yet designed an intermediate bytecode for the compiler to use, but
  236. one will have to be made. The machine spec will likely change somewhat to
  237. accommodate this. Some of the conversion from internal bytecode to target
  238. assembly can likely be inferred from metadata, but some will have to be done
  239. manually for each architecture.
  240. [Here's the entire z80 spec](https://sr.ht/7_Pe.txt) I've been working on, for
  241. your reading pleasure.