logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

2024-05-24-Bunnix.md (10719B)


  1. ---
  2. title: Writing a Unix clone in about a month
  3. date: 2024-05-24
  4. ---
  5. I needed a bit of a break from "real work" recently, so I started a new
  6. programming project that was low-stakes and purely recreational. On April 21st,
  7. I set out to see how much of a Unix-like operating system for x86_64 targets
  8. that I could put together in about a month. The result is
  9. [Bunnix](https://git.sr.ht/~sircmpwn/bunnix). Not including days I didn't work
  10. on Bunnix for one reason or another, I spent 27 days on this project.
  11. You can try it for yourself if you like:
  12. * [Bunnix 0.0.0 iso](https://cyanide.ayaya.dev/bunnix.iso)
  13. To boot this ISO with qemu:
  14. ```
  15. qemu-system-x86_64 -cdrom bunnix.iso -display sdl -serial stdio
  16. ```
  17. You can also write the iso to a USB stick and boot it on real hardware. It will
  18. probably work on most AMD64 machines -- I have tested it on a ThinkPad X220 and
  19. a Starlabs Starbook Mk IV. Legacy boot and EFI are both supported. There are
  20. some limitations to keep in mind, in particular that there is no USB support, so
  21. a PS/2 keyboard (or PS/2 emulation via the BIOS) is required. Most laptops rig
  22. up the keyboard via PS/2, and <abbr title="your milage may vary">YMMV</abbr>
  23. with USB keyboards via PS/2 emulation.
  24. *Tip: the DOOM keybindings are weird. WASD to move, right shift to shoot, and
  25. space to open doors. Exiting the game doesn't work so just reboot when you're
  26. done playing. I confess I didn't spend much time on that port.*
  27. ## What's there?
  28. The Bunnix kernel is (mostly) written in [Hare](https://harelang.org), plus some
  29. C components, namely lwext4 for ext4 filesystem support and libvterm for the
  30. kernel video terminal.
  31. The kernel supports the following drivers:
  32. * PCI (legacy)
  33. * AHCI block devices
  34. * GPT and MBR partition tables
  35. * PS/2 keyboards
  36. * Platform serial ports
  37. * CMOS clocks
  38. * Framebuffers (configured by the bootloaders)
  39. * ext4 and memfs filesystems
  40. There are numerous supported kernel features as well:
  41. * A virtual filesystem
  42. * A /dev populated with block devices, null, zero, and full psuedo-devices,
  43. /dev/kbd and /dev/fb0, serial and video TTYs, and the /dev/tty controlling
  44. terminal.
  45. * Reasonably complete terminal emulator and somewhat passable termios support
  46. * Some 40 syscalls, including for example clock_gettime, poll, openat et al,
  47. fork, exec, pipe, dup, dup2, ioctl, etc
  48. Bunnix is a single-user system and does not currently attempt to enforce Unix
  49. file modes and ownership, though it could be made multi-user relatively easily
  50. with a few more days of work.
  51. Included are two bootloaders, one for legacy boot which is multiboot-compatible
  52. and written in Hare, and another for EFI which is written in C. Both of them
  53. load the kernel as an ELF file plus an initramfs, if required. The EFI
  54. bootloader includes zlib to decompress the initramfs; multiboot-compatible
  55. bootloaders handle this decompression for us.
  56. The userspace is largely assembled from third-party sources. The following
  57. third-party software is included:
  58. * Colossal Cave Adventure (advent)
  59. * dash (/bin/sh)
  60. * Doom
  61. * gzip
  62. * less (pager)
  63. * lok (/bin/awk)
  64. * lolcat
  65. * mandoc (man pages)
  66. * sbase (core utils)[^1]
  67. * tcc (C compiler)
  68. * Vim 5.7
  69. The libc is derived from musl libc and contains numerous modifications to suit
  70. Bunnix's needs. The curses library is based on netbsd-curses.
  71. [^1]: sbase is good software written by questionable people. I do not endorse suckless.
  72. The system works but it's pretty buggy and some parts of it are quite slapdash:
  73. your milage will vary. Be prepared for it to crash!
  74. ## How Bunnix came together
  75. I started documenting the process on Mastodon on day 3 -- check out [the
  76. Mastodon thread](https://fosstodon.org/@drewdevault/112319697309218275) for the
  77. full story. Here's what it looked like on day 3:
  78. ![Screenshot of an early Bunnix build, which boots up, sets up available memory, and exercises an early in-memory filesystem](https://cdn.fosstodon.org/media_attachments/files/112/319/693/110/194/041/original/2c0bd7006a74aece.png)
  79. Here's some thoughts after the fact.
  80. Some of Bunnix's code stems from an earlier project,
  81. [Helios](https://sr.ht/~sircmpwn/helios). This includes portions of the kernel
  82. which are responsible for some relatively generic CPU setup (GDT, IDT, etc), and
  83. some drivers like AHCI were adapted for the Bunnix system. I admit that it would
  84. probably not have been possible to build Bunnix so quickly without prior
  85. experience through Helios.
  86. Two of the more challenging aspects were ext4 support and the virtual terminal,
  87. for which I brought in two external dependencies, lwext4 and libvterm. Both
  88. proved to be challenging integrations. I had to rewrite my filesystem layer a
  89. few times, and it's still buggy today, but getting a proper Unix filesystem
  90. design (including openat and good handling of inodes) requires digging into
  91. lwext4 internals a bit more than I'd have liked. I also learned a lot about
  92. mixing source languages into a Hare project, since the kernel links together
  93. Hare, assembly, and C sources -- it works remarkably well but there are some
  94. pain points I noticed, particularly with respect to building the ABI integration
  95. riggings. It'd be nice to automate conversion of C headers into Hare forward
  96. declaration modules. Some of this work already exists in hare-c, but has a ways
  97. to go. If I were to start again, I would probably be more careful in my design
  98. of the filesystem layer.
  99. Getting the terminal right was difficult as well. I wasn't sure that I was going
  100. to add one at all, but I eventually decided that I wanted to port vim and that
  101. was that. libvterm is a great terminal state machine library, but it's poorly
  102. documented and required a lot of fine-tuning to integrate just right. I also
  103. ended up spending a lot of time on performance to make sure that the terminal
  104. worked smoothly.
  105. Another difficult part to get right was the scheduler. Helios has a simpler
  106. scheduler than Bunnix, and while I initially based the Bunnix scheduler on
  107. Helios I had to throw out and rewrite quite a lot of it. Both Helios and Bunnix
  108. are single-CPU systems, but unlike Helios, Bunnix allows context switching
  109. within the kernel -- in fact, even preemptive task switching enters and exits
  110. via the kernel. This necessitates multiple kernel stacks and a different
  111. approach to task switching. However, the advantages are numerous, one of which
  112. being that implementing blocking operations like disk reads and pipe(2) are much
  113. simpler with wait queues. With a robust enough scheduler, the rest of the kernel
  114. and its drivers come together pretty easily.
  115. Another source of frustration was signals, of course. Helios does not attempt to
  116. be a Unix and gets away without these, but to build a Unix, I needed to
  117. implement signals, big messy hack though they may be. The signal implementation
  118. which ended up in Bunnix is pretty bare-bones: I mostly made sure that SIGCHLD
  119. worked correctly so that I could port dash.
  120. Porting third-party software was relatively easy thanks to basing my libc on
  121. musl libc. I imported large swaths of musl into my own libc and adapted it to
  122. run on Bunnix, which gave me a pretty comprehensive and reliable C library
  123. pretty fast. With this in place, porting third-party software was a breeze, and
  124. most of the software that's included was built with minimal patching.
  125. ## What I learned
  126. Bunnix was an interesting project to work on. My other project, Helios, is a
  127. microkernel design that's Not Unix, while Bunnix is a monolithic kernel that is
  128. much, much closer to Unix.
  129. One thing I was surprised to learn a lot about is filesystems. Helios, as a
  130. microkernel, spreads the filesystem implementation across many drivers running
  131. in many separate processes. This works well enough, but one thing I discovered
  132. is that it's quite important to have caching in the filesystem layer, even if
  133. only to track living objects. When I revisit Helios, I will have a lot of work
  134. to do refactoring (or even rewriting) the filesystem code to this end.
  135. The approach to drivers is also, naturally, much simpler in a monolithic kernel
  136. design, though I'm not entirely pleased with all of the stuff I heaped into ring
  137. 0. There might be room for an improved Helios scheduler design that incorporates
  138. some of the desirable control flow elements from the monolithic design into a
  139. microkernel system.
  140. I also finally learned how signals work from top to bottom, and boy is it ugly.
  141. I've always felt that this was one of the weakest points in the design of Unix
  142. and this project did nothing to disabuse me of that notion.
  143. I had also tried to avoid using a bitmap allocator in Helios, and generally
  144. memory management in Helios is a bit fussy altogether -- one of the biggest pain
  145. points with the system right now. However, Bunnix uses a simple bitmap allocator
  146. for all conventional pages on the system and I found that it works really,
  147. really well and does not have nearly as much overhead as I had feared it would.
  148. I will almost certainly take those lessons back to Helios.
  149. Finally, I'm quite sure that putting together Bunnix in just 30 days is a feat
  150. which would not have been possible with a microkernel design. At the end of the
  151. day, monolithic kernels are just much simpler to implement. The advantages of a
  152. microkernel design are compelling, however -- perhaps a better answer lies in a
  153. hybrid kernel.
  154. ## What's next
  155. Bunnix was (note the past tense) a project that I wrote for the purpose of
  156. recreational programming, so it's purpose is to be fun to work on. And I've had
  157. my fun! At this point I don't feel the need to invest more time and energy into
  158. it, though it would definitely benefit from some. In the future I may spend a
  159. few days on it here and there, and I would be happy to integrate improvements
  160. from the community -- send patches to my [public inbox][inbox]. But for the most
  161. part it is an art project which is now more-or-less complete.
  162. [inbox]: https://lists.sr.ht/~sircmpwn/public-inbox
  163. My next steps in OS development will be a return to Helios with a lot of lessons
  164. learned and some major redesigns in the pipeline. But I still think that Bunnix
  165. is a fun and interesting OS in its own right, in no small part due to its
  166. demonstration of Hare as a great language for kernel hacking. Some of the
  167. priorities for improvements include:
  168. * A directory cache for the filesystem and better caching generally
  169. * Ironing out ext4 bugs
  170. * procfs and top
  171. * mmaping files
  172. * More signals (e.g. SIGSEGV)
  173. * Multi-user support
  174. * NVMe block devices
  175. * IDE block devices
  176. * ATAPI and ISO 9660 support
  177. * Intel HD audio support
  178. * Network stack
  179. * Hare toolchain in the base system
  180. * Self hosting
  181. Whether or not it's me or one of you readers who will work on these first
  182. remains to be seen.
  183. In any case, have fun playing with Bunnix!