commit: 9d8d5b8047c20be87734b0c6e290ada61c25433f
parent de2b96b866bb7d4a88048975c035da98f53fed01
Author: Drew DeVault <sir@cmpwn.com>
Date: Fri, 24 May 2024 17:14:49 +0200
Bunnix blog post
Diffstat:
1 file changed, 224 insertions(+), 0 deletions(-)
diff --git a/content/blog/2024-05-24-Bunnix.md b/content/blog/2024-05-24-Bunnix.md
@@ -0,0 +1,224 @@
+---
+title: Writing a Unix clone in about a month
+date: 2024-05-24
+---
+
+I needed a bit of a break from "real work" recently, so I started a new
+programming project that was low-stakes and purely recreational. On April 21st,
+I set out to see how much of a Unix-like operating system for x86_64 targets
+that I could put together in about a month. The result is
+[Bunnix](https://git.sr.ht/~sircmpwn/bunnix). Not including days I didn't work
+on Bunnix for one reason or another, I spent 27 days on this project.
+
+Here's a little demo of the results...
+
+<iframe title="A short demo of the Bunnix operating system" width="720" height="400" src="https://spacepub.space/videos/embed/415822ac-5755-42a4-9081-b48639eed6be" frameborder="0" allowfullscreen="" sandbox="allow-same-origin allow-scripts allow-popups"></iframe>
+
+You can try it for yourself if you like:
+
+* [Bunnix 0.0.0 iso](https://cyanide.ayaya.dev/bunnix.iso)
+
+To boot this ISO with qemu:
+
+```
+qemu-system-x86_64 -cdrom bunnix.iso -display sdl -serial stdio
+```
+
+You can also write the iso to a USB stick and boot it on real hardware. It will
+probably work on most AMD64 machines -- I have tested it on a ThinkPad X220 and
+a Starlabs Starbook Mk IV. Legacy boot and EFI are both supported. There are
+some limitations to keep in mind, in particular that there is no USB support, so
+a PS/2 keyboard (or PS/2 emulation via the BIOS) is required. Most laptops rig
+up the keyboard via PS/2, and <abbr title="your milage may vary">YMMV</abbr>
+with USB keyboards via PS/2 emulation.
+
+## What's there?
+
+The Bunnix kernel is (mostly) written in [Hare](https://harelang.org), plus some
+C components, namely lwext4 for ext4 filesystem support and libvterm for the
+kernel video terminal.
+
+The kernel supports the following drivers:
+
+* PCI (legacy)
+* AHCI block devices
+* GPT and MBR partition tables
+* PS/2 keyboards
+* Platform serial ports
+* CMOS clocks
+* Framebuffers (configured by the bootloaders)
+* ext4 and memfs filesystems
+
+There are numerous supported kernel features as well:
+
+* A virtual filesystem
+* A /dev populated with block devices, null, zero, and full psuedo-devices,
+ /dev/kbd and /dev/fb0, serial and video TTYs, and the /dev/tty controlling
+ terminal.
+* Reasonably complete terminal emulator and somewhat passable termios support
+* Some 40 syscalls, including for example clock_gettime, poll, openat et al,
+ fork, exec, pipe, dup, dup2, ioctl, etc
+
+Bunnix is a single-user system and does not currently attempt to enforce Unix
+file modes and ownership, though it could be made multi-user relatively easily
+with a few more days of work.
+
+Included are two bootloaders, one for legacy boot which is multiboot-compatible
+and written in Hare, and another for EFI which is written in C. Both of them
+load the kernel as an ELF file plus an initramfs, if required. The EFI
+bootloader includes zlib to decompress the initramfs; multiboot-compatible
+bootloaders handle this decompression for us.
+
+The userspace is largely assembled from third-party sources. The following
+third-party software is included:
+
+* Colossal Cave Adventure (advent)
+* dash (/bin/sh)
+* Doom
+* gzip
+* less (pager)
+* lok (/bin/awk)
+* lolcat
+* mandoc (man pages)
+* sbase (core utils)[^1]
+* tcc (C compiler)
+* Vim 5.7
+
+The libc is derived from musl libc and contains numerous modifications to suit
+Bunnix's needs. The curses library is based on netbsd-curses.
+
+[^1]: sbase is good software written by questionable people. I do not endorse suckless.
+
+## How Bunnix came together
+
+I started documenting the process on Mastodon on day 3 -- check out [the
+Mastodon thread](https://fosstodon.org/@drewdevault/112319697309218275) for the
+full story. Here's what it looked like on day 3:
+
+
+
+Here's some thoughts after the fact.
+
+Some of Bunnix's code stems from an earlier project,
+[Helios](https://sr.ht/~sircmpwn/helios). This includes portions of the kernel
+which are responsible for some relatively generic CPU setup (GDT, IDT, etc), and
+some drivers like AHCI were adapted for the Bunnix system. I admit that it would
+probably not have been possible to build Bunnix so quickly without prior
+experience through Helios.
+
+Two of the more challenging aspects were ext4 support and the virtual terminal,
+for which I brought in two external dependencies, lwext4 and libvterm. Both
+proved to be challenging integrations. I had to rewrite my filesystem layer a
+few times, and it's still buggy today, but getting a proper Unix filesystem
+design (including openat and good handling of inodes) requires digging into
+lwext4 internals a bit more than I'd have liked. I also learned a lot about
+mixing source languages into a Hare project, since the kernel links together
+Hare, assembly, and C sources -- it works remarkably well but there are some
+pain points I noticed, particularly with respect to building the ABI integration
+riggings. It'd be nice to automate conversion of C headers into Hare forward
+declaration modules. Some of this work already exists in hare-c, but has a ways
+to go. If I were to start again, I would probably be more careful in my design
+of the filesystem layer.
+
+Getting the terminal right was difficult as well. I wasn't sure that I was going
+to add one at all, but I eventually decided that I wanted to port vim and that
+was that. libvterm is a great terminal state machine library, but it's poorly
+documented and required a lot of fine-tuning to integrate just right. I also
+ended up spending a lot of time on performance to make sure that the terminal
+worked smoothly.
+
+Another difficult part to get right was the scheduler. Helios has a simpler
+scheduler than Bunnix, and while I initially based the Bunnix scheduler on
+Helios I had to throw out and rewrite quite a lot of it. Both Helios and Bunnix
+are single-CPU systems, but unlike Helios, Bunnix allows context switching
+within the kernel -- in fact, even preemptive task switching enters and exits
+via the kernel. This necessitates multiple kernel stacks and a different
+approach to task switching. However, the advantages are numerous, one of which
+being that implementing blocking operations like disk reads and pipe(2) are much
+simpler with wait queues. With a robust enough scheduler, the rest of the kernel
+and its drivers come together pretty easily.
+
+Another source of frustration was signals, of course. Helios does not attempt to
+be a Unix and gets away without these, but to build a Unix, I needed to
+implement signals, big messy hack though they may be. The signal implementation
+which ended up in Bunnix is pretty bare-bones: I mostly made sure that SIGCHLD
+worked correctly so that I could port dash.
+
+Porting third-party software was relatively easy thanks to basing my libc on
+musl libc. I imported large swaths of musl into my own libc and adapted it to
+run on Bunnix, which gave me a pretty comprehensive and reliable C library
+pretty fast. With this in place, porting third-party software was a breeze, and
+most of the software that's included was built with minimal patching.
+
+## What I learned
+
+Bunnix was an interesting project to work on. My other project, Helios, is a
+microkernel design that's Not Unix, while Bunnix is a monolithic kernel that is
+much, much closer to Unix.
+
+One thing I was surprised to learn a lot about is filesystems. Helios, as a
+microkernel, spreads the filesystem implementation across many drivers running
+in many separate processes. This works well enough, but one thing I discovered
+is that it's quite important to have caching in the filesystem layer, even if
+only to track living objects. When I revisit Helios, I will have a lot of work
+to do refactoring (or even rewriting) the filesystem code to this end.
+
+The approach to drivers is also, naturally, much simpler in a monolithic kernel
+design, though I'm not entirely pleased with all of the stuff I heaped into ring
+0. There might be room for an improved Helios scheduler design that incorporates
+some of the desirable control flow elements from the monolithic design into a
+microkernel system.
+
+I also finally learned how signals work from top to bottom, and boy is it ugly.
+I've always felt that this was one of the weakest points in the design of Unix
+and this project did nothing to disabuse me of that notion.
+
+I had also tried to avoid using a bitmap allocator in Helios, and generally
+memory management in Helios is a bit fussy altogether -- one of the biggest pain
+points with the system right now. However, Bunnix uses a simple bitmap allocator
+for all conventional pages on the system and I found that it works really,
+really well and does not have nearly as much overhead as I had feared it would.
+I will almost certainly take those lessons back to Helios.
+
+Finally, I'm quite sure that putting together Bunnix in just 30 days is a feat
+which would not have been possible with a microkernel design. At the end of the
+day, monolithic kernels are just much simpler to implement. The advantages of a
+microkernel design are compelling, however -- perhaps a better answer lies in a
+hybrid kernel.
+
+## What's next
+
+Bunnix was (note the past tense) a project that I wrote for the purpose of
+recreational programming, so it's purpose is to be fun to work on. And I've had
+my fun! At this point I don't feel the need to invest more time and energy into
+it, though it would definitely benefit from some. In the future I may spend a
+few days on it here and there, and I would be happy to integrate improvements
+from the community -- send patches to my [public inbox][inbox]. But for the most
+part it is an art project which is now more-or-less complete.
+
+[inbox]: https://lists.sr.ht/~sircmpwn/public-inbox
+
+My next steps in OS development will be a return to Helios with a lot of lessons
+learned and some major redesigns in the pipeline. But I still think that Bunnix
+is a fun and interesting OS in its own right, in no small part due to its
+demonstration of Hare as a great language for kernel hacking. Some of the
+priorities for improvements include:
+
+* A directory cache for the filesystem and better caching generally
+* Ironing out ext4 bugs
+* procfs and top
+* mmaping files
+* More signals (e.g. SIGSEGV)
+* Multi-user support
+* NVMe block devices
+* IDE block devices
+* ATAPI and ISO 9660 support
+* Intel HD audio support
+* Network stack
+* Hare toolchain in the base system
+* Self hosting
+
+Whether or not it's me or one of you readers who will work on these first
+remains to be seen.
+
+In any case, have fun playing with Bunnix!