bootstrap-initrd: A self-building environment based on tcc+musl
In late April 2024, I started working on bootstrap-initrd to build a small development environment capable of building a minimalist distro with the least amount of binaries involved and priority over straightforward bootstrap path to ease reviewability. A somewhat related project also ended up being utils-std which went from replacements over busybox/coreutils to a much more complete set of utilities which can be built without already having said utilities.
Constraints
- Readable code
- For example GNU code ends up excluded due to prevalence of layers of macros and horribly long autogenerated code which is difficult to regenerate
- Clean bootstrap path
- No historical versions or multiple stages with different patches. This is both to ease up reviewability and maintenance
- Fast
- Should build in few minutes total, not hours. After all if it takes hours for a computer to parse, it would be even worse for a human
Does it works?
Yeah, I managed to build parts of Oasis Linux: core set (excluding openssh and rc), some in the extra set (file, netbsd-curses, vis), devel set (excluding strace).
There's still things to be done like being able to run the make-initrd.sh
script in itself but I would consider that the ground work is done as big improvements would be with changes in external projects / linux ecosystem.
That said a switch from the Linux kernel to the Fiwix kernel could make sense as the latter can be built with tcc while retaining Linux compatibility.
Is it fast?
Yeah! Takes ~3 seconds for it to boot up + build the base in QEMU on a AMD Ryzen 7 3700X desktop, then time /build-extras.sh
hovers around 1 minute of real time.
Included software
Below is a quick list, ordered by compilation, Base are unpacked projects compiled at boot-up and guaranteed to be clean, Extras are packed tarballs compiled with /build-extras.sh
.
- Base
- tcc (binary seed), musl (binary seed), loksh, OpenBSD yacc, pdpmake,
sed(1)
from sbase, utils-std, heirloom-devtools, (One True) awk, heirloom, bzip2, zlib, pigz - Extras
- lua, bearssl, GNU make, e2fsprogs, gettext-tiny, pkgconf, skalibs, tiny-curl, xz, mdevd, iproute2, git
For details on the choices see bootstrap-initrd's README.md file.
Resulting initrd
I made the choice of seeding tcc+musl, with reusing Alpine packages so I'm not distributing custom binaries. Part of the reason for it is to avoid part of the horribly long bootstrapping path of live-bootstrap. After all I want to ease up reviewability and I'd rather have a clear compromise against a known goal, than settle on a complex and hard to maintain path which could look acceptable.
Somewhat interestingly it ends up at ~1.2MiB of binaries (with removing libc.a
to shave of ~9MiB) which is significantively smaller than GNU Guix's 25MiB bootstrap-guile as presented in The Full-Source Bootstrap: Building from source all the way down.
For x86_64 the initrd currently ends up at 41MiB gzip-compressed, 58MiB uncompressed. Pretty big for an initrd, but as it is in the ballpark of Alpine's base installation (~30MiB) I consider this quite a nice feat for for a quite complete development environment including git.
Should even get smaller once utils-std gets complete enough to replace sbase (166KiB) and heirloom (977KiB and somewhat historical). But the biggest one I want to replace is curl (4.3MiB urgh) with my httpc as all needed to build systems like Oasis is an HTTP downloader not an URL kitchen-sink.
/init
A bit less than 100 lines of C are in /init
:
#!/usr/bin/tcc -run
: tcc gets launched by the kernel and compiles+executes/init
in place- mounts
/sys
,/proc
and/dev
- creates
/dev/null
(no idea why it's not by default in adevtmpfs
mount like the other special devices) - executes a prepared command to compile a shell, I picked up loksh (Linux port of OpenBSD KornShell) as the other shells either themselves require a shell in their buildsystem, or in the case of mrsh are incomplete
- executes said shell against
/init.sh
Few different choices could have been made here: For example lua can also be compiled from source similarly to loksh and it could be an interesting choice for one exclusive to bootstrapping Oasis Linux. But I preferred to quickly launch into a shell script to use a familiar environment which can drop into a prompt if an error happens allowing to poke around, which proved to be very useful in the early stages.
Also if tcc wouldn't be used then a different /init
would have to be written, probably not in C, hopefully not pre-compiled.
Discoveries
- Binutils is a 300+ MiB monster of autogenerated code, gigantic test fixtures, … Thankfully tcc can serve as a replacement
- pdpmake: Public-Domain POSIX Make: A small make implementation, can be compiled in a single command
- pigz is more trivial to compile than the reference implementation of gzip