I, too, "value your privacy" but unlike most I think it is priceless and fundamental. Privacy Policy

bootstrap-initrd: A self-building environment based on tcc+musl

published on 2024-06-23T10:18:22Z, last updated on 2024-06-23T10:18:22Z

In late April 2024, I started working on bootstrap-initrd to build a small development environment capable of building a minimalist distro with the least amount of binaries involved and priority over straightforward bootstrap path to ease reviewability. A somewhat related project also ended up being utils-std which went from replacements over busybox/coreutils to a much more complete set of utilities which can be built without already having said utilities.

Constraints

Readable code
For example GNU code ends up excluded due to prevalence of layers of macros and horribly long autogenerated code which is difficult to regenerate
Clean bootstrap path
No historical versions or multiple stages with different patches. This is both to ease up reviewability and maintenance
Fast
Should build in few minutes total, not hours. After all if it takes hours for a computer to parse, it would be even worse for a human

Does it works?

Yeah, I managed to build parts of Oasis Linux: core set (excluding openssh and rc), some in the extra set (file, netbsd-curses, vis), devel set (excluding strace).
There's still things to be done like being able to run the make-initrd.sh script in itself but I would consider that the ground work is done as big improvements would be with changes in external projects / linux ecosystem.

That said a switch from the Linux kernel to the Fiwix kernel could make sense as the latter can be built with tcc while retaining Linux compatibility.

Is it fast?

Yeah! Takes ~3 seconds for it to boot up + build the base in QEMU on a AMD Ryzen 7 3700X desktop, then time /build-extras.sh hovers around 1 minute of real time.

Included software

Below is a quick list, ordered by compilation, Base are unpacked projects compiled at boot-up and guaranteed to be clean, Extras are packed tarballs compiled with /build-extras.sh.

Base
tcc (binary seed), musl (binary seed), loksh, OpenBSD yacc, pdpmake, sed(1) from sbase, utils-std, heirloom-devtools, (One True) awk, heirloom, bzip2, zlib, pigz
Extras
lua, bearssl, GNU make, e2fsprogs, gettext-tiny, pkgconf, skalibs, tiny-curl, xz, mdevd, iproute2, git

For details on the choices see bootstrap-initrd's README.md file.

Resulting initrd

I made the choice of seeding tcc+musl, with reusing Alpine packages so I'm not distributing custom binaries. Part of the reason for it is to avoid part of the horribly long bootstrapping path of live-bootstrap. After all I want to ease up reviewability and I'd rather have a clear compromise against a known goal, than settle on a complex and hard to maintain path which could look acceptable.

Somewhat interestingly it ends up at ~1.2MiB of binaries (with removing libc.a to shave of ~9MiB) which is significantively smaller than GNU Guix's 25MiB bootstrap-guile as presented in The Full-Source Bootstrap: Building from source all the way down.

For x86_64 the initrd currently ends up at 41MiB gzip-compressed, 58MiB uncompressed. Pretty big for an initrd, but as it is in the ballpark of Alpine's base installation (~30MiB) I consider this quite a nice feat for for a quite complete development environment including git.

Should even get smaller once utils-std gets complete enough to replace sbase (166KiB) and heirloom (977KiB and somewhat historical). But the biggest one I want to replace is curl (4.3MiB urgh) with my httpc as all needed to build systems like Oasis is an HTTP downloader not an URL kitchen-sink.

/init

A bit less than 100 lines of C are in /init:

  1. #!/usr/bin/tcc -run: tcc gets launched by the kernel and compiles+executes /init in place
  2. mounts /sys, /proc and /dev
  3. creates /dev/null (no idea why it's not by default in a devtmpfs mount like the other special devices)
  4. executes a prepared command to compile a shell, I picked up loksh (Linux port of OpenBSD KornShell) as the other shells either themselves require a shell in their buildsystem, or in the case of mrsh are incomplete
  5. executes said shell against /init.sh

Few different choices could have been made here: For example lua can also be compiled from source similarly to loksh and it could be an interesting choice for one exclusive to bootstrapping Oasis Linux. But I preferred to quickly launch into a shell script to use a familiar environment which can drop into a prompt if an error happens allowing to poke around, which proved to be very useful in the early stages.

Also if tcc wouldn't be used then a different /init would have to be written, probably not in C, hopefully not pre-compiled.

Discoveries

  1. Binutils is a 300+ MiB monster of autogenerated code, gigantic test fixtures, … Thankfully tcc can serve as a replacement
  2. pdpmake: Public-Domain POSIX Make: A small make implementation, can be compiled in a single command
  3. pigz is more trivial to compile than the reference implementation of gzip

Fediverse post for comments