bootstrap-initrd.xml (6746B)
- <entry>
- <title>bootstrap-initrd: A self-building environment based on tcc+musl</title>
- <link rel="alternate" type="text/html" href="https://hacktivis.me/articles/bootstrap-initrd"/>
- <id>https://hacktivis.me/articles/bootstrap-initrd</id>
- <published>2024-06-23T10:18:22Z</published>
- <updated>2024-06-23T10:18:22Z</updated>
- <link rel="external replies" type="application/activity+json" href="https://queer.hacktivis.me/objects/7d2cd28e-7550-475c-ba13-28288a705297" />
- <link rel="external replies" type="text/html" href="https://queer.hacktivis.me/objects/7d2cd28e-7550-475c-ba13-28288a705297" />
- <content type="xhtml">
- <div xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" class="h-entry">
- <p>
- In late April 2024, I started working on <a href="https://hacktivis.me/git/bootstrap-initrd/">bootstrap-initrd</a> to build a small development environment capable of building a minimalist distro with the least amount of binaries involved and priority over straightforward bootstrap path to ease reviewability.
- A somewhat related project also ended up being <a href="https://hacktivis.me/git/utils-std/">utils-std</a> which went from replacements over busybox/coreutils to a much more complete set of utilities which can be built without already having said utilities.
- </p>
- <h2>Constraints</h2>
- <dl>
- <dt>Readable code</dt>
- <dd>For example GNU code ends up excluded due to prevalence of layers of macros and horribly long autogenerated code which is difficult to regenerate</dd>
- <dt>Clean bootstrap path</dt>
- <dd>No historical versions or multiple stages with different patches. This is both to ease up reviewability and maintenance</dd>
- <dt>Fast</dt>
- <dd>Should build in few minutes total, not hours. After all if it takes hours for a computer to parse, it would be even worse for a human</dd>
- </dl>
- <h2>Does it works?</h2>
- <p>
- Yeah, I managed to build parts of <a href="https://git.sr.ht/~mcf/oasis">Oasis Linux</a>: core set (excluding openssh and rc), some in the extra set (file, netbsd-curses, vis), devel set (excluding strace).<br />
- There's still things to be done like being able to run the <code>make-initrd.sh</code> script in itself but I would consider that the ground work is done as big improvements would be with changes in external projects / linux ecosystem.
- </p>
- <p>
- That said a switch from the Linux kernel to the <a href="https://www.fiwix.org/">Fiwix</a> kernel could make sense as the latter can be built with tcc while retaining Linux compatibility.
- </p>
- <h2>Is it fast?</h2>
- <p>
- Yeah! Takes ~3 seconds for it to boot up + build the base in QEMU on a AMD Ryzen 7 3700X desktop, then <code>time /build-extras.sh</code> hovers around 1 minute of real time.
- </p>
- <h2>Included software</h2>
- <p>
- Below is a quick list, ordered by compilation, Base are unpacked projects compiled at boot-up and guaranteed to be clean, Extras are packed tarballs compiled with <code>/build-extras.sh</code>.
- </p>
- <dl>
- <dt>Base</dt><dd>tcc (binary seed), musl (binary seed), loksh, OpenBSD yacc, pdpmake, <code>sed(1)</code> from sbase, utils-std, heirloom-devtools, (One True) awk, heirloom, bzip2, zlib, pigz</dd>
- <dt>Extras</dt><dd>lua, bearssl, GNU make, e2fsprogs, gettext-tiny, pkgconf, skalibs, tiny-curl, xz, mdevd, iproute2, git</dd>
- </dl>
- <p>
- For details on the choices see <a href="https://hacktivis.me/git/bootstrap-initrd/file/README.md.html">bootstrap-initrd's README.md file</a>.
- </p>
- <h2>Resulting initrd</h2>
- <p>
- I made the choice of seeding tcc+musl, with reusing Alpine packages so I'm not distributing custom binaries.
- Part of the reason for it is to avoid part of the horribly long <a href="https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst">bootstrapping path of live-bootstrap</a>.
- After all I want to ease up reviewability and I'd rather have a clear compromise against a known goal, than settle on a complex and hard to maintain path which could look acceptable.
- </p>
- <p>
- Somewhat interestingly it ends up at ~1.2MiB of binaries (with removing <code>libc.a</code> to shave of ~9MiB) which is significantively smaller than GNU Guix's 25MiB bootstrap-guile as presented in <a href="https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/">The Full-Source Bootstrap: Building from source all the way down</a>.
- </p>
- <p>
- For x86_64 the initrd currently ends up at 41MiB gzip-compressed, 58MiB uncompressed. Pretty big for an initrd, but as it is in the ballpark of Alpine's base installation (~30MiB) I consider this quite a nice feat for for a quite complete development environment including git.
- </p>
- <p>
- Should even get smaller once utils-std gets complete enough to replace sbase (166KiB) and heirloom (977KiB and somewhat historical).
- But the biggest one I want to replace is curl (4.3MiB urgh) with my <a href="https://hacktivis.me/git/httpc/">httpc</a> as all needed to build systems like Oasis is an HTTP downloader not an URL kitchen-sink.
- </p>
- <h2><code>/init</code></h2>
- <p>A bit less than 100 lines of C are in <code>/init</code>:</p>
- <ol>
- <li><code>#!/usr/bin/tcc -run</code>: tcc gets launched by the kernel and compiles+executes <code>/init</code> in place</li>
- <li>mounts <code>/sys</code>, <code>/proc</code> and <code>/dev</code></li>
- <li>creates <code>/dev/null</code> (no idea why it's not by default in a <code>devtmpfs</code> mount like the other special devices)</li>
- <li>executes a prepared command to compile a shell, I picked up loksh (Linux port of OpenBSD KornShell) as the other shells either themselves require a shell in their buildsystem, or in the case of mrsh are incomplete</li>
- <li>executes said shell against <code>/init.sh</code></li>
- </ol>
- <p>
- Few different choices could have been made here:
- For example lua can also be compiled from source similarly to loksh and it could be an interesting choice for one exclusive to bootstrapping <a href="https://git.sr.ht/~mcf/oasis">Oasis Linux</a>.
- But I preferred to quickly launch into a shell script to use a familiar environment which can drop into a prompt if an error happens allowing to poke around, which proved to be very useful in the early stages.
- </p>
- <p>
- Also if tcc wouldn't be used then a different <code>/init</code> would have to be written, probably not in C, hopefully not pre-compiled.
- </p>
- <h2>Discoveries</h2>
- <ol>
- <li>Binutils is a 300+ MiB monster of autogenerated code, gigantic test fixtures, … Thankfully tcc can serve as a replacement</li>
- <li><a href="https://frippery.org/make">pdpmake: Public-Domain POSIX Make</a>: A small make implementation, can be compiled in a single command</li>
- <li>pigz is more trivial to compile than the reference implementation of gzip</li>
- </ol>
- </div>
- </content>
- </entry>