logo

blog

My website can't be that messy, right? git clone https://hacktivis.me/git/blog.git

bootstrap-initrd.xml (6726B)


  1. <entry>
  2. <title>bootstrap-initrd: A self-building environment based on tcc+musl</title>
  3. <link rel="alternate" type="text/html" href="/articles/bootstrap-initrd"/>
  4. <id>https://hacktivis.me/articles/bootstrap-initrd</id>
  5. <published>2024-06-23T10:18:22Z</published>
  6. <updated>2024-06-23T10:18:22Z</updated>
  7. <link rel="external replies" type="application/activity+json" href="https://queer.hacktivis.me/objects/7d2cd28e-7550-475c-ba13-28288a705297" />
  8. <link rel="external replies" type="text/html" href="https://queer.hacktivis.me/objects/7d2cd28e-7550-475c-ba13-28288a705297" />
  9. <content type="xhtml">
  10. <div xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" class="h-entry">
  11. <p>
  12. In late April 2024, I started working on <a href="https://hacktivis.me/git/bootstrap-initrd/">bootstrap-initrd</a> to build a small development environment capable of building a minimalist distro with the least amount of binaries involved and priority over straightforward bootstrap path to ease reviewability.
  13. A somewhat related project also ended up being <a href="https://hacktivis.me/git/utils-std/">utils-std</a> which went from replacements over busybox/coreutils to a much more complete set of utilities which can be built without already having said utilities.
  14. </p>
  15. <h2>Constraints</h2>
  16. <dl>
  17. <dt>Readable code</dt>
  18. <dd>For example GNU code ends up excluded due to prevalence of layers of macros and horribly long autogenerated code which is difficult to regenerate</dd>
  19. <dt>Clean bootstrap path</dt>
  20. <dd>No historical versions or multiple stages with different patches. This is both to ease up reviewability and maintenance</dd>
  21. <dt>Fast</dt>
  22. <dd>Should build in few minutes total, not hours. After all if it takes hours for a computer to parse, it would be even worse for a human</dd>
  23. </dl>
  24. <h2>Does it works?</h2>
  25. <p>
  26. Yeah, I managed to build parts of <a href="https://git.sr.ht/~mcf/oasis">Oasis Linux</a>: core set (excluding openssh and rc), some in the extra set (file, netbsd-curses, vis), devel set (excluding strace).<br />
  27. There's still things to be done like being able to run the <code>make-initrd.sh</code> script in itself but I would consider that the ground work is done as big improvements would be with changes in external projects / linux ecosystem.
  28. </p>
  29. <p>
  30. That said a switch from the Linux kernel to the <a href="https://www.fiwix.org/">Fiwix</a> kernel could make sense as the latter can be built with tcc while retaining Linux compatibility.
  31. </p>
  32. <h2>Is it fast?</h2>
  33. <p>
  34. Yeah! Takes ~3 seconds for it to boot up + build the base in QEMU on a AMD Ryzen 7 3700X desktop, then <code>time /build-extras.sh</code> hovers around 1 minute of real time.
  35. </p>
  36. <h2>Included software</h2>
  37. <p>
  38. Below is a quick list, ordered by compilation, Base are unpacked projects compiled at boot-up and guaranteed to be clean, Extras are packed tarballs compiled with <code>/build-extras.sh</code>.
  39. </p>
  40. <dl>
  41. <dt>Base</dt><dd>tcc (binary seed), musl (binary seed), loksh, OpenBSD yacc, pdpmake, <code>sed(1)</code> from sbase, utils-std, heirloom-devtools, (One True) awk, heirloom, bzip2, zlib, pigz</dd>
  42. <dt>Extras</dt><dd>lua, bearssl, GNU make, e2fsprogs, gettext-tiny, pkgconf, skalibs, tiny-curl, xz, mdevd, iproute2, git</dd>
  43. </dl>
  44. <p>
  45. For details on the choices see <a href="https://hacktivis.me/git/bootstrap-initrd/file/README.md.html">bootstrap-initrd's README.md file</a>.
  46. </p>
  47. <h2>Resulting initrd</h2>
  48. <p>
  49. I made the choice of seeding tcc+musl, with reusing Alpine packages so I'm not distributing custom binaries.
  50. Part of the reason for it is to avoid part of the horribly long <a href="https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst">bootstrapping path of live-bootstrap</a>.
  51. After all I want to ease up reviewability and I'd rather have a clear compromise against a known goal, than settle on a complex and hard to maintain path which could look acceptable.
  52. </p>
  53. <p>
  54. Somewhat interestingly it ends up at ~1.2MiB of binaries (with removing <code>libc.a</code> to shave of ~9MiB) which is significantively smaller than GNU Guix's 25MiB bootstrap-guile as presented in <a href="https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/">The Full-Source Bootstrap: Building from source all the way down</a>.
  55. </p>
  56. <p>
  57. For x86_64 the initrd currently ends up at 41MiB gzip-compressed, 58MiB uncompressed. Pretty big for an initrd, but as it is in the ballpark of Alpine's base installation (~30MiB) I consider this quite a nice feat for for a quite complete development environment including git.
  58. </p>
  59. <p>
  60. Should even get smaller once utils-std gets complete enough to replace sbase (166KiB) and heirloom (977KiB and somewhat historical).
  61. But the biggest one I want to replace is curl (4.3MiB urgh) with my <a href="https://hacktivis.me/git/httpc/">httpc</a> as all needed to build systems like Oasis is an HTTP downloader not an URL kitchen-sink.
  62. </p>
  63. <h2><code>/init</code></h2>
  64. <p>A bit less than 100 lines of C are in <code>/init</code>:</p>
  65. <ol>
  66. <li><code>#!/usr/bin/tcc -run</code>: tcc gets launched by the kernel and compiles+executes <code>/init</code> in place</li>
  67. <li>mounts <code>/sys</code>, <code>/proc</code> and <code>/dev</code></li>
  68. <li>creates <code>/dev/null</code> (no idea why it's not by default in a <code>devtmpfs</code> mount like the other special devices)</li>
  69. <li>executes a prepared command to compile a shell, I picked up loksh (Linux port of OpenBSD KornShell) as the other shells either themselves require a shell in their buildsystem, or in the case of mrsh are incomplete</li>
  70. <li>executes said shell against <code>/init.sh</code></li>
  71. </ol>
  72. <p>
  73. Few different choices could have been made here:
  74. For example lua can also be compiled from source similarly to loksh and it could be an interesting choice for one exclusive to bootstrapping <a href="https://git.sr.ht/~mcf/oasis">Oasis Linux</a>.
  75. But I preferred to quickly launch into a shell script to use a familiar environment which can drop into a prompt if an error happens allowing to poke around, which proved to be very useful in the early stages.
  76. </p>
  77. <p>
  78. Also if tcc wouldn't be used then a different <code>/init</code> would have to be written, probably not in C, hopefully not pre-compiled.
  79. </p>
  80. <h2>Discoveries</h2>
  81. <ol>
  82. <li>Binutils is a 300+ MiB monster of autogenerated code, gigantic test fixtures, … Thankfully tcc can serve as a replacement</li>
  83. <li><a href="https://frippery.org/make">pdpmake: Public-Domain POSIX Make</a>: A small make implementation, can be compiled in a single command</li>
  84. <li>pigz is more trivial to compile than the reference implementation of gzip</li>
  85. </ol>
  86. </div>
  87. </content>
  88. </entry>