logo

blog

My website can't be that messy, right? git clone https://anongit.hacktivis.me/git/blog.git/

bootstrapping.shtml (12596B)


  1. <!DOCTYPE html>
  2. <html xmlns="http://www.w3.org/1999/xhtml">
  3. <head>
  4. <!--#include file="/templates/head.shtml" -->
  5. <title>Bootstrapping — lanodan’s cyber-home</title>
  6. </head>
  7. <body>
  8. <!--#include file="/templates/en/nav.shtml" -->
  9. <main>
  10. <h1>Bootstrapping</h1>
  11. <ul>
  12. <li><a href="https://bootstrappable.org/">Bootstrappable Builds</a> (GNU Guix focus)</li>
  13. <li><a href="https://bootstrapping.miraheze.org/wiki/Main_Page">bootstrapping wiki</a></li>
  14. <li><a href="https://dwheeler.com/trusting-trust/">David A. Wheeler’s Page on Fully Countering Trusting Trust through Diverse Double-Compiling (DDC) - Countering Trojan Horse attacks on Compilers</a> (Note: Requires trustworthy bootstrap compiler(s) as starting point)</li>
  15. <li><a href="https://www.quora.com/What-is-a-coders-worst-nightmare/answer/Mick-Stute?srid=tQ46&amp;share=1">Mike Stute's answer to What is a coder's worst nightmare?</a></li>
  16. <li><a href="https://research.swtch.com/nih">research!rsc: Running the “Reflections on Trusting Trust” Compiler</a>: This notably contains the code that Ken Thomspon used together with explainations</li>
  17. </ul>
  18. <h2>Reasons</h2>
  19. <dl>
  20. <dt>Security</dt>
  21. <dd>See <a href="https://niconiconi.neocities.org/posts/ken-thompson-really-did-launch-his-trusting-trust-trojan-attack-in-real-life/">Ken Thompson Really Did Launch His "Trusting Trust" Trojan Attack in Real Life</a>.
  22. And <a href="https://manishearth.github.io/blog/2016/12/02/reflections-on-rusting-trust/">Reflections on Rusting Trust</a>: Proof of Concept, backdooring The One True Rust Compiler.
  23. </dd>
  24. <dt>Portability</dt>
  25. <dd>Binary executables have much higher <a href="https://en.wikipedia.org/wiki/Software_rot">bitrot</a> than source code and keeping obsolete binary interfaces often means keeping known security issues.</dd>
  26. <dt>Maintainability</dt>
  27. <dd>By making sure someone else can actually continue maintaining the software, canonical versions or forks</dd>
  28. <dt>Reproducibility's other side of the coin</dt>
  29. <dd>One of <a href="https://reproducible-builds.org/">reproducibility</a>'s effect is allowing to audit source code instead of binaries, but said source code needs to be actually used.</dd>
  30. </dl>
  31. <h2 id="tools">Tools</h2>
  32. <dl>
  33. <dt><a href="https://hacktivis.me/projects/deblob">deblob</a></dt>
  34. <dd>Remove known binary executable formats (including bytecode), designed to be fast enough to barely impact distro-scale package building performance, cannot detect all blobs</dd>
  35. <dt>Debian's <a href="https://salsa.debian.org/debian/devscripts/-/blob/master/scripts/suspicious-source">suspicious-source</a> script</dt>
  36. <dd>Lists what isn't present in a list of source code formats, good for manual audits. Python+<code>magic(5)</code> means it is quite slow.</dd>
  37. </dl>
  38. <h2>Problematic software</h2>
  39. <h3 id="erlang">Erlang</h3>
  40. <p>Documented as originally implemented in prolog, now version <i class="math">n</i> requires binaries version <i class="math">n-1</i> or <i class="math">n</i> to build. No alternative compiler known so far.</p>
  41. <h3 id="rust">Rust</h3>
  42. <p>
  43. There is <a href="https://github.com/thepowersgang/mrustc">mrustc</a> but it's quite unstable and so far GuixSD seems to be the only distro using it.
  44. Getting to stable also involves compiling the intermediary versions.
  45. Rustc also vendors several other projects like LLVM and rust crates (enjoy non-installable libraries), similarly to other rust software.
  46. </p>
  47. <p>
  48. GCC Rust Frontend is also not ready yet (2023-03) for userland, as <a href="#cargo">cargo</a> doesn't bootstraps…
  49. </p>
  50. <h3 id="cargo">Cargo</h3>
  51. <p>
  52. As if rustc not bootstrapping wouldn't be enough, cargo, the buildsystem+dependency-installer for Rust software depends on <a href="https://github.com/rust-lang/cargo/blob/master/Cargo.toml">~60 direct libraries</a>, notably including 2+ git libraries, HTTP Authentication, OpenSSL.<br />
  53. Cargo isn't a buildsystem, it's a full blown package manager, supply chain troublemaker (<a href="https://drewdevault.com/2022/05/12/Supply-chain-when-will-we-learn.html">via designed-vulnerable crates.io</a>), …
  54. </p>
  55. <p>
  56. It really ought to be replaced by something which only takes care of building code (or even just generating a <code>Makefile</code> or a <code>build.ninja</code> file), as was done in the C ecosystem many times in the past (pkg-config ⇒ <a href="https://gitea.treehouse.systems/ariadne/pkgconf">pkgconf</a>, ninja ⇒ <a href="https://github.com/michaelforney/samurai">samurai</a>, …).<br />
  57. This isn't a system that scales, this is just creating a gigantic blob of software that cannot be reasonably audited, right in the toolchain.
  58. </p>
  59. <h3 id="java">Java</h3>
  60. <p>Requires compilers abandonned ~10 years ago, currently doesn't builds to OpenJDK for me.</p>
  61. <h3>Free-Pascal Compiler / Object Pascal</h3>
  62. <p><a href="https://bootstrapping.miraheze.org/wiki/Aesop">Aesop</a> seems to still be at the vaporware stage, no code is available.</p>
  63. <h3 id="nim">Nim</h3>
  64. <p>
  65. The transpiled C non-source code used for bootstrapping contained in <code>./c_code/</code> is pretty much what you would get with C++ mangled symbols auto-decompiled to C.<br />
  66. <a href="https://bootstrapping.miraheze.org/wiki/Bootstrapping_Nim">Bootstrapping Nim via historical releases</a> would need a bootstrap path for Object Pascal, which doesn't exists (yet?), another way would be to have a minimal Nim compiler written in another language which is capable of compiling the current compiler.
  67. </p>
  68. <h3 id="qemu">QEMU</h3>
  69. <p>
  70. QEMU 7.0 <a href="https://github.com/gentoo/gentoo/commit/11c7bca43160b3d893dc8d846d8da2838332123c">needs a quick fix on the <code>pc-bios/meson.build</code> file</a> so you can choose to not use the binaries it ships, fixed in QEMU 7.1.<br />
  71. They are still required so it means identifying the source of all of them and having proper from-source packaging, it's already done in gentoo for Seabios and EDK2-OVMF (UEFI) which is enough to boot machines but not for full-x86 support, non-x86 being even more problematic (ie. which upstream is used for OpenBIOS/OpenFirmware as used for sparc32, sparc64 and ppc32).
  72. </p>
  73. <h3 id="wine-mono">wine-mono</h3>
  74. <p>In gentoo it's a collection of binaries. Upstream repository is at <a href="https://github.com/madewokherd/wine-mono">https://github.com/madewokherd/wine-mono</a> but still includes binaries…</p>
  75. <h3 id="mono">mono / .NET</h3>
  76. <p>
  77. Source-only building is unsupported and nearly impossible (massive chain + intermediary unstable versions).<br />
  78. Should also be noted that Mono started itself with the Microsoft C# compiler (<a href="https://www.mono-project.com/docs/about-mono/history/">History | Mono</a>) instead of <a href="https://www.gnu.org/software/dotgnu/">dotGNU</a> (which is dead since 2012).
  79. </p>
  80. <p>2024-11-30 Update: unmush managed to build mono all the way to 6.12.0 on Guix: <a href="https://debbugs.gnu.org/cgi/bugreport.cgi?bug=74609">[PATCH] Adding a fully-bootstrapped mono</a></p>
  81. <ul>
  82. <li><a href="https://issues.guix.gnu.org/55026">potential prebuilt binaries in the Mono package</a></li>
  83. <li><a href="https://github.com/mono/mono/issues/7445">Cannot build without binary-reference-assemblies</a></li>
  84. <li><a href="https://github.com/dotnet/source-build/issues/1930">Full source bootstrap · Issue #1930 · dotnet/source-build</a></li>
  85. <li><a href="/notes/mono-6.12.0.122_deblob.log">Automatically generated list of blobs via deblob on mono-6.12.0.122 tarball</a></li>
  86. </ul>
  87. <h3 id="chez">Chez Scheme</h3>
  88. <p>Requires bootstrap files, <a href="https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/chez.scm">GNU GuixSD packaging</a> doesn't seems to have it figured out yet.</p>
  89. <h3 id="neko">NekoVM</h3>
  90. <p>Doesn't seems possible to build without <code>boot/*.n</code> files being present, which are NekoVM bytecode files.</p>
  91. <h3 id="nqp">Not Quite Perl (NQP)</h3>
  92. <p>
  93. Doesn't seems possible to build without <code>src/vm/moar/stage0/*.moarvm</code> files being present, which are MoarVM bytecode files.
  94. This means no Rakudo/Perl6.
  95. </p>
  96. <h3 id="gnulib">GNU gnulib</h3>
  97. <p><code>lib/javaversion.class</code>. Made <a href="https://hacktivis.me/tmp/0001-lib-javaversion.class-Remove-build-from-source.patch">[PATCH] lib/javaversion.class: Remove, build from source</a> to have it built from source.</p>
  98. <h3 id="gettext">GNU gettext</h3>
  99. <p>gnulib java blob; 3 Java class files in <code>gettext-tools/examples</code>; <code>gettext-tools/m4/csharpexec-test.exe</code> which doesn't have source code (C# is effectively proprietary anyway). Did <a href="https://github.com/gentoo/gentoo/commit/54b36e80f7c3910ae1557c2faafda3d6d62daf49">sys-devel/gettext: deblob</a> to fix it.</p>
  100. <h3 id="typescript">TypeScript</h3>
  101. <p>Compiler itself is written in TypeScript, no bootstrap path possible as the <a href="https://github.com/microsoft/TypeScript/commit/214df64e287804577afa1fea0184c18c40f7d1ca">commit introducing the compiler</a> is TypeScript code. Want TypeScript compiler? Get a blob from <code>npmjs.org</code>, like the <a href="https://github.com/microsoft/TypeScript/commit/99ec3a96880649eeaa08c3df30e3ae802048f4fe">Initial commit</a> tells you.</p>
  102. <p>
  103. Alternative might be <a href="https://github.com/swc-project/swc">swc</a> (<a href="#rust">Rust</a>). Note that <a href="https://deno.land/">Deno</a> (also <a href="#rust">Rust</a>) just <a href="https://github.com/denoland/deno/blob/main/tools/update_typescript.md">grabs pre-transpiled JS from Microsoft</a> and <a href="https://babeljs.io/">Babel</a> simply seems to depend on the <code>typescript</code> package.<br />
  104. And it should be noted that TypeScript seems to have no specification anymore. (Commit: <a href="https://github.com/microsoft/TypeScript/commit/91822db8e01e38e1f9d80142df67d3849851571d">Remove doc folder (old archived spec and assets), word2md script</a>)
  105. </p>
  106. <h3 id="dart">Dart</h3>
  107. <p>
  108. Yet another chicken-egg language without a single documented way to bootstrap it from source, I wish they would have learned from the other language from Google: Go.
  109. </p>
  110. <h3 id="rollup">rollup</h3>
  111. <p>
  112. <dl>
  113. <dt>chicken-egg</dt><dd>Uses rollup to build itself</dd>
  114. <dt>one-step circular dependency</dt><dd>rollup → acorn → rollup</dd>
  115. <dt>links to a two-step circular dependency</dt><dd>rollup → eslint → webpack → acorn → eslint</dd>
  116. </dl>
  117. I guess web development can also mean creating cyclic graphs of dependencies.<br />
  118. Note: acorn doesn't lists it's dependencies on npmjs because it publishes a pre-compiled version…
  119. </p>
  120. <h2>Potentially problematic</h2>
  121. <h3>OCaml</h3>
  122. <p>Has binary seeds in <code>./boot</code>, there is <a href="https://github.com/Ekdohibs/camlboot">camlboot</a> but it seems to be pretty inefficient (takes hours to compile when regular ocaml takes minutes to compile)</p>
  123. <h3 id="zig">Zig</h3>
  124. <p>
  125. <a href="https://ziglang.org/news/goodbye-cpp/">Threw out the C++ implementation in favor of a <strong>large</strong> WASM binary seed</a>, for now it's chained-bootstrapping.
  126. Hopefully an alternative compiler written in a bootstrapped language will appear, because keeping versions of LLVM all the way to 15 working properly just doesn't seems sane.
  127. </p>
  128. <p>
  129. <a href="https://jakstys.lt/2024/zig-reproduced-without-binaries/">Zig Reproduced Without Binaries</a>
  130. (<a href="https://debbugs.gnu.org/cgi/bugreport.cgi?bug=74217">related debbugs.gnu.org entry</a>):
  131. Successfully reproducing Zig binaries within Guix, sadly inpractical as it uses 53+ intermediate versions between 0.10 and 0.13.
  132. </p>
  133. <h2>Historically problematic</h2>
  134. <h3 id="firefox_python2">Firefox &gt;=68 &lt;=78</h3>
  135. <p>Firefox would bundle python2 and refuse to build if removed. See <a href="https://salsa.debian.org/mozilla-team/firefox/-/commits/esr78/master/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin">Debian firefox-esr source history</a></p>
  136. <h2>Non-Problematic / Praise</h2>
  137. <h3 id="go">Go</h3>
  138. <p><a href="https://golang.org/doc/install/source">Installing Go from source</a> in the official Go documentation details it, both GCCGO and a branching out of Go 1.4 are supported.</p>
  139. </main>
  140. <!--#include file="/templates/en/footer.shtml" -->
  141. </body>
  142. </html>