bootstrapping.shtml (12400B)
- <!DOCTYPE html>
- <html xmlns="http://www.w3.org/1999/xhtml">
- <head>
- <!--#include file="/templates/head.shtml" -->
- <title>Bootstrapping — lanodan’s cyber-home</title>
- </head>
- <body>
- <!--#include file="/templates/en/nav.shtml" -->
- <main>
- <h1>Bootstrapping</h1>
- <ul>
- <li><a href="https://bootstrappable.org/">Bootstrappable Builds</a> (GNU Guix focus)</li>
- <li><a href="https://bootstrapping.miraheze.org/wiki/Main_Page">bootstrapping wiki</a></li>
- <li><a href="https://dwheeler.com/trusting-trust/">David A. Wheeler’s Page on Fully Countering Trusting Trust through Diverse Double-Compiling (DDC) - Countering Trojan Horse attacks on Compilers</a> (Note: Requires trustworthy bootstrap compiler(s) as starting point)</li>
- <li><a href="https://www.quora.com/What-is-a-coders-worst-nightmare/answer/Mick-Stute?srid=tQ46&share=1">Mike Stute's answer to What is a coder's worst nightmare?</a></li>
- <li><a href="https://research.swtch.com/nih">research!rsc: Running the “Reflections on Trusting Trust” Compiler</a>: This notably contains the code that Ken Thomspon used together with explainations</li>
- </ul>
- <h2>Reasons</h2>
- <dl>
- <dt>Security</dt>
- <dd>See <a href="https://niconiconi.neocities.org/posts/ken-thompson-really-did-launch-his-trusting-trust-trojan-attack-in-real-life/">Ken Thompson Really Did Launch His "Trusting Trust" Trojan Attack in Real Life</a>.
- And <a href="https://manishearth.github.io/blog/2016/12/02/reflections-on-rusting-trust/">Reflections on Rusting Trust</a>: Proof of Concept, backdooring The One True Rust Compiler.
- </dd>
- <dt>Portability</dt>
- <dd>Binary executables have much higher <a href="https://en.wikipedia.org/wiki/Software_rot">bitrot</a> than source code and keeping obsolete binary interfaces often means keeping known security issues.</dd>
- <dt>Maintainability</dt>
- <dd>By making sure someone else can actually continue maintaining the software, canonical versions or forks</dd>
- <dt>Reproducibility's other side of the coin</dt>
- <dd>One of <a href="https://reproducible-builds.org/">reproducibility</a>'s effect is allowing to audit source code instead of binaries, but said source code needs to be actually used.</dd>
- </dl>
- <h2 id="tools">Tools</h2>
- <dl>
- <dt><a href="https://hacktivis.me/projects/deblob">deblob</a></dt>
- <dd>Remove known binary executable formats (including bytecode), designed to be fast enough to barely impact distro-scale package building performance, cannot detect all blobs</dd>
- <dt>Debian's <a href="https://salsa.debian.org/debian/devscripts/-/blob/master/scripts/suspicious-source">suspicious-source</a> script</dt>
- <dd>Lists what isn't present in a list of source code formats, good for manual audits. Python+<code>magic(5)</code> means it is quite slow.</dd>
- </dl>
- <h2>Problematic software</h2>
- <h3 id="erlang">Erlang</h3>
- <p>Documented as originally implemented in prolog, now version <i class="math">n</i> requires binaries version <i class="math">n-1</i> or <i class="math">n</i> to build. No alternative compiler known so far.</p>
- <h3 id="rust">Rust</h3>
- <p>
- There is <a href="https://github.com/thepowersgang/mrustc">mrustc</a> but it's quite unstable and so far GuixSD seems to be the only distro using it.
- Getting to stable also involves compiling the intermediary versions.
- Rustc also vendors several other projects like LLVM and rust crates (enjoy non-installable libraries), similarly to other rust software.
- </p>
- <p>
- GCC Rust Frontend is also not ready yet (2023-03) for userland, as <a href="#cargo">cargo</a> doesn't bootstraps…
- </p>
- <h3 id="cargo">Cargo</h3>
- <p>
- As if rustc not bootstrapping wouldn't be enough, cargo, the buildsystem+dependency-installer for Rust software depends on <a href="https://github.com/rust-lang/cargo/blob/master/Cargo.toml">~60 direct libraries</a>, notably including 2+ git libraries, HTTP Authentication, OpenSSL.<br />
- Cargo isn't a buildsystem, it's a full blown package manager, supply chain troublemaker (<a href="https://drewdevault.com/2022/05/12/Supply-chain-when-will-we-learn.html">via designed-vulnerable crates.io</a>), …
- </p>
- <p>
- It really ought to be replaced by something which only takes care of building code (or even just generating a <code>Makefile</code> or a <code>build.ninja</code> file), as was done in the C ecosystem many times in the past (pkg-config ⇒ <a href="https://gitea.treehouse.systems/ariadne/pkgconf">pkgconf</a>, ninja ⇒ <a href="https://github.com/michaelforney/samurai">samurai</a>, …).<br />
- This isn't a system that scales, this is just creating a gigantic blob of software that cannot be reasonably audited, right in the toolchain.
- </p>
- <h3 id="java">Java</h3>
- <p>Requires compilers abandonned ~10 years ago, currently doesn't builds to OpenJDK for me.</p>
- <h3>Free-Pascal Compiler / Object Pascal</h3>
- <p><a href="https://bootstrapping.miraheze.org/wiki/Aesop">Aesop</a> seems to still be at the vaporware stage, no code is available.</p>
- <h3 id="nim">Nim</h3>
- <p>
- The transpiled C non-source code used for bootstrapping contained in <code>./c_code/</code> is pretty much what you would get with C++ mangled symbols auto-decompiled to C.<br />
- <a href="https://bootstrapping.miraheze.org/wiki/Bootstrapping_Nim">Bootstrapping Nim via historical releases</a> would need a bootstrap path for Object Pascal, which doesn't exists (yet?), another way would be to have a minimal Nim compiler written in another language which is capable of compiling the current compiler.
- </p>
- <h3 id="qemu">QEMU</h3>
- <p>
- QEMU 7.0 <a href="https://github.com/gentoo/gentoo/commit/11c7bca43160b3d893dc8d846d8da2838332123c">needs a quick fix on the <code>pc-bios/meson.build</code> file</a> so you can choose to not use the binaries it ships, fixed in QEMU 7.1.<br />
- They are still required so it means identifying the source of all of them and having proper from-source packaging, it's already done in gentoo for Seabios and EDK2-OVMF (UEFI) which is enough to boot machines but not for full-x86 support, non-x86 being even more problematic (ie. which upstream is used for OpenBIOS/OpenFirmware as used for sparc32, sparc64 and ppc32).
- </p>
- <h3 id="wine-mono">wine-mono</h3>
- <p>In gentoo it's a collection of binaries. Upstream repository is at <a href="https://github.com/madewokherd/wine-mono">https://github.com/madewokherd/wine-mono</a> but still includes binaries…</p>
- <h3 id="mono">mono / .NET</h3>
- <p>
- Source-only building is unsupported and nearly impossible (massive chain + intermediary unstable versions).<br />
- Should also be noted that Mono started itself with the Microsoft C# compiler (<a href="https://www.mono-project.com/docs/about-mono/history/">History | Mono</a>) instead of <a href="https://www.gnu.org/software/dotgnu/">dotGNU</a> (which is dead since 2012).
- </p>
- <ul>
- <li><a href="https://issues.guix.gnu.org/55026">potential prebuilt binaries in the Mono package</a></li>
- <li><a href="https://github.com/mono/mono/issues/7445">Cannot build without binary-reference-assemblies</a></li>
- <li><a href="https://github.com/dotnet/source-build/issues/1930">Full source bootstrap · Issue #1930 · dotnet/source-build</a></li>
- <li><a href="/notes/mono-6.12.0.122_deblob.log">Automatically generated list of blobs via deblob on mono-6.12.0.122 tarball</a></li>
- </ul>
- <h3 id="chez">Chez Scheme</h3>
- <p>Requires bootstrap files, <a href="https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/chez.scm">GNU GuixSD packaging</a> doesn't seems to have it figured out yet.</p>
- <h3 id="neko">NekoVM</h3>
- <p>Doesn't seems possible to build without <code>boot/*.n</code> files being present, which are NekoVM bytecode files.</p>
- <h3 id="nqp">Not Quite Perl (NQP)</h3>
- <p>
- Doesn't seems possible to build without <code>src/vm/moar/stage0/*.moarvm</code> files being present, which are MoarVM bytecode files.
- This means no Rakudo/Perl6.
- </p>
- <h3 id="gnulib">GNU gnulib</h3>
- <p><code>lib/javaversion.class</code>. Made <a href="https://hacktivis.me/tmp/0001-lib-javaversion.class-Remove-build-from-source.patch">[PATCH] lib/javaversion.class: Remove, build from source</a> to have it built from source.</p>
- <h3 id="gettext">GNU gettext</h3>
- <p>gnulib java blob; 3 Java class files in <code>gettext-tools/examples</code>; <code>gettext-tools/m4/csharpexec-test.exe</code> which doesn't have source code (C# is effectively proprietary anyway). Did <a href="https://github.com/gentoo/gentoo/commit/54b36e80f7c3910ae1557c2faafda3d6d62daf49">sys-devel/gettext: deblob</a> to fix it.</p>
- <h3 id="typescript">TypeScript</h3>
- <p>Compiler itself is written in TypeScript, no bootstrap path possible as the <a href="https://github.com/microsoft/TypeScript/commit/214df64e287804577afa1fea0184c18c40f7d1ca">commit introducing the compiler</a> is TypeScript code. Want TypeScript compiler? Get a blob from <code>npmjs.org</code>, like the <a href="https://github.com/microsoft/TypeScript/commit/99ec3a96880649eeaa08c3df30e3ae802048f4fe">Initial commit</a> tells you.</p>
- <p>
- Alternative might be <a href="https://github.com/swc-project/swc">swc</a> (<a href="#rust">Rust</a>). Note that <a href="https://deno.land/">Deno</a> (also <a href="#rust">Rust</a>) just <a href="https://github.com/denoland/deno/blob/main/tools/update_typescript.md">grabs pre-transpiled JS from Microsoft</a> and <a href="https://babeljs.io/">Babel</a> simply seems to depend on the <code>typescript</code> package.<br />
- And it should be noted that TypeScript seems to have no specification anymore. (Commit: <a href="https://github.com/microsoft/TypeScript/commit/91822db8e01e38e1f9d80142df67d3849851571d">Remove doc folder (old archived spec and assets), word2md script</a>)
- </p>
- <h3 id="dart">Dart</h3>
- <p>
- Yet another chicken-egg language without a single documented way to bootstrap it from source, I wish they would have learned from the other language from Google: Go.
- </p>
- <h3 id="rollup">rollup</h3>
- <p>
- <dl>
- <dt>chicken-egg</dt><dd>Uses rollup to build itself</dd>
- <dt>one-step circular dependency</dt><dd>rollup → acorn → rollup</dd>
- <dt>links to a two-step circular dependency</dt><dd>rollup → eslint → webpack → acorn → eslint</dd>
- </dl>
- I guess web development can also mean creating cyclic graphs of dependencies.<br />
- Note: acorn doesn't lists it's dependencies on npmjs because it publishes a pre-compiled version…
- </p>
- <h2>Potentially problematic</h2>
- <h3>OCaml</h3>
- <p>Has binary seeds in <code>./boot</code>, there is <a href="https://github.com/Ekdohibs/camlboot">camlboot</a> but it seems to be pretty inefficient (takes hours to compile when regular ocaml takes minutes to compile)</p>
- <h3 id="zig">Zig</h3>
- <p>
- <a href="https://ziglang.org/news/goodbye-cpp/">Threw out the C++ implementation in favor of a <strong>large</strong> WASM binary seed</a>, for now it's chained-bootstrapping.
- Hopefully an alternative compiler written in a bootstrapped language will appear, because keeping versions of LLVM all the way to 15 working properly just doesn't seems sane.
- </p>
- <p>
- <a href="https://jakstys.lt/2024/zig-reproduced-without-binaries/">Zig Reproduced Without Binaries</a>
- (<a href="https://debbugs.gnu.org/cgi/bugreport.cgi?bug=74217">related debbugs.gnu.org entry</a>):
- Successfully reproducing Zig binaries within Guix, sadly inpractical as it uses 53+ intermediate versions between 0.10 and 0.13.
- </p>
- <h2>Historically problematic</h2>
- <h3 id="firefox_python2">Firefox >=68 <=78</h3>
- <p>Firefox would bundle python2 and refuse to build if removed. See <a href="https://salsa.debian.org/mozilla-team/firefox/-/commits/esr78/master/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin">Debian firefox-esr source history</a></p>
- <h2>Non-Problematic / Praise</h2>
- <h3 id="go">Go</h3>
- <p><a href="https://golang.org/doc/install/source">Installing Go from source</a> in the official Go documentation details it, both GCCGO and a branching out of Go 1.4 are supported.</p>
- </main>
- <!--#include file="/templates/en/footer.shtml" -->
- </body>
- </html>