Bootstrapping
- Bootstrappable Builds (GNU Guix focus)
- bootstrapping wiki
- David A. Wheeler’s Page on Fully Countering Trusting Trust through Diverse Double-Compiling (DDC) - Countering Trojan Horse attacks on Compilers (Note: Requires trustworthy bootstrap compiler(s) as starting point)
- Mike Stute's answer to What is a coder's worst nightmare?
- research!rsc: Running the “Reflections on Trusting Trust” Compiler: This notably contains the code that Ken Thomspon used together with explainations
Reasons
- Security
- See Ken Thompson Really Did Launch His "Trusting Trust" Trojan Attack in Real Life. And Reflections on Rusting Trust: Proof of Concept, backdooring The One True Rust Compiler.
- Portability
- Binary executables have much higher bitrot than source code and keeping obsolete binary interfaces often means keeping known security issues.
- Maintainability
- By making sure someone else can actually continue maintaining the software, canonical versions or forks
- Reproducibility's other side of the coin
- One of reproducibility's effect is allowing to audit source code instead of binaries, but said source code needs to be actually used.
Tools
- deblob
- Remove known binary executable formats (including bytecode), designed to be fast enough to barely impact distro-scale package building performance, cannot detect all blobs
- Debian's suspicious-source script
- Lists what isn't present in a list of source code formats, good for manual audits. Python+
magic(5)
means it is quite slow.
Problematic software
Erlang
Documented as originally implemented in prolog, now version n requires binaries version n-1 or n to build. No alternative compiler known so far.
Rust
There is mrustc but it's quite unstable and so far GuixSD seems to be the only distro using it. Getting to stable also involves compiling the intermediary versions. Rustc also vendors several other projects like LLVM and rust crates (enjoy non-installable libraries), similarly to other rust software.
GCC Rust Frontend is also not ready yet (2023-03) for userland, as cargo doesn't bootstraps…
Cargo
As if rustc not bootstrapping wouldn't be enough, cargo, the buildsystem+dependency-installer for Rust software depends on ~60 direct libraries, notably including 2+ git libraries, HTTP Authentication, OpenSSL.
Cargo isn't a buildsystem, it's a full blown package manager, supply chain troublemaker (via designed-vulnerable crates.io), …
It really ought to be replaced by something which only takes care of building code (or even just generating a Makefile
or a build.ninja
file), as was done in the C ecosystem many times in the past (pkg-config ⇒ pkgconf, ninja ⇒ samurai, …).
This isn't a system that scales, this is just creating a gigantic blob of software that cannot be reasonably audited, right in the toolchain.
Java
Requires compilers abandonned ~10 years ago, currently doesn't builds to OpenJDK for me.
Free-Pascal Compiler / Object Pascal
Aesop seems to still be at the vaporware stage, no code is available.
Nim
The transpiled C non-source code used for bootstrapping contained in ./c_code/
is pretty much what you would get with C++ mangled symbols auto-decompiled to C.
Bootstrapping Nim via historical releases would need a bootstrap path for Object Pascal, which doesn't exists (yet?), another way would be to have a minimal Nim compiler written in another language which is capable of compiling the current compiler.
QEMU
QEMU 7.0 needs a quick fix on the pc-bios/meson.build
file so you can choose to not use the binaries it ships, fixed in QEMU 7.1.
They are still required so it means identifying the source of all of them and having proper from-source packaging, it's already done in gentoo for Seabios and EDK2-OVMF (UEFI) which is enough to boot machines but not for full-x86 support, non-x86 being even more problematic (ie. which upstream is used for OpenBIOS/OpenFirmware as used for sparc32, sparc64 and ppc32).
wine-mono
In gentoo it's a collection of binaries. Upstream repository is at https://github.com/madewokherd/wine-mono but still includes binaries…
mono / .NET
Source-only building is unsupported and nearly impossible (massive chain + intermediary unstable versions).
Should also be noted that Mono started itself with the Microsoft C# compiler (History | Mono) instead of dotGNU (which is dead since 2012).
- potential prebuilt binaries in the Mono package
- Cannot build without binary-reference-assemblies
- Full source bootstrap · Issue #1930 · dotnet/source-build
- Automatically generated list of blobs via deblob on mono-6.12.0.122 tarball
Chez Scheme
Requires bootstrap files, GNU GuixSD packaging doesn't seems to have it figured out yet.
NekoVM
Doesn't seems possible to build without boot/*.n
files being present, which are NekoVM bytecode files.
Not Quite Perl (NQP)
Doesn't seems possible to build without src/vm/moar/stage0/*.moarvm
files being present, which are MoarVM bytecode files.
This means no Rakudo/Perl6.
GNU gnulib
lib/javaversion.class
. Made [PATCH] lib/javaversion.class: Remove, build from source to have it built from source.
GNU gettext
gnulib java blob; 3 Java class files in gettext-tools/examples
; gettext-tools/m4/csharpexec-test.exe
which doesn't have source code (C# is effectively proprietary anyway). Did sys-devel/gettext: deblob to fix it.
TypeScript
Compiler itself is written in TypeScript, no bootstrap path possible as the commit introducing the compiler is TypeScript code. Want TypeScript compiler? Get a blob from npmjs.org
, like the Initial commit tells you.
Alternative might be swc (Rust). Note that Deno (also Rust) just grabs pre-transpiled JS from Microsoft and Babel simply seems to depend on the typescript
package.
And it should be noted that TypeScript seems to have no specification anymore. (Commit: Remove doc folder (old archived spec and assets), word2md script)
Dart
Yet another chicken-egg language without a single documented way to bootstrap it from source, I wish they would have learned from the other language from Google: Go.
rollup
- chicken-egg
- Uses rollup to build itself
- one-step circular dependency
- rollup → acorn → rollup
- links to a two-step circular dependency
- rollup → eslint → webpack → acorn → eslint
Note: acorn doesn't lists it's dependencies on npmjs because it publishes a pre-compiled version…
Potentially problematic
OCaml
Has binary seeds in ./boot
, there is camlboot but it seems to be pretty inefficient (takes hours to compile when regular ocaml takes minutes to compile)
Zig
Threw out the C++ implementation in favor of a large WASM binary seed, for now it's chained-bootstrapping, hopefully an alternative compiler written in a bootstrapped language will appear.
Historically problematic
Firefox >=68 <=78
Firefox would bundle python2 and refuse to build if removed. See Debian firefox-esr source history
Non-Problematic / Praise
Go
Installing Go from source in the official Go documentation details it, both GCCGO and a branching out of Go 1.4 are supported.