commit: c9d17fa2d730553cd67c74ece7a968766a7283e3
parent b337e5b57b87453f8d9b78500ac37b9db33913de
Author: Haelwenn (lanodan) Monnier <contact@hacktivis.me>
Date: Sat, 6 May 2023 14:25:50 +0200
notes/unix-defects: null-termination is also lists
Diffstat:
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/notes/unix-defects.xhtml b/notes/unix-defects.xhtml
@@ -11,12 +11,16 @@
<p>This tries to list all the defects that are present in Unix, an OS from the early 70's. I consider "Unix" what current Unix clones (BSDs, illumos, Linux, …) have implemented.</p>
<p>None of this should be present in brand new systems except within a cleanly-separated compatibility layer (like Plan9 ape).</p>
- <h3 id="cstr"><code>NULL</code>-Terminated strings</h3>
+ <h3 id="lists"><code>NULL</code>-Terminated lists</h3>
<dd>
<dt>Slow to parse</dt><dd>Time taken to obtain the length increases with each <em>byte</em> aka <code class="math">O(n)</code> while length prefix is constant-time aka <code class="math">O(1)</code>.</dd>
- <dt>Inefficient & Unsafe string slices</dt><dd>For a slice without modifying the source, you need to copy the wanted part and terminate it with <code>NULL</code>. While for length prefix you can reuse the source string as-is via an offset (or pointer), followed by a byte length.</dd>
- <dt>Unsafe</dt><dd>How do you handle <code>NULL</code> being present in the middle of a string? Or <code>NULL</code> being absent?</dd>
+ <dt>Inefficient & Unsafe slices</dt><dd>For a slice without modifying the source, you still need to copy the wanted part and terminate it with <code>NULL</code>. While with length prefix you can reuse the source as-is via an offset (or pointer) and setting a different length.</dd>
+ <dt>Unsafe</dt><dd>How do you handle <code>NULL</code> being present in the middle of the list? Or <code>NULL</code> being absent?</dd>
</dd>
+ <p>
+ And as C doesn't have a specific type for strings (<code>char</code> represents a character in the same way a <a href="https://en.wikipedia.org/wiki/Memory_word">"word" of memory</a> represents some kind of word), the defects applies to all lists.
+ This is why most of the C API regarding strings cannot be used safely (<code>strcpy</code> vs <code>strncpy</code> or just <code>memcpy</code>), or why so many third-party C libraries APIs are architecturally broken.
+ </p>
<h3 id="errno"><code>errno</code></h3>
<p>