logo

blog

My website can't be that messy, right? git clone https://hacktivis.me/git/blog.git
commit: c9d17fa2d730553cd67c74ece7a968766a7283e3
parent b337e5b57b87453f8d9b78500ac37b9db33913de
Author: Haelwenn (lanodan) Monnier <contact@hacktivis.me>
Date:   Sat,  6 May 2023 14:25:50 +0200

notes/unix-defects: null-termination is also lists

Diffstat:

Mnotes/unix-defects.xhtml10+++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/notes/unix-defects.xhtml b/notes/unix-defects.xhtml @@ -11,12 +11,16 @@ <p>This tries to list all the defects that are present in Unix, an OS from the early 70's. I consider "Unix" what current Unix clones (BSDs, illumos, Linux, …) have implemented.</p> <p>None of this should be present in brand new systems except within a cleanly-separated compatibility layer (like Plan9 ape).</p> - <h3 id="cstr"><code>NULL</code>-Terminated strings</h3> + <h3 id="lists"><code>NULL</code>-Terminated lists</h3> <dd> <dt>Slow to parse</dt><dd>Time taken to obtain the length increases with each <em>byte</em> aka <code class="math">O(n)</code> while length prefix is constant-time aka <code class="math">O(1)</code>.</dd> - <dt>Inefficient &amp; Unsafe string slices</dt><dd>For a slice without modifying the source, you need to copy the wanted part and terminate it with <code>NULL</code>. While for length prefix you can reuse the source string as-is via an offset (or pointer), followed by a byte length.</dd> - <dt>Unsafe</dt><dd>How do you handle <code>NULL</code> being present in the middle of a string? Or <code>NULL</code> being absent?</dd> + <dt>Inefficient &amp; Unsafe slices</dt><dd>For a slice without modifying the source, you still need to copy the wanted part and terminate it with <code>NULL</code>. While with length prefix you can reuse the source as-is via an offset (or pointer) and setting a different length.</dd> + <dt>Unsafe</dt><dd>How do you handle <code>NULL</code> being present in the middle of the list? Or <code>NULL</code> being absent?</dd> </dd> + <p> + And as C doesn't have a specific type for strings (<code>char</code> represents a character in the same way a <a href="https://en.wikipedia.org/wiki/Memory_word">"word" of memory</a> represents some kind of word), the defects applies to all lists. + This is why most of the C API regarding strings cannot be used safely (<code>strcpy</code> vs <code>strncpy</code> or just <code>memcpy</code>), or why so many third-party C libraries APIs are architecturally broken. + </p> <h3 id="errno"><code>errno</code></h3> <p>