logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

A-story-of-two-libcs.md (7765B)


  1. ---
  2. title: A tale of two libcs
  3. date: 2020-09-25
  4. outputs: ["html", "gemtext"]
  5. ---
  6. I received a bug report from Debian today, who had fed some garbage into
  7. [scdoc](https://git.sr.ht/~sircmpwn/scdoc), and it gave them a SIGSEGV back.
  8. Diving into this problem gave me a good opportunity to draw a comparison between
  9. musl libc and glibc. Let's start with the stack trace:
  10. ```
  11. ==26267==ERROR: AddressSanitizer: SEGV on unknown address 0x7f9925764184
  12. (pc 0x0000004c5d4d bp 0x000000000002 sp 0x7ffe7f8574d0 T0)
  13. ==26267==The signal is caused by a READ memory access.
  14. 0 0x4c5d4d in parse_text /scdoc/src/main.c:223:61
  15. 1 0x4c476c in parse_document /scdoc/src/main.c
  16. 2 0x4c3544 in main /scdoc/src/main.c:763:2
  17. 3 0x7f99252ab0b2 in __libc_start_main
  18. /build/glibc-YYA7BZ/glibc-2.31/csu/../csu/libc-start.c:308:16
  19. 4 0x41b3fd in _start (/scdoc/scdoc+0x41b3fd)
  20. ```
  21. And if we pull up that line of code, we find...
  22. ```c
  23. if (!isalnum(last) || ((p->flags & FORMAT_UNDERLINE) && !isalnum(next))) {
  24. ```
  25. Hint: p is a valid pointer. "last" and "next" are both uint32_t. The segfault
  26. happens in the second call to isalnum. And, the key: it can only be reproduced
  27. on glibc, not on musl libc. If you did a double-take, you're not alone. There's
  28. nothing here which could have caused a segfault.
  29. Since it was narrowed down to glibc, I pulled up the source code and went
  30. digging for the isalnum implementation, expecting some stupid bullshit. But
  31. before I get into their stupid bullshit, of which I can assure you there is *a
  32. lot*, let's briefly review the happy version. This is what the musl libc
  33. `isalnum` implementation looks like:
  34. ```c
  35. int isalnum(int c)
  36. {
  37. return isalpha(c) || isdigit(c);
  38. }
  39. int isalpha(int c)
  40. {
  41. return ((unsigned)c|32)-'a' < 26;
  42. }
  43. int isdigit(int c)
  44. {
  45. return (unsigned)c-'0' < 10;
  46. }
  47. ```
  48. As expected, for any value of `c`, isalnum will never segfault. Because why the
  49. fuck would isalnum segfault? Okay, now, let's compare this to the
  50. [glibc implementation][ctype]. When opening this header, you're greeted with the
  51. typical GNU bullshit, but let's trudge through and grep for isalnum.
  52. [ctype]: https://sourceware.org/git/?p=glibc.git;a=blob;f=ctype/ctype.h;h=351495aa4feaf23993fe65afc0760615268d044e;hb=HEAD
  53. The first result is this:
  54. ```c
  55. enum
  56. {
  57. _ISupper = _ISbit (0), /* UPPERCASE. */
  58. _ISlower = _ISbit (1), /* lowercase. */
  59. // ...
  60. _ISalnum = _ISbit (11) /* Alphanumeric. */
  61. };
  62. ```
  63. This looks like an implementation detail, let's move on.
  64. ```c
  65. __exctype (isalnum);
  66. ```
  67. But what's `__exctype`? Back up the file a few lines...
  68. ```c
  69. #define __exctype(name) extern int name (int) __THROW
  70. ```
  71. Okay, apparently that's just the prototype. Not sure why they felt the need to
  72. write a macro for that. Next search result...
  73. ```c
  74. #if !defined __NO_CTYPE
  75. # ifdef __isctype_f
  76. __isctype_f (alnum)
  77. // ...
  78. ```
  79. Okay, this looks useful. What is `__isctype_f`? Back up the file now...
  80. ```c
  81. #ifndef __cplusplus
  82. # define __isctype(c, type) \
  83. ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) type)
  84. #elif defined __USE_EXTERN_INLINES
  85. # define __isctype_f(type) \
  86. __extern_inline int \
  87. is##type (int __c) __THROW \
  88. { \
  89. return (*__ctype_b_loc ())[(int) (__c)] & (unsigned short int) _IS##type; \
  90. }
  91. #endif
  92. ```
  93. Oh.... oh dear. It's okay, we'll work through this together. Let's see,
  94. `__isctype_f` is some kind of inline function... wait, this is the else branch
  95. of `#ifndef __cplusplus`. Dead end. Where the fuck is isalnum *actually*
  96. defined? Grep again... okay... here we are?
  97. ```c
  98. #if !defined __NO_CTYPE
  99. # ifdef __isctype_f
  100. __isctype_f (alnum)
  101. // ...
  102. # elif defined __isctype
  103. # define isalnum(c) __isctype((c), _ISalnum) // <- this is it
  104. ```
  105. Hey, there's that implementation detail from earlier! Remember this?
  106. ```c
  107. enum
  108. {
  109. _ISupper = _ISbit (0), /* UPPERCASE. */
  110. _ISlower = _ISbit (1), /* lowercase. */
  111. // ...
  112. _ISalnum = _ISbit (11) /* Alphanumeric. */
  113. };
  114. ```
  115. Let's suss out that macro real quick:
  116. ```c
  117. # include <bits/endian.h>
  118. # if __BYTE_ORDER == __BIG_ENDIAN
  119. # define _ISbit(bit) (1 << (bit))
  120. # else /* __BYTE_ORDER == __LITTLE_ENDIAN */
  121. # define _ISbit(bit) ((bit) < 8 ? ((1 << (bit)) << 8) : ((1 << (bit)) >> 8))
  122. # endif
  123. ```
  124. Oh, for fuck's sake. Whatever, let's move on and just assume this is a magic
  125. number. The other macro is `__isctype`, which is similar to the `__isctype_f` we
  126. were just looking at a moment ago. Let's go look at that `ifndef __cplusplus`
  127. branch again:
  128. ```c
  129. #ifndef __cplusplus
  130. # define __isctype(c, type) \
  131. ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) type)
  132. #elif defined __USE_EXTERN_INLINES
  133. // ...
  134. #endif
  135. ```
  136. ...
  137. Well, at least we have a pointer dereference now, that could explain the
  138. segfault. What's `__ctype_b_loc`?
  139. ```c
  140. /* These are defined in ctype-info.c.
  141. The declarations here must match those in localeinfo.h.
  142. In the thread-specific locale model (see `uselocale' in <locale.h>)
  143. we cannot use global variables for these as was done in the past.
  144. Instead, the following accessor functions return the address of
  145. each variable, which is local to the current thread if multithreaded.
  146. These point into arrays of 384, so they can be indexed by any `unsigned
  147. char' value [0,255]; by EOF (-1); or by any `signed char' value
  148. [-128,-1). ISO C requires that the ctype functions work for `unsigned
  149. char' values and for EOF; we also support negative `signed char' values
  150. for broken old programs. The case conversion arrays are of `int's
  151. rather than `unsigned char's because tolower (EOF) must be EOF, which
  152. doesn't fit into an `unsigned char'. But today more important is that
  153. the arrays are also used for multi-byte character sets. */
  154. extern const unsigned short int **__ctype_b_loc (void)
  155. __THROW __attribute__ ((__const__));
  156. extern const __int32_t **__ctype_tolower_loc (void)
  157. __THROW __attribute__ ((__const__));
  158. extern const __int32_t **__ctype_toupper_loc (void)
  159. __THROW __attribute__ ((__const__));
  160. ```
  161. That is just so, super cool of you, glibc. I just *love* dealing with locales.
  162. Anyway, my segfaulted process is sitting in gdb, and equipped with all of this
  163. information I wrote the following monstrosity:
  164. ```
  165. (gdb) print ((unsigned int **(*)(void))__ctype_b_loc)()[next]
  166. Cannot access memory at address 0x11dfa68
  167. ```
  168. Segfault found. Reading that comment again, we see "ISO C requires that the
  169. ctype functions work for 'unsigned char' values and for EOF". If we
  170. cross-reference that with the specification:
  171. > In all cases [of functions defined by ctype.h,] the argument is an int, the
  172. > value of which shall be representable as an unsigned char or shall equal the
  173. > value of the macro EOF.
  174. So the fix is obvious at this point. Okay, fine, my bad. My code is wrong. I
  175. apparently cannot just hand a UCS-32 codepoint to isalnum and expect it to tell
  176. me if it's between 0x30-0x39, 0x41-0x5A, or 0x61-0x7A.
  177. But, I'm going to go out on a limb here: maybe isalnum should never cause a
  178. program to segfault no matter what input you give it. Maybe because the spec
  179. says you *can* does not mean you *should*. Maybe, just maybe, the behavior of
  180. this function should not depend on five macros, whether or not you're using a
  181. C++ compiler, the endianness of your machine, a look-up table, thread-local
  182. storage, and two pointer dereferences.
  183. Here's the musl version as a quick reminder:
  184. ```c
  185. int isalnum(int c)
  186. {
  187. return isalpha(c) || isdigit(c);
  188. }
  189. int isalpha(int c)
  190. {
  191. return ((unsigned)c|32)-'a' < 26;
  192. }
  193. int isdigit(int c)
  194. {
  195. return (unsigned)c-'0' < 10;
  196. }
  197. ```
  198. Bye!