logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

Re-Slow.md (9404B)


  1. ---
  2. date: 2020-01-08
  3. title: Following up on "Hello world"
  4. layout: post
  5. tags: [followup]
  6. ---
  7. This is a follow-up to my last article, [Hello
  8. world](https://drewdevault.com/2020/01/04/Slow.html), which is easily the most
  9. negatively received article I've written — a remarkable feat for someone
  10. who's written as much flame bait as me. Naturally, the fault lies with the
  11. readers.
  12. <a href="https://xkcd.com/1984/" rel="noopener"><img src="https://imgs.xkcd.com/comics/misinterpretation_2x.png" width="294" /></a>
  13. All jokes aside, I'll try to state my point better. The "Hello world" article
  14. was a lot of work to put together &mdash; frustrating work &mdash; by the time
  15. I had finished collecting numbers, I was exhausted and didn't pay much mind to
  16. putting context to them. This left a lot of it open to interpretation, and a lot
  17. of those interpretations didn't give the benefit of the doubt.
  18. First, it's worth clarifying that the assembly program I gave is a
  19. *hypothetical, idealized* hello world program, and in practice not even the
  20. assembly program is safe from bloat. After it's wrapped up in an ELF, even after
  21. stripping, the binary bloats up to <strong>157&times;</strong> the size of the
  22. actual machine code. I had hoped this would be more intuitively clear, but the
  23. take-away is that the ideal program is a pipe dream, not a standard to which the
  24. others are held. As the infinite frictionless plane in vacuum is to physics,
  25. that assembly program is to compilers.
  26. I also made the mistake of including the runtime in the table. What I wanted you
  27. to notice about the timestamp is that it *rounds to zero* for 15 of the 21 test
  28. cases, and arguably only one or two approach the realm of human perception.
  29. It's meant to lend balance to the point I'm making with the number of syscalls:
  30. despite the complexity on display, the user generally can't even tell. The other
  31. problem with including the runtimes is that it makes it look like a benchmark,
  32. which it's not (you'll notice that if you grep for "benchmark", you will find no
  33. results).
  34. Another improvement would have been to group rows of the table by orders of
  35. magnitude (in terms of number of syscalls), and maybe separate the outliers in
  36. each group. There is little difference between many of the languages in the
  37. middle of the table, but when one of them is your favorite language, "stacking
  38. it up" against its competitors like this is a good way to get the reader's blood
  39. pumping and bait some flames. If your language appears to be represented
  40. unfavorably on this chart, you're likely to point out the questionable
  41. methodology, golf your way to a more generous sample code, etc; things I could
  42. have done myself were I trying to make a benchmark rather than a point about
  43. complexity.
  44. And hidden therein is my actual point: complexity. There has long been a trend
  45. in computing of endlessly piling on the abstractions, with no regard for the
  46. consequences. The web is an ever growing mess of complexity, with larger and
  47. larger blobs of inscrutable JavaScript being shoved down pipes with no regard
  48. for the pipe's size or the bridge toll charged by the end-user's telecom.
  49. Electron apps are so far removed from hardware that their jarring non-native UIs
  50. can take seconds to respond and eat up the better part of your RAM to merely
  51. show a text editor or chat application.
  52. The PC in front of me is literally five thousand times faster than the graphing
  53. calculator in my closet - but the latter can boot to a useful system in a
  54. fraction of a millisecond, while my PC takes almost a minute. Productivity per
  55. CPU cycle per Watt is the lowest it's been in decades, and is orders of
  56. magnitude (plural) beneath its potential. So far as most end-users are
  57. concerned, computers haven't improved in meaningful ways in the past 10 years,
  58. and in many respects have become worse. The cause is well-known: programmers
  59. have spent the entire lifetime of our field recklessly piling abstraction on top
  60. of abstraction on top of abstraction. We're more concerned with shoving more
  61. spyware at the problem than we are with optimization, outside of a small number
  62. of high-value problems like video decoding.[^1] Programs have grown fat and
  63. reckless in scope, and it affects literally everything, even down to the last
  64. bastion of low-level programming: C.
  65. I use syscalls as an approximation of this complexity. Even for one of the
  66. simplest possible programs, there is a huge amount of abstraction and complexity
  67. that comes with many approaches to its implementation. If I just print "hello
  68. world" in Python, users are going to bring along almost a million lines of code
  69. to run it, the fraction of which isn't dead code is basically a rounding error.
  70. This isn't *always* a bad thing, but it often is and no one is thinking about
  71. it.
  72. That's the true message I wanted you to take away from my article: most
  73. programmers aren't thinking about this complexity. Many choose tools because
  74. it's easier for them, or because it's what they know, or because developer time
  75. is more expensive than the user's CPU cycles or battery life and the engineers
  76. aren't signing the checks. I hoped that many people would be surprised at just
  77. how much work their average programming language could end up doing even when
  78. given simple tasks.
  79. The point was not that your programming language is wrong, or that being higher
  80. up on the table is better, or that programming languages should be blindly
  81. optimizing these numbers. The point is, if these numbers surprised you, then you
  82. should find out why! I'm a systems programmer &mdash; I want you to be
  83. interested in your systems! And if this surprises you, I wonder what else
  84. might...
  85. I know that article didn't do a good job of explaining any of this. I'm sorry.
  86. ---
  87. Now to address more specific comments:
  88. **What the fuck is a syscall**?
  89. This question is more common with users of the languages which make more of
  90. them, ironically. A syscall is when your program asks the kernel to do something
  91. for it. This causes a transition from *user space* to *kernel space*. This
  92. transition is one of the more expensive things your programs can do, but a
  93. program that doesn't make any syscalls is not a useful program: syscalls are
  94. necessary to do any kind of I/O (input or output). [Wikipedia
  95. page](https://en.wikipedia.org/wiki/System_call).
  96. On Linux, you can use the [strace](https://linux.die.net/man/1/strace) tool to
  97. analyze the syscalls your programs are making, which is how I obtained the
  98. numbers in the original article.
  99. **This "benchmark" is biased against JIT'd and interpreted languages**.
  100. Yes, it is. It *is* true that many programming environments have to factor
  101. in a "warm up" time. This argument on its face-value is apparently validated by
  102. the cargo-culted (and often correct) wisdom that benchmarks should be conducted
  103. with timers in-situ, post warm-up period, with the measured task being
  104. repeated many times so that trends become more obvious.[^2] It's precisely these
  105. details, which the conventional benchmarking wisdom aims to obscure, that I'm
  106. trying to cast a light on. While a benchmark which shows how quickly a bunch of
  107. programming languages can print "hello world" a million times[^3] might be
  108. interesting, it's not what I'm going for here.
  109. **Rust is doing important things with those syscalls**.
  110. My opinion on this is mixed: yes, stack guards are useful. However, my "hello
  111. world" program has zero chance of causing a stack overflow. In theory, Rust
  112. should be able to reckon whether or not many programs are at risk of stack
  113. overflow. If not, it can ask the programmer to specify some bounds, or it can
  114. emit the stack guards *only in those cases*. The worst option is panicking, and
  115. I'm surprised that Crustaceans feel like this is sufficient. Funny, given their
  116. obsession with "zero cost" abstractions, that a nonzero-cost abstraction would
  117. be so fiercely defended. They're already used to overlong compile times, adding
  118. more analysis probably won't be noticed ;)
  119. **Go is doing important things with those syscalls**.
  120. On this I wholly disagree. I hate the Go runtime, it's the worst thing about an
  121. otherwise great language. Go programs are almost impossible to debug for having
  122. to sift through mountains of unrelated bullshit the program is doing, all to
  123. support a concurrency/parallelism model that I also strongly dislike. There are
  124. some bad design decisions in Golang and stracing the average Go program brings a
  125. lot of them to light. Illumos has many of its own problems, but [this
  126. article](http://dtrace.org/blogs/wesolows/2014/12/29/golang-is-trash/) about
  127. porting Go to it covers a number of related problems.
  128. **Wow, Zig is competitive with assembly?**
  129. Yeah, I totally had the same reaction. I'm interested to see how it measures up
  130. under more typical workloads. People keep asking me what I think about Zig in
  131. general, and I think it has potential, but I also have a lot of complaints. It's
  132. not likely to replace C for me, but it might have a place somewhere in my stack.
  133. [^1]: For efficient display of unskippable 30 second video ads, of course.
  134. [^2]: This approach is the most "fair" for comparison's sake, but it also often obscures a lot of the practical value of the benchmark in the first place. For example, how often is the branch predictor and L1 cache going to be warmed up in favor of the measured code in practice?
  135. [^3]: All of them being handily beaten by `/bin/yes "hello world"`