logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

Linux-development-is-profoundly-distributed.md (5504B)


  1. ---
  2. title: Linux development is distributed - profoundly so
  3. date: 2020-09-02
  4. ---
  5. The standard introduction to git starts with an explanation of what it means to
  6. use a "distributed" version control system. It's pointed out that every
  7. developer has a complete local copy of the repository and can work independently
  8. and offline, often contrasting this design with systems like SVN and CVS. The
  9. explanation usually stops here. If you want to learn more, consider git's roots:
  10. it is the version control system purpose-built for Linux, the largest and most
  11. active open source project in the world. To learn more about the true nature of
  12. distributed development, we should observe Linux.
  13. Pull up your local copy of the Linux source code (you have one of those,
  14. right?[^1]) and open the MAINTAINERS file. Scroll down to line 150 or so and
  15. let's start reading some of these entries.
  16. [^1]: Okay, just in case: `git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git`
  17. Each of these represents a different individual or group which has some interest
  18. in the Linux kernel, often a particular driver. Most of them have an "F" entry,
  19. which indicates which files they're responsible for in the source code. Most
  20. have an "L" entry, which has a mailing list you can post questions, bug reports,
  21. and patches to, as well as an individual maintainer ("M") or maintainers who are
  22. known to have expertise and autonomy over this part of the kernel. Many of them
  23. — but, hmm, not all — also have a tree ("T"), which is a dedicated
  24. git repo with their copy of Linux, for staging changes to the kernel. This is
  25. common with larger drivers or with "meta" organizations, which oversee
  26. development of entire subsystems.
  27. However, this presents a simplified view. Look carefully at the "DRM" drivers
  28. ([Direct Rendering Manager][0]); a group of drivers and maintainers who are
  29. collectively responsible for graphics on Linux. There are many drivers and many
  30. maintainers, but a careful eye will notice that there are many similarities as
  31. well. A lot of them use the same mailing list, dri-devel@lists.freedesktop.org,
  32. and many of them use the same git repository:
  33. `git://anongit.freedesktop.org/drm/drm-misc`. It's not mentioned in this file,
  34. but many of them also shared the FreeDesktop bugzilla until recently, then moved
  35. to the FreeDesktop GitLab; and many of them share the `#dri-devel` IRC channel
  36. on Freenode. And again I'm simplifying — there are also many related IRC
  37. channels and git repos, and some larger drivers like AMDGPU have dedicated
  38. mailing lists and trees.
  39. [0]: https://en.wikipedia.org/wiki/Direct_Rendering_Manager
  40. There's more complexity to this system still. For example, not all of these
  41. subsystems are using git. The Intel TXT subsystem uses Mercurial. The Device
  42. Mapper team (one of the largest and most important Linux subsystems) uses
  43. [Quilt][1]. And like Linux DRM is a meta-project for many DRM-related subsystems
  44. & drivers, there are higher-level meta projects still, such as driver-core,
  45. which manages code and subsystems common to *all* I/O drivers. There are also
  46. cross-cutting concerns, such as the interaction between linux-usb and various
  47. network driver teams.
  48. [1]: https://savannah.nongnu.org/projects/quilt
  49. Patches to any particular driver could first end up on a domain-specific mailing
  50. list, with a particular maintainer being responsible for reviewing and
  51. integrating the patch, with their own policies and workflows and tooling. Then
  52. it might flow upwards towards another subsystem with its own similar features,
  53. and then up again towards meta-meta trees like linux-staging, and eventually to
  54. Linus' tree[^2]. Along the way it might receive feedback from other projects if it
  55. has cross-cutting concerns, tracing out an ever growing and shrinking bubble of
  56. inclusion among the trees, ultimately ending up in every tree. And that's
  57. *still* a simplification — for example, an important bug fix may sidestep
  58. all of this entirely and get applied on top of a downstream distribution kernel,
  59. ending up on end-user machines before it's made much progress upstream at all.
  60. [^2]: That's not the only destination; for example, some patches will end up in the LTS kernels as well.
  61. This complex *graph* of Linux development has code flowing smoothly between
  62. hundreds of repositories, emails exchanging between hundreds of mailing lists,
  63. passing through the hands of dozens of maintainers, several bug trackers,
  64. various CI systems, all day, every day, ten-thousand fold. This is truly
  65. illustrative of **distributed** software development, well above and beyond the
  66. typical explanation given to a new git user. The profound potential of the
  67. distributed git system can be plainly seen in the project for which it was
  68. principally designed. It's also plain to see how difficult it would be to adapt
  69. this system to something like GitHub pull requests, despite how easy many who
  70. are perplexed by the email-driven workflow wish it to be[^3]. As a matter of
  71. fact, several Linux teams are already using GitHub and GitLab and even pull or
  72. merge requests on their respective platforms. However, scaling this system up
  73. to the entire kernel would be a great challenge indeed.
  74. [^3]: If you are among the perplexed, [my interactive git send-email tutorial](https://git-send-email.io) takes about 10 minutes and is often recommended to new developers by Greg KH himself.
  75. By the way — that MAINTAINERS file? Scroll to the bottom. My copy is
  76. *19,000 lines long*.