logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

Porting-DOOM-to-Helios.md (15860B)


  1. ---
  2. title: Porting Doom to Helios
  3. date: 2022-07-01
  4. ---
  5. [Doom] was an incredibly popular video game by Id software which, six years
  6. following its release, was made [open source] under the GPLv2 license. Thanks to
  7. this release, combined with the solid software design and lasting legacy of
  8. backwards compatibility in C, Doom has been ported to countless platforms by
  9. countless programmers. And I recently added myself to this number :)
  10. [Doom]: https://en.wikipedia.org/wiki/Doom_(1993_video_game)
  11. [open source]: https://github.com/id-Software/DOOM
  12. <video src="https://mirror.drewdevault.com/doom.webm" controls muted autoplay></video>
  13. I'm working on a new kernel called [Helios], and I thought that porting Doom
  14. would present a good opportunity for proving the kernel design &mdash; you never
  15. know if you have a good design until you try to use it for real. Doom is a good
  16. target because it does not require much to get working, but it is a useful (and
  17. fun) program to port. It calls for the following features:
  18. [Helios]: https://sr.ht/~sircmpwn/helios
  19. 1. A working C programming environment
  20. 1. Dynamic memory allocation
  21. 1. A place to draw the screen (a framebuffer)
  22. 1. Keyboard input
  23. As I was working, I gradually came to understand that Helios was pretty close to
  24. supporting all of these features, and thought that the time to give Doom a shot
  25. was coming soon. In my [last status update], I shared a picture of a Helios
  26. userspace program utilizing the framebuffer provided by multiboot, ticking one
  27. box. We've had dynamic memory allocation in userspace working since June 8th.
  28. The last pieces were a keyboard driver and a C library.
  29. [last status update]: https://drewdevault.com/2022/06/15/Status-update-June-2022.html
  30. I started with the keyboard driver, since that would let me continue to work on
  31. Hare for a little bit longer, providing a more direct benefit to the long-term
  32. goals (rather than the short-term goal of "get Doom to work"). Since Helios is a
  33. micro-kernel, the keyboard driver is implemented in userspace. A PS/2 keyboard
  34. driver requires two features which are reserved to ring 0: I/O ports and IRQ
  35. handling. To simplify the interface to the essentials for this use-case,
  36. pressing or releasing a key causes IRQ 1 to be fired on the PIC, then reading
  37. from port 0x60 provides a scancode. We already had support for working with I/O
  38. ports in userspace, so the blocker here was IRQ handling.
  39. Helios implements IRQs similarly to seL4, by using a "notification" object (an
  40. IPC primitive) which is signalled by the kernel when an IRQ occurs. I was
  41. pleased to have this particular blocker, as developing out our IPC
  42. implementation further was a welcome task. The essential usage of a notification
  43. involves two operations: wait and signal. The former blocks until the
  44. notification is signalled, and the later signals the notification and unblocks
  45. any tasks which are waiting on it. Unlike sending messages to endpoints, signal
  46. never blocks.
  47. After putting these pieces together, I was able to write a simple PS/2 keyboard
  48. driver which echos pressed keys to the kernel console:
  49. ```hare
  50. const irq1_notify = helios::newnotification()?;
  51. const irq1 = helios::irqcontrol_issue(rt::INIT_CAP_IRQCONTROL, irq1_notify, 1)?;
  52. const ps2 = helios::iocontrol_issue(rt::INIT_CAP_IOCONTROL, 0x60, 0x64)?;
  53. for (true) {
  54. helios::wait(irq1_notify)?;
  55. const scancode = helios::ioport_in8(ps2, 0x60)?;
  56. helios::irq_ack(irq1)!;
  57. };
  58. ```
  59. This creates a notification capability to wait on IRQs, then creates a
  60. capability for IRQ 1 registered for that notification. It also issues an I/O
  61. port capability for the PS/2 ports, 0x60-0x64 (inclusive). Then it loops,
  62. waiting until an interrupt occurs, reading the scancode from the port, and
  63. printing it. Simple!
  64. I now turned my attention to a C library for Doom. The first step for writing
  65. userspace programs in C for a new operating system is to produce a suitable C
  66. cross-compiler toolchain. I adapted the instructions from [this OSdev wiki
  67. tutorial](https://wiki.osdev.org/GCC_Cross-Compiler) for my needs and produced
  68. the working patches for [binutils] and [gcc]. I started on a simple C library
  69. that included [some assembly glue][0] for syscalls, [an entry point][1], and
  70. [a couple of syscall wrappers][2]. With great anticipation, I wrote the
  71. following C program and loaded it into Helios:
  72. [binutils]: https://git.sr.ht/~sircmpwn/binutils/commit/b104dee8b4d5f6fb57d585132775e22f0eba80df
  73. [gcc]: https://git.sr.ht/~sircmpwn/gcc/commit/20df2b4d99670f2db51f84dc57d2253fd71d0b2b
  74. [0]: https://git.sr.ht/~sircmpwn/mercury/tree/f80bb66373ab12a66a9a86894d212cbbdfcf53bf/item/libc/syscall.s
  75. [1]: https://git.sr.ht/~sircmpwn/mercury/tree/f80bb66373ab12a66a9a86894d212cbbdfcf53bf/item/libc/crt
  76. [2]: https://git.sr.ht/~sircmpwn/mercury/tree/1217b54b7bd09bebcd672d1e9cdae14f2e2e545f/item/libc/syscalls.c
  77. ```c
  78. #include <helios/syscall.h>
  79. #include <string.h>
  80. int main() {
  81. const char *message = "Hello from userspace in C!\n";
  82. sys_writecons(message, strlen(message));
  83. return 0;
  84. }
  85. ```
  86. ```
  87. $ qemu-system-x86_64 -m 1G -no-reboot -no-shutdown \
  88. -drive file=boot.iso,format=raw \
  89. -display none \
  90. -chardev stdio,id=char0 \
  91. -serial chardev:char0
  92. Booting Helios kernel
  93. Hello from userspace in C!
  94. ```
  95. Woohoo! After a little bit more work setting up the basics, I started rigging
  96. [doomgeneric] (a Doom fork designed to be easy to port) up to my cross
  97. environment and seeing what would break.
  98. [doomgeneric]: https://github.com/ozkl/doomgeneric
  99. As it turned out, a lot of stuff would break. doomgeneric is designed to be
  100. portable, but it actually depends on a lot of stuff to be available from the C
  101. environment: stdio, libmath, string.h stuff, etc. Not too much, but more than I
  102. cared to write from scratch. So, I started pulling in large swaths of [musl
  103. libc], trimming out as much as I could, and wriggling it into a buildable state.
  104. I also wrote a lot of shims to fake out having a real Unix system to run it in,
  105. like this code for defining stdout & stderr to just write to the kernel console:
  106. [musl libc]: https://musl.libc.org
  107. ```c
  108. static size_t writecons(FILE *f, const unsigned char *buf, size_t size) {
  109. sys_writecons(f->wbase, f->wpos - f->wbase);
  110. sys_writecons(buf, size);
  111. f->wend = f->buf + f->buf_size;
  112. f->wpos = f->wbase = f->buf;
  113. return size;
  114. }
  115. #undef stdout
  116. static unsigned char stdoutbuf[BUFSIZ+UNGET];
  117. hidden FILE __stdout_FILE = {
  118. .buf = stdoutbuf+UNGET,
  119. .buf_size = sizeof stdoutbuf-UNGET,
  120. .fd = 1,
  121. .flags = F_PERM | F_NORD,
  122. .lbf = '\n',
  123. .write = &writecons,
  124. .seek = NULL,
  125. .close = NULL,
  126. .lock = -1,
  127. };
  128. FILE *const stdout = &__stdout_FILE;
  129. FILE *volatile __stdout_used = &__stdout_FILE;
  130. #undef stderr
  131. static unsigned char stderrbuf[UNGET];
  132. hidden FILE __stderr_FILE = {
  133. .buf = stderrbuf+UNGET,
  134. .buf_size = 0,
  135. .fd = 2,
  136. .flags = F_PERM | F_NORD,
  137. .lbf = -1,
  138. .write = &writecons,
  139. .seek = NULL,
  140. .close = NULL,
  141. .lock = -1,
  142. };
  143. FILE *const stderr = &__stderr_FILE;
  144. FILE *volatile __stderr_used = &__stderr_FILE;
  145. ```
  146. The result of all of this hacking and slashing is quite a mess, and none of this
  147. is likely to be useful in the long term. I did this work over the course of a
  148. couple of afternoons just to get everything "working" enough to support Doom,
  149. but an actual useful C programming environment for Helios is likely some ways
  150. off. Much of the near-term work will be in Mercury, which will be a Hare
  151. environment for writing drivers, and we won't see a serious look at better C
  152. support until we get to Luna, the POSIX compatibility layer a few milestones
  153. away.
  154. Anyway, in addition to pulling in lots of musl libc, I had to write some
  155. original code to create C implementations of the userspace end for working with
  156. Helios kernel services. Some of this is pretty straightforward, such as the
  157. equivalent of the helios::ioport_issue code from the keyboard driver you saw
  158. earlier:
  159. ```c
  160. cap_t
  161. iocontrol_issue(cap_t ctrl, uint16_t min, uint16_t max)
  162. {
  163. uint64_t tag = mktag(IO_ISSUE, 1, 1);
  164. cap_t cap = capalloc();
  165. ipc_buffer->caddr = cap;
  166. struct sysret ret = sys_send(ctrl, tag, min, max, 0);
  167. assert(ret.status == 0);
  168. return cap;
  169. }
  170. ```
  171. A more complex example is the code which maps a page of physical memory into the
  172. current process's virtual address space. In Helios, similar to L4, userspace
  173. must allocate its own page tables. However, these page tables are semantically
  174. *owned* by userspace, but they're not actually *reachable* by userspace &mdash;
  175. the page tables themselves are not mapped into their address space (for obvious
  176. reasons, I hope). A consequence of this is that the user cannot examine the page
  177. tables to determine which, if any, intermediate page tables have to be allocated
  178. in order to perform a desired memory mapping. The solution is to try the mapping
  179. anyway, and if the page tables are missing, the kernel will reply telling you
  180. which table it needs to complete the mapping request. You allocate the
  181. appropriate table and try again.
  182. Some of this workload falls on userspace. I had already done this part in Hare,
  183. but I had to revisit it in C:
  184. ```c
  185. struct sysret
  186. page_map(cap_t page, cap_t vspace, uintptr_t vaddr)
  187. {
  188. uint64_t tag = mktag(PAGE_MAP, 1, 1);
  189. ipc_buffer->caps[0] = vspace;
  190. return sys_send(page, tag, (uint64_t)vaddr, 0, 0);
  191. }
  192. static void
  193. map_table(uintptr_t vaddr, enum pt_type kind)
  194. {
  195. int r;
  196. cap_t table;
  197. switch (kind) {
  198. case PT_PDPT:
  199. r = retype(&table, CT_PDPT);
  200. break;
  201. case PT_PD:
  202. r = retype(&table, CT_PD);
  203. break;
  204. case PT_PT:
  205. r = retype(&table, CT_PT);
  206. break;
  207. default:
  208. assert(0);
  209. }
  210. assert(r == 0);
  211. struct sysret ret = page_map(table, INIT_CAP_VSPACE, vaddr);
  212. if (ret.status == -MISSING_TABLES) {
  213. map_table(vaddr, ret.value);
  214. map_table(vaddr, kind);
  215. }
  216. }
  217. void *
  218. map(cap_t page, uintptr_t vaddr)
  219. {
  220. while (1) {
  221. struct sysret ret = page_map(page, INIT_CAP_VSPACE, vaddr);
  222. if (ret.status == -MISSING_TABLES) {
  223. map_table(vaddr, ret.value);
  224. } else {
  225. assert(ret.status == 0);
  226. break;
  227. }
  228. }
  229. return (void *)vaddr;
  230. }
  231. ```
  232. Based on this work, I was able to implement a very stupid malloc, which rounds
  233. all allocations up to 4096 and never frees them. Hey! It works, okay?
  234. ```c
  235. uintptr_t base = 0x8000000000;
  236. static cap_t
  237. page_alloc()
  238. {
  239. cap_t page;
  240. int r = retype(&page, CT_PAGE);
  241. assert(r == 0);
  242. return page;
  243. }
  244. void *
  245. malloc(size_t n)
  246. {
  247. if (n % 4096 != 0) {
  248. n += 4096 - (n % 4096);
  249. }
  250. uintptr_t ret = base;
  251. while (n != 0) {
  252. cap_t page = page_alloc();
  253. map(page, base);
  254. base += 4096;
  255. n -= 4096;
  256. }
  257. return (void *)ret;
  258. }
  259. ```
  260. There is also [devmap], which you can read in your own time, which is used for
  261. mapping device memory into your address space. This is neccessary to map the
  262. framebuffer. It's more complex because it has to allocate a *specific* physical
  263. page address into userspace, rather than whatever page happens to be free.
  264. [devmap]: https://git.sr.ht/~sircmpwn/mercury/tree/f80bb66373ab12a66a9a86894d212cbbdfcf53bf/item/libc/helios/device.c
  265. So, to revisit our progress, we have:
  266. ✓ A working C programming environment
  267. ✓ Dynamic memory allocation
  268. ✓ A place to draw the screen (a framebuffer)
  269. ✓ Keyboard input
  270. It's time for Doom, baby. Doomgeneric expects the porter to implement the
  271. following functions:
  272. - DG\_Init
  273. - DG\_DrawFrame
  274. - DG\_GetKey
  275. - DG\_SetWindowTitle
  276. - DG\_SleepMs
  277. - DG\_GetTicksMs
  278. Easy peasy. Uh, except for that last one. I forgot that our requirements list
  279. should have included a means of sleeping for a specific period of time.
  280. Hopefully that won't be a problem later.
  281. I started with DG\_Init, allocating the pieces that we'll need and stashing the
  282. important bits in some globals.
  283. ```c
  284. int fb_width, fb_height, fb_pitch;
  285. uint8_t *fb;
  286. cap_t irq1_notify;
  287. cap_t irq1;
  288. cap_t ps2;
  289. void DG_Init()
  290. {
  291. uintptr_t vbeaddr = bootinfo->arch->vbe_mode_info;
  292. uintptr_t vbepage = vbeaddr / 4096 * 4096;
  293. struct vbe_mode_info *vbe = devmap(vbepage, 1) + (vbeaddr % 4096);
  294. fb_width = vbe->width;
  295. fb_height = vbe->height;
  296. fb_pitch = vbe->pitch;
  297. assert(vbe->bpp == 32);
  298. unsigned int npage = (vbe->pitch * vbe->height) / 4096;
  299. fb = devmap((uintptr_t)vbe->framebuffer, npage);
  300. irq1_notify = mknotification();
  301. irq1 = irqcontrol_issue(INIT_CAP_IRQCONTROL, irq1_notify, 1);
  302. ps2 = iocontrol_issue(INIT_CAP_IOCONTROL, 0x60, 0x64);
  303. }
  304. ```
  305. If the multiboot loader is configured to set up a framebuffer, it gets handed
  306. off to the kernel, and Helios provides it to userspace as mappable device
  307. memory, so that saves us from doing all of the annoying VBE crap (or heaven
  308. forbid, write an actual video driver). This lets us map the framebuffer into our
  309. process. Second, we do the same notification+IRQ+IOControl thing we did from the
  310. keyboard driver you saw earlier, except in C, so that we can process scancodes
  311. later.
  312. Next is DG\_DrawFrame, which is pretty straightforward. We just copy scanlines
  313. from the internal buffer to the framebuffer whenever it asks us to.
  314. ```c
  315. void DG_DrawFrame()
  316. {
  317. for (int i = 0; i < DOOMGENERIC_RESY; ++i) {
  318. memcpy(fb + i * fb_pitch, DG_ScreenBuffer + i * DOOMGENERIC_RESX, DOOMGENERIC_RESX * 4);
  319. }
  320. }
  321. ```
  322. Then we have DG\_GetKey, similar to our earlier keyboard driver, plus actually
  323. interpeting the scancodes we get, plus making use of a new non-blocking wait
  324. syscall I added to Helios:
  325. ```c
  326. int DG_GetKey(int *pressed, unsigned char *doomKey)
  327. {
  328. struct sysret ret = sys_nbwait(irq1_notify);
  329. if (ret.status != 0) {
  330. return 0;
  331. }
  332. uint8_t scancode = ioport_in8(ps2, 0x60);
  333. irq_ack(irq1);
  334. uint8_t mask = (1 << 7);
  335. *pressed = (scancode & mask) == 0;
  336. scancode = scancode & ~mask;
  337. switch (scancode) {
  338. case K_AD05:
  339. *doomKey = KEY_ENTER;
  340. break;
  341. case K_AE08:
  342. *doomKey = KEY_UPARROW;
  343. break;
  344. case K_AD07:
  345. *doomKey = KEY_LEFTARROW;
  346. break;
  347. case K_AD08:
  348. *doomKey = KEY_DOWNARROW;
  349. break;
  350. case K_AD09:
  351. *doomKey = KEY_RIGHTARROW;
  352. break;
  353. case K_AB03:
  354. *doomKey = KEY_FIRE;
  355. break;
  356. case K_AB06:
  357. *doomKey = KEY_USE;
  358. break;
  359. case 1:
  360. *doomKey = KEY_ESCAPE;
  361. break;
  362. }
  363. return *doomKey;
  364. }
  365. ```
  366. Then, uh, we have a problem. Here's what I ended up doing for DG\_SleepMs:
  367. ```c
  368. uint32_t ticks = 0;
  369. void DG_SleepMs(uint32_t ms)
  370. {
  371. // TODO: sleep properly
  372. int64_t _ms = ms;
  373. while (_ms > 0) {
  374. sys_yield();
  375. ticks += 5;
  376. _ms -= 5;
  377. }
  378. }
  379. uint32_t DG_GetTicksMs()
  380. {
  381. return ticks;
  382. }
  383. ```
  384. Some fellow on IRC said he'd implement a sleep syscall for Helios, but didn't
  385. have time before I was ready to carry on with this port. So instead of trampling
  386. on his feet, I just yielded the thread (which immediately returns to the caller,
  387. since there are no other threads at this point) and pretend it took 5ms to do
  388. so, hoping for the best. It does not work! This port plays at wildly different
  389. speeds depending on the performance of the hardware you run it on.
  390. I'm not too torn up about it, though. My goal was not to make a particularly
  391. nice or fully featured port of Doom. The speed is problematic, I hardcoded the
  392. shareware doom1.wad as the only supported level, you can't save the game, and it
  393. crashes when you try to pick up the shotgun. But it does its job: it
  394. demonstrates the maturity of the kernel's features thus far and provides good
  395. feedback on the API design and real-world utility.
  396. If you'd like to try it, you can download a bootable ISO here:
  397. https://mirror.drewdevault.com/doom.iso
  398. You can run it on qemu like so:
  399. ```
  400. $ qemu-system-x86_64 -m 1G -no-reboot -no-shutdown \
  401. -drive file=doom.iso,format=raw \
  402. -display sdl \
  403. -chardev stdio,id=char0 \
  404. -serial chardev:char0
  405. ```
  406. Enter to start, WASD to move, right shift to fire, space to open doors. It
  407. *might* work on real hardware, but the framebuffer stuff is pretty hacky and not
  408. guaranteed to work on most stuff, and the PS/2 keyboard driver will only work
  409. with a USB keyboard if you have legacy USB emulation configured in your BIOS,
  410. and even then it might not work well. YMMV. It works on my ThinkPad X230. Have
  411. fun!