logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

2023-04-08-Drivers-and-mercury.md (20157B)


  1. ---
  2. title: Writing Helios drivers in the Mercury driver environment
  3. date: 2023-04-08
  4. ---
  5. *[Helios] is a microkernel written in the [Hare] programming language and is
  6. part of the larger [Ares](https://ares-os.org) operating system. You can watch
  7. my FOSDEM 2023 talk introducing Helios [on PeerTube][0].*
  8. [0]: https://spacepub.space/w/wpKXfhqqr7FajEAf4B2Vc2
  9. [Helios]: https://git.sr.ht/~sircmpwn/helios
  10. [Hare]: https://harelang.org
  11. Let's take a look at the new Mercury driver development environment for Helios.
  12. As you may remember from my FOSDEM talk, the Ares operating system is built out
  13. of several layers which provide progressively higher-level environments for an
  14. operating system. At the bottom is the Helios microkernel, and today we're going
  15. to talk about the second layer: the [Mercury] environment, which is used for
  16. writing and running device drivers in userspace. Let's take a look at a serial
  17. driver written against Mercury and introduce some of the primitives used by
  18. driver authors in the Mercury environment.
  19. [Mercury]: https://git.sr.ht/~sircmpwn/mercury
  20. Drivers for Mercury are written as normal ELF executables with an extra section
  21. called .manifest, which includes a file similar to the following (the provided
  22. example is for the serial driver we'll be examining today):
  23. ```ini
  24. [driver]
  25. name=pcserial
  26. desc=Serial driver for x86_64 PCs
  27. [capabilities]
  28. 0:ioport = min=3F8, max=400
  29. 1:ioport = min=2E8, max=2F0
  30. 2:note =
  31. 3:irq = irq=3, note=2
  32. 4:irq = irq=4, note=2
  33. _:cspace = self
  34. _:vspace = self
  35. _:memory = pages=32
  36. [services]
  37. devregistry=
  38. ```
  39. Helios uses a capability-based design, in which access to system resources (such
  40. as I/O ports, IRQs, or memory) is governed by capability objects. Each process
  41. has a *capability space*, which is a table of capabilities assigned to that
  42. process, and when performing operations (such as writing to an I/O port) the
  43. user provides the index of the desired capability in a register when invoking
  44. the appropriate syscall.
  45. The manifest first specifies a list of capabilities required to operate the
  46. serial port. It requests, assigned static capability addresses, capabilities for
  47. the required I/O ports and IRQs, as well as a notification object which the IRQs
  48. will be delivered to. Some capability types, such as I/O ports, have
  49. configuration parameters, in this case the minimum and maximum port numbers
  50. which are relevant. The IRQ capabilities require a reference to a notification
  51. as well.
  52. Limiting access to these capabilities provides very strong isolation between
  53. device drivers. On a monolithic kernel like Linux, a bug in the serial driver
  54. could compromise the entire system, but a vulnerability in our driver could, at
  55. worst, write garbage to your serial port. This model also provides better
  56. security than something like OpenBSD's pledge by declaratively specifying what
  57. we need and nothing else.
  58. Following the statically allocated capabilities, we request our own capability
  59. space and virtual address space, the former so we can copy and destroy our
  60. capabilities, and the latter so that we can map shared memory to perform reads
  61. and writes for clients. We also request 32 pages of memory, which we use to
  62. allocate page tables to perform those mappings; this will be changed later.
  63. These capabilities do not require any specific address for the driver to work,
  64. so we use "\_" to indicate that any slot will suit our needs.
  65. Mercury uses some vendor extensions over the System-V ABI to communicate
  66. information about these capabilities to the runtime. Notes about each of the
  67. \_'d capabilities are provided by the auxiliary vector, and picked up by the
  68. Mercury runtime -- for instance, the presence of a memory capability is detected
  69. on startup and is used to set up the allocator; the presence of a vspace
  70. capability is automatically wired up to the mmap implementation.
  71. Each of these capabilities is implemented by the kernel, but additional services
  72. are available in userspace via endpoint capabilities. Each of these endpoints
  73. implements a particular API, as defined by a protocol definition file. This
  74. driver requires access to the device registry, so that it can create devices for
  75. its serial ports and expose them to clients.
  76. These protocol definitions are written in a domain-specific language and parsed
  77. by [ipcgen] to generate client and server implementations of each. Here's a
  78. simple protocol to start us off:
  79. [ipcgen]: https://git.sr.ht/~sircmpwn/ipcgen
  80. ```
  81. namespace io;
  82. # The location with respect to which a seek operation is performed.
  83. enum whence {
  84. # From the start of the file
  85. SET,
  86. # From the current offset
  87. CUR,
  88. # From the end of the file
  89. END,
  90. };
  91. # An object with file-like semantics.
  92. interface file {
  93. # Reads up to amt bytes of data from a file.
  94. call read{pages: page...}(buf: uintptr, amt: size) size;
  95. # Writes up to amt bytes of data to a file.
  96. call write{pages: page...}(buf: uintptr, amt: size) size;
  97. # Seeks a file to a given offset, returning the new offset.
  98. call seek(offs: i64, w: whence) size;
  99. };
  100. ```
  101. Each interface includes a list of methods, each of which can take a number of
  102. capabilities and parameters, and return a value. The "read" call here, when
  103. implemented by a file-like object, accepts a list of memory pages to perform the
  104. read or write with (shared memory), as well as a pointer to the buffer address
  105. and size. Error handling is still a to-do.
  106. ipcgen consumes these files and writes client or server code as appropriate.
  107. These are generated as part of the Mercury build process and end up in
  108. \*\_gen.ha files. The generated client code is filed away into the relevant
  109. modules (this protocol ends up at io/file\_gen.ha), alongside various
  110. hand-written files which provide additional functionality and often wrap the IPC
  111. calls in a higher-level interface. The server implementations end up in the
  112. "serv" module, e.g. serv/io/file\_gen.ha.
  113. Let's look at some of the generated client code for io::file objects:
  114. ```hare
  115. // This file was generated by ipcgen; do not modify by hand
  116. use helios;
  117. use rt;
  118. // ID for the file IPC interface.
  119. export def FILE_ID: u32 = 0x9A533BB3;
  120. // Labels for operations against file objects.
  121. export type file_label = enum u64 {
  122. READ = FILE_ID << 16u64 | 1,
  123. WRITE = FILE_ID << 16u64 | 2,
  124. SEEK = FILE_ID << 16u64 | 3,
  125. };
  126. export fn file_read(
  127. ep: helios::cap,
  128. pages: []helios::cap,
  129. buf: uintptr,
  130. amt: size,
  131. ) size = {
  132. // ...
  133. };
  134. ```
  135. Each interface has a unique ID (generated from the FNV-1a hash of its fully
  136. qualified name), which is bitwise-OR'd with a list of operations to form call
  137. labels. The interface ID is used elsewhere; we'll refer to it again later. Then
  138. each method generates an implementation which arranges the IPC details as
  139. necessary and invokes the "call" syscall against the endpoint capability.
  140. The generated server code is a bit more involved. Some of the details are
  141. similar -- FILE\_ID is generated again, for instance -- but there are some
  142. additional details as well. First is the generation of a vtable defining the
  143. functions implementing each operation:
  144. ```hare
  145. // Implementation of a [[file]] object.
  146. export type file_iface = struct {
  147. read: *fn_file_read,
  148. write: *fn_file_write,
  149. seek: *fn_file_seek,
  150. };
  151. ```
  152. We also define a file object which is subtyped by the implementation to store
  153. implementation details, and which provides to the generated code the required
  154. bits of state.
  155. ```hare
  156. // Instance of an file object. Users may subtype this object to add
  157. // instance-specific state.
  158. export type file = struct {
  159. _iface: *file_iface,
  160. _endpoint: helios::cap,
  161. };
  162. ```
  163. Here's an example of a subtype of file used by the initramfs to store additional
  164. state:
  165. ```hare
  166. // An open file in the bootstrap filesystem
  167. type bfs_file = struct {
  168. serv::io::file,
  169. fs: *bfs,
  170. ent: tar::entry,
  171. cur: io::off,
  172. padding: size,
  173. };
  174. ```
  175. The embedded serv::io::file structure here is populated with an implementation
  176. of file\_iface, here simplified for illustrative purposes:
  177. ```hare
  178. const bfs_file_impl = serv_io::file_iface {
  179. read = &bfs_file_read,
  180. write = &bfs_file_write,
  181. seek = &bfs_file_seek,
  182. };
  183. fn bfs_file_read(
  184. obj: *serv_io::file,
  185. pages: []helios::cap,
  186. buf: uintptr,
  187. amt: size,
  188. ) size = {
  189. let file = obj: *bfs_file;
  190. const fs = file.fs;
  191. const offs = (buf & rt::PAGEMASK): size;
  192. defer helios::destroy(pages...)!;
  193. assert(offs + amt <= len(pages) * rt::PAGESIZE);
  194. const buf = helios::map(rt::vspace, 0, map_flags::W, pages...)!: *[*]u8;
  195. let buf = buf[offs..offs+amt];
  196. // Not shown: reading the file data into this buffer
  197. };
  198. ```
  199. The implementation can prepare a file object and call dispatch on it to process
  200. client requests: this function blocks until a request arrives, decodes it, and
  201. invokes the appropriate function. Often this is incorporated into an event loop
  202. with poll to service many objects at once.
  203. ```hare
  204. // Prepare a file object
  205. const ep = helios::newendpoint()!;
  206. append(fs.files, bfs_file {
  207. _iface = &bfs_file_impl,
  208. _endpoint = ep,
  209. fs = fs,
  210. ent = ent,
  211. cur = io::tell(fs.buf)!,
  212. padding = fs.rd.padding,
  213. });
  214. // ...
  215. // Process requests associated with this file
  216. serv::io::file_dispatch(file);
  217. ```
  218. Okay, enough background: back to the serial driver. It needs to implement the
  219. following protocol:
  220. ```
  221. namespace dev;
  222. use io;
  223. # TODO: Add busy error and narrow semantics
  224. # Note: TWO is interpreted as 1.5 for some char lengths (5)
  225. enum stop_bits {
  226. ONE,
  227. TWO,
  228. };
  229. enum parity {
  230. NONE,
  231. ODD,
  232. EVEN,
  233. MARK,
  234. SPACE,
  235. };
  236. # A serial device, which implements the file interface for reading from and
  237. # writing to a serial port. Typical implementations may only support one read
  238. # in-flight at a time, returning errors::busy otherwise.
  239. interface serial :: io::file {
  240. # Returns the baud rate in Hz.
  241. call get_baud() uint;
  242. # Returns the configured number of bits per character.
  243. call get_charlen() uint;
  244. # Returns the configured number of stop bits.
  245. call get_stopbits() stop_bits;
  246. # Returns the configured parity setting.
  247. call get_parity() parity;
  248. # Sets the baud rate in Hz.
  249. call set_baud(hz: uint) void;
  250. # Sets the number of bits per character. Must be 5, 6, 7, or 8.
  251. call set_charlen(bits: uint) void;
  252. # Configures the number of stop bits to use.
  253. call set_stopbits(bits: stop_bits) void;
  254. # Configures the desired parity.
  255. call set_parity(parity: parity) void;
  256. };
  257. ```
  258. This protocol *inherits* the io::file interface, so the serial port is usable
  259. like any other file for reads and writes. It additionally defines
  260. serial-specific methods, such as configuring the baud rate or parity. The
  261. generated interface we'll have to implement looks something like this, embedding
  262. the io::file\_iface struct:
  263. ```hare
  264. export type serial_iface = struct {
  265. io::file_iface,
  266. get_baud: *fn_serial_get_baud,
  267. get_charlen: *fn_serial_get_charlen,
  268. get_stopbits: *fn_serial_get_stopbits,
  269. get_parity: *fn_serial_get_parity,
  270. set_baud: *fn_serial_set_baud,
  271. set_charlen: *fn_serial_set_charlen,
  272. set_stopbits: *fn_serial_set_stopbits,
  273. set_parity: *fn_serial_set_parity,
  274. }
  275. ```
  276. Time to dive into the implementation. Recall the driver manifest, which provides
  277. the serial driver with a suitable environment:
  278. ```
  279. [driver]
  280. name=pcserial
  281. desc=Serial driver for x86_64 PCs
  282. [capabilities]
  283. 0:ioport = min=3F8, max=400
  284. 1:ioport = min=2E8, max=2F0
  285. 2:note =
  286. 3:irq = irq=3, note=2
  287. 4:irq = irq=4, note=2
  288. _:cspace = self
  289. _:vspace = self
  290. _:memory = pages=32
  291. [services]
  292. devregistry=
  293. ```
  294. I/O ports for reading and writing to the serial devices, IRQs for receiving
  295. serial-related interrupts, a device registry to add our serial devices to the
  296. system, and a few extra things for implementation needs. Some of these are
  297. statically allocated, some of them are provided via the auxiliary vector.
  298. Our [serial driver][driver] opens by defining constants for the statically
  299. allocated capabilities:
  300. [driver]: https://git.sr.ht/~sircmpwn/mercury/tree/5e12977a0cb773331b9b3b8421da63b85eed232c/item/cmd/serial
  301. ```hare
  302. def IOPORT_A: helios::cap = 0;
  303. def IOPORT_B: helios::cap = 1;
  304. def IRQ: helios::cap = 2;
  305. def IRQ3: helios::cap = 3;
  306. def IRQ4: helios::cap = 4;
  307. ```
  308. The first thing we do on startup is create a serial device.
  309. ```hare
  310. export fn main() void = {
  311. let serial0: helios::cap = 0;
  312. const registry = helios::service(sys::DEVREGISTRY_ID);
  313. sys::devregistry_new(registry, dev::SERIAL_ID, &serial0);
  314. helios::destroy(registry)!;
  315. // ...
  316. ```
  317. The device registry is provided via the aux vector, and we can use
  318. helios::service to look it up by its interface ID. Then we use the
  319. devregistry::new operation to create a serial device:
  320. ```
  321. # Device driver registry.
  322. interface devregistry {
  323. # Creates a new device implementing the given interface ID using the
  324. # provided endpoint capability and returns its assigned serial number.
  325. call new{; out}(iface: u64) uint;
  326. };
  327. ```
  328. After this we can destroy the registry -- we won't need it again and it's best
  329. to get rid of it so that we can work with the minimum possible privileges at
  330. runtime. After this we initialize the serial port, acknowledge any interrupts
  331. that might have been pending before we got started, an enter the main loop.
  332. ```hare
  333. com_init(&ports[0], serial0);
  334. helios::irq_ack(IRQ3)!;
  335. helios::irq_ack(IRQ4)!;
  336. let poll: [_]pollcap = [
  337. pollcap { cap = IRQ, events = pollflags::RECV, ... },
  338. pollcap { cap = serial0, events = pollflags::RECV, ... },
  339. ];
  340. for (true) {
  341. helios::poll(poll)!;
  342. if (poll[0].revents & pollflags::RECV != 0) {
  343. dispatch_irq();
  344. };
  345. if (poll[1].revents & pollflags::RECV != 0) {
  346. dispatch_serial(&ports[0]);
  347. };
  348. };
  349. ```
  350. The dispatch\_serial function is of interest, as this provides the
  351. implementation of the serial object we just created with the device registry.
  352. ```hare
  353. type comport = struct {
  354. dev::serial,
  355. port: u16,
  356. rbuf: [4096]u8,
  357. wbuf: [4096]u8,
  358. rpending: []u8,
  359. wpending: []u8,
  360. };
  361. fn dispatch_serial(dev: *comport) void = {
  362. dev::serial_dispatch(dev);
  363. };
  364. const serial_impl = dev::serial_iface {
  365. read = &serial_read,
  366. write = &serial_write,
  367. seek = &serial_seek,
  368. get_baud = &serial_get_baud,
  369. get_charlen = &serial_get_charlen,
  370. get_stopbits = &serial_get_stopbits,
  371. get_parity = &serial_get_parity,
  372. set_baud = &serial_set_baud,
  373. set_charlen = &serial_set_charlen,
  374. set_stopbits = &serial_set_stopbits,
  375. set_parity = &serial_set_parity,
  376. };
  377. fn serial_read(
  378. obj: *io::file,
  379. pages: []helios::cap,
  380. buf: uintptr,
  381. amt: size,
  382. ) size = {
  383. const port = obj: *comport;
  384. const offs = (buf & rt::PAGEMASK): size;
  385. const buf = helios::map(rt::vspace, 0, map_flags::W, pages...)!: *[*]u8;
  386. const buf = buf[offs..offs+amt];
  387. if (len(port.rpending) != 0) {
  388. defer helios::destroy(pages...)!;
  389. return rconsume(port, buf);
  390. };
  391. pages_static[..len(pages)] = pages[..];
  392. pending_read = read {
  393. reply = helios::store_reply(helios::CADDR_UNDEF)!,
  394. pages = pages_static[..len(pages)],
  395. buf = buf,
  396. };
  397. return 0;
  398. };
  399. // (other functions omitted)
  400. ```
  401. We'll skip much of the implementation details for this specific driver, but I'll
  402. show you how read works at least. It's relatively straightforward: first we mmap
  403. the buffer provided by the caller. If there's already readable data pending from
  404. the serial port (stored in that rpending slice in the comport struct, which is a
  405. slice of the statically-allocated rbuf field), we copy it into the buffer and
  406. return the number of bytes we had ready. Otherwise, we stash details about the
  407. caller, storing the special reply capability in our cspace (this is one of the
  408. reasons we need cspace = self in our manifest) so we can reply to this call
  409. once data is available. Then we return to the main loop.
  410. The main loop also wakes up on an interrupt, and we have an interrupt unmasked
  411. on the serial device to wake us whenever there's data ready to be read.
  412. Eventually this gets us here, which finishes the call we saved earlier:
  413. ```hare
  414. // Reads data from the serial port's RX FIFO.
  415. fn com_read(com: *comport) size = {
  416. let n: size = 0;
  417. for (comin(com.port, LSR) & RBF == RBF; n += 1) {
  418. const ch = comin(com.port, RBR);
  419. if (len(com.rpending) < len(com.rbuf)) {
  420. // If the buffer is full we just drop chars
  421. static append(com.rpending, ch);
  422. };
  423. };
  424. if (pending_read.reply != 0) {
  425. const n = rconsume(com, pending_read.buf);
  426. helios::send(pending_read.reply, 0, n)!;
  427. pending_read.reply = 0;
  428. helios::destroy(pending_read.pages...)!;
  429. };
  430. return n;
  431. };
  432. ```
  433. I hope that gives you a general idea of how drivers work in this environment!
  434. I encourage you to read the full implementation if you're curious to know more
  435. about the serial driver in particular -- it's just 370 lines of code.
  436. The last thing I want to show you is how the driver gets executed in the first
  437. place. When Helios boots up, it starts /sbin/sysinit, which is provided by
  438. Mercury and offers various low-level userspace runtime services, such as the
  439. device registry and bootstrap filesystem we saw earlier. After setting up its
  440. services, sysinit executes /sbin/usrinit, which is provided by the next layer
  441. up (Gaia, eventually) and sets up the rest of the system according to user
  442. policy, mounting filesystems and starting up drivers and such. At the moment,
  443. usrinit is fairly simple, and just runs a little demo. Here it is in full:
  444. ```hare
  445. use dev;
  446. use fs;
  447. use helios;
  448. use io;
  449. use log;
  450. use rt;
  451. use sys;
  452. export fn main() void = {
  453. const fs = helios::service(fs::FS_ID);
  454. const procmgr = helios::service(sys::PROCMGR_ID);
  455. const devmgr = helios::service(sys::DEVMGR_ID);
  456. const devload = helios::service(sys::DEVLOADER_ID);
  457. log::printfln("[usrinit] Running /sbin/drv/serial");
  458. let proc: helios::cap = 0;
  459. const image = fs::open(fs, "/sbin/drv/serial")!;
  460. sys::procmgr_new(procmgr, &proc);
  461. sys::devloader_load(devload, proc, image);
  462. sys::process_start(proc);
  463. let serial: helios::cap = 0;
  464. log::printfln("[usrinit] open device serial0");
  465. sys::devmgr_open(devmgr, dev::SERIAL_ID, 0, &serial);
  466. let buf: [rt::PAGESIZE]u8 = [0...];
  467. for (true) {
  468. const n = match (io::read(serial, buf)!) {
  469. case let n: size =>
  470. yield n;
  471. case io::EOF =>
  472. break;
  473. };
  474. // CR => LF
  475. for (let i = 0z; i < n; i += 1) {
  476. if (buf[i] == '\r') {
  477. buf[i] = '\n';
  478. };
  479. };
  480. // echo
  481. io::write(serial, buf[..n])!;
  482. };
  483. };
  484. ```
  485. Each of the services shown at the start are automatically provided in usrinit's
  486. aux vector by sysinit, and includes all of the services required to bootstrap
  487. the system. This includes a filesystem (the initramfs), a process manager (to
  488. start up new processes), the device manager, and the driver loader service.
  489. usrinit starts by opening up /sbin/drv/serial (the serial driver, of course)
  490. from the provided initramfs using fs::open, which is a convenience wrapper
  491. around the filesystem protocol. Then we create a new process with the process
  492. manager, which by default has an empty address space -- we could load a normal
  493. process into it with sys::process\_load, but we want to load a driver, so we
  494. use the devloader interface instead. Then we start the process and boom: the
  495. serial driver is online.
  496. The serial driver registers itself with the device registry, which means that we
  497. can use the device manager to open the 0th device which implements the serial
  498. interface. Since this is compatible with the io::file interface, it can simply
  499. be used normally with io::read and io::write to utilize the serial port. The
  500. main loop simply echos data read from the serial port back out. Simple!
  501. ---
  502. That's a quick introduction to the driver environment provided by Mercury. I
  503. intend to write a few more drivers soon myself -- PC keyboard, framebuffer,
  504. etc -- and set up a simple shell. We have seen a few sample drivers written
  505. pre-Mercury which would be nice to bring into this environment, such as virtio
  506. networking and block devices. It will be nice to see them re-introduced in an
  507. environment where they can provide useful services to the rest of userspace.
  508. If you're interested in learning more about Helios or Mercury, consult
  509. [ares-os.org](https://ares-os.org) for documentation -- though beware of the
  510. many stub pages. If you have any questions or want to get involved in writing
  511. some drivers yourself, jump into our IRC channel: #helios on Libera Chat.