Back That ‘S’ Up: Moving to RISC-V’s Supervisor Mode
This is a continuation of an ongoing theme which is started here: https://osblog.stephenmarz.com.
Contents
- What is Supervisor Mode?
- Why Supervisor Mode?
- Complications while in Supervisor Mode
- Complications with Interrupts
- Conclusion
What is Supervisor Mode?
My original OS Blog (see here: https://osblog.stephenmarz.com) ran the operating system in RISC-V’s machine mode, which is the most privileged mode in the RISC-V architecture. It has access to all control and status registers, runs in physical memory only, and has no restrictions placed upon it by the CPU.
I recently ran across the OpenSBI (Open Supervisor Binary Interface), listed on Github here: https://github.com/riscv/opensbi. It’s been around a little while, but it seeks to be an interface between the operating system and the machine–the low-level system.
Currently, OpenSBI has a legacy interface that abstracts the UART device for you, as well as a HART (hardware thread–RISC-V’s name for a CPU core) management system–start a hart at a given address, stop a hart, get the status of a hart, etc.
The supervisor mode, known as S mode in RISC-V, is two layers below Machine mode, as shown in the RISC-V specification below. This means that our operating system uses the OpenSBI set of utilities for very low level things, in which the OpenSBI abstracts away. This makes designing an operating system for the myriad of boards a bit easier.
Why Supervisor Mode?
Why run the operating system at S mode? Well, whereas a high-level language as the application binary interface, or ABI, the operating system can rely on a certain set of utilities given by a RISC-V “bios” for a lack of a better term.
This allows our operating system to abstract much of the machine architecture away. Instead of relying on the actual system’s specification, we can program the operating system using more or less the RISC-V specification only.
To understand what’s going on, let’s take a look at the user level. This is where applications live. Whenever a user application runs afoul of the “rules”, such as dereferencing an invalid memory location or executing an illegal instruction, the CPU will trap to the machine mode, unless the trap is delegated lower. Luckily for us, OpenSBI delegates user traps to the supervisor mode, so our operating system can handle it.
Now, let’s move up one level. What happens when the operating system runs afoul of the rules? In most respects, the system crashes. An illegal instruction in machine mode will trap into machine mode. This can potentially cause a loop of traps in machine mode, as one compounds another.
So, as U-mode is to S-mode, S-mode is to M-mode, meaning that if the operating system runs at S-mode and messes with the CPU, then the OpenSBI will handle the trap. Usually, this means trying to load the hart into a stable, known state.
Complications while in Supervisor Mode
I wrote my operating system in machine mode to make it easier to understand the RISC-V architecture. Now, switching to S-mode has complicated some things. One of the major issues I have found is that the hart’s unique id is in the mhartid register. That little m in front of it means that it is a machine-mode register. Since we’re at a lower level, we’re not allowed to access any of the machine mode registers. If we try, we will get an illegal instruction.
This means that we have to keep track of our own core id. This makes context switching and putting applications on a different core slightly more complicated. We can’t just read the mhartid register and know our hart id. Instead, we have to follow the OpenSBI interface. This is easier said than done!
Now that we don’t have access to any of the machine mode registers, we have to use the supervisor levels. Luckily, RISC-V gives us “views” of the same register with an s in front of it. For example, mstatus becomes sstatus. The neat thing is that if we write to sstatus, the machine-only bits are masked, and we can only set the supervisor bits. This means we can’t switch ourselves into machine mode while in supervisor mode by simply setting the mstatus bits (bits 12 and 11).
The sstatus register is the exact same register as the mstatus register, but when we make writes to it, the CPU will not change any machine-only bits. This is called a “view” of the machine status register. Here’s the supervisor status register. Take note of the machine bits that are masked.
Notice that bits 12 and 11 (Machine Previous Privilege Mode [MPP]) are WPRI, which stands for Write-Preserve, Read-Ignore. Write-preserve means that if we write to bits 12 and 11, it will preserve their original value, which essentially prevents us from writing to MPP in supervisor mode. Read-ignore means that if we try to read bits 12 and 11, we won’t get the actual data. Instead, it ignores the data and usually will give us 0.
This changes the way we switch from supervisor mode into user mode. So, if we put 1 into SPP (bit 8 of sstatus), then when we execute the sret (supervisor return), then this bit will become the privilege level. Recall that level 1 is supervisor mode. If we put 0 in bit 8, then after an sret, we will be in user mode. Recall that user mode is level 0.
Complications with Interrupts
Interrupts trigger a pin on the CPU to cause a trap. This is usually in response to something, such as a timer, a software interrupt, or an external, platform-level interrupt, such as a keyboard input or wifi notification.
The interrupts I’m concerned about are at the platform level. RISC-V has a specification for the platform-level interrupt controller, or PLIC. In this, we can configure the PLIC to trap in supervisor mode or even in machine mode. Since the PLIC is directly connected to the CPU, we, at supervisor mode, can tell the PLIC where to send interrupts. This makes our job a little bit harder since there are many different configurations of multiple-harts, different hart configurations and so on.
To demonstrate this, here’s the memory map of the PLIC on a 5-hart CPU where the 0-hart only has M mode, and harts 1, 2, 3, and 4 have both M and S mode.
As you can see, our stride isn’t the same for every hart, so we will have to configure our operating system nearly at machine-level. If you take a look at the virt cpu in QEMU (see qemu/virt.h at master ยท qemu/qemu (github.com)). So, I can’t go off of just configuring each hart to enable an interrupt, I have to specify the mode where I want the interrupt to be trapped. Furthermore, each hart is not necessarily the same, as you can see with the diagram above. the FU540 has hart 0 (called the “monitor core”) to supervise the other cores, which runs in machine mode only.
Traps also require us to context switch, by saving the current set of registers into the context running on the hart, scheduling the next context, and loading that context onto the hart. This was fairly simple in machine mode for one reason–the MMU is turned off automatically in machine mode. This is not the case in supervisor mode. Furthermore, the MMU register, called supervisor address translation and protection (SATP), is immediate. Meaning, if I set the mode, the MMU immediately turns on. This can be a problem because I have to juggle certain registers. Take a look at the trap handler written in RISC-V assembly below.
(Updated screenshot): I originally had sstatus instead of sscratch. The point of csrrw is to make an atomic swap of sscratch into t6 and the old value of t6 into sscratch. This allows us to keep both values. As a side note, this is actually the second time I’ve done this. My typing fingers just like sstatus better than sscratch.
As you can see, we have to be careful not destroy a register in our context. I usually use the t6 register since it is register number 31, which is the last register for an ascending loop. In the code above, I’m making sure that no memory accesses are made after the SATP register is written to. Remember, it’s immediate. As soon as I write to the SATP register and set the mode, it is up and running since we’re in supervisor mode.
This leads us to a little bit of a problem. Unless we map this handler, it will not be able to execute–and we still need to get to sret
. Recall that X (execute) is one of our bits and so is the U (user) bit. So, how do we handle this? We will see in the next post. Stay tuned.
Conclusion
I’m still working on migrating my machine-mode operating system into a supervisor-mode operating system. This is a work in progress, so I encourage you to keep up to date on this blog!
Thanks for reading!