Talking with our new Operating System by Handling Input Events and Devices

Talking with our new Operating System by Handling Input Events and Devices

Input devices give our operating system the ability to accept mouse and keyboard inputs from the graphical user interface (GUI). We originally used the UART (universal asynchronous receiver/transmitter) for console I/O, but we’re high class now–we have a GUI! So, we will use the virtio protocol to communicate with a mouse and keyboard.


Contents

  1. VirtIO Input Device Protocol
  2. Tablet vs. Mouse (absolute vs. relative positioning)
  3. Event and Status Queues
  4. Event System
  5. The Device’s Perspective
  6. Event Queues
  7. System Calls
  8. ABS Events
  9. The Final Product
  10. Additional Reading

VirtIO Input Device Protocol

Yet again, we are using the virtio protocol, but this time we’re targeting the input device (device id: 18). In particular, we will look at the keyboard inputs as well as mouse inputs. In our case, we’re going to use the mouse as a tablet–to get absolute positioning since we don’t own the entire screen.

So, we look for device ID 18 and redirect it to initialize in the input.rs file. We will actually have two devices linking here: the mouse and the keyboard. One thing about input is we get a stream of data from the input devices, whether that’s key codes or mouse inputs, so we need to make sure that we’re notified of every little input. On my system, QEMU offers RING_EVENT_IDX, which is a way to suppress notifications until a certain index. However, if we did this, our inputs would be suppressed until a certain number of inputs were made. So, we need to turn this off.

let mut host_features = ptr.add(MmioOffsets::HostFeatures.scale32()).read_volatile();
host_features &= !(1 << VIRTIO_F_RING_EVENT_IDX);
ptr.add(MmioOffsets::GuestFeatures.scale32()).write_volatile(host_features);

Tablet vs. Mouse (Absolute vs. Relative Positioning)

There are two different ways to handle input signals from a mouse or a tablet: (1) relative positioning and (2) absolute position. Relative positioning means that a mouse input gives you the change in position whereas absolute positioning gives you the coordinates of the new position. So, there’s a little bit more math involved with relative positioning. To change this in QEMU, a virtio mouse is relative positioning whereas a virtio tablet is absolute positioning. Both will use the host’s mouse for inputs, so don’t let the terminology throw you off.

The interesting portion about how these events come to you is that we get either an ABS_Y or ABS_X event, which then we look at the event code for the coordinate. The coordinates are also in their own units. When I took a look, the coordinates went from 0 to 32,767, which is the maximum capacity of a signed, 16-bit integer. So, with our screen being 640×480, we need to clamp our values between 0 and 32767. To perform the conversion from input coordinates to screen coordinates, I applied the following equation: \(\text{screencoordx} = \frac{\text{evcoordx}}{32767}\times 480\) and \(\text{screencoordy} = \frac{\text{evcoordy}}{32767}\times 640\).

Essentially, the equation just takes where the current event coordinate is located in reference to the event coordinate space (0 – 32,767). That gives us a ratio. We then apply this ratio to a smaller coordinate space (0 – 640) or (0 – 480). Unfortunately, this equation requires me to use floating point math. There is probably a better way to use integers only, but by the time I started writing this, I haven’t yet done it.

To visualize how this works, we divide the coordinate space into 32,767 and 640 equally spaced distances. Then, we divide the given X coordinate by this coordinate space to see how many distances we get. Since we divided both coordinate spaces equally, we get the same number of distances for 640, it’s just that each spacing between is much smaller. So, we multiply by 640 to convert back into the coordinate space. I know that’s not clear, so hopefully the following diagram will explain.

Both dashed lines have about an equal number of dashes, but because the 0-640 coordinate space is smaller, the distance between each dash and the dash itself are much closer together. So, when we divide the coordinate by 32,767, we are getting back which dash we’re at (which is a number between 0.0 and 1.0). This will equally correspond with the dash in the 0-640 coordinate space. However, to draw to the screen, we need to know the coordinate in that space. So, if we multiply the dash by 640, we get the X coordinate in 0-640 coordinate space.

The vertical axis, Y, is the same way, except that our screen space is 0-480, which means each dash and the distance between each dash are even closer together.

To see this in code form, here it is written in C++:

constexpr u32 lerp(u32 val, u32 mx1, u32 mx2) {
   // mx1 is FROM maximum coordinate and mx2 is TO maximum coordinate
   f64 r = val / static_cast<f64>(mx1);
   return r * mx2;
}

Event and Status Queues

We have two different virtqueues with an input device: (1) the event queue and (2) the status queue. The event queue is populated by the device (mouse or keyboard) with an Event structure, shown below. This is unlike the other queues where we add information to make a request. Since these devices are simply just feeding us input, we need to provide memory locations where these events can go.

The Event structure is the only data structure passed between the input device and the operating system. This is also the only structure used for both the event and status queues.

#[repr(C)]
#[derive(Copy, Clone)]
pub struct Event {
    pub event_type: EventType,
    pub code: u16,
    pub value: u32,
}
#[repr(u16)]
#[derive(Copy, Clone)]
pub enum EventType {
    Syn = 0x00,
    Key = 0x01,
    Rel = 0x02,
    Abs = 0x03,
    Msc = 0x04,
    Sw = 0x05,
    Led = 0x11,
    Snd = 0x12,
    Rep = 0x14,
    Ff = 0x15,
    Pwr = 0x16,
    FfStatus = 0x17,
    Max = 0x1f,
}

The first field of the Event structure is a type. This type tells us what kind of input it was. A Key event is something that came from a button being pushed. This can be a keyboard input or the mouse being pressed. The Abs (absolute) input is a position input. Unlike Rel (relative), each abs input gives us the absolute coordinate. Our input system will give us the X and Y coordinates in different events. Every event has a code and a value. The code is a sub-type of the original event type. In the Abs type, this is going to be ABS_X or ABS_Y to report the x coordinate or y coordinate, respectively.


Event System

The virtio event system is built on top of Linux’s evdev system, and in fact, most of the codes are the exact same. We can see that we will get the same reports that evdev would given key presses or mouse movements. The following is a partial (very partial) list of the key codes that evdev is expecting:

#define KEY_ESC  1
#define KEY_1    2
#define KEY_2    3
#define KEY_3    4
#define KEY_4    5
#define KEY_5    6
// ...
#define KEY_Q    16
#define KEY_W    17
#define KEY_E    18
#define KEY_R    19

So, if we want to detect the W key being pressed, we would look for event type Key and event code 17. In userspace, we can make a rough event handler by using a switch statement:

switch (ev.code) {
   case BTN_MOUSE:
      pressed = (ev.value & 1) == 1;
   break;
   case KEY_O:
      current_color = Pixel {255, 150, 0, 255};
   break;
   case KEY_B:
      current_color = Pixel {0, 0, 255, 255};
   break;
   case KEY_G:
      current_color = Pixel {0, 255, 0, 255};
   break;
   case KEY_R:
      current_color = Pixel {255, 0, 0, 255};
   break;
}

I will explain the other code first, since we need some way of getting events queued from the operating system. However, the code above should just show you how an event translates into human-understandable input.


The Device’s Perspective

The device, the keyboard or mouse, is connected to the virtio bus. It will send inputs to the virtio system, which then in turn looks to find a place to put the event.

The virtio spec is a little bit unclear, so I had to look at the virtio implementation in QEMU and the driver in Linux to fully understand what’s going on. I think the virtio spec is unclear because it leaves the libevdev implementation out. So, I have to figure out what fuzz, res, and all that mean.

The virtio bus will look in the available ring for the next buffer–the buffer is simply an 8-byte piece of memory that’s big enough to store an Event structure. When it finds this, it will “consume it” and put in the Event. This is why we need to make all event buffers device writeable.

unsafe fn repopulate_event(dev: &mut Device, buffer: usize) {
// Populate eventq with buffers, these must be at least the size of struct virtio_input_event.
	let desc = Descriptor {
		addr: dev.event_buffer.add(buffer) as u64,
		len: EVENT_SIZE as u32,
		flags: VIRTIO_DESC_F_WRITE,
		next: 0
	};
	let head = dev.event_idx as u16;
	(*dev.event_queue).desc[dev.event_idx as usize] = desc;
	dev.event_idx = (dev.event_idx + 1) % VIRTIO_RING_SIZE as u16;
	(*dev.event_queue).avail.ring[(*dev.event_queue).avail.idx as usize % VIRTIO_RING_SIZE] = head;
	(*dev.event_queue).avail.idx = (*dev.event_queue).avail.idx.wrapping_add(1);
}

As you can see above, I don’t make a request per se through the event queue. Instead, I’m just providing memory locations where the virtio system can place an event. This will happen every time the device consumes an event buffer, so I put it in a function that will be called over and over again.

Let’s take a look at what happens when the virtio input device interrupts us to tell is something happened.

let ref desc = queue.desc[elem.id as usize];
let event = (desc.addr as *const Event).as_ref().unwrap();
repopulate_event(dev, elem.id as usize);
dev.event_ack_used_idx = dev.event_ack_used_idx.wrapping_add(1);
match event.event_type {
	EventType::Abs => {
		let mut ev = ABS_EVENTS.take().unwrap();
		ev.push_back(*event);
		ABS_EVENTS.replace(ev);	
	},
	EventType::Key => {
		let mut ev = KEY_EVENTS.take().unwrap();
		ev.push_back(*event);
		KEY_EVENTS.replace(ev);	
	},
	_ => {
	}
}

The code above is a snippet of the whole function, but I want to highlight how we handle the event. We get an interrupt, and the used ring tells us which buffer was just used–for lack of a better term. We can then look at this buffer and retrieve the event. Notice that these events get pushed back. I use (*event) to copy the event instead of grabbing a reference to the event. Again, this is because we reuse the buffers in the virtio available ring.

Also, notice that any event not supported by our driver is silently dropped. This is how the specification tells us to handle unknown events.

One particular piece of code I’d like to draw your attention to is repopulate_buffer. We reuse the memory space over and over again so that we don’t have to keep freeing and reallocating the buffer. However, the event system requires that the next available ring contain a descriptor with a memory address–so this is a dog chasing its tail. It isn’t difficult to do, we just have to put the exact same memory address back into the descriptor and point to it through the available ring.

Finally, notice that the event queues KEY_EVENTS and ABS_EVENTS push the event back onto it. Since we will be reusing the buffer, I made events clonable and copyable by using #[derive(Clone, Copy)].


Event Queues

Input events will generate an interrupt. Therefore, the operating system must queue events so that whenever the userspace program is ready, it can hand those events off to it. I chose to use a VecDeque due to its O(1) inserts and removals (amortized). We can also use push_back to put an event on the queue and pop_front to pull an event off.

Our event queues must be global to stay resident in the OS:

pub static mut ABS_EVENTS: Option<VecDeque<Event>> = None;
pub static mut KEY_EVENTS: Option<VecDeque<Event>> = None;

This should look familiar if you’ve seen the process queue. We use the Option sum type so that we can have something set at compile time. Now, to create the queues, we have to judge the initial capacity:

ABS_EVENTS = Some(VecDeque::with_capacity(100));
KEY_EVENTS = Some(VecDeque::with_capacity(15));

I used a little bit of creative license for these event queues. When I moved my mouse, it generated nearly 100 events in about a second. My keyboard was about 2 to 3 keys per second. I think 100 and 15 will keep us from having to resize without wasting too much memory.

If we exceed the capacity of the event queue, the VecDeque will grow the ring buffer. However, we don’t place a check on it to avoid exhausting our memory. This is probably a very bad idea, but for now, it suffices. The only way, right now, to drain the event buffer is for a user space application to make a system call. So, if no user space application cares about events, we could easily overfill the buffer.


System Calls

The whole point here is to make user space applications recognize events. At first, I wrote a system call for abs and one for key that would retrieve the top event and return it. Due to the sheer number of mouse events, this proved to be VERY slow–painfully slow in fact. So, I changed the system call to download an array of events, which I will show you below.

Right now, I just assigned numbers 1002 and 1004 for these system calls. More realistically, we would use the open() and read() system calls to open a mouse or keyboard device. However, for now, we can at least see how the system works.

let mut ev = KEY_EVENTS.take().unwrap();
let max_events = (*frame).regs[Registers::A1 as usize];
let vaddr = (*frame).regs[Registers::A0 as usize] as *const Event;
if (*frame).satp >> 60 != 0 {
	let process = get_by_pid((*frame).pid as u16);
	let table = ((*process).get_table_address() as *mut Table).as_mut().unwrap();
	(*frame).regs[Registers::A0 as usize] = 0;
	for i in 0..if max_events <= ev.len() { max_events } else { ev.len() } {
		let paddr = virt_to_phys(table, vaddr.add(i) as usize);
		if paddr.is_none() {
			break;
		}
		let paddr = paddr.unwrap() as *mut Event;
		*paddr = ev.pop_front().unwrap();
		(*frame).regs[Registers::A0 as usize] += 1;
	}
}
KEY_EVENTS.replace(ev);

First, we need to grab the user process’ table (if one exists). Instead of returning a single 8-byte event, we are now given a memory address to fill events. This is also why we need to keep running virt_to_phys for every memory address we get. It is possible that the user space application exceeds their memory space, so we want to make sure we don’t crash inside of our kernel.

By user space, we are given a buffer in A0 and a maximum number of events in A1. Since each event is 8 bytes, the buffer should be at least \(\text{max number of events}\times 8\) bytes. When we return, A0 will be filled with the number of events that the OS was able to put into the array.

In user space, our system call would look something like the following:

struct Event {
	u16 event_type;
	u16 code;
	u32 value;
} events[MAX_EVENTS];
//...
if ((num_events = syscall_get_key(events, MAX_EVENTS)) > 0) {
	for (u32 z = 0;z < num_events;z++) {
		Event &ev = events[z];
		switch (ev.code) {

So, events is an array of Event structures, and MAX_EVENTS is the size of the array (currently, I set it to 100). Notice that our system call will NOT block if there are no events. This is contrary to something like the select system call, where we can set it to block or not block.

In our function above, we process the events off of the event array. We really don’t need the if statement, since the for loop wouldn’t execute anyway if we didn’t have events, but I realized that when I copied the code into this blog post :(.

After I changed the system call to receive an array of events, we get a fully supported input system complete with key presses and mouse events.

ABS Events Caveat

Remember that we get an absolute event as either ABS_X or ABS_Y. That means we are required to keep up with the x or the y coordinates. This is fairly straight forward, but it threw me for a loop the first time I encountered it. I thought that the value contained both the X and Y elements, like the X11 display server does. However, that is not the case. We get one coordinate per event. So, to keep track we do:

if ((num_events = syscall_get_abs(events, MAX_EVENTS)) < 1) {
	syscall_sleep(noevt_slptm);
	continue;
}
for (u32 z = 0;z < num_events;z++) {
	Event &ev = events[z];
	if (ev.code == ABS_X) {
		x = lerp(ev.value & 0x7fff, 32767, 640);
	}
	else if (ev.code == ABS_Y) {
		y = lerp(ev.value & 0x7fff, 32767, 480);
	}
	if (pressed) {
		fill_rect(fb, x, y, 5, 5, current_color);
	}
}

So, we have a variable x and a variable y that will keep track of where our mouse is moving. We also need to convert the coordinate from a space of 0 through 32767 to a space of 0 through 640 (width or x coordinate) or a space of 0 through 480 (height or y coordinate). That’s the purpose of the lerp (linear interpolation) function. It isn’t really a linear interpolation function in the nominal sense, but it does the trick.

Notice that if there are no events, I put the process to sleep. For many systems, this is how we event process. Instead of having the message queue block, we do it. This is so that we can update graphics or do whatever we want while there are no events. Since our system call doesn’t ever block, we sort of have to give the CPU a breather by putting our process to sleep.


Final Product

Let’s see how all of these systems work together to make a drawing application.

Screen capture of our GPU/event system.

You can see that our event input only has so many inputs per time unit. That is why when I move the mouse quicker, we get more distance per square. Also, I do not do anything to connect the squares, which is why each square is a discrete event. Finally, I can change colors by typing B (blue), O (orange), R (red) as shown in the event handler in user space:

switch (ev.code) {
		case BTN_MOUSE:
			pressed = (ev.value & 1) == 1;
		break;
		case KEY_O:
			current_color = Pixel {255, 150, 0, 255};
		break;
		case KEY_B:
			current_color = Pixel {0, 0, 255, 255};
		break;
		case KEY_G:
			current_color = Pixel {0, 255, 0, 255};
		break;
		case KEY_R:
			current_color = Pixel {255, 0, 0, 255};
		break;
		case KEY_W:
			if (ev.value == 0) { //released
				fill_rect(fb, 0, 0, 640, 480, white_color);
				syscall_inv_rect(6, 0, 0, 640, 480);
			}
		break;
		case KEY_Q:
			if (ev.value == 0) { // released
				fill_rect(fb, 0, 0, 640, 480, black_color);
				syscall_inv_rect(6, 0, 0, 640, 480);
			}
		break;
	}
}

Each event is discrete, so to keep a state, we need to track it ourselves, which is where our x and y coordinates come from as well as the pressed variable–which states whether the left mouse button is up (false) or down (true). We can use that to only draw when the mouse is held down:

if ((num_events = syscall_get_abs(events, MAX_EVENTS)) < 1) {
	syscall_sleep(noevt_slptm);
	continue;
}
for (u32 z = 0;z < num_events;z++) {
	Event &ev = events[z];
	if (ev.code == ABS_X) {
		x = lerp(ev.value & 0x7fff, 32767, 640);
	}
	else if (ev.code == ABS_Y) {
		y = lerp(ev.value & 0x7fff, 32767, 480);
	}
	if (pressed) {
		fill_rect(fb, x, y, 5, 5, current_color);
	}
}
if (pressed) {
	syscall_inv_rect(6, 0, 0, 640, 480);
}

We still have to keep up with the x and y coordinates every time we get an absolute event, but we only fill a rectangle and invalidate the screen when we have the mouse pressed down. Otherwise, the cursor just moves with nothing drawn to the screen.

As always, please head on over to my GitHub to look at the source for everything I discussed in this post.


Additional Reading