This is an old revision of the document!
In this section, we try to summarize some of the traditional reponsibilities of a CPU. If you are already acquainted with them (or some), feel free to skim past them!
Primarily developing for x86
, you may be familiar with its protection rings, or at least with two of them: the ones traditionally used for the separation between user-space and kernel-space.
When the x86
' architecture was first introduced in the late '70s, the designers expected OS developers to need a mechanism to isolate critical components (e.g.: drivers) from regular applications. As a result, they implemented four levels of isolation (that the OS developers more or less ignored :D ):
A special System Call instruction is required to get from a lower-privileged level to a higher one. It is usually implemented as a software interrupt / trap by the CPU and, when invoked by, e.g., a user program, it will pause and save its CPU state on the stack (program counter, flags) and invoke a special routine registered by the Operating System. The OS kernel will be passed control to and begin to analyze (via a standardized calling convention) and execute the request (such as read from / write to the filesystem / disk / network / USB devices etc. – which the application would not normally be privileged enough to accomplish; remember: hardware access is quite restricted from upper rings).
Pagination (see Fig. 1) is an architectural feature of all modern processors that allows the Operating System to present each process a different view of its memory.
When said process tries to access a page (e.g., 4KB block), the address is translated by the Memory Management Unit (MMU), a hardware component, by means of a data structure called Page Table, unique to each process and residing in kernel memory.
This allows the kernel to obscure parts of memory (e.g.: that of other processes) as a means of isolation. Or to over-commit resources, only to actually allocate them when needed (e.g.: malloc()
-ed memory is assigned to the process only after it is first accessed).
Because changing the active Page Table is an expensive operation (mostly due to CPU cache invalidation), the Virtual Address Space of each process also contains the kernel memory mapped in the higher half. Yes, the kernel also uses virtual addressing for its own memory. When the kernel needs to intervene on behalf of the unprivileged process (e.g.: when a System Call is performed), the CPU state transitions from Ring 3 to Ring 0, but the Page Table does not need to be switched out. Although this technique increases the overall system performance, it also raises a question: how do we stop an unprivileged process from accessing kernel memory, since it's already mapped in its virtual address space?
The answer is that, aside from information relevant to the address translation itself, the Page Table also contains access restrictions. Each memory transaction that leads to an address translation also presents its intent, e.g.: whether it wants to write data to memory, or fetch an instruction to execute. This allows the MMU to block such access depending on the Read-Write-Execute permissions associated with each page. However, this is only one example of restriction that can be enforced. The Page Table can also restrict access based on privilege levels. Unfortunately, the architecture defines only two such levels: Privileged (Rings 0-2) and unprivileged (Ring 3).
Nonetheless, the two inner protection rings are still implemented on x86 CPUs to this day. The question is: why? Based the announcement of the new x86s architecture that's supposed to eliminate 16-bit and 32-bit modes of 64-bit processors, it's safe to rule out backward compatibility. The real reason is probably that it's just cheaper this way (changing the logic design of a processor is risky, requires extensive testing and very costly prototyping).
Finally, please note that, although we described the virtual memory mechanisms of x86, the concepts are really the same for all other architectures (of course, the configuration registers and page entry structure will differ, but they all share a common feature set)!
In ARM's nomenclature, the CPU protection modes are called Exception Levels. Although they are analogous to x86's rings, they feature two significant improvements: first, the standardization of the most important modes for userspace, kernel space and hypervisor (for running multiple OSes in Virtual Machines); second, a secure separation between Secure and Non-Secure Worlds, but this will be discussed in Lecture 03.
Three exception levels:
~
Ring 0)~
Ring -1)Note: with the introduction of the ARM TrustZone security extensions, [almost] all of these modes were vertically partitioned into two security domains. To make it possible to switch back and forth between them, a new Exception Level – EL3 (Secure Monitor) – was added.