This is an old revision of the document!
Working on x86-64, you may be familiar with its protection rings, or at least with two of them. When the architecture was first introduced in the late '70s, the designers expected OS developers to need a mechanism to isolate critical components (e.g.: drivers) from regular applications. As a result, they implemented four levels of isolation… that the OS developers more or less ignored:
The reason why ring1 and ring2 lost favor among kernel developers is mostly due to another component of the modern protection model, namely paging (see Fig. 1). Paging is a system that allows the Operating System to present each process a different view of its memory. When said process tries to access a page (i.e.: 4KB block), the address is translated by the Memory Management Unit (MMU), a hardware component, by means of a data structure called Page Table, unique to each process and residing in kernel memory. This allows the kernel to obscure parts of memory (e.g.: that of other processes) as a means of isolation. Or to over-commit resources, only to actually allocate them when needed (e.g.: malloc()
-ed memory is assigned to the process only after it is first accessed).
Because changing the active Page Table is an expensive operation (mostly due to CPU cache invalidation), the Virtual Address Space of each process also contains the kernel memory mapped in the higher half. Yes, the kernel also uses virtual addressing for its own memory. When the kernel needs to intervene on behalf of the unprivileged process (e.g.: when a System Call is performed), the CPU state transitions from ring3 to ring0, but the Page Table does not need to be switched out. Although this technique increases the overall system performance, it also raises a question: how do we stop an unprivileged process from accessing kernel memory, since it's already mapped in its virtual address space?
Aside from information relevant to the address translation itself, the Page Table also contains access restrictions. Each memory transaction that leads to an address translation also presents its intent, e.g.: whether it wants to write data to memory, or fetch an instruction to execute. This allows the MMU to block such access depending on the Read-Write-Execute permissions associated with each page. However, this is only one example of restriction that can be enforced. The Page Table can also restrict access based on privilege levels. Unfortunately, the architecture defines only two such levels: privileged (ring0-2) and unprivileged (ring3).
By now it should start becoming clear why ring1 and ring2 were abandoned. Since the same isolation guarantees that User Space and Kernel Space enjoy could not be extended to each protection domain, sacrificing performance only to restrict access to a few privileged instructions was simply not worth it. Nonetheless, ring1 and ring2 are still implemented on x86 CPUs to this day. The reason? Based the announcement of the new x86s architecture that's supposed to eliminate 16-bit and 32-bit modes of 64-bit processors, it's safe to rule out backward compatibility. The real reason is probably that it's just cheaper this way.
Because a CPU architecture never stops evolving, new protection modes and extensions had to be added along the way. Some more unnerving than others:
In ARM's nomenclature, the protection modes are called Exception Levels. Although they are analogous to the (important) protection rings in x86, they benefit from one significant improvement: the separation of Secure and Non-Secure Worlds. Depending on the system configuration, access to certain resources (both memory and physical devices) can be restricted to Secure World code.
The Non-Secure world consists of three exception levels:
Nothing interesting here; it's the same as x86. On the Secure World side however:
Although chances are you haven't heard of it, Intel had a similar solution called the Software Guard Extension (SGX). This extension was meant to protect small amounts (~72MB) of sensitive (user space) application data and code from a potentially malicious OS. This was realized by restricting access to the protected memory ranges (Enclaves) to code that already resided in the Enclave. Additionally, calls to Enclave functions could be made only via a strictly enforced API defined by the user at compile time; so no arbitrary jumps after a return to libc. There are numerous reasons why this technology failed. The main one would be that it did not work. Researchers have found dozens of ways to break the isolation guarantees that SGX was supposed to offer, most of them relying on side channels attacks (i.e.: deducing user secrets by observing how the target process influences the system). Coupled with the lack of isolation for privileged code that ARM offers (S-EL1) and the fact that Intel's remote attestation of SGX-capable CPUs and secure applications could not be offloaded to third parties, more or less guaranteed its fade from relevance.