In computing, many things can be made virtual, and there are various types of virtualization: network, desktop, and application virtualization. There are some challenges involved in the virtualization of modern processors (i.e., those built on x86 architecture), which has led to the development of different types of virtualization technology.
Hardware - or processor, or platform - virtualization is usually meant when people refer to “virtualization.” In hardware virtualization, the hardware of the actual system - or host - is “hidden,” and one or more simulated virtual environments are created in which virtual systems - or guests - can operate.
The software that makes virtualization possible is called the hypervisor. Also known as a Virtual Machine Monitor (VMM), the intermediary manages the resources and requests between the host and guest systems, thereby keeping them separate. A hypervisor is either bare-metal, where it’s installed directly on the hardware (i.e., where the host OS usually sits), or it’s hosted, in which case it runs from inside the OS.
A bare-metal hypervisor is considered more efficient and robust because it has direct access to physical resources. On the other hand, a hosted hypervisor provides greater flexibility but can reflect lower performance levels because requests to the hardware have a much longer return journey.
As the name suggests, full virtualization requires every single aspect of the physical hardware to be reflected in the virtual machine so that any software can run independently and unmodified in the virtual system. It also requires that the virtual computer be completely contained, as if in a bubble. Nothing done within the virtual system can affect anything outside of that bubble and vice versa.
In their 1974 article "Formal Requirements for Virtualizable Third Generation Architectures," Popek and Goldberg outline the conditions required for full virtualization to be considered successful.
IBM first achieved full virtualization in the 1960s: the computer architecture and processors they were using had everything needed to fulfill the requirements laid out by Popek and Goldberg. However, later processors, built using (what is now deemed) industry-standard x86 architecture, have some limitations, which mean they do not fulfill those requirements.
The fundamental difference between the two is the ability to “trap and emulate” privileged instructions.
Privileged instructions are those that have the potential to affect the proper functioning of the OS. Non-privileged instructions require no specific permissions and can be successfully executed by user-level applications. Control-sensitive instructions change the processor privilege level, and behavior-sensitive instructions are those whose behavior depends on the privilege level in which it is executed.
When a program or application executes a privileged instruction without the correct privilege level, the processor traps it from going any further and raises a warning flag. In a virtualized system, the hypervisor sees this flag, which then emulates the instruction needed to complete the process and keep everything in the guest system running smoothly.
In x86 architecture, the trap and emulate process does not work for several reasons.
This is the main reason it was long considered impossible to virtualize processors built in this way.
X86 architecture is organized into 4 rings of decreasing privilege, and the Operating System is designed to sit directly on the hardware where it has full control over the physical resources. This is within Ring 0, the highest privilege level: this is the only ring in which privileged instructions can be executed. User-level applications occupy Ring 3, which is furthest away from the hardware and offers the lowest privilege level.
Remember, the VMM can either be bare-metal (i.e., occupying Ring 0) or hosted within the software (Ring 3). The virtual system itself is a user application and, therefore, will always be in Ring 3.
The structure of x86 architecture makes virtualization difficult in the following ways:
These challenges were successfully overcome in 1998 by the software company VMware. They achieved full virtualization of the x86 processor through a combination of Binary Translation and direct execution.
As mentioned above, some sensitive instructions within the x86 architecture cannot be effectively virtualized. This is because all instructions are written in binary code. So VMWare developed a way of translating all the instructions from the guest OS from binary code into a different computer “language.” They called this Binary Translation.
Because the instructions have been translated, it doesn’t matter whether they were previously classified as privileged, non-privileged, or sensitive. Instead, all instructions from the guest OS go directly to the VMM. Therefore, the VMM has to sit in Ring 0 to execute all instructions from the guest OS.
The guest OS is not itself modified and remains unaware of being virtualized. Instead, it occupies Ring 1, which allows it a higher privilege level than user applications running within the virtual OS. All other user applications remain in Ring 3 and can send their non-privileged instructions directly to the host system. VMWare calls this direction execution.
Intel and AMD developed this technique, and processors including the extra features (named Intel VT and AMD-V, respectively) have been available on the market since 2006. In this method, the difficulties involved with successfully executing privileged and sensitive instructions are overcome by adding extra features built into the actual hardware of the host system. For example, in hardware-assisted virtualization, the CPU is designed with an additional execution layer below Ring 0.
The Guest OS sits within Ring 0, and the VMM sits below this in Root Mode Privilege Level. All privileged and sensitive instructions now go directly to the VMM. They are automatically trapped and, where necessary, emulated in this new layer which removes the need for any binary translation. Requests and instructions from user applications in Ring 3 still go directly to the host system hardware.
Hardware-Assisted Virtualization can match performance levels of Binary Translation for the most part, but it does include a very rigid programming model that is not easily amended. Thus, any instances where performance does fall behind can only improve with time and advances in technology.
In paravirtualization, also known as OS-assisted virtualization, the guest OS is modified in a way that replaces its non-virtualizable privileged instructions with hypercalls that go directly to the hypervisor. Communication between guest OS and hypervisor also allows for the relocation of complex tasks to the host system, where they can be completed more quickly than in the virtual system.
While this improves performance and efficiency, paravirtualization cannot be considered full virtualization because the guest OS cannot run unmodified, is aware that it is virtualized, and can communicate with both the hypervisor and other guest systems.
Paravirtualization is available on Linux but is not compatible with Windows.