Contents

Something about Process-Virtual-Machine

Process VM

  • Emulation of a user-level process (with possibly different ISA, different OS) – components required? – techniques? – correctness verification?

  • Process VM with same OS, same ISA – use case?

    • motivation
      • process migration
      • flexible control of used OS resources
    • „OS-level virtualization“ enough
    • not discussed here
  • Virtualization of a process environment

    • emulation of user ISA + OS environment (ABI)
    • every guest process gets own environment
    • guest and host OS often the same
  • View of the host machine user

    • processes running inside Process VM look exactly the same as host processes
    • Wanted: easy usage of VM (automatic startup on demand)

Process-level VMs provide user applications with a virtual ABI environ- ment. In their various implementations, process VMs can provide replication, emulation, and optimization. The following subsections describe each of these.

Components of a Process VM

  • Initialization

    – loader for code/data into guest memory

    – creation of tables, translation cache

    – redirection of exceptions

    ​ • for all potentially happening exceptions

    ​ • example: division by zero scenarios:

    ​ – guest does own handling

    ​ – guest ISA does not generate this exception

  • Emulation Manager, translation

    – see ISA part

  • emulation of OS calls

    – needs mapping of guest to host OS

  • emulation of exceptions

    – e.g. page fault, division by zero

    – detection by interpreter / by hardware (OS)

    – reconstruction of precise state needed

  • handling of interrupts – examples: signals, OS callbacks

Types of Compatibility

  • Strict Compatibility („Intrinsic“) – every operation of a real machine is precisely emulated by Process VM, including

    • processor errata
    • visible implementation behavior (e.g. „undefined“ behavior)
    • usually not included: performance

    – real properties have to be known!

    – result: guest process is unable to detect any difference in virtual vs. real environment

    – complete check of VM implementation required

    – example: „emulation“ of Intel x86 processor by AMD HW

  • Problems

    – exec has to be known completely

    – precise mapping can be very inefficient

    – high development costs

  • Real motivation for a Process VM?

    – example: MS Office running on PowerPC/Linux

    – observation - not every operation is used - precise mapping of reality not required for execution! e.g. „don ́t care bits“ in result of a OS call

    – relaxation of compatibility possible

  • Relaxed Compatibility („Extrinsic“)

    • construction of exec" such that exec ́ is sufficiently approximated, e.g. - only mapping for operations generated by a given compiler - implementation by similar operations (80 bit floating point emulated by 64 bit FP)
    • „sufficiently“ defined by guest SW able to run - significantly simpler verification - ISA documentation is enough - precise mapping of operations can be replaced by more efficient, similar ones

State Mapping Guest => Host

  • Memory

    – direct mapping with fixed offset

    • guest address A => host address A ́ = offset + A
    • special case: offset = 0 – possible if guest OS allocates address ranges (i.e. user-level guest needs to be able to cope with arbitrary addresses from guest OS) – efficient emulation possible

    – issues

    • available host memory smaller than guest memory
    • 64bit guest on 32bit host => needs indirect address mapping in software (= the general resolution)
  • Address Mapping in Software

    A ́ = (A & (PageLen-1)) + AddrTab[ A / PageLen]

    • effort per access into guest memory: - temporary register + 6 host instructions + 2x host memory access - length of AddrTab: guest memory size / PageLen * pointer size host

Emulation of Memory

  • Compatibility: often a tradeoff between – performance and – compatibility

  • Issues – how to protect VM from accesses of guest? – linear address space vs. segmentation – memory protection

    • capabilities: Read/Write/Execute (RWE)
    • granularity: page size – x86 can use either 4 KB or 2/4 MB – MIPS/IA-64: almost any power-of-two

    => Use HW if possible (e.g. mprotect + SEGF handler)

Emulation of Exceptions and Interrupts

  • Notifications from the outside that something has happened – delivered to OS by hardware – delivered to user space from OS

    • Linux/UNIX: spontaneous call into a signal handler
  • Exceptions („Trap“) – synchronous with program execution, deterministic – examples: memory violation, division by 0

  • Interrupt – Asynchronous, external event – examples

    • asynchronous I/O (notify if data available)
    • asychronnous inter-process communication (IPC)
  • Classes of notifications for specific events – visible in ABI (i.e. for user level) • guest OS sends a signal / terminates program • example: program registered handler for SEGFAULT – invisible in ABI • exception/interrupt does not result in triggering an notification for the program • example: – ABI does not know about SEGFAULTs – handling completely done by guest OS

  • Issues – exception ABI-invisible in host, visible in guest

    • needs interpretative detection

    – exception ABI-visible in host, invisible in guest

    • needs check whether exception should be ignored

    – one exception type in host-ABI for multiple exception types in guest

    • needs differentiation in SW
    • example: host has same overflow exception for integers and FP
  • Requirement for compatibility – „Precise Exception Handling“ (PEH): if instruction X triggers exception, then

    • all instructions before X are executed
    • no instruction after X is executed
    • X itself not executed yet

    – assumption: host supports PEH

  • Specifics with interrupts also needs semantics of PEH, but X can be choosen by VM (to some extent) e.g. after execution of a BB

  • Detection of Exception Situations

    • manual check – part of interpretation = interpretative detection

    • hook into HW-triggered exception

      – needs equal semantic of exception situations in guest and host ISAs

      – registration of notification for all possible exception types at host OS by VMM

      – translation of host OS notifications into according guest OS notifications

  • Detection of Interrupts

    • delays acceptable – interpretation

      • finish current guest instruction
      • create exception state (e.g. stack …)
      • call into interrupt handler

      – binary translation

      • delay until jumping back into emulation manager
      • chaining can result in unacceptable delays – in interrupt handler of VMM: unchain current translation
      • required: – no loop in translation block (ok with BBs) – state is consistent at end of translation block

Emulators and Dynamic Binary Translators

Process-level virtual machines is to support program binaries compiles to a different instruction set thatn the one executed by the host’s hardware, i.e., to emulate one instruction set on hardware designed for another. Application programs are compiled for a source ISA, but the hardware implements a different target ISA.

The most straight forward emulation methos is interpretation An interpreter program executing the targest ISA fetches, decodes, and emulates the execution of individual source instructions. This can be a relatively slow process, requiring tens of native target instructions for each source instrction interpreted.

For better performance, binary translation is typically used. With binary translation, blocks of source instrctions are converted to targed instructions that perform equivalent functions. There can be a relatively high overhead associated with the translation process, but once a block of instructions is translated, the translated instructions can be cached and repeatedly executed much faster than they can be interpreted. Because binary translation is the most important feature of this type of process virtual machine, they are sometimes called dynamic binary translators. 1

Of course,

Same-ISA Binary Optimizers

Most dynamic binary translators not only translate from source to target code but also perform some code optimizations. Optimization of a program binary is the primary purpose of the virtual machine.

High-Level Language Virtual Machines: Platform Independence

Full cross-plattform portability is more easily achieved by taking a step back and designing it into an overall software framework. One way of accomplishing this is to design a process-level VM at the same time as an application does not directly correspond to any real platform. Rather, it is designed for ease of application program development. These high-level languafe VMs (HLL VMs) are similar to the process VMs described earliear. However, they are focused on minimizing hardware-specific and OS-specific features because thes would compromise platform independence.

In a conventional system, the compiler consists of a fronted that performs lexical, syntax, and semantic analysis to generate simple intermediate code – similar to machine code but more abstract. Typically the intermediate code does not contain specific register assignments, for example. Then a code generator takes the intermediate code and generates a binary containing machine code for a specific ISA and OS. This binary file is distributed and executed on platforms that support the given ISA/OS combination. To execute the program on a different platform, however, it must be recompiled for that platform.

In HLL VMs, this model is changed. Ths steps are similar to the conventional ones, but the point at which program distribution takes place is at a higher level. A conventional compiler fronted generates abstract machine code, which is very similar to an intermediate form. In many HLL VMs, this is a rather generic stack-based ISA. This virtual ISA is in essence the machine code for a virtual machine. The portable virtual ISA code is distributed for execution on different platforms. For each platform, a VM capable fo executing the virtual ISA is implemented. In its simplest form, the VM contains an interpreter that takes each instruction, decodes it, and then performs as part of the VM. In more sophisticated, higher performance VMs, the abstract machine code may be compiled (binary translated) into host machine code for direct excution on the host platform. 1


  1. [James E. Smith, Ravi Nair, Virtual Machines, 2015, ISBN:9781558609105] ↩︎