Virtual Machine Construction, Part I: The Essential Elements of a VM

In this series, I will be developing a virtual machine, as well as developing the instruction set for it. Along the way, I will look at what goes into creating an assembler for it, as well. This, the first part of that series, will go over the simplest possible implementation that will still provide a useful base for extension into a full-fledged VM. The very basics of VM construction are gone over at length elsewhere. I suggest reading Felix Angell’s Implementing a Virtual Machine in C, which inspired me to make my own VM project; if only because there were so many things I wanted to do in a VM that were not possible with “mac”, the VM created in Felix’s post. For that reason, I started from the ground up. In this post, I will create the basic infrastructure for a project I will call AstridVM (naming it after my daughter sounds better than BrandonVM or *gag* FelgerVM).

By the end of this series, AstridVM will be a fully-usable VM with an attendant assembler, and we will be ready to write a cool compiler for it. But for now, on to the basics.

What Compiler to Use?

The first technical decision we need to make is what language we will implement our VM in. Most example VM’s on the internet use C, and for good reason. It’s lean, lacks the overhead of higher-level languages like C++, and most C compilers implement generated code optimizations learned over its 46 year history.

Stack vs Register

The second technical decision a VM has to make is whether it operates on a stack, or via registers. Both have their advantages. Stack-based VM’s include Java and .NET CLR, while register-based VM’s are demonstrated by Dalvik, the Java VM that powers Android. Stack-based VM’s allow us to be brief in our instructions, as many commands implicitly operate on the stack, whereas register operations often require multiple operands per opcode. Register VM’s, however, have certain optimizations that are not possible via the stack. Mark Sinnathamby’s Stack based vs Register based Virtual Machine Architecture, and the Dalvik VM has more information that should be considered by anyone implementing a VM from scratch.

For this reason, AstridVM will have both stack and register operations. This is one aspect of Felix’s mac VM that will prove useful to us in the long run. In this article, however, we will restrict our use of registers to the three needed to get our stack operations up and running. Later articles, especially those dealing with loop and conditional operations, will be well-served by the use of additional general purpose registers.

Deciding on Types

In most online tutorials, both the stack and the instructions that make up the program are given as integers. This is helpful for getting our feet wet with the mechanics of how a VM works, but it becomes less useful when we need to implement a VM that operates realistically. This is for two reasons:

  • A real VM needs to read its instruction set from a file, which will be a set of 8-bit bytes instead of double or quadruple byte integers.
  • We will eventually need to perform text operations on the stack. Byte arrays can be manipulated as integer arrays, but vice-versa requires more mental gymnastics to reconcile.

The instructions that make up to program, as well as the stack, will be an array of bytes. The instructions themselves, however, will be 2 bytes, as will addresses. We can define these in a header file so we can modify them at will to suit our preferences:

#ifndef __TYPES_H
#define __TYPES_H

#include <stdint.h>

#define OPCODE_TYPE uint16_t
#define OPCODE_SIZE 2
#define OFFSET_TYPE uint16_t
#define OFFSET_SIZE 2

#endif // __TYPES_H
Figure 1: types.h

 

One pattern you’ll need to get comfortable with is mapping integers to and from a byte array. Imagine we have stack defined as unsigned char stack[], and instructions stored in unsigned char instructions[], and we want to move an integer from the instruction set to a integer register, and from the integer register to the stack:

ax = *(uint16_t*)(instructions + ip);
*(uint16_t*)(stack + sp) = ax;

Figure 2: Mapping int[] to and from char[]

What’s going on here? We did some pointer arithmetic (don’t worry, we’ll do a LOT more!) to get the address of the current instruction. We then cast that to be a pointer to an integer instead of a pointer to character, and dereferenced that address to get the actual integer value stored there. In the instruction set, that integer spanned two bytes; we grabbed both of them. You can work out the reverse operation in the second line.

Note that the GNU C compiler will give you a warning because it knows we are casting a pointer from one type to another; it just doesn’t know why. But that’s alright; we do.

The Instruction Set

Let us define the opcodes that our VM is going to use. These will be the building blocks of all programs written for AstridVM.

#ifndef __OPCODES_H
#define __OPCODES_H

typedef enum
{
    HALT = 0x0000,
    NOP  = 0x0001,
    PUSH = 0x0002, // Push onto stack. Operand is a literal integer.
    POP  = 0x0003, // Remove integer from the top of the stack.
    MAX
} opcodes;

#endif // __OPCODES_H

Figure 3: opcodes.h

Most other introductions to VM construction also throw in some math operations to demonstrate how to write and execute meaningful algorithms. We are going to be a bit more ambitious, so we will skip these rudimentary operations for now.

Defining the Necessary Registers

I said earlier that we would not be implementing general purpose registers; nevertheless, there are three essential registers we will need to get our basic VM up and running:

  • IP: Instruction Pointer. A program is a sequence of bytes. This pointer tells us which byte is currently being executed. As our program runs, this pointer advances until we reach a HALT.

  • SP: Stack Pointer. This points to the top of the stack. When our program begins, SP is zero. As we push bytes on the stack, this pointer gets higher; we pop bytes off the stack, it gets lower.

  • FP: Frame Pointer. This is a very important pointer, as all local relative stack addresses are offsets of this pointer. In our initial implementation, however, we won’t have method calls, so there is only one local context. For this reason, FP will remain zero for now. In the next post, this register will take center stage.

Earlier, we also said we would use integers for our offsets. Most examples of VM’s use a simple integer array with an enum defining the position in the array of each register. We will be a little bit more ambitious (so as to set ourselves up for future success).

Our implementation will use 16-bit addressing. But this is quite antiquated, and in the future we will want to use a more modern 32-bit or 64-bit addressing system. To keep these disparate addressing systems in sync, modern processors overlap the different sizes for each registers. For instance on x86_64, the 32-bit register EAX is also the lower half of the 64-bit RAX register, and the lower half of THAT is the 16-bit AX register. HAX and LAX are the respective 8-bit higher and lower halves of that. This scheme also lets us do some handy bitwise operations, so we will model it in our VM thusly:

#ifndef __REGISTERS_H
#define __REGISTERS_H

#include <stdint.h>

typedef union vm_register 
{
    uint64_t r64;
    uint32_t r32;
    uint16_t r16;
    uint8_t r[2];
} vm_register;

extern vm_register registers[];

Figure 4: registers.h, part 1

We then define our mission critical registers:

typedef enum
{
    IP, SP, FP,
    NUM_REGS
} registers_names;

#define ip (registers[IP].r16)
#define sp (registers[SP].r16)
#define fp (registers[FP].r16)

#endif
Figure 5: registers.h, part 2

When we implement our general purpose registers, they will look something like this:

#define rax (registers[AX].r64)
#define eax (registers[AX].r32)
#define ax  (registers[AX].r16)
#define hax (registers[AX].r8[1])
#define lax (registers[AX].r8[0])

Figure 6: Definition of general purpose registers

The union operator overlays these registers on top of each other, such that r32 is the first half of r64, r16 is the first half of r32, and so forth. This makes conversion from one integer size to another very easy. Let’s say we want to convert from a 16 bit to a 32 bit:

ax = 0x1234;

At this point, we can access EAX, and its value will be 0x00001234.

A Discussion About Endian-ness

But how? Since AX contains bytes that overlay with the beginning of EAX, why is the value of EAX not now 0x12340000? Its because x86 is little-endian, meaning the least significant byte is at the lowest address. Take the above assignment to AX. Here’s what the memory looks like:

     BYTE 0 | BYTE 1 | BYTE 2 | BYTE 3 
E/AX   34       12       00       00

Note that whether you access it as AX or EAX, the least significant bytes are the same.

Scaffolding the Helper Methods

Here I have the advantage of having already implemented the VM and knowing what methods need to expose throughout the program. Here is the globals.h file we will include in astridvm.c:

#ifndef __GLOBALS_H
#define __GLOBALS_H

#include "types.h"

#define DEBUG

#define STACK_SIZE 1024*4
#define PROGRAM_SIZE 1024*4

extern unsigned char stack[];
extern unsigned char *instructions;

void push_int(uint16_t val);
uint16_t pop_int();
uint16_t fetch_int();
OPCODE_TYPE fetch_op();

#endif // __GLOBALS_H
Figure 7: globals.h
Hopefully most of this is self-explanatory. We use unsigned char for both the instruction list and the stack, as we already established that both will be byte arrays. The size of the stack and the program are arbitrary. We will probably revisit them, later. The methods listed there will make more sense when we build out the main code file.

Building Out the Virtual Machine Itself

Now it’s time to build out the main program. Like most other online examples, this initial implementation will use a pre-populated list of instructions, given as an array of integers. In Part 3, we will discard this prepropulation entirely, and will begin using an assembler to make a binary VM image instead. Baby steps.

These are the headers needed for astridvm.c:

#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>

#include "globals.h"
#include "opcodes.h"
#include "registers.h"

Figure 8: astridvm.c, part 1

Next we define the registers, stack, and instruction list that our program is going to use. You’ll note that we have already extern’d these variables in global.h to use in other code files, should we add them later (we will).

vm_register registers[NUM_REGS];
unsigned char stack[STACK_SIZE];
unsigned char *instructions = NULL;

Figure 9: astridvm.c, part 2

Our VM is going to take a list of inputs (the instruction list) and keep iterating over it until it comes to HALT. At that point, we need to signal the byte parser to cease execution and exit the VM. The easiest way to do that is to establish global variable we can turn off like a switch:

bool running = true;

Figure 10: astridvm.c, part 3

We have already defined the helper methods we’ll be using to parse the byte array. Now we define them. Taking a look at fetch_op(), we write something like this:

OPCODE_TYPE fetch_op()
{
    ip += OPCODE_SIZE;
    return *(OPCODE_TYPE*)(instructions + ip - OPCODE_SIZE);
}

Figure 11: astridvm.c, part 4

At the cost of an extra math operation, we avoid using a local variable (dear assembly wizards: let me know if creating extra locals on the stack and popping them off at the end of scope is more or less efficient than the extra SUB; then tell me if using a static local would change your answer). Note that we used the reference/dereference pattern shown before to grab a 16-bit integer from a 8-bit byte array.

We also need to fetch some integers from the instructions:

uint16_t fetch_int()
{
    ip += 2;
    return *(uint16_t*)(instructions + ip - 2);
}

Figure 12: astridvm.c, part 5

Yes, technically, with the VM as written, fetch_int() is exactly the same as fetch_op(). This will change, however, if we ever decide the change the size of our instruction set op codes.

We also need a way of pushing and popping integer values on and off the stack:

void push_int(uint16_t val)
{
    *(uint16_t*)(stack + sp) = val;
    sp += 2;
}

uint16_t pop_int()
{
    sp -= 2;
    return *(uint16_t*)(stack + sp);
}
Figure 13: astridvm.c, part 6

These helper utilities are called from our main logic routine that evaluates opcodes and executes them.

void eval(OPCODE_TYPE opcode)
{
    switch (opcode)
    {

Figure 14: astridvm.c, part 7

In order to evaluate opcodes (we will very soon have an enormous list), we could have used chained if…else blocks. This would have worked fine; there is a good chance the C compiler would know what to do. But by using a switch statement, we tell the compiler to make deeper optimizations, such as using a branch table or binary search.

Now we look, in turn, at each of the instruction implementations:

    case HALT:
        running = false;

#ifdef DEBUG
        printf("HALT\n");
#endif

        break;

Figure 15: astridvm.c, part 8

HALT is simple; we set the flag that tells the main VM loop to stop continuing.

    case PUSH:
        push_int(fetch_int());

#ifdef DEBUG
        printf("PUSH[sp=%d] %d\n", sp, *(uint16_t*)(stack + sp - 2));
#endif

        break;

Figure 16: astridvm.c, part 9

PUSH is made easier through the previous implementation of our helper functions. Note that we continue our convention of not using additional local variables,

    case POP:
        pop_int();

#ifdef DEBUG
        printf("POP[sp=%d] %d\n", sp, *(uint16_t*)(stack + sp));
#endif

        break;
    }
}

Figure 17: astridvm.c, part 10

Our final opcode (and ending the eval() method) is made trivial by the pop_int() method. In practical usage, this instruction will be used to discard unwanted stack members.

We aren’t yet ready to read from a file, yet, so we need to define a sample program that demonstrates all of our instructions and successfully terminates:

uint16_t program[] = 
{
    PUSH, 1,
    POP,
    PUSH, 5,
    PUSH, 6,
    POP,
    POP,
    HALT
};

Figure 18: astridvm.c, part 11

In part 3, we will dispense with this “dummy” data and transition to using a binary file to store our program. For now, this is a necessary evil. We now just need the main VM loop:

int main(int argc, char **argv)
{
    sp = 0;
    ip = 0;
    fp = 0;

    instructions = (unsigned char*)program;

    while (running)
    {
        eval(fetch_op());
    }

    return 0;
}

Figure 19: astridvm.c, part 12

You can see here that we initialize our essential registers, map our program (an array of integers) to the byte array that we will iterate over, and proceed to evaluate opcodes until the running flag is turned off.

The Results

Here is what we get when we run the program:

$ ./astridvm
PUSH[sp=2] 1
POP[sp=0] 1
PUSH[sp=2] 5
PUSH[sp=4] 6
POP[sp=2] 6
POP[sp=0] 5
HALT

Figure 20: The results

If you trace these results through the sample program, you’ll see the results are as expected. The VM is in working order, and we’re ready to begin expanding on it.

Next Steps

We will continue putting off “useful” commands (like math or string operations) in favor of scaffolding out more complex operations that will set us up for future success. In part 2, we will add functions calls, as well as our call stack. In this, I hope to go a bit farther than most VM construction tutorials. Along the way, we’ll learn how stacks and frames work in non-VM environments, as well.

Virtual Machine Construction Part II – Procedure Calls

Get the Source

https://github.com/bfelger/AstridVM/tree/master/src-part1