remedyvm

A toy RISC virtual machine inspired by Bell Lab's `dis' and Tsoding's `bm'
git clone git://git.ethandl.dev/remedyvm
Log | Files | Refs

readme.org (6343B)


      1 * RemedyVM, the register-memory dynamic virtual machine
      2 *NOTE:* This is a personal project for funzies pls don't bully.
      3 
      4 ** Overview
      5 RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT
      6 compiler, a JIT compiler, and a simple interpreter.
      7 
      8 The memory/stack/heap approach allows for register allocation
      9 optimisations at run/compile-time, whilst still offering compilers
     10 targeting RemedyVM the option to put memory into a stack or the heap.
     11 
     12 A memory machine (MM) is a machine with "infinite" registers, each
     13 corresponding with a temporary value. Much like a register machine,
     14 temporaries can easily occupy real registers in JIT compilation, but
     15 much like a stack machine, the registers are in a growing list.
     16 
     17 There are currently no plans for garbage collection, but if it is to be
     18 implemented in the future, it will likely be done similarly to Bell
     19 Labs' =dis= VM, taking a hybrid reference count/mark and sweep approach.
     20 According to Winterbottom and Pike, the memory machine architecture
     21 makes this easier to accomplish.
     22 
     23 The bytecode should be compilable straight to machine code ahead of time
     24 (AOT), removing /any/ runtime in the process. The JIT compiler will need
     25 a psuedo-runtime, all functions are /thunks/ until first run, then they
     26 are "evaluated" (JIT compiled) to a function that runs with no overhead.
     27 
     28 Some functions/labels may force themselves to be JIT compiled or
     29 interpreted so as to adopt a structure that could only come about from
     30 being lazily evaluated on the fly. Potentially a full runtime could be
     31 set-up to allow for AOT compiled lazy-thunks.
     32 
     33 The runtime JIT process may need to have multiple steps, akin to the V8
     34 JavaScript runtime. Register allocation may be heavy on resources as it
     35 is a graph-colouring problem, so an initial round may not allocate
     36 machine registers optimally, or may even run interpreted.
     37 
     38 Whilst the standard implementation will only really be a psuedo-runtime,
     39 the initial implementation will simply be an interpreted VM.
     40 
     41 To be clear, this project is defining a VM target, a VM interpreter, and
     42 a VM compiler with JIT/AOT.
     43 
     44 Given the stack-register hybrid approach, temporaries will be stored in
     45 one of 2 places to be determined at runtime: 1. A dynamic array of
     46 registers that is determined at VM startup/compile-time to match the
     47 host architecture. This is the "memory" in memory machine. 2. The
     48 regular stack. 3. The heap.
     49 
     50 Given the simplicity, beauty, and popularity of RISC machines, the
     51 instruction set of RemedyVM will be very simple and should be similar
     52 to ARM, MIPS, and other RISC instruction sets.
     53 
     54 RemedyVM should be as platform agnostic as possible, though it's hard
     55 to get a "one implementation fits all", the underlying platform
     56 assumptions should try to be as minimal as possible.
     57 
     58 Given that RemedyVM is inherently not a very stable target, compilers
     59 do not have much control over what goes into registers, and what goes
     60 onto the stack. This is the hardest problem that this project aims to
     61 solve. This issue is discussed in more depth in the
     62 [[#unstable-target-solutions][unstable target section]], but
     63 the solution picked by RemedyVM is to have a /memory machine/ RISC
     64 architecture.
     65 
     66 ** Unstable compiler target solutions
     67 :PROPERTIES:
     68 :CUSTOM_ID: unstable-target-solutions
     69 :END:
     70 Given the virtual machine decides the number of registers to allocate at
     71 run/comptime, compilers have an unstable target!
     72 
     73 How can a compiler be sure that an architecture will be able to put 20
     74 variables into registers? It can't!
     75 
     76 Some solutions: 1. Treat all temporaries as being in an ordered dynamic
     77 list, the runtime register allocation takes the \(K\) lowest temporaries
     78 and puts them into registers. (Where \(K\) is the number of registers
     79 for a target) - This is the method used in the =dis= VM by Bell Labs,
     80 they use a very similar Memory Machine architecture. 2. Leave it up to
     81 the compiler, any excess "registers" get treated as temporaries. - This
     82 would be bad for targets that have too few or many registers. 3.
     83 Stabilise the VM target and just pick a number of registers. - The JVM
     84 takes this to the extreme and has only machine registers, PC registers,
     85 and the stack. 4. Restrict compilers to just use a stack, the JIT will
     86 decide on registers. - This would make RemedyVM a simple stack-based
     87 VM
     88 
     89 It seems from these that the best solution is solution 1, to go with an
     90 infinite-register hybrid VM, or rather a Memory machine. Unlike =dis=,
     91 RemedyVM will be a RISC-like architecture for ease of instruction
     92 selection. RemedyVM is thus a /memory virtual machine/ with stack &
     93 heap components for records that might not fit well into the
     94 architecture.
     95 
     96 ** Goals
     97 - [ ] Lay-out the basic instruction set
     98   - [ ] Move
     99   - [ ] Stack
    100   - [ ] Heap
    101   - [ ] Arithmetic
    102   - [ ] Logical
    103   - [ ] Jumps
    104   - [ ] Conditional instructions
    105     - [ ] All instructions should be conditional
    106 - [ ] Implement an interpreter
    107 - [ ] Compilation
    108   - [ ] Assembly generation
    109   - [ ] Linking
    110     - [ ] Static linking
    111     - [ ] Dynamic linking (?)
    112     - /Note: Linking will probably just be done with the regular gcc
    113       tooling, at least initially./
    114   - [ ] Optimise instruction selection for RISC platforms
    115   - [ ] Optimise instruction selection for x86 platforms
    116   - [ ] Implement AOT compiler
    117     - [ ] Link to libc etc
    118   - [ ] Implement JIT compiler
    119     - [ ] Runtime implementation (?)
    120     - [ ] Lazily compiled thunk implementation
    121 - [ ] Implement a basic compiler targeting RemedyVM
    122   - [ ] Basic lisp-like LL(1) language
    123   - [ ] Simpler Lisp-Haskell hybrid
    124   - [ ] Mojo
    125 - [ ] Tests
    126   - [ ] Unit VM tests
    127   - [ ] Fibonacci
    128   - [ ] Factorial
    129   - [ ] Mandelbrot
    130   - [ ] Game of life
    131   - [ ] Memory tests
    132   - [ ] Arithmetic tests
    133 - [ ] Cross-Platform
    134   - [ ] Linux
    135   - [ ] Plan9(front)
    136   - [ ] OpenBSD
    137   - [ ] MacOS
    138   - [ ] Windows
    139 - [ ] Documentation (ugh)
    140   - [ ] Design doc
    141   - [ ] Target spec
    142 - [ ] Implementation languages
    143   - [ ] C
    144     - The most cross-platform, will support every target
    145   - [ ] Haskell
    146     - Will support most targets
    147   - [ ] Rust (?)
    148   - [ ] Go (?)
    149   - [ ] Zig (?)
    150   - [ ] Java (?)
    151     - Icky, but also have already written /some/ of a compiler in Java.
    152 
    153 *** Note
    154 - This is a personal project, and is not going to be a serious virtual
    155   machine.
    156 - It is just experience, my first virtual machine!
    157 - The more advanced features may not ever be implemented.