readme.md (6287B)
1 # RemedyVM, the register-memory dynamic virtual machine 2 3 **NOTE:** This is a personal project for funzies pls don't bully. 4 5 ## Overview 6 7 RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT 8 compiler, a JIT compiler, and a simple interpreter. 9 10 The memory/stack/heap approach allows for register allocation 11 optimisations at run/compile-time, whilst still offering compilers 12 targeting RemedyVM the option to put memory into a stack or the heap. 13 14 A memory machine (MM) is a machine with "infinite" registers, each 15 corresponding with a temporary value. Much like a register machine, 16 temporaries can easily occupy real registers in JIT compilation, but 17 much like a stack machine, the registers are in a growing list. 18 19 There are currently no plans for garbage collection, but if it is to be 20 implemented in the future, it will likely be done similarly to Bell 21 Labs' `dis` VM, taking a hybrid reference count/mark and sweep approach. 22 According to Winterbottom and Pike, the memory machine architecture 23 makes this easier to accomplish. 24 25 The bytecode should be compilable straight to machine code ahead of time 26 (AOT), removing *any* runtime in the process. The JIT compiler will need 27 a psuedo-runtime, all functions are *thunks* until first run, then they 28 are "evaluated" (JIT compiled) to a function that runs with no overhead. 29 30 Some functions/labels may force themselves to be JIT compiled or 31 interpreted so as to adopt a structure that could only come about from 32 being lazily evaluated on the fly. Potentially a full runtime could be 33 set-up to allow for AOT compiled lazy-thunks. 34 35 The runtime JIT process may need to have multiple steps, akin to the V8 36 JavaScript runtime. Register allocation may be heavy on resources as it 37 is a graph-colouring problem, so an initial round may not allocate 38 machine registers optimally, or may even run interpreted. 39 40 Whilst the standard implementation will only really be a psuedo-runtime, 41 the initial implementation will simply be an interpreted VM. 42 43 To be clear, this project is defining a VM target, a VM interpreter, and 44 a VM compiler with JIT/AOT. 45 46 Given the stack-register hybrid approach, temporaries will be stored in 47 one of 2 places to be determined at runtime: 1. A dynamic array of 48 registers that is determined at VM startup/compile-time to match the 49 host architecture. This is the "memory" in memory machine. 2. The 50 regular stack. 3. The heap. 51 52 Given the simplicity, beauty, and popularity of RISC machines, the 53 instruction set of RemedyVM will be very simple and should be similar to 54 ARM, MIPS, and other RISC instruction sets. 55 56 RemedyVM should be as platform agnostic as possible, though it's hard to 57 get a "one implementation fits all", the underlying platform assumptions 58 should try to be as minimal as possible. 59 60 Given that RemedyVM is inherently not a very stable target, compilers do 61 not have much control over what goes into registers, and what goes onto 62 the stack. This is the hardest problem that this project aims to solve. 63 This issue is discussed in more depth in the [unstable target 64 section](#unstable-target-solutions), but the solution picked by 65 RemedyVM is to have a *memory machine* RISC architecture. 66 67 ## Unstable compiler target solutions 68 69 Given the virtual machine decides the number of registers to allocate at 70 run/comptime, compilers have an unstable target! 71 72 How can a compiler be sure that an architecture will be able to put 20 73 variables into registers? It can't! 74 75 Some solutions: 1. Treat all temporaries as being in an ordered dynamic 76 list, the runtime register allocation takes the $K$ lowest temporaries 77 and puts them into registers. (Where $K$ is the number of registers for 78 a target) - This is the method used in the `dis` VM by Bell Labs, they 79 use a very similar Memory Machine architecture. 2. Leave it up to the 80 compiler, any excess "registers" get treated as temporaries. - This 81 would be bad for targets that have too few or many registers. 3. 82 Stabilise the VM target and just pick a number of registers. - The JVM 83 takes this to the extreme and has only machine registers, PC registers, 84 and the stack. 4. Restrict compilers to just use a stack, the JIT will 85 decide on registers. - This would make RemedyVM a simple stack-based VM 86 87 It seems from these that the best solution is solution 1, to go with an 88 infinite-register hybrid VM, or rather a Memory machine. Unlike `dis`, 89 RemedyVM will be a RISC-like architecture for ease of instruction 90 selection. RemedyVM is thus a *memory virtual machine* with stack & heap 91 components for records that might not fit well into the architecture. 92 93 ## Goals 94 95 - [ ] Lay-out the basic instruction set 96 - [ ] Move 97 - [ ] Stack 98 - [ ] Heap 99 - [ ] Arithmetic 100 - [ ] Logical 101 - [ ] Jumps 102 - [ ] Conditional instructions 103 - [ ] All instructions should be conditional 104 - [ ] Implement an interpreter 105 - [ ] Compilation 106 - [ ] Assembly generation 107 - [ ] Linking 108 - [ ] Static linking 109 - [ ] Dynamic linking (?) 110 - *Note: Linking will probably just be done with the regular gcc 111 tooling, at least initially.* 112 - [ ] Optimise instruction selection for RISC platforms 113 - [ ] Optimise instruction selection for x86 platforms 114 - [ ] Implement AOT compiler 115 - [ ] Link to libc etc 116 - [ ] Implement JIT compiler 117 - [ ] Runtime implementation (?) 118 - [ ] Lazily compiled thunk implementation 119 - [ ] Implement a basic compiler targeting RemedyVM 120 - [ ] Basic lisp-like LL(1) language 121 - [ ] Simpler Lisp-Haskell hybrid 122 - [ ] Mojo 123 - [ ] Tests 124 - [ ] Unit VM tests 125 - [ ] Fibonacci 126 - [ ] Factorial 127 - [ ] Mandelbrot 128 - [ ] Game of life 129 - [ ] Memory tests 130 - [ ] Arithmetic tests 131 - [ ] Cross-Platform 132 - [ ] Linux 133 - [ ] Plan9(front) 134 - [ ] OpenBSD 135 - [ ] MacOS 136 - [ ] Windows 137 - [ ] Documentation (ugh) 138 - [ ] Design doc 139 - [ ] Target spec 140 - [ ] Implementation languages 141 - [ ] C 142 - The most cross-platform, will support every target 143 - [ ] Haskell 144 - Will support most targets 145 - [ ] Rust (?) 146 - [ ] Go (?) 147 - [ ] Zig (?) 148 - [ ] Java (?) 149 - Icky, but also have already written *some* of a compiler in Java. 150 151 ### Note 152 153 - This is a personal project, and is not going to be a serious virtual 154 machine. 155 - It is just experience, my first virtual machine! 156 - The more advanced features may not ever be implemented.