readme.org (6343B)
1 * RemedyVM, the register-memory dynamic virtual machine 2 *NOTE:* This is a personal project for funzies pls don't bully. 3 4 ** Overview 5 RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT 6 compiler, a JIT compiler, and a simple interpreter. 7 8 The memory/stack/heap approach allows for register allocation 9 optimisations at run/compile-time, whilst still offering compilers 10 targeting RemedyVM the option to put memory into a stack or the heap. 11 12 A memory machine (MM) is a machine with "infinite" registers, each 13 corresponding with a temporary value. Much like a register machine, 14 temporaries can easily occupy real registers in JIT compilation, but 15 much like a stack machine, the registers are in a growing list. 16 17 There are currently no plans for garbage collection, but if it is to be 18 implemented in the future, it will likely be done similarly to Bell 19 Labs' =dis= VM, taking a hybrid reference count/mark and sweep approach. 20 According to Winterbottom and Pike, the memory machine architecture 21 makes this easier to accomplish. 22 23 The bytecode should be compilable straight to machine code ahead of time 24 (AOT), removing /any/ runtime in the process. The JIT compiler will need 25 a psuedo-runtime, all functions are /thunks/ until first run, then they 26 are "evaluated" (JIT compiled) to a function that runs with no overhead. 27 28 Some functions/labels may force themselves to be JIT compiled or 29 interpreted so as to adopt a structure that could only come about from 30 being lazily evaluated on the fly. Potentially a full runtime could be 31 set-up to allow for AOT compiled lazy-thunks. 32 33 The runtime JIT process may need to have multiple steps, akin to the V8 34 JavaScript runtime. Register allocation may be heavy on resources as it 35 is a graph-colouring problem, so an initial round may not allocate 36 machine registers optimally, or may even run interpreted. 37 38 Whilst the standard implementation will only really be a psuedo-runtime, 39 the initial implementation will simply be an interpreted VM. 40 41 To be clear, this project is defining a VM target, a VM interpreter, and 42 a VM compiler with JIT/AOT. 43 44 Given the stack-register hybrid approach, temporaries will be stored in 45 one of 2 places to be determined at runtime: 1. A dynamic array of 46 registers that is determined at VM startup/compile-time to match the 47 host architecture. This is the "memory" in memory machine. 2. The 48 regular stack. 3. The heap. 49 50 Given the simplicity, beauty, and popularity of RISC machines, the 51 instruction set of RemedyVM will be very simple and should be similar 52 to ARM, MIPS, and other RISC instruction sets. 53 54 RemedyVM should be as platform agnostic as possible, though it's hard 55 to get a "one implementation fits all", the underlying platform 56 assumptions should try to be as minimal as possible. 57 58 Given that RemedyVM is inherently not a very stable target, compilers 59 do not have much control over what goes into registers, and what goes 60 onto the stack. This is the hardest problem that this project aims to 61 solve. This issue is discussed in more depth in the 62 [[#unstable-target-solutions][unstable target section]], but 63 the solution picked by RemedyVM is to have a /memory machine/ RISC 64 architecture. 65 66 ** Unstable compiler target solutions 67 :PROPERTIES: 68 :CUSTOM_ID: unstable-target-solutions 69 :END: 70 Given the virtual machine decides the number of registers to allocate at 71 run/comptime, compilers have an unstable target! 72 73 How can a compiler be sure that an architecture will be able to put 20 74 variables into registers? It can't! 75 76 Some solutions: 1. Treat all temporaries as being in an ordered dynamic 77 list, the runtime register allocation takes the \(K\) lowest temporaries 78 and puts them into registers. (Where \(K\) is the number of registers 79 for a target) - This is the method used in the =dis= VM by Bell Labs, 80 they use a very similar Memory Machine architecture. 2. Leave it up to 81 the compiler, any excess "registers" get treated as temporaries. - This 82 would be bad for targets that have too few or many registers. 3. 83 Stabilise the VM target and just pick a number of registers. - The JVM 84 takes this to the extreme and has only machine registers, PC registers, 85 and the stack. 4. Restrict compilers to just use a stack, the JIT will 86 decide on registers. - This would make RemedyVM a simple stack-based 87 VM 88 89 It seems from these that the best solution is solution 1, to go with an 90 infinite-register hybrid VM, or rather a Memory machine. Unlike =dis=, 91 RemedyVM will be a RISC-like architecture for ease of instruction 92 selection. RemedyVM is thus a /memory virtual machine/ with stack & 93 heap components for records that might not fit well into the 94 architecture. 95 96 ** Goals 97 - [ ] Lay-out the basic instruction set 98 - [ ] Move 99 - [ ] Stack 100 - [ ] Heap 101 - [ ] Arithmetic 102 - [ ] Logical 103 - [ ] Jumps 104 - [ ] Conditional instructions 105 - [ ] All instructions should be conditional 106 - [ ] Implement an interpreter 107 - [ ] Compilation 108 - [ ] Assembly generation 109 - [ ] Linking 110 - [ ] Static linking 111 - [ ] Dynamic linking (?) 112 - /Note: Linking will probably just be done with the regular gcc 113 tooling, at least initially./ 114 - [ ] Optimise instruction selection for RISC platforms 115 - [ ] Optimise instruction selection for x86 platforms 116 - [ ] Implement AOT compiler 117 - [ ] Link to libc etc 118 - [ ] Implement JIT compiler 119 - [ ] Runtime implementation (?) 120 - [ ] Lazily compiled thunk implementation 121 - [ ] Implement a basic compiler targeting RemedyVM 122 - [ ] Basic lisp-like LL(1) language 123 - [ ] Simpler Lisp-Haskell hybrid 124 - [ ] Mojo 125 - [ ] Tests 126 - [ ] Unit VM tests 127 - [ ] Fibonacci 128 - [ ] Factorial 129 - [ ] Mandelbrot 130 - [ ] Game of life 131 - [ ] Memory tests 132 - [ ] Arithmetic tests 133 - [ ] Cross-Platform 134 - [ ] Linux 135 - [ ] Plan9(front) 136 - [ ] OpenBSD 137 - [ ] MacOS 138 - [ ] Windows 139 - [ ] Documentation (ugh) 140 - [ ] Design doc 141 - [ ] Target spec 142 - [ ] Implementation languages 143 - [ ] C 144 - The most cross-platform, will support every target 145 - [ ] Haskell 146 - Will support most targets 147 - [ ] Rust (?) 148 - [ ] Go (?) 149 - [ ] Zig (?) 150 - [ ] Java (?) 151 - Icky, but also have already written /some/ of a compiler in Java. 152 153 *** Note 154 - This is a personal project, and is not going to be a serious virtual 155 machine. 156 - It is just experience, my first virtual machine! 157 - The more advanced features may not ever be implemented.