remedyvm

A toy RISC virtual machine inspired by Bell Lab's `dis' and Tsoding's `bm'
git clone git://git.ethandl.dev/remedyvm
Log | Files | Refs

commit 2e787e6068fbc21faf2f15e9369cf4fa497993ad
parent 183dd716e0fd3c4de87c10b31eedbe84a499b5e4
Author: Ethan Long <Ethan.Long@anu.edu.au>
Date:   Wed,  7 Jun 2023 23:04:10 +1000

Updated readme.md with the github markdown

Diffstat:
Mreadme.md | 170+++++++++++++++++++++++++++++++++++++++----------------------------------------
1 file changed, 83 insertions(+), 87 deletions(-)

diff --git a/readme.md b/readme.md @@ -1,6 +1,6 @@ # RemedyVM, the register-memory dynamic virtual machine -**NOTE:** This is a personal project for funzies pls don\'t bully. +**NOTE:** This is a personal project for funzies pls don't bully. ## Overview @@ -11,22 +11,21 @@ The memory/stack/heap approach allows for register allocation optimisations at run/compile-time, whilst still offering compilers targeting RemedyVM the option to put memory into a stack or the heap. -A memory machine (MM) is a machine with \"infinite\" registers, each +A memory machine (MM) is a machine with "infinite" registers, each corresponding with a temporary value. Much like a register machine, temporaries can easily occupy real registers in JIT compilation, but much like a stack machine, the registers are in a growing list. There are currently no plans for garbage collection, but if it is to be implemented in the future, it will likely be done similarly to Bell -Labs\' `dis`{.verbatim} VM, taking a hybrid reference count/mark and -sweep approach. According to Winterbottom and Pike, the memory machine -architecture makes this easier to accomplish. +Labs' `dis` VM, taking a hybrid reference count/mark and sweep approach. +According to Winterbottom and Pike, the memory machine architecture +makes this easier to accomplish. The bytecode should be compilable straight to machine code ahead of time (AOT), removing *any* runtime in the process. The JIT compiler will need a psuedo-runtime, all functions are *thunks* until first run, then they -are \"evaluated\" (JIT compiled) to a function that runs with no -overhead. +are "evaluated" (JIT compiled) to a function that runs with no overhead. Some functions/labels may force themselves to be JIT compiled or interpreted so as to adopt a structure that could only come about from @@ -47,16 +46,16 @@ a VM compiler with JIT/AOT. Given the stack-register hybrid approach, temporaries will be stored in one of 2 places to be determined at runtime: 1. A dynamic array of registers that is determined at VM startup/compile-time to match the -host architecture. This is the \"memory\" in memory machine. 2. The +host architecture. This is the "memory" in memory machine. 2. The regular stack. 3. The heap. Given the simplicity, beauty, and popularity of RISC machines, the instruction set of RemedyVM will be very simple and should be similar to ARM, MIPS, and other RISC instruction sets. -RemedyVM should be as platform agnostic as possible, though it\'s hard -to get a \"one implementation fits all\", the underlying platform -assumptions should try to be as minimal as possible. +RemedyVM should be as platform agnostic as possible, though it's hard to +get a "one implementation fits all", the underlying platform assumptions +should try to be as minimal as possible. Given that RemedyVM is inherently not a very stable target, compilers do not have much control over what goes into registers, and what goes onto @@ -65,96 +64,93 @@ This issue is discussed in more depth in the [unstable target section](#unstable-target-solutions), but the solution picked by RemedyVM is to have a *memory machine* RISC architecture. -## Unstable compiler target solutions {#unstable-target-solutions} +## Unstable compiler target solutions Given the virtual machine decides the number of registers to allocate at run/comptime, compilers have an unstable target! How can a compiler be sure that an architecture will be able to put 20 -variables into registers? It can\'t! +variables into registers? It can't! Some solutions: 1. Treat all temporaries as being in an ordered dynamic list, the runtime register allocation takes the $K$ lowest temporaries and puts them into registers. (Where $K$ is the number of registers for -a target) - This is the method used in the `dis`{.verbatim} VM by Bell -Labs, they use a very similar Memory Machine architecture. 2. Leave it -up to the compiler, any excess \"registers\" get treated as -temporaries. - This would be bad for targets that have too few or many -registers. 3. Stabilise the VM target and just pick a number of -registers. - The JVM takes this to the extreme and has only machine -registers, PC registers, and the stack. 4. Restrict compilers to just -use a stack, the JIT will decide on registers. - This would make -RemedyVM a simple stack-based VM +a target) - This is the method used in the `dis` VM by Bell Labs, they +use a very similar Memory Machine architecture. 2. Leave it up to the +compiler, any excess "registers" get treated as temporaries. - This +would be bad for targets that have too few or many registers. 3. +Stabilise the VM target and just pick a number of registers. - The JVM +takes this to the extreme and has only machine registers, PC registers, +and the stack. 4. Restrict compilers to just use a stack, the JIT will +decide on registers. - This would make RemedyVM a simple stack-based VM It seems from these that the best solution is solution 1, to go with an -infinite-register hybrid VM, or rather a Memory machine. Unlike -`dis`{.verbatim}, RemedyVM will be a RISC-like architecture for ease of -instruction selection. RemedyVM is thus a *memory virtual machine* with -stack & heap components for records that might not fit well into the -architecture. +infinite-register hybrid VM, or rather a Memory machine. Unlike `dis`, +RemedyVM will be a RISC-like architecture for ease of instruction +selection. RemedyVM is thus a *memory virtual machine* with stack & heap +components for records that might not fit well into the architecture. ## Goals -- [ ] Lay-out the basic instruction set - - [ ] Move - - [ ] Stack - - [ ] Heap - - [ ] Arithmetic - - [ ] Logical - - [ ] Jumps - - [ ] Conditional instructions - - [ ] All instructions should be conditional -- [ ] Implement an interpreter -- [ ] Compilation - - [ ] Assembly generation - - [ ] Linking - - [ ] Static linking - - [ ] Dynamic linking (?) - - *Note: Linking will probably just be done with the regular - gcc tooling, at least initially.* - - [ ] Optimise instruction selection for RISC platforms - - [ ] Optimise instruction selection for x86 platforms - - [ ] Implement AOT compiler - - [ ] Link to libc etc - - [ ] Implement JIT compiler - - [ ] Runtime implementation (?) - - [ ] Lazily compiled thunk implementation -- [ ] Implement a basic compiler targeting RemedyVM - - [ ] Basic lisp-like LL(1) language - - [ ] Simpler Lisp-Haskell hybrid - - [ ] Mojo -- [ ] Tests - - [ ] Unit VM tests - - [ ] Fibonacci - - [ ] Factorial - - [ ] Mandelbrot - - [ ] Game of life - - [ ] Memory tests - - [ ] Arithmetic tests -- [ ] Cross-Platform - - [ ] Linux - - [ ] Plan9(front) - - [ ] OpenBSD - - [ ] MacOS - - [ ] Windows -- [ ] Documentation (ugh) - - [ ] Design doc - - [ ] Target spec -- [ ] Implementation languages - - [ ] C - - The most cross-platform, will support every target - - [ ] Haskell - - Will support most targets - - [ ] Rust (?) - - [ ] Go (?) - - [ ] Zig (?) - - [ ] Java (?) - - Icky, but also have already written *some* of a compiler in - Java. +- [ ] Lay-out the basic instruction set + - [ ] Move + - [ ] Stack + - [ ] Heap + - [ ] Arithmetic + - [ ] Logical + - [ ] Jumps + - [ ] Conditional instructions + - [ ] All instructions should be conditional +- [ ] Implement an interpreter +- [ ] Compilation + - [ ] Assembly generation + - [ ] Linking + - [ ] Static linking + - [ ] Dynamic linking (?) + - *Note: Linking will probably just be done with the regular gcc + tooling, at least initially.* + - [ ] Optimise instruction selection for RISC platforms + - [ ] Optimise instruction selection for x86 platforms + - [ ] Implement AOT compiler + - [ ] Link to libc etc + - [ ] Implement JIT compiler + - [ ] Runtime implementation (?) + - [ ] Lazily compiled thunk implementation +- [ ] Implement a basic compiler targeting RemedyVM + - [ ] Basic lisp-like LL(1) language + - [ ] Simpler Lisp-Haskell hybrid + - [ ] Mojo +- [ ] Tests + - [ ] Unit VM tests + - [ ] Fibonacci + - [ ] Factorial + - [ ] Mandelbrot + - [ ] Game of life + - [ ] Memory tests + - [ ] Arithmetic tests +- [ ] Cross-Platform + - [ ] Linux + - [ ] Plan9(front) + - [ ] OpenBSD + - [ ] MacOS + - [ ] Windows +- [ ] Documentation (ugh) + - [ ] Design doc + - [ ] Target spec +- [ ] Implementation languages + - [ ] C + - The most cross-platform, will support every target + - [ ] Haskell + - Will support most targets + - [ ] Rust (?) + - [ ] Go (?) + - [ ] Zig (?) + - [ ] Java (?) + - Icky, but also have already written *some* of a compiler in Java. ### Note -- This is a personal project, and is not going to be a serious virtual - machine. -- It is just experience, my first virtual machine! -- The more advanced features may not ever be implemented. +- This is a personal project, and is not going to be a serious virtual + machine. +- It is just experience, my first virtual machine! +- The more advanced features may not ever be implemented.