remedyvm

A toy RISC virtual machine inspired by Bell Lab's `dis' and Tsoding's `bm'
git clone git://git.ethandl.dev/remedyvm
Log | Files | Refs

commit 183dd716e0fd3c4de87c10b31eedbe84a499b5e4
parent 2052a724daca6d1fc71fa71b3872144d87cdd24e
Author: Ethan Long <Ethan.Long@anu.edu.au>
Date:   Wed,  7 Jun 2023 23:01:33 +1000

Gaming

Diffstat:
Doutline.md | 158-------------------------------------------------------------------------------
Doutline.org | 155-------------------------------------------------------------------------------
Areadme.md | 160+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Areadme.org | 157+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 317 insertions(+), 313 deletions(-)

diff --git a/outline.md b/outline.md @@ -1,158 +0,0 @@ -# RemedyVM, the register-memory dynamic virtual machine - -## Overview - -RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT -compiler, a JIT compiler, and a simple interpreter. - -The memory/stack/heap approach allows for register allocation -optimisations at run/compile-time, whilst still offering compilers -targeting RemedyVM the option to put memory into a stack or the heap. - -A memory machine (MM) is a machine with \"infinite\" registers, each -corresponding with a temporary value. Much like a register machine, -temporaries can easily occupy real registers in JIT compilation, but -much like a stack machine, the registers are in a growing list. - -There are currently no plans for garbage collection, but if it is to be -implemented in the future, it will likely be done similarly to Bell -Labs\' `dis`{.verbatim} VM, taking a hybrid reference count/mark and -sweep approach. According to Winterbottom and Pike, the memory machine -architecture makes this easier to accomplish. - -The bytecode should be compilable straight to machine code ahead of time -(AOT), removing *any* runtime in the process. The JIT compiler will need -a psuedo-runtime, all functions are *thunks* until first run, then they -are \"evaluated\" (JIT compiled) to a function that runs with no -overhead. - -Some functions/labels may force themselves to be JIT compiled or -interpreted so as to adopt a structure that could only come about from -being lazily evaluated on the fly. Potentially a full runtime could be -set-up to allow for AOT compiled lazy-thunks. - -The runtime JIT process may need to have multiple steps, akin to the V8 -JavaScript runtime. Register allocation may be heavy on resources as it -is a graph-colouring problem, so an initial round may not allocate -machine registers optimally, or may even run interpreted. - -Whilst the standard implementation will only really be a psuedo-runtime, -the initial implementation will simply be an interpreted VM. - -To be clear, this project is defining a VM target, a VM interpreter, and -a VM compiler with JIT/AOT. - -Given the stack-register hybrid approach, temporaries will be stored in -one of 2 places to be determined at runtime: 1. A dynamic array of -registers that is determined at VM startup/compile-time to match the -host architecture. This is the \"memory\" in memory machine. 2. The -regular stack. 3. The heap. - -Given the simplicity, beauty, and popularity of RISC machines, the -instruction set of RemedyVM will be very simple and should be similar to -ARM, MIPS, and other RISC instruction sets. - -RemedyVM should be as platform agnostic as possible, though it\'s hard -to get a \"one implementation fits all\", the underlying platform -assumptions should try to be as minimal as possible. - -Given that RemedyVM is inherently not a very stable target, compilers do -not have much control over what goes into registers, and what goes onto -the stack. This is the hardest problem that this project aims to solve. -This issue is discussed in more depth in the [unstable target -section](#unstable-target-solutions), but the solution picked by -RemedyVM is to have a *memory machine* RISC architecture. - -## Unstable compiler target solutions {#unstable-target-solutions} - -Given the virtual machine decides the number of registers to allocate at -run/comptime, compilers have an unstable target! - -How can a compiler be sure that an architecture will be able to put 20 -variables into registers? It can\'t! - -Some solutions: 1. Treat all temporaries as being in an ordered dynamic -list, the runtime register allocation takes the $K$ lowest temporaries -and puts them into registers. (Where $K$ is the number of registers for -a target) - This is the method used in the `dis`{.verbatim} VM by Bell -Labs, they use a very similar Memory Machine architecture. 2. Leave it -up to the compiler, any excess \"registers\" get treated as -temporaries. - This would be bad for targets that have too few or many -registers. 3. Stabilise the VM target and just pick a number of -registers. - The JVM takes this to the extreme and has only machine -registers, PC registers, and the stack. 4. Restrict compilers to just -use a stack, the JIT will decide on registers. - This would make -RemedyVM a simple stack-based VM - -It seems from these that the best solution is solution 1, to go with an -infinite-register hybrid VM, or rather a Memory machine. Unlike -`dis`{.verbatim}, RemedyVM will be a RISC-like architecture for ease of -instruction selection. RemedyVM is thus a *memory virtual machine* with -stack & heap components for records that might not fit well into the -architecture. - -## Goals - -- [ ] Lay-out the basic instruction set - - [ ] Move - - [ ] Stack - - [ ] Heap - - [ ] Arithmetic - - [ ] Logical - - [ ] Jumps - - [ ] Conditional instructions - - [ ] All instructions should be conditional -- [ ] Implement an interpreter -- [ ] Compilation - - [ ] Assembly generation - - [ ] Linking - - [ ] Static linking - - [ ] Dynamic linking (?) - - *Note: Linking will probably just be done with the regular - gcc tooling, at least initially.* - - [ ] Optimise instruction selection for RISC platforms - - [ ] Optimise instruction selection for x86 platforms - - [ ] Implement AOT compiler - - [ ] Link to libc etc - - [ ] Implement JIT compiler - - [ ] Runtime implementation (?) - - [ ] Lazily compiled thunk implementation -- [ ] Implement a basic compiler targeting RemedyVM - - [ ] Basic lisp-like LL(1) language - - [ ] Simpler Lisp-Haskell hybrid - - [ ] Mojo -- [ ] Tests - - [ ] Unit VM tests - - [ ] Fibonacci - - [ ] Factorial - - [ ] Mandelbrot - - [ ] Game of life - - [ ] Memory tests - - [ ] Arithmetic tests -- [ ] Cross-Platform - - [ ] Linux - - [ ] Plan9(front) - - [ ] OpenBSD - - [ ] MacOS - - [ ] Windows -- [ ] Documentation (ugh) - - [ ] Design doc - - [ ] Target spec -- [ ] Implementation languages - - [ ] C - - The most cross-platform, will support every target - - [ ] Haskell - - Will support most targets - - [ ] Rust (?) - - [ ] Go (?) - - [ ] Zig (?) - - [ ] Java (?) - - Icky, but also have already written *some* of a compiler in - Java. - -### Note - -- This is a personal project, and is not going to be a serious virtual - machine. -- It is just experience, my first virtual machine! -- The more advanced features may not ever be implemented. diff --git a/outline.org b/outline.org @@ -1,155 +0,0 @@ -* RemedyVM, the register-memory dynamic virtual machine -** Overview -RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT -compiler, a JIT compiler, and a simple interpreter. - -The memory/stack/heap approach allows for register allocation -optimisations at run/compile-time, whilst still offering compilers -targeting RemedyVM the option to put memory into a stack or the heap. - -A memory machine (MM) is a machine with "infinite" registers, each -corresponding with a temporary value. Much like a register machine, -temporaries can easily occupy real registers in JIT compilation, but -much like a stack machine, the registers are in a growing list. - -There are currently no plans for garbage collection, but if it is to be -implemented in the future, it will likely be done similarly to Bell -Labs' =dis= VM, taking a hybrid reference count/mark and sweep approach. -According to Winterbottom and Pike, the memory machine architecture -makes this easier to accomplish. - -The bytecode should be compilable straight to machine code ahead of time -(AOT), removing /any/ runtime in the process. The JIT compiler will need -a psuedo-runtime, all functions are /thunks/ until first run, then they -are "evaluated" (JIT compiled) to a function that runs with no overhead. - -Some functions/labels may force themselves to be JIT compiled or -interpreted so as to adopt a structure that could only come about from -being lazily evaluated on the fly. Potentially a full runtime could be -set-up to allow for AOT compiled lazy-thunks. - -The runtime JIT process may need to have multiple steps, akin to the V8 -JavaScript runtime. Register allocation may be heavy on resources as it -is a graph-colouring problem, so an initial round may not allocate -machine registers optimally, or may even run interpreted. - -Whilst the standard implementation will only really be a psuedo-runtime, -the initial implementation will simply be an interpreted VM. - -To be clear, this project is defining a VM target, a VM interpreter, and -a VM compiler with JIT/AOT. - -Given the stack-register hybrid approach, temporaries will be stored in -one of 2 places to be determined at runtime: 1. A dynamic array of -registers that is determined at VM startup/compile-time to match the -host architecture. This is the "memory" in memory machine. 2. The -regular stack. 3. The heap. - -Given the simplicity, beauty, and popularity of RISC machines, the -instruction set of RemedyVM will be very simple and should be similar -to ARM, MIPS, and other RISC instruction sets. - -RemedyVM should be as platform agnostic as possible, though it's hard -to get a "one implementation fits all", the underlying platform -assumptions should try to be as minimal as possible. - -Given that RemedyVM is inherently not a very stable target, compilers -do not have much control over what goes into registers, and what goes -onto the stack. This is the hardest problem that this project aims to -solve. This issue is discussed in more depth in the -[[#unstable-target-solutions][unstable target section]], but -the solution picked by RemedyVM is to have a /memory machine/ RISC -architecture. - -** Unstable compiler target solutions -:PROPERTIES: -:CUSTOM_ID: unstable-target-solutions -:END: -Given the virtual machine decides the number of registers to allocate at -run/comptime, compilers have an unstable target! - -How can a compiler be sure that an architecture will be able to put 20 -variables into registers? It can't! - -Some solutions: 1. Treat all temporaries as being in an ordered dynamic -list, the runtime register allocation takes the \(K\) lowest temporaries -and puts them into registers. (Where \(K\) is the number of registers -for a target) - This is the method used in the =dis= VM by Bell Labs, -they use a very similar Memory Machine architecture. 2. Leave it up to -the compiler, any excess "registers" get treated as temporaries. - This -would be bad for targets that have too few or many registers. 3. -Stabilise the VM target and just pick a number of registers. - The JVM -takes this to the extreme and has only machine registers, PC registers, -and the stack. 4. Restrict compilers to just use a stack, the JIT will -decide on registers. - This would make RemedyVM a simple stack-based -VM - -It seems from these that the best solution is solution 1, to go with an -infinite-register hybrid VM, or rather a Memory machine. Unlike =dis=, -RemedyVM will be a RISC-like architecture for ease of instruction -selection. RemedyVM is thus a /memory virtual machine/ with stack & -heap components for records that might not fit well into the -architecture. - -** Goals -- [ ] Lay-out the basic instruction set - - [ ] Move - - [ ] Stack - - [ ] Heap - - [ ] Arithmetic - - [ ] Logical - - [ ] Jumps - - [ ] Conditional instructions - - [ ] All instructions should be conditional -- [ ] Implement an interpreter -- [ ] Compilation - - [ ] Assembly generation - - [ ] Linking - - [ ] Static linking - - [ ] Dynamic linking (?) - - /Note: Linking will probably just be done with the regular gcc - tooling, at least initially./ - - [ ] Optimise instruction selection for RISC platforms - - [ ] Optimise instruction selection for x86 platforms - - [ ] Implement AOT compiler - - [ ] Link to libc etc - - [ ] Implement JIT compiler - - [ ] Runtime implementation (?) - - [ ] Lazily compiled thunk implementation -- [ ] Implement a basic compiler targeting RemedyVM - - [ ] Basic lisp-like LL(1) language - - [ ] Simpler Lisp-Haskell hybrid - - [ ] Mojo -- [ ] Tests - - [ ] Unit VM tests - - [ ] Fibonacci - - [ ] Factorial - - [ ] Mandelbrot - - [ ] Game of life - - [ ] Memory tests - - [ ] Arithmetic tests -- [ ] Cross-Platform - - [ ] Linux - - [ ] Plan9(front) - - [ ] OpenBSD - - [ ] MacOS - - [ ] Windows -- [ ] Documentation (ugh) - - [ ] Design doc - - [ ] Target spec -- [ ] Implementation languages - - [ ] C - - The most cross-platform, will support every target - - [ ] Haskell - - Will support most targets - - [ ] Rust (?) - - [ ] Go (?) - - [ ] Zig (?) - - [ ] Java (?) - - Icky, but also have already written /some/ of a compiler in Java. - -*** Note -- This is a personal project, and is not going to be a serious virtual - machine. -- It is just experience, my first virtual machine! -- The more advanced features may not ever be implemented. diff --git a/readme.md b/readme.md @@ -0,0 +1,160 @@ +# RemedyVM, the register-memory dynamic virtual machine + +**NOTE:** This is a personal project for funzies pls don\'t bully. + +## Overview + +RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT +compiler, a JIT compiler, and a simple interpreter. + +The memory/stack/heap approach allows for register allocation +optimisations at run/compile-time, whilst still offering compilers +targeting RemedyVM the option to put memory into a stack or the heap. + +A memory machine (MM) is a machine with \"infinite\" registers, each +corresponding with a temporary value. Much like a register machine, +temporaries can easily occupy real registers in JIT compilation, but +much like a stack machine, the registers are in a growing list. + +There are currently no plans for garbage collection, but if it is to be +implemented in the future, it will likely be done similarly to Bell +Labs\' `dis`{.verbatim} VM, taking a hybrid reference count/mark and +sweep approach. According to Winterbottom and Pike, the memory machine +architecture makes this easier to accomplish. + +The bytecode should be compilable straight to machine code ahead of time +(AOT), removing *any* runtime in the process. The JIT compiler will need +a psuedo-runtime, all functions are *thunks* until first run, then they +are \"evaluated\" (JIT compiled) to a function that runs with no +overhead. + +Some functions/labels may force themselves to be JIT compiled or +interpreted so as to adopt a structure that could only come about from +being lazily evaluated on the fly. Potentially a full runtime could be +set-up to allow for AOT compiled lazy-thunks. + +The runtime JIT process may need to have multiple steps, akin to the V8 +JavaScript runtime. Register allocation may be heavy on resources as it +is a graph-colouring problem, so an initial round may not allocate +machine registers optimally, or may even run interpreted. + +Whilst the standard implementation will only really be a psuedo-runtime, +the initial implementation will simply be an interpreted VM. + +To be clear, this project is defining a VM target, a VM interpreter, and +a VM compiler with JIT/AOT. + +Given the stack-register hybrid approach, temporaries will be stored in +one of 2 places to be determined at runtime: 1. A dynamic array of +registers that is determined at VM startup/compile-time to match the +host architecture. This is the \"memory\" in memory machine. 2. The +regular stack. 3. The heap. + +Given the simplicity, beauty, and popularity of RISC machines, the +instruction set of RemedyVM will be very simple and should be similar to +ARM, MIPS, and other RISC instruction sets. + +RemedyVM should be as platform agnostic as possible, though it\'s hard +to get a \"one implementation fits all\", the underlying platform +assumptions should try to be as minimal as possible. + +Given that RemedyVM is inherently not a very stable target, compilers do +not have much control over what goes into registers, and what goes onto +the stack. This is the hardest problem that this project aims to solve. +This issue is discussed in more depth in the [unstable target +section](#unstable-target-solutions), but the solution picked by +RemedyVM is to have a *memory machine* RISC architecture. + +## Unstable compiler target solutions {#unstable-target-solutions} + +Given the virtual machine decides the number of registers to allocate at +run/comptime, compilers have an unstable target! + +How can a compiler be sure that an architecture will be able to put 20 +variables into registers? It can\'t! + +Some solutions: 1. Treat all temporaries as being in an ordered dynamic +list, the runtime register allocation takes the $K$ lowest temporaries +and puts them into registers. (Where $K$ is the number of registers for +a target) - This is the method used in the `dis`{.verbatim} VM by Bell +Labs, they use a very similar Memory Machine architecture. 2. Leave it +up to the compiler, any excess \"registers\" get treated as +temporaries. - This would be bad for targets that have too few or many +registers. 3. Stabilise the VM target and just pick a number of +registers. - The JVM takes this to the extreme and has only machine +registers, PC registers, and the stack. 4. Restrict compilers to just +use a stack, the JIT will decide on registers. - This would make +RemedyVM a simple stack-based VM + +It seems from these that the best solution is solution 1, to go with an +infinite-register hybrid VM, or rather a Memory machine. Unlike +`dis`{.verbatim}, RemedyVM will be a RISC-like architecture for ease of +instruction selection. RemedyVM is thus a *memory virtual machine* with +stack & heap components for records that might not fit well into the +architecture. + +## Goals + +- [ ] Lay-out the basic instruction set + - [ ] Move + - [ ] Stack + - [ ] Heap + - [ ] Arithmetic + - [ ] Logical + - [ ] Jumps + - [ ] Conditional instructions + - [ ] All instructions should be conditional +- [ ] Implement an interpreter +- [ ] Compilation + - [ ] Assembly generation + - [ ] Linking + - [ ] Static linking + - [ ] Dynamic linking (?) + - *Note: Linking will probably just be done with the regular + gcc tooling, at least initially.* + - [ ] Optimise instruction selection for RISC platforms + - [ ] Optimise instruction selection for x86 platforms + - [ ] Implement AOT compiler + - [ ] Link to libc etc + - [ ] Implement JIT compiler + - [ ] Runtime implementation (?) + - [ ] Lazily compiled thunk implementation +- [ ] Implement a basic compiler targeting RemedyVM + - [ ] Basic lisp-like LL(1) language + - [ ] Simpler Lisp-Haskell hybrid + - [ ] Mojo +- [ ] Tests + - [ ] Unit VM tests + - [ ] Fibonacci + - [ ] Factorial + - [ ] Mandelbrot + - [ ] Game of life + - [ ] Memory tests + - [ ] Arithmetic tests +- [ ] Cross-Platform + - [ ] Linux + - [ ] Plan9(front) + - [ ] OpenBSD + - [ ] MacOS + - [ ] Windows +- [ ] Documentation (ugh) + - [ ] Design doc + - [ ] Target spec +- [ ] Implementation languages + - [ ] C + - The most cross-platform, will support every target + - [ ] Haskell + - Will support most targets + - [ ] Rust (?) + - [ ] Go (?) + - [ ] Zig (?) + - [ ] Java (?) + - Icky, but also have already written *some* of a compiler in + Java. + +### Note + +- This is a personal project, and is not going to be a serious virtual + machine. +- It is just experience, my first virtual machine! +- The more advanced features may not ever be implemented. diff --git a/readme.org b/readme.org @@ -0,0 +1,157 @@ +* RemedyVM, the register-memory dynamic virtual machine +*NOTE:* This is a personal project for funzies pls don't bully. + +** Overview +RemedyVM will be a (memory↔stack↔heap)-based VM target, with an AOT +compiler, a JIT compiler, and a simple interpreter. + +The memory/stack/heap approach allows for register allocation +optimisations at run/compile-time, whilst still offering compilers +targeting RemedyVM the option to put memory into a stack or the heap. + +A memory machine (MM) is a machine with "infinite" registers, each +corresponding with a temporary value. Much like a register machine, +temporaries can easily occupy real registers in JIT compilation, but +much like a stack machine, the registers are in a growing list. + +There are currently no plans for garbage collection, but if it is to be +implemented in the future, it will likely be done similarly to Bell +Labs' =dis= VM, taking a hybrid reference count/mark and sweep approach. +According to Winterbottom and Pike, the memory machine architecture +makes this easier to accomplish. + +The bytecode should be compilable straight to machine code ahead of time +(AOT), removing /any/ runtime in the process. The JIT compiler will need +a psuedo-runtime, all functions are /thunks/ until first run, then they +are "evaluated" (JIT compiled) to a function that runs with no overhead. + +Some functions/labels may force themselves to be JIT compiled or +interpreted so as to adopt a structure that could only come about from +being lazily evaluated on the fly. Potentially a full runtime could be +set-up to allow for AOT compiled lazy-thunks. + +The runtime JIT process may need to have multiple steps, akin to the V8 +JavaScript runtime. Register allocation may be heavy on resources as it +is a graph-colouring problem, so an initial round may not allocate +machine registers optimally, or may even run interpreted. + +Whilst the standard implementation will only really be a psuedo-runtime, +the initial implementation will simply be an interpreted VM. + +To be clear, this project is defining a VM target, a VM interpreter, and +a VM compiler with JIT/AOT. + +Given the stack-register hybrid approach, temporaries will be stored in +one of 2 places to be determined at runtime: 1. A dynamic array of +registers that is determined at VM startup/compile-time to match the +host architecture. This is the "memory" in memory machine. 2. The +regular stack. 3. The heap. + +Given the simplicity, beauty, and popularity of RISC machines, the +instruction set of RemedyVM will be very simple and should be similar +to ARM, MIPS, and other RISC instruction sets. + +RemedyVM should be as platform agnostic as possible, though it's hard +to get a "one implementation fits all", the underlying platform +assumptions should try to be as minimal as possible. + +Given that RemedyVM is inherently not a very stable target, compilers +do not have much control over what goes into registers, and what goes +onto the stack. This is the hardest problem that this project aims to +solve. This issue is discussed in more depth in the +[[#unstable-target-solutions][unstable target section]], but +the solution picked by RemedyVM is to have a /memory machine/ RISC +architecture. + +** Unstable compiler target solutions +:PROPERTIES: +:CUSTOM_ID: unstable-target-solutions +:END: +Given the virtual machine decides the number of registers to allocate at +run/comptime, compilers have an unstable target! + +How can a compiler be sure that an architecture will be able to put 20 +variables into registers? It can't! + +Some solutions: 1. Treat all temporaries as being in an ordered dynamic +list, the runtime register allocation takes the \(K\) lowest temporaries +and puts them into registers. (Where \(K\) is the number of registers +for a target) - This is the method used in the =dis= VM by Bell Labs, +they use a very similar Memory Machine architecture. 2. Leave it up to +the compiler, any excess "registers" get treated as temporaries. - This +would be bad for targets that have too few or many registers. 3. +Stabilise the VM target and just pick a number of registers. - The JVM +takes this to the extreme and has only machine registers, PC registers, +and the stack. 4. Restrict compilers to just use a stack, the JIT will +decide on registers. - This would make RemedyVM a simple stack-based +VM + +It seems from these that the best solution is solution 1, to go with an +infinite-register hybrid VM, or rather a Memory machine. Unlike =dis=, +RemedyVM will be a RISC-like architecture for ease of instruction +selection. RemedyVM is thus a /memory virtual machine/ with stack & +heap components for records that might not fit well into the +architecture. + +** Goals +- [ ] Lay-out the basic instruction set + - [ ] Move + - [ ] Stack + - [ ] Heap + - [ ] Arithmetic + - [ ] Logical + - [ ] Jumps + - [ ] Conditional instructions + - [ ] All instructions should be conditional +- [ ] Implement an interpreter +- [ ] Compilation + - [ ] Assembly generation + - [ ] Linking + - [ ] Static linking + - [ ] Dynamic linking (?) + - /Note: Linking will probably just be done with the regular gcc + tooling, at least initially./ + - [ ] Optimise instruction selection for RISC platforms + - [ ] Optimise instruction selection for x86 platforms + - [ ] Implement AOT compiler + - [ ] Link to libc etc + - [ ] Implement JIT compiler + - [ ] Runtime implementation (?) + - [ ] Lazily compiled thunk implementation +- [ ] Implement a basic compiler targeting RemedyVM + - [ ] Basic lisp-like LL(1) language + - [ ] Simpler Lisp-Haskell hybrid + - [ ] Mojo +- [ ] Tests + - [ ] Unit VM tests + - [ ] Fibonacci + - [ ] Factorial + - [ ] Mandelbrot + - [ ] Game of life + - [ ] Memory tests + - [ ] Arithmetic tests +- [ ] Cross-Platform + - [ ] Linux + - [ ] Plan9(front) + - [ ] OpenBSD + - [ ] MacOS + - [ ] Windows +- [ ] Documentation (ugh) + - [ ] Design doc + - [ ] Target spec +- [ ] Implementation languages + - [ ] C + - The most cross-platform, will support every target + - [ ] Haskell + - Will support most targets + - [ ] Rust (?) + - [ ] Go (?) + - [ ] Zig (?) + - [ ] Java (?) + - Icky, but also have already written /some/ of a compiler in Java. + +*** Note +- This is a personal project, and is not going to be a serious virtual + machine. +- It is just experience, my first virtual machine! +- The more advanced features may not ever be implemented.