commit 5d6f2c6fea73bf85e89cf450721e52f6f89b0af5
parent 8db092141a41362e7ae6537e44f44df3a62ce0bb
Author: Ethan Long <ethandavidlong@gmail.com>
Date: Mon, 10 Jun 2024 23:36:08 +1000
Added a bit more content to the automaton writeup
Diffstat:
1 file changed, 21 insertions(+), 0 deletions(-)
diff --git a/content/projects/dfa-regex-engine/writeup.md b/content/projects/dfa-regex-engine/writeup.md
@@ -67,6 +67,27 @@ See [the automaton included](#automaton-included) for info on these data structu
This is exactly what the `regex` binary does.
+The mapping \\(R_L \mapsto \epsilon\text{NFA}^\*_L\\) is not necessarily trivial,
+the language of regular expressions is unfortunately not regular (it would have
+been really nice to describe the language of regular expressions exactly with a
+regular expression). I implemented an LL(1) predictive recursive descent
+parser[^1] to parse the regular expression into an AST, which is then converted
+into an \\(\epsilon\text{NFA}^\*_L\\) through a set of algorithms described in
+the COMP1600 lectures.
+
+# Conversion
+
+Conversion between various finite state automaton is achieved through the
+algorithms described in the lectures, with not much care for the memory
+complexity of the resulting conversion. Conversions are also not perfectly
+reversible, converting an \\(\text{NFA}\\) to a \\(\text{DFA}\\) and then back
+will just yield a \\(\text{DFA}\\) in the struct of an \\(\text{NFA}\\).
+
# Demo
See the demo I have hosted here:
{{< article link="/projects/dfa-regex-engine/demo/" >}}
+
+[^1]: In all honesty I'm not sure if it is really LL(1), I think it may
+ technically be nondeterministic given it has some notion of context. As of
+ writing, it has been months since I actually implemented it and did any
+ formal theory on grammars so I'm a bit vague.