Jooser (Java Compiler)
About
Jooser is a compiler for the Joos language. The Joos language is essentially Java 1.3 without exceptions. The compiler outputs x86 assembly and includes a command line interface that uses an external assembler to generates executables.
Being a school project, the compiler does not use any libraries or frameworks (other than the Python standard library). To avoid cheating, the code and the documentation are not publicly available; however, you can contact me and I'll be happy to give you access (assuming you're not a UW student currently taking CS 444).
Extra Features
delta = { ("q0", NFA.EPSILON_CHAR): {"q1", "q2"}, ("q1", "a") : "q3", ("q2", "b") : "q3" } # create NFA that accepts char 'a' or 'b' nfa, _ = create_automata(NFA, "q0 q1 q2 q3", delta, "q0", "q3") everything = nfa.union(nfa.inverse()) # accepts everything starts = nfa.cat(everything) # anything that starts with 'a' or 'b' dfa = everything.to_dfa()
A regex language that generates DFAs that can be used with the scanner. To make the language usable, I also implemented a syntax highlighter for Vim (I know, super cool 🤓).
Each line is a regex followed by ||
and the token name associated with that regex.
\p
produces a dfa that recognizes nothing (Φ) so it's a clever way of writing comments without
extending the language.
The regex parser optimizes those out before generating the final DFA.
ast = parse(tokens) # tokens is a list of Token objects ASTNode.add_recursive_prop("environment") ast.environment = 3 ast.vchildren[0].vchildren[2].environment # produces 3 ast.vchildren[0].environment = 4 ast.vchildren[0].vchildren[1].environment # produces 4
Recursive properties that can be set on any ASTNode
and looked up recursively from child nodes.
This feature is very useful for storing structures used by the compiler (like the environment) and for generating
code.
To keep lookup of non-recursive properties efficient, only properties added using add_recursive_prop
are looked up recursively.
traverser = ASTTraverser() @traverser.rule_handler("primaryExpr -> LPAREN expression RPAREN") def _(node: ASTNode): # handle all nodes that have name "primaryExpr" # and 3 children with names "LPAREN" "expression" and "RPAREN" ... @traverser.name_handler("expression") def _(node: ASTNode): # handle all nodes with name "expression" ... # do the traversal traverser.breadth_first() traverser.depth_first()
A pythonic way of implementing tree traversals.
Each function declares the kind of grammar rule it handles and does not worry about the rest of the traversal.
Since function names don't matter in this context (because they never get called directly), I just used
"_"
for all of them.
Traverser
also supports arbitrary types of traversing the tree by subclassing
TraveralQueue
and using the traverse
function.
breadth_first
, for example, is just a shortcut for traverse(queue_type=FIFOQueue)
.