Most of the work done in tools supporting a language consists in manipulating the AST. In this post we are going to see how to perform transformation and processing on the Abstract Syntax Tree through model-to-model transformations. These techniques will be useful to perform operation like:
- validation: finding errors in the AST
- remove syntax sugar: by transforming the AST into equivalent but more explicit forms
- fill symbol tables and resolve symbols
All these techniques will be based on generic ways to navigate and change the AST. Let’s see how.
This post is the part of a series. The goal of the series is to describe how to create a useful language and all the supporting tools.
- Building a lexer
- Building a parser
- Creating an editor with syntax highlighting
- Build an editor with autocompletion
- Mapping the parse tree to the abstract syntax tree
- Model to model transformations
- Validation
- Generating bytecode
After writing this series of posts I refined my method, expanded it, and clarified into this book titled How to create pragmatic, lightweight languages
Code
Code is available on GitHub under the tag 06_transformations
The operations we need
We will need to:
- process the AST: we want to do some operation to extract information from the AST
- transform the AST: we want to transform the single nodes of the AST
Implement the transformations manually
We could implement the transformation manually on each single class of the AST. For example, this is how we could implement these methods on a SumExpression
fun SumExpression.process(operation: (Node) -> Unit) { operation.invoke(this) this.left.process(operation) this.right.process(operation) } fun SumExpression.transform(operation: (Node) -> Node) : Node { val newLeft = this.left.transform(operation) as Expression val newRight = this.right.transform(operation) as Expression return operation(SumExpression(newLeft, newRight)) }
We would have just to repeat it for each single class of our metamodel. Quite boring, eh?
Transformations using reflection
The alternative is to use reflection. In this way we can specify these operations for Node and they will work for every class of every metamodel (so basically, for every language we are going to work on).
fun Node.process(operation: (Node) -> Unit) { operation(this) this.javaClass.kotlin.memberProperties.forEach { p -> val v = p.get(this) when (v) { is Node -> v.process(operation) is Collection<*> -> v.forEach { if (it is Node) it.process(operation) } } } } fun Node.transform(operation: (Node) -> Node) : Node { operation(this) val changes = HashMap<String, Any>() this.javaClass.kotlin.memberProperties.forEach { p -> val v = p.get(this) when (v) { is Node -> { val newValue = v.transform(operation) if (newValue != v) changes[p.name] = newValue } is Collection<*> -> { val newValue = v.map { if (it is Node) it.transform(operation) else it } if (newValue != v) changes[p.name] = newValue } } } var instanceToTransform = this if (!changes.isEmpty()) { val constructor = this.javaClass.kotlin.primaryConstructor!! val params = HashMap<KParameter, Any?>() constructor.parameters.forEach { param -> if (changes.containsKey(param.name)) { params[param] = changes[param.name] } else { params[param] = this.javaClass.kotlin.memberProperties.find { param.name == it.name }!!.get(this) } } instanceToTransform = constructor.callBy(params) } return operation(instanceToTransform) }
Testing the transformations
It all looks nice and well but the question is: does it actually work?
Let’s try.
In this little test we will transform references to a variable A into references to a variable B.
class ModelTest { @test fun transformVarName() { val startTree = SandyFile(listOf( VarDeclaration("A", IntLit("10")), Assignment("A", IntLit("11")), Print(VarReference("A")))) val expectedTransformedTree = SandyFile(listOf( VarDeclaration("B", IntLit("10")), Assignment("B", IntLit("11")), Print(VarReference("B")))) assertEquals(expectedTransformedTree, startTree.transform { when (it) { is VarDeclaration -> VarDeclaration("B", it.value) is VarReference -> VarReference("B") is Assignment -> Assignment("B", it.value) else -> it } }) } }
Conclusions
Another building block for the future. In the next steps we will see how to use this, for example to implement validation.