As beginners we used to write very large functions and then giant God classes. As we improve our skills, our classes become smaller but more numerous: we get a swarm of small, focuses classes that collaborate to create the final system. In other words we shift the complexity from the relations between the components of a class to the relations between classes. Indeed most Design Patterns just describe how different classes should be related.
We shift the complexity from the relations between the
components of a class to the relations between classes
We have different kinds of relationships between classes: aggregation and composition, unidirectional and bidirectional associations. We have constraints on associations and we could have attributes associated to these relations. However we do not have language support for relations.
This is very different in meta-modelling languages or frameworks, such EMF, which instead permit to describe relations. When building complex models, for example Abstract Syntax Trees, a good representation of relations helps a lot.
So I am considering adding support for relations in Turin (my very own programming language).
Example: an AST
Consider an abstract syntax tree. The basic relation is a composition relation, so each node can contain zero or more children nodes, while all nodes have one single parent node (or none, for the root of the tree).
Suppose we can represent this relation like this:
relation Ast { one Node parent many Node children }
Ideally we would like to use it like this:
// We create some nodes Node nodeA = Node() Node nodeB = Node() Node nodeC = Node() // and we should be able to easily navigate the relations assertEquals(false, nodeA.parent.isPresent()) assertEquals(false, nodeB.parent.isPresent()) assertEquals(false, nodeC.parent.isPresent()) assertEquals(Collections.emptyList(), nodeA.children) assertEquals(Collections.emptyList(), nodeB.children) assertEquals(Collections.emptyList(), nodeC.children) // we should also be able to easily set new relations nodeA.parent.set(nodeB) nodeB.children.add(nodeC) assertEquals(nodeB, nodeA.parent.get()) assertEquals(false, nodeB.parent.isPresent()) assertEquals(nodeB, nodeC.parent.get()) assertEquals([], nodeA.children) // note that we preserved the order of insertion assertEquals([nodeA, nodeC], nodeB.children) assertEquals([], nodeC.children)
So we would expect a fully bidirectional relation: when we update one endpoint, the other one is automatically aware of the change:
- if we set the parent of A to be B, the children of B will list A
- if we add A to the children of B, the parent of A will be B
And each node can have one parent so:
Node nodeA = Node() Node nodeB = Node() Node nodeC = Node() nodeA.parent = nodeB assertEquals([nodeA], nodeB.children) assertEquals([], nodeC.children) nodeA.parent = nodeC assertEquals([], nodeB.children) assertEquals([nodeA], nodeC.children)
No book-keeping is needed: all the parties involved have always the most updated picture. Normally in Java you would have a field parent and a list of children of each single node and changing the parent of a node would mean:
- alerting the old parent so that it can remove the node from its children
- alerting new new parent so that it can add the node to its children
- update the parent field in the node
Boring and error prone. In Turin instead it just works out of the box. Multiple endpoints (such as children) are seen as auto-updating lists while single endpoints (such as parent) are instead seen as a sort of Optional (with the possibility to set the value, not just verifying if it is present and read it).
Relationships and subsets
That is great, however we would like to have something more. We would like to have different specializations of those relationships. For example:
- a Method (which is a Node) can have multiple FormalArguments and one return type (TypeUsage)
- a FormalArgument (which is a Node) has one type (TypeUsage)
Now, we would like to add a FormalArgument instance as the param of a method and see it both as part of the “params” and of the “children” of that Node.
In other words “params” would represent a subset of the “children” of Method. Also “returnType” would be a subset (with exactly one element) of the “children” of method.
type Method extends Node { FormalArgument* params = subset of AST.children(parent=this) TypeUsage returnType = subset of AST.children(parent=this) }
Note that I want to be able to:
- add a node into a subset (e.g., params). I should see that node both among the params and the children
- add a node directly among the children: I should see the node as part of the children but not as part of any subset
Status
This is the syntax I am working on and most of the implementation for relationships is ready. Time to battle test it!
I am still working on relationships subsets but many tests are already passing and I should be ready to commit it soon.
For now I plan to support only relations with two endpoints which can be either single or multiple. In the future relations could hold values or have additional constraints, like we have for fields. I am still exploring the idea and experimenting how it works in practice, after all there are not many programming languages supporting relationships out there :)
A few references
No, I am not the first one to think about relations as first-class citizens in object-oriented programming languages:
A few useful references to understand more on the theoretical level:
- First-Class Relationships for Object-Orientated Programming Languages, PhD thesis from Stephen Nelson
- Relations: Abstracting Object Collaborations, Balzer et al., Technical Report
- Unifying and Generalizing Relations in Role-Based Data Modeling and Navigation, Harkes and Visser, Technical Report
I have to study better myself these papers, and then be sure to deliver a good compromise to make easier to think about some classes of problems which involves relationships and associations. After all, this is the goal of Turin: be a tool which support thinking about problems and solutions.