When I was working at TripAdvisor we had this internal book club. We started by reading Effective Java, this very famous book from Joshua Bloch. It is interesting and all but I think this book was so successful that most of his advices are now part of the Java culture: most of Java developers know them and sort of apply them. The point is that these advices, while very reasonable, are not consistently applied in large codebases. For this reason I decided I wanted to create a tool to verify that our codebase (which was… big) was actually adopting the suggestions from the book, so I started writing a tool named Effective Java and I started writing it in Clojure.
Running Queries From The Command Line
The basic idea is that you can run several queries against your code. The queries implemented are loosely based on the advices of the book. For example, one obvious advice is that you should not have tons of constructors and use factory methods instead. So let’s suppose we want to verify which classes have 5 constructors or more; you run this:
java -jar effectivejava-0.1.3-SNAPSHOT-standalone.jar -q mc -d "<myJavaProjectDir>" -t 5
And you get back something like this:
Considering 109 Java files japa.parser.ast.expr.ArrayCreationExpr : 5 japa.parser.ast.body.MethodDeclaration : 5 japa.parser.ast.body.BaseParameter : 5 japa.parser.ast.body.FieldDeclaration : 5
At this point you can look at this code and decide which parts you should refactor to improve the quality of your code. I think that the only way that works when dealing with a large codebase is to consistently improve it with an infinite patience and love. I find very useful to have some way to narrow your focus on something actionable (e.g., one single class or one single method) because if you stop and stare at the whole codebase you will just leave in despair and become a monk somewhere far, far away from classes 10K of lines long or constructors taking 20 parameters. When faced with such a daunting task you should not think, you should instead find one single problem and fix it. One way to focus is having someone finding issues for you.
Let a tool be your blinkers.
Queries Implemented
The number of queries currently implemented are very limited:
- number of non private constructors
- number of arguments for the non private constructors
- type of singleton implemented by a class (public field, static factory, enum)
The number of queries is limited because I focused more on building several ways of using the tool (listed below) and because I am working on far too many things :)
How the Model Is Built
Another reason for the limited number of queries is that I am currently using JavaParser to build a model of the parsed code. While it is a great tool (not saying this because I am a contributor to the project…:D) it is not able to resolve symbols.
On one hand it means less configuration for the user but it limits the kind of analysis which is possible to do. Things could change in the future because JavaParser is evolving to support this kind of analysis (update: not it does). I could also switch to use other tools like JaMoPP or MoDisco. Such tools are definitely more complex and less lightweight but it could be worthy to take a closer look to them.
Ideally we want a tool which both parse source code and it is also able to build a model for compiled code (to consider our dependencies). Such tools should be able to integrate the different type of models performing analysis on the resulting megamodel. So the model obtained by parsing a Java file could have references to the model obtained by analyzing a Jar. It is the kind of things which require some work and building a parser it is just a fraction of the effort.
The Interactive Mode
I think that in some cases you do not exactly what you are looking for. You want just to explore the code and figure out things as you go. So you could parse some source code, ask some queries, then changing the thresholds to limit your results and ask some queries more. Then you fix something and run your query again. Something like this:
So I spent some time implementing the interactive mode instead of writing more queries. Stupid me, but I have to say it was fun to write that piece of Clojure code :D
Using Effective Java as a Linter for Java
This feature is still in the very early stages. It currently runs a couple of queries for the number of constructors and the number of parameters for each constructor
lein run -l ⏎ [info] Linter, no directory indicated. Using current directory Loaded 440 files japa.parser.ast.stmt.CatchClause : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.stmt.AssertStmt : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.AnnotationMemberDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.EnumDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.Parameter : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.EnumConstantDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.VariableDeclarator : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.TypeDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.BaseParameter : This class has too many constructors (5). Consider using static factory methods or the Builder pattern japa.parser.ast.body.AnnotationDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.MethodDeclaration : This class has too many constructors (5). Consider using static factory methods or the Builder pattern japa.parser.ast.body.FieldDeclaration : This class has too many constructors (5). Consider using static factory methods or the Builder pattern japa.parser.ast.body.ConstructorDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.body.ClassOrInterfaceDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.type.ClassOrInterfaceType : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.type.WildcardType : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.type.ReferenceType : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.PackageDeclaration : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.expr.ArrayCreationExpr : This class has too many constructors (5). Consider using static factory methods or the Builder pattern japa.parser.ast.expr.VariableDeclarationExpr : This class has too many constructors (4). Consider using static factory methods or the Builder pattern japa.parser.ast.expr.MethodCallExpr : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.object.BatchSqlUpdate : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.object.SqlUpdate : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.object.SqlFunction : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.datasource.DriverManagerDataSource : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.datasource.SingleConnectionDataSource : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.datasource.SimpleDriverDataSource : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.core.SqlOutParameter : This class has too many constructors (7). Consider using static factory methods or the Builder pattern org.springframework.jdbc.core.SqlParameter : This class has too many constructors (7). Consider using static factory methods or the Builder pattern org.springframework.jdbc.core.support.SqlLobValue : This class has too many constructors (8). Consider using static factory methods or the Builder pattern org.springframework.jdbc.core.ResultSetSupportingSqlParameter : This class has too many constructors (6). Consider using static factory methods or the Builder pattern org.springframework.jdbc.core.SqlParameterValue : This class has too many constructors (4). Consider using static factory methods or the Builder pattern org.springframework.jdbc.core.SqlInOutParameter : This class has too many constructors (7). Consider using static factory methods or the Builder pattern japa.parser.ast.stmt.CatchClause.CatchClause(int, int, int, int, int, List<AnnotationExpr>, List<Type>, VariableDeclaratorId, BlockStmt) : This constructor has too many parameters (9). Consider using the Builder pattern japa.parser.ast.stmt.LabeledStmt.LabeledStmt(int, int, int, int, String, Statement) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.DoStmt.DoStmt(int, int, int, int, Statement, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.AssertStmt.AssertStmt(int, int, int, int, Expression, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.ExplicitConstructorInvocationStmt.ExplicitConstructorInvocationStmt(int, int, int, int, List<Type>, boolean, Expression, List<Expression>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.stmt.SwitchStmt.SwitchStmt(int, int, int, int, Expression, List<SwitchEntryStmt>) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.IfStmt.IfStmt(int, int, int, int, Expression, Statement, Statement) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.stmt.SwitchEntryStmt.SwitchEntryStmt(int, int, int, int, Expression, List<Statement>) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.SynchronizedStmt.SynchronizedStmt(int, int, int, int, Expression, BlockStmt) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.ForeachStmt.ForeachStmt(int, int, int, int, VariableDeclarationExpr, Expression, Statement) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.stmt.ForStmt.ForStmt(int, int, int, int, List<Expression>, Expression, List<Expression>, Statement) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.stmt.WhileStmt.WhileStmt(int, int, int, int, Expression, Statement) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.stmt.TryStmt.TryStmt(int, int, int, int, List<VariableDeclarationExpr>, BlockStmt, List<CatchClause>, BlockStmt) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.AnnotationMemberDeclaration.AnnotationMemberDeclaration(int, int, int, int, int, List<AnnotationExpr>, Type, String, Expression) : This constructor has too many parameters (9). Consider using the Builder pattern japa.parser.ast.body.EnumDeclaration.EnumDeclaration(int, List<AnnotationExpr>, String, List<ClassOrInterfaceType>, List<EnumConstantDeclaration>, List<BodyDeclaration>) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.body.EnumDeclaration.EnumDeclaration(int, int, int, int, int, List<AnnotationExpr>, String, List<ClassOrInterfaceType>, List<EnumConstantDeclaration>, List<BodyDeclaration>) : This constructor has too many parameters (10). Consider using the Builder pattern japa.parser.ast.body.Parameter.Parameter(int, int, int, int, int, List<AnnotationExpr>, Type, boolean, VariableDeclaratorId) : This constructor has too many parameters (9). Consider using the Builder pattern japa.parser.ast.body.EnumConstantDeclaration.EnumConstantDeclaration(int, int, int, int, List<AnnotationExpr>, String, List<Expression>, List<BodyDeclaration>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.VariableDeclarator.VariableDeclarator(int, int, int, int, VariableDeclaratorId, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.body.TypeDeclaration.TypeDeclaration(int, int, int, int, List<AnnotationExpr>, int, String, List<BodyDeclaration>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.VariableDeclaratorId.VariableDeclaratorId(int, int, int, int, String, int) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.body.BaseParameter.BaseParameter(int, int, int, int, int, List<AnnotationExpr>, VariableDeclaratorId) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.body.AnnotationDeclaration.AnnotationDeclaration(int, int, int, int, int, List<AnnotationExpr>, String, List<BodyDeclaration>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.MethodDeclaration.MethodDeclaration(int, List<AnnotationExpr>, List<TypeParameter>, Type, String, List<Parameter>, int, List<NameExpr>, BlockStmt) : This constructor has too many parameters (9). Consider using the Builder pattern japa.parser.ast.body.MethodDeclaration.MethodDeclaration(int, int, int, int, int, List<AnnotationExpr>, List<TypeParameter>, Type, String, List<Parameter>, int, List<NameExpr>, BlockStmt) : This constructor has too many parameters (13). Consider using the Builder pattern japa.parser.ast.body.InitializerDeclaration.InitializerDeclaration(int, int, int, int, boolean, BlockStmt) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.body.FieldDeclaration.FieldDeclaration(int, int, int, int, int, List<AnnotationExpr>, Type, List<VariableDeclarator>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.ConstructorDeclaration.ConstructorDeclaration(int, List<AnnotationExpr>, List<TypeParameter>, String, List<Parameter>, List<NameExpr>, BlockStmt) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.body.ConstructorDeclaration.ConstructorDeclaration(int, int, int, int, int, List<AnnotationExpr>, List<TypeParameter>, String, List<Parameter>, List<NameExpr>, BlockStmt) : This constructor has too many parameters (11). Consider using the Builder pattern japa.parser.ast.body.MultiTypeParameter.MultiTypeParameter(int, int, int, int, int, List<AnnotationExpr>, List<Type>, VariableDeclaratorId) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.ClassOrInterfaceDeclaration.ClassOrInterfaceDeclaration(int, List<AnnotationExpr>, boolean, String, List<TypeParameter>, List<ClassOrInterfaceType>, List<ClassOrInterfaceType>, List<BodyDeclaration>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.body.ClassOrInterfaceDeclaration.ClassOrInterfaceDeclaration(int, int, int, int, int, List<AnnotationExpr>, boolean, String, List<TypeParameter>, List<ClassOrInterfaceType>, List<ClassOrInterfaceType>, List<BodyDeclaration>) : This constructor has too many parameters (12). Consider using the Builder pattern japa.parser.ast.type.ClassOrInterfaceType.ClassOrInterfaceType(int, int, int, int, ClassOrInterfaceType, String, List<Type>) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.type.WildcardType.WildcardType(int, int, int, int, ReferenceType, ReferenceType) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.type.ReferenceType.ReferenceType(int, int, int, int, Type, int) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.PackageDeclaration.PackageDeclaration(int, int, int, int, List<AnnotationExpr>, NameExpr) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.FieldAccessExpr.FieldAccessExpr(int, int, int, int, Expression, List<Type>, String) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.expr.ArrayCreationExpr.ArrayCreationExpr(int, int, int, int, Type, int, ArrayInitializerExpr) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.expr.ArrayCreationExpr.ArrayCreationExpr(int, int, int, int, Type, List<Expression>, int) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.expr.VariableDeclarationExpr.VariableDeclarationExpr(int, int, int, int, int, List<AnnotationExpr>, Type, List<VariableDeclarator>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.expr.SingleMemberAnnotationExpr.SingleMemberAnnotationExpr(int, int, int, int, NameExpr, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.InstanceOfExpr.InstanceOfExpr(int, int, int, int, Expression, Type) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.ObjectCreationExpr.ObjectCreationExpr(int, int, int, int, Expression, ClassOrInterfaceType, List<Type>, List<Expression>, List<BodyDeclaration>) : This constructor has too many parameters (9). Consider using the Builder pattern japa.parser.ast.expr.AssignExpr.AssignExpr(int, int, int, int, Expression, Expression, Operator) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.expr.NormalAnnotationExpr.NormalAnnotationExpr(int, int, int, int, NameExpr, List<MemberValuePair>) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.QualifiedNameExpr.QualifiedNameExpr(int, int, int, int, NameExpr, String) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.MemberValuePair.MemberValuePair(int, int, int, int, String, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.ArrayAccessExpr.ArrayAccessExpr(int, int, int, int, Expression, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.MethodCallExpr.MethodCallExpr(int, int, int, int, Expression, List<Type>, String, List<Expression>) : This constructor has too many parameters (8). Consider using the Builder pattern japa.parser.ast.expr.ConditionalExpr.ConditionalExpr(int, int, int, int, Expression, Expression, Expression) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.expr.CastExpr.CastExpr(int, int, int, int, Type, Expression) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.expr.BinaryExpr.BinaryExpr(int, int, int, int, Expression, Expression, Operator) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.expr.UnaryExpr.UnaryExpr(int, int, int, int, Expression, Operator) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.CompilationUnit.CompilationUnit(int, int, int, int, PackageDeclaration, List<ImportDeclaration>, List<TypeDeclaration>) : This constructor has too many parameters (7). Consider using the Builder pattern japa.parser.ast.TypeParameter.TypeParameter(int, int, int, int, String, List<ClassOrInterfaceType>) : This constructor has too many parameters (6). Consider using the Builder pattern japa.parser.ast.ImportDeclaration.ImportDeclaration(int, int, int, int, NameExpr, boolean, boolean) : This constructor has too many parameters (7). Consider using the Builder pattern
Please note that EffectiveJava does not do syntax highlighting: it is the widget I am using to show the code (what a dirty trick, eh?).
How It Is implemented
The various queries are implemented as Operation:
(defrecord Operation [query params headers])
An operation takes the actual query to execute on the model of the code, the params (such as the thresholds to be used), and the headers of the table to be produced.
So the operation can be used with printOperation:
(defn printOperation [operation cus threshold] (let [headers (.headers operation), results ((.query operation) {:cus cus :threshold threshold})] (printTable headers results)))
printTable contains all the logic to produce a (sort of) nice looking map. Note that different kind of objects could be contained in the result of a query: classes, methods, constructors, fields, etc. I need a way to transform all of them in strings. For that I am using a Clojure multi-method:
(defmulti toString class) (defmethod toString :default [x] (str x)) (defmethod toString String [x] x) (defmethod toString clojure.lang.Keyword [x] (name x)) (defmethod toString ClassOrInterfaceDeclaration [x] (getQName x)) (defmethod toString ConstructorDeclaration [x] (getQName x))
For example, I print the qualified name (getQName) for different elements. The way it is calculated it is using a Clojure protocol. This is a part of the implementation:
(defprotocol withPackageName (packageName [this])) (extend-protocol withPackageName CompilationUnit (packageName [this] (let [p (.getPackage this)] (if (nil? p) "" (.toString (.getName p)))))) (extend-protocol withPackageName Node (packageName [this] (let [pn (.getParentNode this)] (packageName pn)))) (extend-protocol withPackageName SingleFieldDeclaration (packageName [this] (packageName (.variable this)))) (defprotocol Named (getName [this]) (getQName [this])) (extend-protocol Named TypeDeclaration (getName [this] (.getName this)) (getQName [this] (let [pn (packageName this), cn (getName this)] (if (.isEmpty pn) cn (str pn "." cn))))) (extend-protocol Named EnumConstantDeclaration (getName [this] (.getName this)) (getQName [this] (let [pn (.getParentNode this)] (str (getQName pn) "." (getName this))))) (extend-protocol Named MethodDeclaration (getName [this] (.getName this)) (getQName [this] (let [pn (.getParentNode this)] (str (getQName pn) "." (getName this)))))
For the interactive mode we use insta-parser to parse the commands:
(def command-parser (insta/parser (str "<COMMAND> = HELP | EXIT | LOAD | LIST | MC \n" "HELP = 'help' | 'h' \n" "EXIT = 'exit' | 'quit' | 'q' \n" "LOAD = 'load' <WS> STR \n" "LIST = 'list' \n" "MC = ('mc'|'many-constructors') <WS> 'th' <WS> NUM \n" "WS = #'[\t ]+' \n" "NUM = #'[0-9]+' \n" "STR = #'\"[^\"]*\"' \n")))
We have then just a loop passing in the state the list of loaded classes.
To parse the options obtained from the command line we just use parse opts:
(def cliOpts [ ["-h" "--help" "Show help" :flag true :default false] ["-i" "--interactive" "launch interactive mode" :flag true :default false] ["-l" "--linter" "launch linter mode" :flag true :default false] ["-d" "--dir DIRNAME" "REQUIRED: Directory containing the code to check"] ["-q" "--query QUERYNAME" "REQUIRED: Query to perform: mc=many constructors, mcp=many constructor parameters, st=singleton type"] ["-t" "--threshold VALUE" "Threshold to be used in the query" :default 0 :parse-fn #(Integer/parseInt %) :validate [#(>= % 0) "Must be a number equal or greater to 0"]] ]) (defn name2operation [name] (cond (= "mc" name) classesWithManyConstructorsOp (= "mcp" name) constructorsWithManyParametersOp (= "st" name) classesAndSingletonTypeOp :else nil)) (defn -main "What I do" [& args] (let [optsMap (parse-opts args cliOpts) opts (:options optsMap) banner (:summary optsMap)] ... ... ... the rest of the good stuff
There is then some code to interface the tool with Javaparser but you are lucky: I am not going to bore you with that. If you are interested feel free to write to me: I like helping out and I like Javaparser so I am very happy to answer all of your questions about it.
Conclusions
The idea of building tools to explore codebases is something has fascinated me for a while. During my PhD I have built some tools in that general area, for example CodeModels, which wraps several parsers for several languages and permit to perform cross-language analysis.
What do you think? Is it worthy to spend some more time on this tool? Would you like to use something like this? What are your strategy for static analysis?