Posts

Resolve method calls in Java code using the JavaSymbolSolver

javasymbolsolver

Resolve method calls in Java Code using JavaSymbolSolver

Why I created the java-symbol-solver?

A few years ago I started using JavaParser and then I started contributing. After a while I realized that many operations we want to do on Java code cannot be done just by using the Abstract Syntax Tree produced by a parser, we need also to resolve types, symbols and method calls. For this reason I have created the JavaSymbolSolver. It is now been used to produce static analysis tools by Coati.

One thing that is missing is documentation: people open issues on JavaParser asking how to answer a certain question and the answer is often “for this you need to use JavaSymbolSolver”. Starting from these issues I will show a few examples.

Inspired by this issue I will show how to produce a list of all calls to a specific method.

Learn advanced JavaParser

Javaparser_visited

Receive a chapter on the book JavaParser: Visited.

This chapter presents the JavaSymbolSolver, which you will need for all the advanced analysis and transformation of Java code

Powered by ConvertKit

How can we resolve method calls in Java using the java-symbol-solver?

It can be done in two steps:

  1. You use JavaParser on the source code to build your ASTs
  2. You call JavaSymbolSolver on the nodes of the ASTs representing method calls and get the answer

We are going to write a short example. At the end we will get an application that given a source file will produce this:

We are going to use Kotlin and Gradle. Our build file looks like this:

Building an AST is quite easy, you simply call this method:

What the hell is a Type Solver? It is the object which knows where to look for classes. When processing source code you will typically have references to code that is not yet compiled, but it is just present in other source files. You could also use classes contained in JARs or classes from the Java standard libraries. You have just to tell to your TypeSolver where to look for classes and it will figure it out.

In our example we will parse the source code from the JavaParser project (how meta?!). This project has source code in two different directories, for proper source code and code generated by JavaCC (you can ignore what JavaCC is, it is not relevant to you). We of course use also classes from the java standard libraries. This is how our TypeSolver looks like:

This is where we invoke JavaParserFacade, one of the classes provided by JavaSymbolSolver. We just take a method call at the time and we pass it to the method solve of the JavaParserFacade. We get a MethodUsage (which is basically a method declaration + the value of the parameter types for that specific invocation). From it we get the MethodDeclaration and we print the qualified signature, i.e., the qualified name of the class followed by the signature of the method. This is how we get the final output:

There is so plumbing to do but basically JavaSymbolSolver does all the heavy work behind the scene. Once you have a node of the AST you can throw it at the class JavaParserFacade and it will give you back all the information you may need: it will find corresponding types, fields, methods, etc.

The problem is… we need more documentation and feedback from users. I hope some of you will start using JavaSymbolSolver and tell us how we can improve it.

Also, last week the JavaSymbolSolver was moved under the JavaParser organization. This means that in the future we will work more closely with the JavaParser project.

The code is available on GitHub: java-symbol-solver-examples

Functional programming for Java: getting started with Javaslang

Java is an old language and there are many new kids in the block who are challenging it on its own terrain (the JVM). However Java 8 arrived and brought a couple of interesting features. Those interesting features enabled the possibility of writing new amazing frameworks like the Spark web framework or Javaslang.

In this post we take a look at Javaslang which brings functional programming to Java.

Functional programming: what is that good for?

It seems that all the cool developers want to do some functional programming nowadays. As they wanted to use Object-oriented programming before. I personally think functional programming is great to tackle a certain set of problems, while other paradigms are better in other cases.

Functional programming is great when:

  • you can pair it with immutability: a pure function has not side-effect and it is easier to reason about. Pure functions means immutability, which drastically simplifies testing and debugging. However not all solutions are nicely represent with immutability. Sometimes you just have a huge piece of data that it is shared between several users and you want to change it in place. Mutability is the way to go in that case.
  • you have code which depends on inputs, not on state: if something depends on state instead than on input it sounds more like a method that a function to me. Functional code ideally should make very explicit which information is using (so it should use just parameters). That also means more generic and reusable functions.
  • you have independent logic, which is not highly coupled: functional code is great when it is organized in small, generic and reusable functions
  • you have streams of data that you want to transform: this is in my opinion the easiest place where you can see the values of functional programming. Indeed streams received a lot of attention in Java 8.

Discuss the library

As you can read on javaslang.com:

Java 8 introduced λ which dramatically increases the expressiveness of our programs, but “Clearly, the JDK APIs won’t help you to write concise functional logic (…)”jOOQ™ blog

Javaslang™ is the missing part and the best solution to write comprehensive functional Java 8+ programs.

This is exactly as I see Javaslang: Java 8 gave us the enabling features to build more concise and composable code. But it did not do the last step. It opened a space and Javaslang arrived to fill it.

Javaslang brings to the table many features:

  • currying: currying can be use to implement the partial application of functions
  • pattern matching: let’s think of it as the dynamic dispatching for functional programming
  • failure handling: because exceptions are bad for function compositions
  • Either: this is another structure which is very common in functional programming. The typical example is a function which returns a value when things go well and an error message when things go not so well
  • tuples: tuples are a nice lightweight alternatives to objects and perfect to return multiple values. Just do not be lazy and use classes when it makes sense to do so
  • memoization: this is caching for functions

For developers with experience in functional programming this will all sound very well known. For the rest of us let’s take a look at how we can use this stuff in practice.

Ok, but in practice how can we use this stuff?

Obviously showing an example for each of the feature of Javaslang is far beyond the scope of this post. Let’s just see how we could use some of them and in particular let’s focus on the bread and butter of functional programming: functions manipulation.

Given that I am obsessed with manipulation of Java code we are going to see how we can use Javaslang to examine the Abstract Syntax Tree (AST) of some Java code. The AST can be easily obtained using the beloved JavaParser.

If you are using gradle your build.gradle file could look like this:

We are going to implement very simple queries. Queries we can be answered just looking at the AST without solving symbols. If you want to play with Java ASTs and solve symbols you may want to take a look at this project of mine: java-symbol-solver.

For example:

  • find classes with a method with a given name
  • find classes with a method with a given number of parameters
  • find classes with a given name
  • combining the previos queries

Let’s start with a function which given a CompilationUnit and a method name returns a List of TypeDeclarations defining a method with that name. For people who never used JavaParser: a CompilationUnit represents an entire Java file, possibly containing several TypeDeclarations. A TypeDeclaration can be a class, an interface, an enum or an annotation declaration.

getTypesWithThisMethod is very simple: we take all the types in the CompilationUnit (cu.getTypes()) and we filter them, selecting only the types which have a method with that name. The real work is done in hasMethodNamed.

In hasMethodNamed we start by creating a javaslang.collection.List from our java.util.List (List.ofAll(typeDeclaration.getMembers())Then we consider that we are only interested in the MethodDeclarations: we are not interested in field declarations or other stuff contained in the type declaration. So we map each method declaration to either Option.of(true) if the name of the method matches the desidered methodName, otherwise we map it toOption.of(false). Everything that is not a MethodDeclaration is mapped to Option.none(). Note that we do that in two steps: first the method is mapped to an Option<String> then the Option<String> is mapped to an Option<Boolean>.

So for example, if we are looking for a method name “foo” in a class which has three fields, followed by methods named “bar”, “foo” and “baz” we will get a list of:

Option.none(), Option.none(), Option.none(), Option.of(false)Option.of(true)Option.of(false)

The next step is to map both Option.none() and Option.of(false) to false and Option.of(true) to true. Note that we could have than that immediately instead of having two maps operation concatenated. However I prefer to do things in steps. Once we get a list of true and false we need to derive one single value out of it, which should be true if the list contains at least one true, and false otherwise. Obtaining a single value from a list is called a reduce operation. There are different variants of this kind of operation: I will let you look into the details 🙂

We could rewrite the latest method like this:

Why we would like to do so? It seems (and it is) much more complicate but it shows us how we can manipulate functions and this is an intermediate step to obtain code which is more flexible and powerful. So let’s try to understand what we are doing.

First a quick note: the class Function1 indicates a function taking one parameter. The first generic parameter is the type of the parameter accepted by the function, while the second one is the type of the value returned by the function. Function2 takes instead 2 parameters. You can understand how this goes on 🙂

We:

  • reverse the order in which parameters can be passed to a function
  • we create a partially applied function: this is a function in which the first parameter is “fixed”

So we create our originalFunctionReversedAndCurriedAndAppliedToMethodName just manipulating the original function hasMethodNamed. The original function took 2 parameters: a TypeDeclaration  and the name of the method. Our elaborated function takes just a TypeDeclaration. It still returns a boolean.

We then simply transform our function in a predicate with this tiny function which we could reuse over and over:

Now, this is how we can make it more generic:

Ok, now we could generalize also hasMethodWithName:

After some refactoring we get this code:

Now let’s see how it can be used:

The source file we used in this tests is this one:

This is of course a very, very, very limited introduction to the potentialities of Javaslang. What I thinki is important to get for someone new to functional programming is the tendence to write very small functions which can be composed and manipulates to obtain very flexible and powerful code. Functional programming can seem obscure when we start using it but if you look at the tests we wrote I think they are rather clear and descriptive.

Functional Programming: is all the hype justified?

I think there is a lot of interest in functional programming but if that becomes hype it could lead to poor design decisiong. Think about the time when OOP was the new rising star: the Java designers went all the way down forcing programmers to put every piece of code in a class and now we have utility classes with a bunch of static methods. In other words we took functions and asked them to pretend to be a class to gain our OOP medal. Does it make sense? I do not think so. Perhaps it helped to be a bit extremist to strongly encourage people to learn OOP principles. That is why if you want to learn functional programming you may want to use functional-only languages like Haskell: because they really, really, really push you into functional programming. So that you can learn the principles and use them when it does make sense to do so.

Conclusions

I think functional programming is a powerful tool and it can lead to very expressive code. It is not the right tool for every kind of problem, of course. It is unfortunate that Java 8 comes without proper support for functional programming patterns in the standard library. However some of the enabling features have been introduced in the language and Javaslang is making possible to write great functional code right now. I think more libraries will come later, and perhaps they will help keeping Java alive and healthy for a little longer.

 

Note: thanks to Lorenzo Bettini for pointing out a couple of mistakes

Effective Java: a tool to explore and measure your Java code written in Clojure

When I was working at TripAdvisor we had this internal book club. We started by reading Effective Java, this very famous book from Joshua Bloch. It is interesting and all but I think this book was so successful that most of his advices are now part of the Java culture: most of Java developers know them and sort of apply them. The point is that these advices, while very reasonable, are not consistently applied in large codebases. For this reason I decided I wanted to create a tool to verify that our codebase (which was… big) was actually adopting the suggestions from the book, so I started writing a tool named Effective Java and I started writing it in Clojure.

Running queries from the command line

The basic idea is that you can run several queries against your code. The queries implemented are loosely based on the advices of the book. For example, one obvious advice is that you should not have tons of constructors and use factory methods instead. So let’s suppose we want to verify which classes have 5 constructors or more; you run this:

And you get back something like this:

At this point you can look at this code and decide which parts you should refactor to improve the quality of your code. I think that the only way that works when dealing with a large codebase is to consistently improve it with an infinite patience and love.  I find very useful to have some way to narrow your focus on something actionable (e.g., one single class or one single method) because if you stop and stare at the whole codebase you will just leave in despair and become a monk somewhere far, far away from classes 10K of lines long or constructors taking 20 parameters. When faced with such a daunting task you should not think, you should instead find one single problem and fix it. One way to focus is having someone finding issues for you.

paraocchi

 

Let a tool be your blinkers.

Queries implemented

The number of queries currently implemented are very limited:

  • number of non private constructors
  • number of arguments for the non private constructors
  • type of singleton implemented by a class (public field, static factory, enum)

The number of queries is limited because I focused more on building several ways of using the tool (listed below) and because I am working on far too many things 🙂

How the model is built

Another reason for the limited number of queries is that I am currently using JavaParser to build a model of the parsed code. While it is a great tool (not saying this because I am a contributor to the project…:D) it is not able to resolve symbols. On one hand it means less configuration for the user but it limits the kind of analysis which is possible to do. Things could change in the future because JavaParser is evolving to support this kind of analysis. I could also switch to use other tools like JaMoPP or MoDisco . Such tools are definitely more complex and less lightweight but it could be worthy to take a closer look to them. Ideally we want a tool which both parse source code and it is also able to build a model for compiled code (to consider our dependencies). Such tools should be able to integrate the different type of models performing analysis on the resulting megamodel. So the model obtained by parsing a Java file could have references to the model obtained by analyzing a Jar. It is the kind of things which require some work and building a parser it is just a fraction of the effort.

The interactive mode

I think that in some cases you do not exactly what you are looking for. You want just to explore the code and figure out things as you go. So you could parse some source code, ask some queries, then changing the thresholds to limit your results and ask some queries more. Then you fix something and run your query again. Something like this:

Screenshot from 2015-04-05 11:04:15

So I spent some time implementing the interactive mode instead of writing more queries. Stupid me, but I have to say it was fun to write that piece of Clojure code 😀

Using Effective Java as a linter for Java

This feature is still in the very early stages. It currently runs a couple of queries for the number of constructors and the number of parameters for each constructor

Please note that EffectiveJava does not do syntax highlighting: it is the widget I am using to show the code (what a dirty trick, eh?).

How it is implemented

The various queries are implemented as Operation:

An operation takes the actual query to execute on the model of the code, the params (such as the thresholds to be used), and the headers of the table to be produced.

So the operation can be used with printOperation:

printTable contains all the logic to produce a (sort of) nice looking map. Note that different kind of objects could be contained in the result of a query: classes, methods, constructors, fields, etc. I need a way to transform all of them in strings. For that I am using a Clojure multi-method:

For example I print the qualified name (getQName) for different elements. The way it is calculated it is using a Clojure protocol. This is a part of the implementation:

For the interactive mode we use insta-parser to parse the commands:

We have then just a loop passing in the state the list of loaded classes.

To parse the options obtained from the command line we just use parse opts:

There is then some code to interface the tool with Javaparser but you are lucky: I am not going to bore you with that. If you are interested feel free to write to me: I like helping out and I like Javaparser so I am very happy to answer all of your questions about it.

Conclusions

The idea of building tools to explore codebases is something has fascinated me for a while. During my PhD I have built some tools in that general area, for example CodeModels, which wraps several parsers for several languages and permit to perform cross-language analysis.

What do you think? Is it worthy to spend some more time on this tool? Would you like to use something like this? What are your strategy for static analysis?

Mocking in Java: why mocking, why not mocking, mocking also those awful private static methods

Unit tests: there are people out there surviving without, by in many cases you wan to have this life insurance. Something to protect you from slipping an error, something to accompany your software also when you will have long forgotten it and someone else will have to figure out how to maintain that legacy thing.

Why Mocking

If you want (need?) to write unit tests, by definition they have to test in isolation the behavior of a small unit (typically a class). The behavior of the tested method should not depend on the behavior of some other class because:

  • the other classes will change (impacting our test),
  • because the other class is not predictable (random generator, user input),
  • because it depends on external elements (network, databases,other processes),
  • because the other classes could require a complex initialization, and you do not want to do that

The unit test should reply to the question:

Does my unit work in an ideal world, surrounded by extremely nice and well behaving neighbours?

Mocks provide that ideal world. You can try the luck of your system in the ideal world using other kinds of tests (like integration tests).

The argument “mocking is a bad thing”

Some people argue that you should not use mocks, because you should build your systems in such a way that make mocking not necessary.

I think this is partially true: if a system is well designed, with testability in mind from day 0, probably it will require much less mocking than otherwise. However you will still need mocking for three reasons:

  • you will inherit systems which either have no tests or have a low coverage. If tests are added as an afterthought the system is not designed to be testable and you will end up using a lot of mocking
  • it is true that in most situation you can avoid using mocking if you build your system using a lot of interfaces and dependency injection. However it means in most cases to use a fair amount of overengineering. You will end up having all over the places interfaces like FooBarable, BarFooator, and then classes like DummyFooBarable, HttpFooBarable, etc. Good, now you can avoid mocking but your system is became one of the reasons why other programmers laugh at Java code
  • sometimes your unit is a method, not a class, so you want to mock the rest of the class (partial mocking). Suppose you want to test method foo of class Foo. This method invoke bar and baz of the same class. If these methods interact with a lot of other classes (Bazzator, Bazzinga, BazLoader) it could be easy to just mock the methods bar and baz instead of mocking these other classes. Another advantage is that it make your tests more readable: you could write something like when this.bar() return 3 than this.foo() should return false instead of building a complex test to create the conditions under which this.bar() return 3

So, yes, you should not be mocking all the times, but many times you do not have an alternative and in some cases the alternative is way worse.

Mocking basics

Ok, let’s start by specifying our dependencies. We will use both Mockito and Easymock together with PowerMock for extra power. PowerMock complements both of Mockito and Easymock.

My scenario

I had to work with a legacy application: you know, the kind of application that one wants to touch even with a pole, the one with all the original authors disappeared (deported for their crimes?). You get the picture. Now we had to do a tiny change to this application and then run away like hell. Given we are good professionals we wanted to write a test: the problem was that our change was inside a very large method which was private and static. The method is named dealsToDisplay. Given it is private and static we invoke it through reflection (see invokeDealsToDisplay()). The actual tests are in dealsToDisplayNoBlacklistingTest, dealsToDisplaySomeBlacklistingTest, dealsToDisplayCompleteBlacklistingTest. All the rest is mocking and plumbing.

Mocking static methods

In my scenario I had to:

  • invoke a private static method
  • mock several static methods

The former is easy, you just have to use reflection:

We start by finding the method and we set is accessible (setting back the previous value when we are done). At this point we can invoke it. Nice. Sort of.

To mock static methods we have instead to use PowerMock which does the trick by using a custom Classloader and doing bytecode rewriting on the fly. Yes, it does not sound safe. No, there are no alternatives that I am aware of.

So we need to do a few things, first of all we have to instruct PowerMock to take care of loading the class through its Classloader:

Then we have to declare which methods we intend to mock:

Finally we mock them, specifying what result do we want when they are invoked:

Conclusions

Our solution required a considerable amount of pluming and mocking, mocking and plumbing, to test a very limited functionality. While we are happy with the result and have reasonable confidence with this not destroy that old piece of code, it is clear that it is not an ideal scenario. But sometimes you have to  do what you gotta do.

Bonus: mocking singletons

A common issue is the necessity of mocking singletons. While you can write your own recipe reusing the code presented in this post, you can also take a look at this post:

Mocking a singleton with EasyMock and PowerMock

Happy mocking!

P.S. If you have suggestions, or corrections please let me know!

Update

Steve Bennett wrote some interesting comments about this post, it is worthy taking a look on Steve Bennett’s blog.