Everything related to software development that every developer could find useful

The Simple Way to Find the Correct Syntax

The Simple Way to Find the Correct Syntax
It happens to the best of us, you are writing some code in a Language that you use sporadically and you start asking yourself what is the correct syntax for this or that. You know many languages and you start wondering how do you iterate the elements of a collection? Is it foreach, for .. in or something else? You know the answer, you just need a second or two… it’s matter of pride! You are a professional and you are going to remember it…

Let’s not kid ourselves you are going to search for it, in a search engine or directly on stackoverflow. Or maybe you are going to search for a cheat sheet in Duck Duck Go.

A Better Alternative (Sometimes)

It works, but it’s not always perfect. Sometimes you have to browse different questions or to sift through a long and erudite answer about the performance profile of the different options of looking the characters in a string. Which is cool, but you just need to refresh how to do it not how to optimize to death a 40-lines script. So now there is an interesting alternative: SyntaxDB. It’s both a search engine for syntax and a reference. Actually it’s called reference, but it’s a bit better than a standard reference and a bit worse.

SyntaxDB Python Reference

As you can see it’s better in the sense that it provides some guidance, something that is not always available for all references, and a generic structure for many different languages. It’s also a bit worse because it’s not complete, in fact there are many level of completeness. You can’t even say that they cover all the basics (like strings) for all the languages. It seems that the author has inserted the constructs that came in his mind and then he tried to gave a structure to them and choose some languages to cover. And probably that’s pretty much what happened since it is the work of a single author. Indeed a very competent one, but it’s not service from a company.

The Future is Bright

Don’t get me wrong, it’s a good idea that it is implemented well and it actually works if you really need to find the syntax for for in python. Also the author is working on letting everybody contribute:

Something I’m very happy and surprised with is the number of people asking how they can help. The most requested feature for SyntaxDB has been a way to let you developers contribute, and I’m happy to say that’s the next feature I’ll be working on!

from the About page

It’s just that not all the content it’s there. Plus it seems that only one version of any language is supported, which is something that can really be a problem in languages such as Python which, for some reason, it’s still very much split between version 2 and 3.

Another good reason to follow the service is that is indeed well designed and thought for the contemporary world. What I mean is that there is already an API and its implementation follows the Swagger/OpenAPI Specification. There are already integrations with editors/IDEs such as Visual Studio Code and Atom. And there is even one for the Duck Duck Go search engine and a bot for Slack. I think it’s off to a good start, it just need some time to develop.

 

A template system for Google Docs: Google Drive automation and PDF generation with Google Execution API

My consulting business is getting more steam and I am starting to be annoyed by the administrative steps. Basically when I need to prepare a new invoice I have to:

  • copy I template I have in my Google drive
  • fill the data
  • download a PDF
  • archive the PDF on Dropbox
  • send the PDF to the customer
  • if the customer is in the European Union (outside France) I need to fill a declaration for the “Douane”. This is what is called an “intrastat” declaration in some places

Now, this is not the most exciting and creative part of the job so I automated some of the process.

Right now I have a script that can create a copy of the template fill it, generate the PDF and download it. I still need to automate the part in which I upload the PDF to Dropbox, but for now I could just copy the PDF in my Dropbox local checkout.

Google Developers Console

Now, this is the boring part and it is easy to miss something. I will try to recollect from memory what I did and wish you good luck. After all I do not want to make things too easy. That would be boring, wouldn’t it?

First, visit https://console.developers.google.com and create a project.

Then add permissions to that project, selecting the Google Drive API and the Google Apps Script Execution API.

Finally go to credentials and generate the “OAuth client ID” credentials. You should get a file to download. It is a JSON file containing the credentials for your project.

Good, enough with the boring bits, let’s start to program.

Loading the data

For now the data for the invoices is kept in a simple JSON file. Later I could store it in a Google Spreadsheet document. At that point I could trigger the invoice generation when I add some data to that file.

Right now instead my script take 2 parameters:

  1. the name of the file containing the data
  2. the number of the invoice I want to generate

So the usage scenario is this: first of all I open the data file and add the data for the new invoice. Typically I copy the data from a previous invoice for the same customer and I adapt it. Then I close the file and run the script specifying the number of the new invoice to generate. If I need it I could also regenerate an old invoice just by running the script with the number of that invoice, without the need to touch the data file.

This is the code which parses the arguments and load the data:

An example of data file:

 

Finding the template and cloning it

In my Google Drive I have a directory named DriveInvoicing which contains a Google Doc named Template. Here it is the first page:

Screenshot from 2016-04-12 21-47-21

The second page contains uninteresting legalese in both French and English: French because I am supposed to write my invoices in French, given that I am located in France. English because most of my clients do not speak any French.

The code to locate the template file is this:

Copying the template and filling it

First of all we create a copy of the template:

Then we execute a Google Script on it:

Finally we download the document as a PDF:

The script which fill the data is this:

This is created in the online editor for Google Scripts:

Screenshot from 2016-04-12 22-02-58

What we got

This is the final result:

Screenshot from 2016-04-12 21-49-22

ENOUGH, GIMME THE CODE!

Code is available on GitHub: https://github.com/ftomassetti/DriveInvoicing

Recognizing hand-written rectangles in an image

Machine learning for points classification?

Last time we have seen how to identify key points in an image. I was then thinking to use machine learning techniques to recognize the roles played by each point. I played for a while with Weka, a tool which make very easy to experiment with different Machine Learning algorithms. To identify the features to use in the classification I used this strategy:

  • I draw two concentric circles around the points of interest: one close and one further away
  • I identified the intersection of the contour to which the key point belonged and the concentric circles
  • I splitted the circles in 12 parts and counted how many intersections were falling into each of those parts
  • I then used those 24 values for the classification

..if simple heuristics can do…

However I realized that there was no actual need for machine learning. I could instead very simple heuristics. After all I was just looking for the corners of the rectangles so I considered only points which had two intersections for the closest and the farthest circle. Then I considered the angle of this intersection, basically looking for something around 90°. Then considering the orientation of the corner I classified it as a top-left, top-right, bottom-left or bottom-right corner.

Once I have classified the points I started looking for top-left corners and considered matching bottom-right corners. I just took the closest one in the right direction. Once I have a pair of top-left and bottom-right corners I know where to look for the missing corners: the top-right is supposed to have an x equal to the one of the bottom-right point and an y equal to the one of the top-left corner, viceversa for the bottom-left point. If I can find these two points where I am looking for them I consider the rectangle complete.

Finally I have just to check if I recognized overlapping rectangles: in that case I just throw away the smaller ones.

This algorithm is not perfect but I get decent results:

res_whiteboard1

Why I am not using OpenCV

When we are manipulation images OpenCV is the obvious answer, however I did not get good result with it. It seems that the typical algorithms for detecting rectangles are confused by the fact the contourns I found are not rectangles. This is because of the connections between rectangles (the lines linking the rectangles). I tried a few thing but I did not get any good result. In addition to that OpenCV is written in C/C++ and that basically means that deploying it is much more cumbersome. My current solution is Java based and that means that I can easily run it on every possible platform without headaches. I will have another look at OpenCV and I am very open to suggestion. In fact a friend of mine just gave me a couple of nice ideas to try.

Code, where is the code?

You know, words are nice and all but the only thing that really matters is code. You can grab it on GitHub, here: https://github.com/ftomassetti/SketchModel

Functional programming for Java: getting started with Javaslang

Java is an old language and there are many new kids in the block who are challenging it on its own terrain (the JVM). However Java 8 arrived and brought a couple of interesting features. Those interesting features enabled the possibility of writing new amazing frameworks like the Spark web framework or Javaslang.

In this post we take a look at Javaslang which brings functional programming to Java.

Functional programming: what is that good for?

It seems that all the cool developers want to do some functional programming nowadays. As they wanted to use Object-oriented programming before. I personally think functional programming is great to tackle a certain set of problems, while other paradigms are better in other cases.

Functional programming is great when:

  • you can pair it with immutability: a pure function has not side-effect and it is easier to reason about. Pure functions means immutability, which drastically simplifies testing and debugging. However not all solutions are nicely represent with immutability. Sometimes you just have a huge piece of data that it is shared between several users and you want to change it in place. Mutability is the way to go in that case.
  • you have code which depends on inputs, not on state: if something depends on state instead than on input it sounds more like a method that a function to me. Functional code ideally should make very explicit which information is using (so it should use just parameters). That also means more generic and reusable functions.
  • you have independent logic, which is not highly coupled: functional code is great when it is organized in small, generic and reusable functions
  • you have streams of data that you want to transform: this is in my opinion the easiest place where you can see the values of functional programming. Indeed streams received a lot of attention in Java 8.

Discuss the library

As you can read on javaslang.com:

Java 8 introduced λ which dramatically increases the expressiveness of our programs, but “Clearly, the JDK APIs won’t help you to write concise functional logic (…)”jOOQ™ blog

Javaslang™ is the missing part and the best solution to write comprehensive functional Java 8+ programs.

This is exactly as I see Javaslang: Java 8 gave us the enabling features to build more concise and composable code. But it did not do the last step. It opened a space and Javaslang arrived to fill it.

Javaslang brings to the table many features:

  • currying: currying can be use to implement the partial application of functions
  • pattern matching: let’s think of it as the dynamic dispatching for functional programming
  • failure handling: because exceptions are bad for function compositions
  • Either: this is another structure which is very common in functional programming. The typical example is a function which returns a value when things go well and an error message when things go not so well
  • tuples: tuples are a nice lightweight alternatives to objects and perfect to return multiple values. Just do not be lazy and use classes when it makes sense to do so
  • memoization: this is caching for functions

For developers with experience in functional programming this will all sound very well known. For the rest of us let’s take a look at how we can use this stuff in practice.

Ok, but in practice how can we use this stuff?

Obviously showing an example for each of the feature of Javaslang is far beyond the scope of this post. Let’s just see how we could use some of them and in particular let’s focus on the bread and butter of functional programming: functions manipulation.

Given that I am obsessed with manipulation of Java code we are going to see how we can use Javaslang to examine the Abstract Syntax Tree (AST) of some Java code. The AST can be easily obtained using the beloved JavaParser.

If you are using gradle your build.gradle file could look like this:

We are going to implement very simple queries. Queries we can be answered just looking at the AST without solving symbols. If you want to play with Java ASTs and solve symbols you may want to take a look at this project of mine: java-symbol-solver.

For example:

  • find classes with a method with a given name
  • find classes with a method with a given number of parameters
  • find classes with a given name
  • combining the previos queries

Let’s start with a function which given a CompilationUnit and a method name returns a List of TypeDeclarations defining a method with that name. For people who never used JavaParser: a CompilationUnit represents an entire Java file, possibly containing several TypeDeclarations. A TypeDeclaration can be a class, an interface, an enum or an annotation declaration.

getTypesWithThisMethod is very simple: we take all the types in the CompilationUnit (cu.getTypes()) and we filter them, selecting only the types which have a method with that name. The real work is done in hasMethodNamed.

In hasMethodNamed we start by creating a javaslang.collection.List from our java.util.List (List.ofAll(typeDeclaration.getMembers())Then we consider that we are only interested in the MethodDeclarations: we are not interested in field declarations or other stuff contained in the type declaration. So we map each method declaration to either Option.of(true) if the name of the method matches the desidered methodName, otherwise we map it toOption.of(false). Everything that is not a MethodDeclaration is mapped to Option.none(). Note that we do that in two steps: first the method is mapped to an Option<String> then the Option<String> is mapped to an Option<Boolean>.

So for example, if we are looking for a method name “foo” in a class which has three fields, followed by methods named “bar”, “foo” and “baz” we will get a list of:

Option.none(), Option.none(), Option.none(), Option.of(false)Option.of(true)Option.of(false)

The next step is to map both Option.none() and Option.of(false) to false and Option.of(true) to true. Note that we could have than that immediately instead of having two maps operation concatenated. However I prefer to do things in steps. Once we get a list of true and false we need to derive one single value out of it, which should be true if the list contains at least one true, and false otherwise. Obtaining a single value from a list is called a reduce operation. There are different variants of this kind of operation: I will let you look into the details 🙂

We could rewrite the latest method like this:

Why we would like to do so? It seems (and it is) much more complicate but it shows us how we can manipulate functions and this is an intermediate step to obtain code which is more flexible and powerful. So let’s try to understand what we are doing.

First a quick note: the class Function1 indicates a function taking one parameter. The first generic parameter is the type of the parameter accepted by the function, while the second one is the type of the value returned by the function. Function2 takes instead 2 parameters. You can understand how this goes on 🙂

We:

  • reverse the order in which parameters can be passed to a function
  • we create a partially applied function: this is a function in which the first parameter is “fixed”

So we create our originalFunctionReversedAndCurriedAndAppliedToMethodName just manipulating the original function hasMethodNamed. The original function took 2 parameters: a TypeDeclaration  and the name of the method. Our elaborated function takes just a TypeDeclaration. It still returns a boolean.

We then simply transform our function in a predicate with this tiny function which we could reuse over and over:

Now, this is how we can make it more generic:

Ok, now we could generalize also hasMethodWithName:

After some refactoring we get this code:

Now let’s see how it can be used:

The source file we used in this tests is this one:

This is of course a very, very, very limited introduction to the potentialities of Javaslang. What I thinki is important to get for someone new to functional programming is the tendence to write very small functions which can be composed and manipulates to obtain very flexible and powerful code. Functional programming can seem obscure when we start using it but if you look at the tests we wrote I think they are rather clear and descriptive.

Functional Programming: is all the hype justified?

I think there is a lot of interest in functional programming but if that becomes hype it could lead to poor design decisiong. Think about the time when OOP was the new rising star: the Java designers went all the way down forcing programmers to put every piece of code in a class and now we have utility classes with a bunch of static methods. In other words we took functions and asked them to pretend to be a class to gain our OOP medal. Does it make sense? I do not think so. Perhaps it helped to be a bit extremist to strongly encourage people to learn OOP principles. That is why if you want to learn functional programming you may want to use functional-only languages like Haskell: because they really, really, really push you into functional programming. So that you can learn the principles and use them when it does make sense to do so.

Conclusions

I think functional programming is a powerful tool and it can lead to very expressive code. It is not the right tool for every kind of problem, of course. It is unfortunate that Java 8 comes without proper support for functional programming patterns in the standard library. However some of the enabling features have been introduced in the language and Javaslang is making possible to write great functional code right now. I think more libraries will come later, and perhaps they will help keeping Java alive and healthy for a little longer.

 

Note: thanks to Lorenzo Bettini for pointing out a couple of mistakes

A tutorial on using Sql2o with Spark and other updates

A few weeks ago I wrote a tutorial on getting started with Spark (the Java web framework). A few readers appreciated it and it was linked by the Jetbrains blog, republished by DZone and republished by the new Spark tutorials blog.

After that me and David Åse chatted a bit and we decided to work together on a few tutorials to publish on the Spark tutorials blog. So today we publish the first of hopefully a long list: Spark and Databases: Configuring Spark to work with Sql2o in a testable way.

Content of the tutorial on Sql2o + Spark

  • see when to use an ORM and when not

  • how to organize the code that access the database and integrate it with the controllers

  • how to use Sql2o

  • we put everything together and improve the BlogService we have started in the first post on Spark.

At the end we will have something like this:

5069583_orig

Plans for the future

David is a great guy that among the other things rewrote the Spark website (does look cool, eh?). I asked him how he was involved in Spark and we are working on a short interview, similar to the one I had with Luca Barbato: I think it always inspiring to learn how people started giving back to the open-source community.

Reviewing, reviewing, reviewing

In the rest of the week I have been fairly busy doing a technical reviews for two books from the Pragmatic Bookshelf (did I tell already that I love their books?). It required a fair amount of effort but I learned a few things on topics I would not have time to spend time on normally, so I am fairly happy.

Getting started with Spark: it is possible to create lightweight RESTful application also in Java

Recently I have been writing a RESTful service using Spark, a web framework for Java (which is not related to Apache Spark). When we planned to write this I was ready to the unavoidable Javaesque avalanche of interfaces, boilerplate code and deep hierarchies. I was very surprised to find out that an alternative world exists also for the developers confined to Java.

In this post we are going to see how to build a RESTful application for a blog, using JSON to transfer data. We will see:

  • how to create a simple Hello world in Spark
  • how to specify the layout of the JSON object expected in the request
  • how to send a post request to create a new post
  • how to send a get request to retrieve the list of posts

We are not going to see how to insert this data in a DB. We will just keep the list in memory (in my real service I have been using sql2o).

Note: I wrote a bunch of other tutorials on Spark. Take a look at Spark tutorials website.

A few dependencies

We will be using Maven so I will start by creating a new pom.xml throwing in a few things. Basically:

  • Spark
  • Jackson
  • Lombok
  • Guava
  • Easymock (used only in tests, not presented in this post)
  • Gson

Spark hello world

Do you have all of this? Cool let’s write some code then.

And now we can run it with something like:

Let’s open a browser and visit localhost http://localhost:4567/posts. Here we want to do a simple get. For performing posts you could want to use the Postman plugin for your browser or just run curl. Whatever works for you.

Using Jackson and Lombok for awesome descriptive exchange objects

In a typical RESTful application we expect to receive POST requests with json objects as part of the payload. Our job will be to check the code is well-formed JSON, that it corresponds to the expected structure, that the values are in the valid ranges, etc. Kind of boring and repetitive. We could do that in different ways. The most basic one is to use gson:

We probably do not want to do that.

A more declarative way to specify what structure we expect is creating a specific class.

And then we could use Jackson:

In this way Jackson check automatically for us if the payload has the expected structure. We could want to verify if additional constraints are respected. For example we could want to check if the title is not empty and at least one category is specified. We could create an interface just for validation:

Still we have a bunch of boring getters and setters. They are not very informative and just pollute the code. We can get rid of them using Lombok. Lombok is an annotation processor that add repetitive methods for you (getters, setters, equals, hashCode, etc.). You can think of it as a plugin for your compiler that looks for annotations (like @Data) and generates methods based on them. If you add it to your dependencies maven will be fine but your IDE could not give you auto-completion for the methods that Lombok adds. You may want to install a plugin. For Intellij Idea I am using Lombok Plugin version 0.9.1 and it works great.

Now you can revise the class NewPostPayload as:

Much nicer, eh?

A complete example

We need to do basically two things:

  1. insert a new post
  2. retrieve the whole list of posts

The first operation should be implemented as a POST (it has side effects), while the second one as a GET. Both of them are operation on the posts collection so we will use the endpoint /posts .

Let’s start by inserting  post. First of all we will parse

And then see how to retrieve all the posts:

And the final code is:

 

Using PostMan to try the application

You may want to use curl instead, if you prefer the command line. I like not having to escape my JSON and having a basic editor so I use PostMan (a Chrome plugin).

Let’s insert a post. We specify all the fields as part of a Json object inserted in the body of the request. We get back the ID of the post created.

Screen Shot 2015-03-30 at 17.25.22

Then we can get the list of the posts. In this case we use a GET (no body in the request) and we get the data of all the posts (just the one we inserted above).

Screen Shot 2015-03-30 at 17.30.33

Conclusions

I have to say that I was positively surprised by this project. I was ready for the worse: this is the kind of application that requires a basic logic and a lot of plumbing. I found out that Python, Clojure and Ruby do all a great jobs for this kinds of problems, while the times I wrote simple web applications in Java the logic was drown in boilerplate code. Well, things can be different. The combination of Spark, Lombok, Jackson and Java 8 is really tempting. I am very grateful to the authors of these pieces of software, they are really improving the life of Java developers. I consider it also a lesson: great frameworks can frequently improves things much more than we think.

Edit: I received a suggestion to improve one of the example from the good folks on reddit. Thanks! Please keep the good suggestions coming!

Getting started with Docker from a developer point of view: how to build an environment you can trust

Lately I have spent a lot of thoughts on building repeatable processes that can be trusted. I think that there lies the difference between being an happy hacker cracking out code for the fun of it and an happy hacker delivering something you can count on. What makes you a professional it is a process that is stable, is safe and permit you to evolve without regressions.

As part of this process I focused more on Continuos Integration and on techniques for testing. I think a big part of having a good process is to have an environment you can control, easily configure and replicate as you want. Have you ever updated something on your development machine and all the hell breaks loose? Well, I do not like that. Sure, there are a few tools we can use:

  • Virtualenv when working on python, to isolate the libraries you want to access
  • RVM and Gemfiles to play with different versions of Ruby/JRuby + libraries for different projects
  • Cabal, which permits to specify project specific sets of libraries for Haskell projects (and BTW good luck with that…)
  • Maven to specify which version of the java compiler you want to use and which dependencies

These tools help a lot, but they are not nearly enough. Sometimes you have to access shared libraries, sometimes you need a certain tool (apache httpd? MySQL? Postgresql?) installed and configured in a certain way, for example:

  • you could need to have an apache httpd configured on a certain port, for a certain domain name
  • you could need a certain set of users for your DB, with specific permissions set
  • you could need to use a specific compiler, maybe even a specific version (C++’11, anyone?)

There are many things that you could need to control to have a fully replicable environment. Sometimes you can just use some scripts to create that environment and distribute those scripts. Sometimes you can give instructions, listing all the steps to replicate that environment. The problem is that other contributors could fail to execute those steps and your whole environment could be messed up when you update something in your system. When that happen you want a button to click to return to a known working state.

You can easily start having slightly different environments w.r.t. your other team members or the production environment and inconsistencies start to creep in. Moreover if you have a long setup process, it could be take a long time to you to recreate the environment on a new machine. When you need to start working on another laptop for whatever reason you want to be able to do that easily, when you want someone to start contributing to your open-source projects you want to lower the barriers.

It is for all these reasons that recently I started playing with Docker.

What is Docker and how to install it

Basically you can imagine Docker as a sort of lightweight alternative to VirtualBox or other similar hypervisors. Running on a linux box, you can create different sort-of virtual-machines all using the kernel of the “real” machine. However you can fully isolate those virtual machines, installing specific versions of the tools you need, specific libraries, etc.

Docker runs natively only on Linux. To use it under Mac OS-X or Windows you need to create a lightweight virtual machine running Linux and Docker will run on that virtual machine. However the whole mess can be partially hidden using boot2docker. It means some additional headaches but you can survive that, if you have to. If I can I prefer to ssh on a Linux box and run Docker there, but sometimes it is not the best solution.

To install docker on a Debian derivative just run:

Our example: creating two interacting docker containers

Let’s start with a simple example: let’s suppose you want to develop a PHP application (I am sorry…) and you want to use MySQL as your database (sorry again…).

We will create two docker containers: on the first one we will install PHP, on the second one MySQL. We will make the two containers communicate and access the application from the browser on our guest machine. For simplicity we will run PhpMyAdmin instead of developing any sample application in PHP.

The first Docker container: PHP

Let’s start with something very simple: let’s configure a Docker image to run httpd under centos6. Let’s create a directory named phpmachine and create a file named Dockerfile.

Note that this is a very simple example: we are not specifying a certain version of httpd to be installed. When installing some other software we could want to do that.

From the directory containing the Dockerfile run:

This command will create a container as described by he instructions. As first thing it will download a Centos 6 image to be used as base of this machine.

Now running docker images you should find a line similar to this one:

You can now start this container and login into it with this command:

Once you are logged into the container you can start Apache and find out the IP of the docker machine running it:

Now, if you type that IP in a browser you should see something like this:

Screenshot from 2015-03-08 17:13:53

Cool, it is up and running!

Let’s improve the process so that 1) we can start the httpd server without having to use the console of the docker container 2) we do not have to figure out the IP of the container.

To solve the first issue just add this line to the Dockerfile:

Now rebuild the container and start it like this:

In this way the port 80 of the docker container is re-mapped into the port 80 of the host machine. You can now open a browser and use the localhost or 127.0.0.1 address.

Wonderful, now let’s get started with the MySQL server.

The second Docker container: MySQL server

We want to create a Dockerfile in another directory and add in the same directory a script named config_db.sh.

Note: we are not saving in any way the data of our MySQL DB, so every time we restart the container we lose everything.

Now we can build the machine:

Then we can run it:

And we can connect from our “real box” to the mysql server running in the docker container:

Does everything works as expected so far? Cool, let’s move on.

Make the two docker containers communicate

Let’s assign a name to the mysql container:

Now let’s start the PHP container telling it about the mysqlcontainer:

From the console of the phpmachine you should be able to ping dbhost (the name under which the phpmachine can reach the mysql container). Good!

In practice a line is added to the /etc/hosts file of the phpmachine, associating dbhost with the IP of our mysqlmachine.

Installing PHPMyAdmin

We are using PHPMyAdmin as the placeholder for some application that you could want to develop. When you develop an application you want to edit it on your development machine and making it available to the docker container. So, download PhpMyAdmin version 4.0.x (later versions require mysql 5.5, while centos 6 uses mysql 5.1) and unpack it in some directory, suppose it is in ~/Downloads/phpMyAdmin-4.0.10-all-languages. Now you can run the docker container with php like this:

This will mount the directory with the source code of PhpMyAdmin on /var/www/html in the* phpmachine*, which is the directory which Apache httpd is configured to serve.

At this point you need to rename config.sample.inc.php in config.inc.php and change this line:

In this way the phpmachine should use the db on the mysqlmachine.

Now you should be able to visit localhost and see a form.There insert the credentials for the db: myuser, myuserpwd and you should be all set!

Screenshot from 2015-03-09 19:59:07

How does Docker relate with Vagrant and Ansible, Chef, Puppet?

There are a few other tools that could help with managing virtual machines and sort-of-virtual machines. If you are a bit confused about the relations between different tools this is an over-simplistic summary:

  • Vagrant is a command line utility to manage virtual machines, but we are talking about complete simulations of a machine, while Docker uses the kernel from the Docker host, resulting in much lighter “virtual machines” (our Docker containers)
  • Ansible, Chef and Puppet are ways to manage the configuration of these machines (operationalising processes) they could be used in conjunction with Docker. Ansible seems much lighter compared to Chef and Puppet (but slightly less powerful). It is gaining momentum among Docker users and I plan to learn more about it.

This post gives some more details about the relations between these tools.

Conclusions

In our small example we could play with a realistic simulation of the final production setup, which we suppose composed by two machines running CentOS 6. By doing so we have figured out a few things (e.g., we have packages for MySQL 5.1 and it forces us to not use the last version of PhpMyAdmin, we know the complete list of packages we need to install, etc.). In this way we can reasonably expects very few surprised when deploying to the production environment. I strongly believe that having less surprises is extremely good.

We could also just deploy the docker containers itself if we want so (I have never tried that yet).

Update: I am happy the guys at Docker cited this article in their weekly newsletter, thanks!

Portability: stories of what can go wrong when run your code on another machine

In the last year I faced many surprises when running some well tested code on my dev-servers or my laptops. It is curious (and scaring) how code that has been widely used in production (sometimes for years) can still hide portability issues so that the first time you try that piece of software in slightly different conditions the unexpected happens.

I have experienced that both when working on some open-source projects and in some very big companies. The difference probably is that such problems tend to emerge sooner in open-source projects, if there is an active userbase, while in companies that control their development environment these little time bombs can remain silent and struck a lot of time after being put in place. In the following a list a few categories of portability issues that caused problems.

Locale configuration

This is something we constantly overlook but a lot of libraries do assumptions according to the locale configured on the current machine. If you are on a unix-ish box (linux, bsd, mac, etc.) open a console a run locale. You will get something similar.

Screen Shot 2015-02-09 at 10.08.54

These environment variables could affect the way dates are parsed or the even numbers are parsed. For example in Italian we use the comma instead of the dot to separate the integer from the fractional part of numbers so that “12.14” could not be parsed if you locale is set to Italian and be parsed if it is set to UK English. Or American expect the month to precede the day in dates. So:

02/01/2015

Could be the 1st of February for an American or the 2nd of January in most European countries. The way it is parsed could depend on the locale configuration.

You will notice that the locale configuration contains also a default encoding (UTF-8) in my case, so I would imagine that also encoding problems with text are possible. I did not face them yet this year but I will keep an eye open in that.

Locale configuration… over SSH

A variant of the previous problem (or a multiplier of it) is that locale configuration can be transferred when ssh-ing on a machine. By default if you connect, let’s say, from a machine with an Irish locale to a machine with an US locale the console opened will be configured with the Irish locale. Imagine how fun is to try to debug this problem: a colleague of yours (with the American locale) ssh into that machine and does not see any problem, then you ssh into it and run in the problem magically appearing just for Irish folks (should we suspect Leprechauns?).

How can you avoid that? Simple, you can solve it either preventing the client to send the environment configuration or preventing the server from accepting it. To prevent the client from sending it open your /etc/ssh_config and look for these lines:

Now, remove these bad boys and save yourself some headaches. For preventing the server from accepting it you have to look for the configuration of the ssh daemon (sshd).

Bonus solution: fix your software to not depend on the locale configuration

Poor man solution: force the locale to the holy working value (typically en_US.UTF-8) before starting compiling or running the locale-dependant/buggy application

Timezone

I found out that some tests were passing if they were ran in a certain timezone… hint: was not the timezone where I was in

Why was it happening? Because some functions had an hard-coded timezone, while others had not. Now, it has been very confusing to solve this issue because a value obtained from parsing a date like: 1/1/2015 ended up being transformed in 2/1/2015 (2nd of January) after a few passages. So, be sure to not being silently using the current timezone in some places and use an hard-coded one (says, UTC) in others. Or be ready to deal with weird bugs. I wonder what happens when the summer time is enabled or disabled… fun time.

Version dependent implementations

Sometimes the problem is that you are doing something really stupid and do not realize because it happens to work on a very specific configuration. Those are among my favourite bugs. Suppose for example that you write a test checking if a certain value is present as the first element of an array. So far so good. The problem is this array is obtained by iterating over a Set which does not give any guarantees about the order of the iterated elements (they are not sorted by any known and sensible function and they are not necessarily in the same order they were inserted).
Now, until you run your tests on a machine with the same architecture, and the same version of the standard libraries (the same JDK in this case) you do not notice any issue, and you will not notice them until a new version of the JDK is released which return the values of that implementation of Set in a different order (absolutely legit). And now your tests do not pass. Have fun finding out the root cause.

Compilers

This will deserve a series of post of its own. I experienced that while working on C++ code using some features from C++ ’11. In particular I was trying to make the some codebase work on:

  • gcc
  • clang
  • mingw
  • visual c

I was very surprised by the warnings (and even errors) that some compilers report on code that other compilers are perfectly fine with. The worst thing was one function (a pretty important one) of the standard library were not available under one particular platform. I figured out after I started using that function all over the same and when I tried to port my application to a new compiler, I ended up making the feature using that function unavailable/crappy on that platform. Definitely not satisfying but at least I remembered why I stopped programming in C++. The advantages of the JVM are easily overlooked. And everything in the end is easier to port than C++ code.

Conclusions

This sort of issues make me wonder how software can work at all: the number of possible errors that can go unnoticed is simply mesmerising.  I think the only answer is release, test, stress your code in any way possible and be anyway ready to face all sort of problems leading to interesting debugging sessions. If you have talented, well-educated and patient developers maybe your code will be working as desired a reasonably portion of time. Maybe.

 

Getting started with Frege: Hello World and basic setup using Maven

I spent a couple of hours playing with Frege (Haskell on the JVM) and not much documentation tutorials seems available. I am trying to help writing this simple Hello World tutorial.

The code is available on Github: https://github.com/ftomassetti/frege-tutorial/tree/01_HelloWorld

Update: Frege has some very useful documentation at http://www.frege-lang.org/doc/… where … represents the package, or module, name. For example, if one needs some reference for the frege.java.util.Regex package, one looks at http://www.frege-lang.org/doc/frege/java/util/Regex.html

Frege source code

The code is very simple for our little hello world example. In this tutorial we focus mainly on configuring our environment.

We declare the name of module to be HelloWorld. It will affect the name of the Java class produced.

The third line defines the type signature of the main function, while the fourth lines define main as a call to putStrLn using an IO Monad. In practice, you have to do the operations which affect the real world (like reading from a file or writing to the screen) inside a do statement. The reason is that the compiler treat them differently from pure functions, which can be optimized in several ways (lazyness, memoization, etc.) while “realworld operations” cannot.

Writing the POM (Maven configuration file)

First let’s take a look to the whole file:

The dependencies contain frege, no surprises here:

We then use two plugins, to compile Frege code and Java code:

Finally we save the classpath used by Maven in a file (classpath.conf) by using  the maven-dependency-plugin

The classpath.conf file will be useful for running the application using the run.sh script.

Running the application, the run.sh script

To run the application we need the frege jar and the classes generated from our frege source code.

Compile and running HelloWorld

After cloning the repository, you can simply run:

The result, should be something like:

[federico@normandie frege-tutorial]$ sh run.sh
Hello world. Frege is a lot of fun!
runtime 0.001 wallclock seconds.

 

Exploring frege: Haskell for the JVM

Recently I played a lot with Clojure and as part of my playing I built a civilizations simulator named civs. I really love building applications at the REPL and my Clojure code is much clearer and easier to read than the Java code I could have written for it

but…

when you need to refactor significantly your code it is going to be painful. Probably you can limit the issue with appropriate design choices and there is the possibility to use different forms of optional typing in Clojure, but still, you will miss a compiler helping you figure out what functions you need to revisit because of your small change in a data structure. Sure, I should add use different abstractions but…

So, I spent some time trying to figure out which functional static typed language was worthy to spend some time on, just to know better my options. I considered:

OCaml: it seems that the adoption is declining more and more according to different metrics and the current compilers do not support multithread, which seems a capital sin.

Scala: I have used it just to build an incremental parser for Java and I dislike it. It tries to be too many things at the same time and remember me C++, at the good old times when language designers taught that supporting more and more and more features in a language was a nice thing. That and the awful tool support made me decise that I do not want to touch it neither with a pole, unless forced too.

Haskell: what else is left? SML?

Now, as part of my job at TripAdvisor I am learning to be more and more practical and getting things done. I have already some investments done in the JVM environment. For example my world generator (lands) can be used through Jython on the JVM as well as the name generator (namegen) and a few other things that are written in Java or Clojure (no stuff left in JRuby, right?). So it would be great to have Haskell for the JVM…

…and then is when Frege comes to the rescue! It is just this, a port of Haskell to the JVM.

Basic toolchain

From the Github page you can download the jars you need. Visit this page: https://github.com/Frege/frege

You can play with the REPL and when you want to compile & run:

java -jar ~/tools/frege3.21.586-g026e8d7.jar main.fr
java -cp ~/tools/frege3.21.586-g026e8d7.jar:. Main

A small difference between Haskell and Frege

While in Haskell I would write:


data Name = String | Unnamed
deriving Show

 

In Frege I need to write:


data Name = String | Unnamed
derive Show Name -- no spaces at the beginning of the line. Yes, it matters

What derive means translated in Java? It means more or less I am too lazy to write the toString method for this class (Name) so please do the usual thing auto-magically and implement it for me.

A first example

So I started to create a few types to run the civilization simulator in Frege:

 

Running the first example

Screenshot from 2014-09-20 14:32:50

So far Frege seems stable, easy to install and very close to Haskell. Now I want to play more with the Java integration.