Resurrect a C++ codebase and create a proper open-source project out of it

Our interests often are the sparkle to start a pet project. For example, I am interested in world generators and because of that I created Lands: an application which simulate different physical phenomena and produce as outputs different maps (for elevation, rivers, biomes, etc.). After many experiments I finally understood that a critical component of a world generator is the plate tectonics simulation. Now, writing a plate tectonics is not an easy task: it requires a lot of research and a lot of tuning to obtain realistic results. Moreover it is not easy to achieve decent performances, generating a small world (let’s say 512 x 512 cells) could easily take several minutes even on a recent and powerful machine so the code need to be reasonably optimized.

Luckily, I found a very interesting project to be used as a base for Lands. The project was built as part of a master thesis and it is called platec. I started by creating python-bindings for this project (pyplatec) given that Lands is written in Python and platec it is written in C++. After a while I needed to improve a few things in that project. For example platec can generate only square maps with a side which is a power of 2 (e.g., a map of size 512 x 512 can be generated but one with 511 x 511, 513 x 513 or 800 x 600 can not). Initially I was just rescaling the maps generated from platec to the desired size but it caused distorsions (especially when the map had a width/height ratio very different from 1).

The project did not seem maintained (no forum, no way to report issues, no recent releases) so I wrote to the original author and then I created my fork. It could sound silly but I also wanted to have it on GitHub instead that of Sourceforge. My fork, plate-tectonics, simply started with the code from platec. I just threw away the code for the UI (I was interested only in using the code as a library for Lands).

While the code was doing most of what I wanted there were a few things missing to make it a “proper” open-source project:

  1. Build system there was a Makefile, but I wanted a cross-platform build system
  2. Tests the project did not contain tests at all
  3. Automated builds I love them for two reasons: 1) they force me to create a completely repeatable build process 2) they verify my code run and my tests pass not only on my machine (ever forgot a git add?)
  4. Documenting I wanted to save the trouble I had understanding the code to next contributors and especially to myself, when I am going to look at the code again in a few months

So I invested some time on these aspects before writing the few changes I needed.

Build system

I decided to use CMake, which supports almost all platforms and compilers (I got Mac Os X, Linux and Windows covered, so I am happy enough). I never used CMake before but it is sort of decent. I miss a better system for dependency management. Forget Maven or Gems, you have to obtain the code of you dependency yourself. For example the suggested mode to include Google Test (a test framework) is to just copy the code in your project. It seems sub-optimal to me… There are still libraries that have to be installed by the user using platform specific tools (apt-get, yum, brew, you name it), CMake can just try to verify if the libraries are installed but it cannot provide any help in actually installing them. There is definitely a lot to be desired here. I would very happy to hear about alternatives.

Tests

Tests were very important also because the code was very complex and C++ can behave in mysterious ways from time to time. It was a codebase with very few comments and big methods and classes. I needed sort of black-box tests to check I was not going to break anything while refactoring. I started using Google Test about I could also have used CppUnit. Currently I have decent tests for the code I added. I test the existing code just by checking that some generated worlds keep being exactly the same as before. From time to time I have to break this absolute constraint; in that case I just run the tool, generate a new world, look at it and decide if it can be used as the new reference. This approach was also necessary to test the generation of worlds which could not be generated before (like the non-square worlds).

Automate builds

I love Travis. Unfortunately it uses just linux machines (so I cannot verify my code works on Mac or Windows) but it still a very good sentinel. I can use it to build my application with a couple of different compilers (clang and gcc) and it does not hurt. Unfortunately when I tried to compile the code on windows I faced errors that I could not discover using Travis, so there is space for improvements. Any suggestion?

Documenting

On this side there is still a lot to be done. I just added a few comments here and there, removed comments written in Finnish (unfortunately I do not speak the language :D) and written a minimal README. I also added a link to the original master thesis, which helped me a lot in understanding the codebase.

Conclusions

I am happy about the status of plate-tectonics: it is far from being perfect but I managed to do the main changes I wanted to do. I would definitely need to document and test better the project. However in the last few months I thought about all the things I wanted to see in the open-source projects I interacted with and I tried to put them in plate-tectonics. I realized that code is fundamental, but there are still many things to build around it which transform it from a bunch a files in a directory into a proper open-source project.

I would love to hear about similar experience with resurrecting open-source projects and in general about which aspects you think are the most relevant for the success of a similar project.