Let’s say you need to automatically port some code from one language to another, how are you going to do it? Is it even possible? Maybe you have already seen a conversion between similar languages, such as Java to C#. That sounds much simpler in comparison.

In this article we are going to discuss some broad strategies to translate JavaScript to a very different language, such as C#. We will discuss the issues with that, and plan some possible solutions. It is an accessible overview of main issues you will encounter. We will not arrive to writing code: that would be far too complicated for an introduction to the topic. Let’s avoid putting together something terribly hacky just for the sake of typing some code.

Having said that, we are going to see all the problems you may find in converting one real JavaScript project: fuzzysearch, a tiny but very successful library to calculate the difference between two strings, in the context of spelling correction.

'use strict';

function fuzzysearch (needle, haystack) {
  var hlen = haystack.length;
  var nlen = needle.length;
  if (nlen > hlen) {
    return false;
  }
  if (nlen === hlen) {
    return needle === haystack;
  }
  outer: for (var i = 0, j = 0; i < nlen; i++) {
    var nch = needle.charCodeAt(i);
    while (j < hlen) {
      if (haystack.charCodeAt(j++) === nch) {
        continue outer;
      }
    }
    return false;
  }
  return true;
}

module.exports = fuzzysearch;

When it is Worth the Effort

First of all, you should ask yourself if the conversion it is worth the effort. Even if you were able to successfully obtain some runnable C#, you have to consider that the style and the architecture will probably be unnatural. It would not follow best practices and patterns adopted by the rest of your code. As a consequence, the project could be harder to maintain than if you write it from scratch in C#.

This is a common problem even in carefully planned conversion, as the one who originated Lucene.net, that started as a conversion from Java to C#. Furthermore, you will not be able to use it without manual work for every specific project, because even the standard libraries are just different. Look at the example: while you could capitalize the length of haystack.length, you cannot just capitalize charCodeAt, because you will have to map different functions in the source and destination language.

On the other hand, all languages have an area of specialization which may interest to you, such as Natural Language Processing in Python. And if you accept the fact that you will have to do some manual work, and you are very interested in one project, then creating an automatic conversion will give you a huge head start. If you are interested in having a well-tested generic tool, you may want to concentrate on small libraries, such as the JavaScript one in the example.

Parse it with ANTLR

The first step is parsing, and for that, you should just use ANTLR. There are already many grammars available which may not necessarily be up-to-date. But are much better than starting from scratch and they will give you an idea of the scale of the project. You should use visitors, instead of listeners, because they allow you to control the flow more easily. You should parse the different elements in custom classes, that can manage the small problems that arises. Once you have done this, generating C# should be easier.

The Small Differences

There are things that you could just skip, such as the first and last lines, they most probably do not apply to your C# project. But you must pay attention to the small differences: the var keyword has a different meaning in JavaScript and C#. By coincidence it would work most of the time, and would be quite useful to avoid the problem of the lack of strict typing in JavaScript. But it is not magic, you are just hoping that the compiler will figure it out. And sometimes it is not a one to one conversion. For instance, you cannot use in C# in the way it is used in the initialization of the for cycle on line 12, because you can only declare one variable at a time.

The continue before outer, on line 16, should be transformed in goto, but when it is alone it works just as in C#. A difference that could be fixed quite brutally is the strict equality comparison ===/!==, that could be replaced with ==/!= in most of cases, since it is related to problems due to the dynamic typing of JavaScript. In general, you can do a pre-parse check and transform the original source code to avoid some problems or even comment out some things that cannot be easily managed.

I Present You Thy Enemy: Dynamic Typing

The real problem is that JavaScript uses dynamic typing, while C# use strict typing. In JavaScript any variable could be of any type, which leads to certain issues, such as the aforementioned strict equality operator, but it is very easy to use. In C# instead, you need to know the type of your variables, because there are checks to be made. And this information is simply not available in the JavaScript source code. You might think that you could just use the var keyword, but you cannot. The compiler must be able to determine the real type at compile time, something that will not always be possible. For example you cannot use it in declaring function arguments.

You can use the dynamic keyword, which makes the type be determined at execution time. Still this does not fixes all the problem, such as initialization. You may check the source code for literal initialization or, in theory, even execute the original JavaScript in C# and find a way to determine the correct type. But that would be quite convoluted. You might get lucky, and in small project, such our example, you will, but not always.

A technique that can work is to create a custom class that can manage at runtime your specific problem. For example, imagine that you have a function that sometimes returns a floating number and sometimes an integer. The trouble is that when it returns a floating number, it performs one operation, but it performs another one when it get an integer. You can create a custom class in C# that determine at runtime which actual type the current result is and behave accordingly.

There are also problems that can be more easy to manage than you imagined. For instance, assigning a function to a variable it is not something that you usually do as explicitly in C# as you do in JavaScript. But it is easy using the type delegate and constructs such as Func. Of course you still have to deal with determining the correct types of the arguments, if any is present, but it does not add any other difficulties per se.

Not Everything is an Object and Other Issues

In Javascript string is a string, but not an object, while in C# everything is an object, there are no exceptions. This is a relevant issue, but it is less problematic than dynamic typing. For instance, to convert our example we just have to wrap around the function a custom class, which is not really hard.

One obvious problem is that there are different libraries in different languages. Some will not be available in the destination language. On the other hand, some part of the project might not be needed in the destination language, because there are already better alternatives. Of course you still have to actually change all the related code or wrap the real library in the destination language around a custom class that mimics the original one.

Conclusion

There are indeed major difficulties even for a small project to be able to transform from language to another, especially when they are so different like JavaScript and C#. But let’s image that you are interested in something very specific, such a very successful library and its plugins. You want to port the main library and to give a simpler way for the developers of the plugins to port their work. There are probably many similarities in the code, and so you can do most of the work to manage typical problems and can provide guidance for the remaining ones.

Converting code between languages so different in nature it is not easy, that is for sure. However, you can apply some mixed automatic/manual approach by converting a large amount of code automatically and fix the corner cases manually. If you can also translates the tests maybe you can later refactor the code, once it is in C#, and over time improve the quality.