Understand WebAssembly: Why It Will Change the Web

There is a new weapon in ~~the great war against JavaScript~~ the struggle to allow developers to choose their favorite style of programming while improving performance and productivity. That weapon is WebAssembly, that will revolutionize client-side web development.

WebAssembly, or wasm, is a low-level bytecode format for in-browser client-side scripting. If you are writing a compiler for a programming language one option is to target a platform, like the JVM or .NET, and compile your language to the target bytecode. WebAssembly occupies the same role, so when you compiling to WebAssembly you are making your software available for all platforms where it is supported, in other words all browsers.

In practical terms, WebAssembly is implemented by browsers’ developers on the back of the existing JavaScript engine. Essentially, it is designed to substitute JavaScript as the destination of compilers and transpilers on the web. For instance, instead of compiling TypeScript to JavaScript, its developers could now compile to WebAssembly. In short, it is not a new virtual machine, it is a new format for the same JavaScript VM that is included in every browser. This will make possible to take advantage of the existing JavaScript infrastructure, without using JavaScript.

The design of the Minimum Viable Product was completed in March 2017 and now there are implementations ready for every major browser.

Why it Matters?

For starters, the new WebAssembly format promise significant gains in terms of parsing performance:

The kind of binary format being considered for WebAssembly can be natively decoded much faster than JavaScript can be parsed (experiments show more than 20× faster). On mobile, large compiled codes can easily take 20–40 seconds just to parse, so native decoding (especially when combined with other techniques like streaming for better-than-gzip compression) is critical to providing a good cold-load user experience.
– from the FAQ of WebAssembly

Note that we are talking about parsing performance, not necessarily execution performance. Because in many cases it will run on the existing JavaScript engine. However, this increase in parsing performance alone will permit to put on the web software that would have been impractical to develop before. For instance: virtual machines, virtual reality, image recognition, etc.

The first production users are probably going to be game engine developers, because they are always in search of the best performance. Before WebAssembly the best they could hope for was asm.js (a simplified JavaScript, optimized for speed), which is a cool technology, but it is not really usable for many games. I remember trying the famous demo Epic Citadel (now offline) by Unreal Technology. It actually ran smooth, but it took around 15 minutes to download and parse the code, which is obviously not good enough for a quick game.

In fact Autodesk plans to support WebAssembly for their Stingray game engine and Unity Technologies, the creators of the Unity game engine, started experimenting with WebAssembly back in 2015. The Rust developer also are already working to support WebAssembly to run Rust code on the web.

What Can it Do For You

In the greater scheme of things, the arrival of WebAssembly means that you will not be forced anymore to use JavaScript for the web, because it is the only thing that run in the browser. JavaScript has a bad reputation, but in reality is a good language for what is was designed for: quickly write small scripts. The problem is that currently you are forced to use for everything else you need to run on the web, and this is an issue for many large projects.

It is true that you can use better versions of JavaScript, like TypeScript, or even new languages like Kotlin. But, in the end they all have to compile down to JavaScript. In turns this has created issues to the developers of JavaScript, that have to support essentially all scenarios and all programming styles. WebAssembly will change that and allows everybody to concentrate on what they can do better.

That is not all: it will be possible to port WebAssembly to other platforms. This means that if you write software in a language that compiles to WebAssembly you might be able to run it on .NET. The fact that it is based on the already existing JavaScript infrastructure on the web means that you can already use it in production.

However, this is not the only option. You can create your own specific implementation for your needs. You could create an optimized compiler for your language. You could create it from scratch or add WebAssembly support to an existing compiler. Thus doing you could take advantage of all the other WebAssembly modules.

For instance, you could create a WebAssembly compiler for a DSL that you use internally on your company and make it run on the web, client-side, without the use of custom plugins like the Oracle Java Plug-in or Adobe Flash.

How it Works

A founding principle of WebAssembly is to integrate well with the existing JavaScript world. This ranges from technical things, such as inteoperability and sharing the security policies (same-origin), to tooling integration, such as as supporting the View Source functionality of web browsers.

To accomplish this goal WebAssembly defines both a binary format and an equivalent text format, for tools and human readers. Technically the textual format uses S-expressions, so it will look like the following.

(func (param i32) (local f64)
  get_local 0
  get_local 1)

However, the tools will probably show something more similar to this representation (example from the documentation).

C++	Binary	Text
int factorial(int n) { if (n == 0) return 1; else return n * factorial(n-1); }	20 00 42 00 51 04 7e 42 01 05 20 00 20 00 42 01 7d 10 00 7e 0b	get_local 0 i64.const 0 i64.eq if i64 i64.const 1 else get_local 0 get_local 0 i64.const 1 i64.sub call 0 i64.mul end

C++

Binary

Text

int factorial(int n) {
  if (n == 0)
    return 1;
  else
    return n * factorial(n-1);
}

get_local 0
i64.const 0
i64.eq
if i64
    i64.const 1
else
    get_local 0
    get_local 0
    i64.const 1
    i64.sub
    call 0
    i64.mul
end

You may wonder why C++ is used as an example. That is because the objective of the initial release (MVP) of WebAssembly was to support C/C++. Other languages will come later; at moment they are in development. This was chosen for a couple of technical and practical reasons:

The MVP of WebAssembly does not support garbage collection (it is in the works)
The implementation of the C/C++ to WebAssembly compiler can rely on a ready to use and battle tested tool like LLVM (one of the most used set of compiler tools)

The WebAssembly developers used LLVM to cut the amount of work necessary to get a working product. Furthermore, this allowed them to easily integrate with other tools that work with LLVM, like Emscripten.

With an MVP there can be testing and usage by real programmers, which allows to improve WebAssembly accordingly.

WebAssembly Tools

At the moment, the official WebAssembly tools can compile only C/C++ to WebAssembly, although other developers have already started working on adding support to other languages and platforms (previously we mentioned .NET and Rust). However, even with official tools you can already use it for web development it in two ways:

writing WebAssembly in the text format and convert it to binary using the provided tools
use a third-party tool that build upon these tools

The first choice is not really practical for common use, but it is the way to go if you want to get the feel of the format or start working in integrating it in your own tools. There are actually two toolkits: WebAssembly Binary Toolkit and Binaryen.

WABT includes tools for development and/or use in tools meant to work with WebAssembly:

it perfectly supports the format specification
it can convert from/to the textual format
it include an interpreter

In short, it ensures a clean and easy access to the data in the WebAssembly format so that you can work with it.

On the other hand, Binaryen is an industrial-strength toolkit meant for usage in a compiler infrastructure:

it can work with WebAssembly code or a control flow graph form meant for compilers
it optimizes the code with many passes, both standard optimization and specific ones for WebAssembly
it can compile from/to asm.js (a subset of JavaScript), Rust MIR (an intermediate language for Rust) and LLVM

So, this toolkit is ready to be integrated in your production backend.

Essentially, both allows you to create tools that manipulate WebAssembly, but WABT is designed for tool that are used during development (e.g., static analysis) while Binaryen is made to create production WebAssembly (e.g., compilers).

These tools are great for whoever develops tools and compiler-related products: they provide both development tools and ready-to-use production tools. However, they are not ideal for normal developers, for them there is an easier way.

If you need to build such tools, there is also a second option: build everything from scratch. This is the way to go if you need a custom compiler or interpreter for your own language, need the best performance or a lightweight tool. Since we have already worked with WebAssembly, we have made a library to help you (and us) working with WebAssembly: the WasmCompilerKit. It is basically a Kotlin library that you can use to load WASM files, modify them and generating them. Given that it is written in Kotlin, you can use it also from Java and Scala. We are considering transforming it in a multi-platform Kotlin project targeting also JavaScript.

Using WebAssembly

If you are just a developer interested in using WebAssembly the suggested way of starting is to use Emscripten (SDK). Emscripten is a toolchain already used to compile C/C++ in asm.js, a subset of JavaScript invented with similar goals to the ones of WebAssembly. With Emscripten you can more easily use the previously mentioned Binaryen and integrate it with its own chain.

Once you installed Emscripten, or you have it compiled from source, you must install binaryen.

# these commands should be executed inside the emsdk folder
# on Linux or Mac OS X
# this step might take a while
./emsdk install --build=Release sdk-incoming-64bit binaryen-master-64bit
./emsdk activate --global --build=Release sdk-incoming-64bit binaryen-master-64bit

# on Windows
# this step might take a while
# if you are using Visual Studio 2017, append --vs2017
emsdk install --build=Release sdk-incoming-64bit binaryen-master-64bit
emsdk activate --global --build=Release sdk-incoming-64bit binaryen-master-64bit

Then to activate the environment for compilation and ensure proper paths and variables are set, each time you launch the following commands.

# the command should be executed inside the emsdk folder
# on Linux or Mac OS X
source ./emsdk_env.sh --build=Release

# on Windows
emsdk_env.bat --build=Release

Finally, you can write your code.

#include <stdio.h>

int factorial(int n) {
  if (n == 0)
    return 1;
  else
    return n * factorial(n-1);
}

int main(int argc, char ** argv) {
  int number = 5;
  int fact = factorial(number);
  printf("The factorial of %d is %d", number, fact);
}

And then compile to WebAssembly and see it in your browser.

emcc WebAssemblyExample.c -s WASM=1 -o WebAssemblyExample.html
# launch the local web server included with emscripten
emrun --no_browser --port 8080 .

The first command will generate three files: a WASM module, an HTML file that shows the code in action and a JS file that setup the Module and take cares of all that is needed to run it. The WASM=1 indicates to Emscripten that we want to generate a WASM module, instead of of an asm.js file.

You can also output your code inside a custom template, using the --shell-file option. The Emscripten SDK installation contains the basic template in this location (esmdk-folder)/emscripten/incoming/src/shell_minimal.html. Copy that file in your project and adapt it your needs (e.g., add the rest of your JS code).

# We renamed it WebAssemblyTemplate.html
emcc -o WebAssemblyExample.html WebAssemblyExample.c -O3 -s WASM=1 --shell-file WebAssemblyTemplate.html

You could also output directly a JS file, but this not recommended at the moment. That is because you need code to take care of low-level issues like memory allocation, memory leaks, etc.

The final objective is to make to load WebAssembly module as easy is to load JavaScript code, with the <script type='module'> HTML code, but we are not there yet.

Interoperability Between C and JavaScript

The interoperability between C and JavaScript is an issue for Emscripten. The first thing you have to do is to include the header file of emscripten.

#include <emscripten.h>

The easier way to call JavaScript is simply to call the function emscripten_run_script:

// it is equivalent to call eval() in JavaScript
emscripten_run_script("alert('hello')");

Instead to call a C function from JavaScript is slightly more complicated. First, you have to make it available from C/C++ code, because by default Emscripten makes unavailable all C functions, except for the main one. So, you have to add the modifier EMSCRIPTEN_KEEP_ALIVE to all functions you want to use in JavaScript.

int EMSCRIPTEN_KEEPALIVE factorial(int n) {
  if (n == 0)
    return 1;
  else
    return n * factorial(n-1);
}

If you write in C++, remember to put any function you want to make available inside an extern 'C' block, to avoid C++ mangling the name of the function (that is something that C++ does, it is not a WebAssembly or Emscripten bug).

Second, you have to compile the WebAssembly module with the option NO_EXIT_RUNTIME, this avoid the shutdown of the runtime at the exit of the main function, which would make impossible to call C code from JavaScript.

emcc -o WebAssemblyExample.html WebAssemblyExample.c -O3 -s WASM=1 -s NO_EXIT_RUNTIME=1 --shell-file WebAssemblyTemplate.html

Third, you cannot call your C function directly, but you have to use the following syntax.

Module.ccall('factorial', // name of C function
             'number', // return type
             ['number'], // argument types
             [4] // arguments
);

The type could be one of three: number, string and array.

If you need to use the function multiple times in your JavaScript code, you can wrap it with a cwrap function.

factorial = Module.cwrap('factorial', 'number', ['number'])
factorial(4);

You can easily try it in the console.

Calling WebAssembly code from the console

Summary

We have seen a short introduction to WebAssembly: what it is, why you should care and how you can use it. It is going to be a great platform for the further evolution of the web: it will make developing for the web easier and more efficient.

Its development is backed by people at Mozilla, Microsoft, Google and Apple. The attention to the tooling is another proof of the importance of WebAssembly: it is going to change the web fast.

You can use WebAssembly today and look at more in depth documentation on the MDN website. If you are interested in knowing more details about the format you can read them on the official website. You could also look the Emscripten documentation to understand issues related to interoperability between JavaScript and the C/C++ code.

Finally, if you are ready to use it in production you may want to check a book: we wrote a review of WebAssembly In Action, which is a great book to work with WebAssembly from C/C++.

Understand WebAssembly: Why It Will Change the Web

Why it Matters?

What Can it Do For You

How it Works

WebAssembly Tools

Using WebAssembly

Interoperability Between C and JavaScript

Summary

Categories

More on Language Engineering

How to choose the target language for a migration

How to Use the EGL Parser