Create a simple parser in C# with Sprache

Create a simple parser in C# with Sprache

You can find the code for this article on github

Everybody loves ANTLR, but sometimes it may be overkill. On the other hand, a regular expression just doesn’t cut it or it may be too complicated to maintain. What a developer can do in such cases ? He uses Sprache. As its creators say:

Sprache is a simple, lightweight library for constructing parsers directly in C# code.

It doesn’t compete with “industrial strength” language workbenches – it fits somewhere in between regular expressions and a full-featured toolset like ANTLR.

It is a simple but effective tool, whose main limitation is being character-based. In other words, it works on characters and not on tokens. The advantage is that you can work directly with code and you don’t have to use external tools to generate the parser.

The guessing game

You can see the project website if you want to see specific real uses, let’s just say that its even credited by ReSharper and it was created more than six years ago, so it’s stable and quite good. It’s ideal to manage things like error messages created by other tools that you have to deal with, to read logs, to parse queries like the ones you would uses for a simple search library or to read simple formats like Json. In this article we will create a parser for a simple guessing game, we will use .NET Core and xUnit for the unit tests, so it will work also on Linux and Mac.

The objective of the game is to guess a number, and to do that you can ask if the number is greater than a number, less than a number or between two numbers. When you are ready to guess you simply ask if it’s equal to a certain number.

Setup the project

We will use VSCode, instead of Visual Studio, but in the github project you would find two projects, one for each: this because there are still some compatibility quirks relative to project.json and the different .NET Core tools versions used by Visual Studio or the standalone command line version. To clarify, the project.json generated by the .NET Core standalone command line will work also with Visual Studio, but not viceversa (this might be changed when you will read this). Also, with two projects you can easily see how Visual Studio integrates xUnit tests. The C# code itself is the same.

Create the file global.json in the directory of your project, in our case SpracheGame, then create another SpracheGame folder inside src and a SpracheGame.Tests folder inside test. Inside the nested SpracheGame folder you can create a new .NET core program with the usual:

While you are nside the SpracheGame.Tests folder you can create a xUnit test project with:

You can see the final structure here.

SpracheGame folder structure

Change both project.json, adding sprache as a dependency to the main project:

…and add the main project as a dependency for the xUnit test project.

If you are using Visual Studio you may need to add a runtimes section to both of your project.json:

See the .NET documentation for .NET Core Runtime IDentifier (RID) catalog if you need to know other platform IDs.

Create GameParser

Let’s start by creating a class called GameParser and by recognizing numbers and commands.

On line 3 there is the code to parse a number: we start with Sprache.Parse followed by a digit, of which there must be at least one, then we convert from IEnumerable<char> to string, with Text(), and finally we discard whitespace with Token(). So first we choose the type of character we need, in this case Digit, then we set a quantity modifier and trasform the result in something more manageable. Notice that we return Parser<string> and not an int.

On the lines 5-6 we order to the parser to find a character ‘<‘  followed by  one ‘>’, using  Then(). We return an enum instead of a simple string. We can easily check for the presence of different options with the Or(), but it’s important to remember that, just as for ANTLR, the order matters. We have to put the more specific case first, otherwise it would match the generic one instead of reaching the correct case.

Now we have to combine this two simple parser in one Play, and thanks to the LINQ-like syntax the task is very simple. Most commands require only a number, but there is one that requires two, because we have to check if the number to guess is between two given numbers. It also has a different structure, first there is a number, then the symbol, and finally the second number. This is a more natural syntax for the user than using a ‘<>’ symbol followed by two numbers. As you can see, the code is quite simple, we gather the elements with from .. in .. and then we create a new object with select.

It’s time for Play

The only interesting things in the Play class are on the lines 27-51, the Evaluate function, where the “magic” happens, and I use the term magic extremely loosely. The number to guess is provided to the function, then it’s properly checked with the command and the numbers of the specific play that we are evaluating.

Unit Tests are easy

There are basically no disadvantages in using xUnit for our unit tests: it’s compatible with many platforms, it’s still integrated with the Visual Studio Test Explorer and it also have a special feature: theory. Theory is a special kind of test that allow you to supply multiple inputs with one test. Lines 3-6 shows exactly how you can do it. In our case we are testing that our parser can parse numbers with many digits.

The following test is a typical one, we are checking that the symbol ‘>’ is correctly parsed as a Command.Greater. On Line 27 we are making sure that an Exception is raised if we encounter an incorrect Play. Sprache allows also to use TryParse, instead of Parse, if you don’t want to throw an exception. As you can see the simplicity of tool make very easy to test it.

Let’s put everything together

The main function doesn’t contain anything shocking, on the lines 27-28 we parse the input and execute the proper command, then, on 31, we check whether we guessed the correct number and if so we prepare to exit the cycle. Notice that we provide a way to exit the game even without guessing the number correctly, but we check for ‘q’ before trying to parse, because it would be an illegal command for GameParser.

Conclusions

This blog talks much about Language Engineering, which is a fascinating topic, but it is not always used in the everyday life of the average developer. Sprache, instead, is one tool that any developer could find a use for. When a RegEx wasn’t good enough you probably have simply redesigned your application, making your life more complicated. Now you don’t need to, when you meet the mortal enemy of regular expressions, that is to say nested expression, you can just use Sprache, right in your code.

Download the guide with 68 resources on Creating Programming Languages

68resources

Receive the guide to your inbox to read it on all your devices when you have time

Powered by ConvertKit

Do You Need a Parser?

We can design parsers for new languages, or rewrite parsers for existing languages built in house.

On top of parsers we can then help building interpreters, compilers, code generators, documentation generators, or translators (code converters) to other languages.

2 replies

Trackbacks & Pingbacks

  1. […] going to see how to create a more complex example of code generation with Roslyn and parsing with Sprache. We are going to create a skeleton class from a PlantUML file. In short, we are doing the inverse […]

Comments are closed.