Designing a DSL for accounting: use a DSL to describe taxes, pension contributions, and general financial calculations

Designing a DSL for accounting

Let’s see how we can design a DSL.

Many readers appreciated the tutorials we wrote on implementing Domain Specific Languages but kept asking us resources on the designing of these languages.

On this blog we have discussed already how to write parsers, interpreters, and compilers but we did not write much on what comes before, when we collect requirements, talk to users, iterate over design ideas. We start doing that in this article. Here we present a concrete case for a DSL. We look into the problem and design a solution. In this article we do not discuss the implementation in details. In the future we may want to present several possible implementations, if you readers are interested.

The problem we selected is also one that could be of interest for many of you: the calculation of financial values (i.e., accounting), and of taxes in particular.

Goals of the project

In a nutshell we want to build a system for:

  • Being able to predict taxes
  • Doing simulations: what if we raise compensation? What is the gross revenue we need to obtain a certain amount of net income?

As someone who had a business in France and who currently has one in Italy I had to spend a lot of time understanding how taxes are calculated. As an engineer and a pragmatist I am always mesmerized by how this sort of things can get very, very complicated quite quickly. This is particularly true in certain countries but the reality is that accounting is complicated in every single country.

This topic is particularly complex for small business owners because they have to consider the combined effects of taxes on the company and personal taxes on them. The two kinds interact: do you raise your salary? This means lower profits for the company, therefore you will pay lower taxes on the company. At the same time however, you will pay higher personal taxes. So what is the best balance? This is typically a question that is not easy to answer. We intend to build a system to help answer this and similar questions.

Context: who is going to use this DSL and to achieve what?

Let’s dive a bit more in the context of the project to understand what are the problems related to accounting, who we can help and how.

We should ask a few questions:

  • what are the goals of the DSL?
  • who is going to use the DSL?
  • what are the current processes?
  • what are the main issues with the current processes?
  • why a DSL would improve the situation?

The goals of the DSL

A DSL is built to support someone in performing some processes, so a core question is: which processes are future users interested in?

  • The users would like to be able to calculate how many taxes the company and the owners are going to pay

This is the main goal of the system

  • [Optional] The users would like to understand how many taxes the employees are going to pay

This would be useful to understand which net pay will correspond to which gross pay but it is not fundamental

  • The users would like to be able to calculate how many pension contributions the company and the owners are going to pay

Mandatory pension contributions could cost almost as much as taxes in certain countries so they are very relevant

  • [Optional] The users would also like to know when each specific amount will be due (e.g., certain taxes can be paid in multiple tranches) so that they could plan their cashflow

It would be an additional benefit to have some visibility on the cashflow but that would not be strictly needed

  • The users would like to be able to execute simulations to see how variations in revenues or expenses would affect the results

This is also very important: during the year as the owners see variations in revenues or expenses they would like to understand how much money it will end in their pocket so that they can plan their business and personal expenses accordingly

  • The users would like to be able to execute simulations to compare different strategies (e.g., pay higher salaries to the owners or distribute higher profits)

This is a feature very important to support decisions

  • [Optional] The users would like to insert ranges for values and see ranges as outputs (e.g., inserting forecasted revenues as 100K-120K and obtain net profits as a range)

There is uncertainty, especially when forecasting revenues or expenses so the ability to deal with this uncertainty would be useful. However this can be emulated simply by inserting different values so it is not fundamental

  • [Optional] When a new tax is discussed being able to quickly describe it using the DSL to verify the impact with respect to the current taxes

This could be useful in countries where taxes vary frequently. In most countries the rates are adjusted yearly so it would be nice to have this feature but this is not strictly needed on the initial version of the system

We believe that at this stage it is very important to identify which are the core processes.

There are typically few activities that bring most of the benefits and they should be supported very nicely by the DSL in order to make the project a success. Great support for the core activities is typically what convince the organizations to switch to the new, DSL-based, system.

Clients or users in general tend to list to many activities and processes they would like to be supported by the system under development. Trying to accommodate too many different processes in the first iteration of a language would leave to a DSL that is too complex and not specific enough. If many processes have to be supported from the beginning it would be important to at least identify which ones are the core ones.

Who is going to use the DSL

Business partners and managers with no specific knowledge in accounting.

What are the current processes

Currently these calculations are performed by tax consultants on demand.

What are the main issues with the current processes

The process to get an update on the forecasts is long and custom. Users have to contact the tax consultant with their specific question and wait for the answer. This does not encourage to do these simulations often. Also, typically the results are not explained step by step so it is difficult to reconstruct why something is happening.

What happens in practice is that business owners spend thousands of euros per year on tax consultants. Still they have no clear answers and take some business decisions without the supporting data because of how long and frustrating could be the process to get a quick calculation performed by the tax consultant.

How a DSL would improve the situation

  • A DSL would permit to define new taxes and calculations in a comprehensible way
  • A DSL would support the possibility to do simulations and explain the results by showing intermediate steps
  • A DSL would enable to do calculations on real data and forecasts very quickly at no additional cost

Existing solutions

We are not the first to see the need of a DSL for accounting. It could be useful to take a look at what already exist.

An internal DSL for accounting built at Gusto

For example, at Gusto they wrote an internal DSL for accounting in Ruby. Yes, I know that internal DSLs are much less interesting than external DSLs. Still, on the Gusto Engineering Blog blog we can find interesting considerations on the need of a DSL which make sense (I would say more sense) if we consider external DSLs and not just internal DSLs as they did.

In particular they underline one problem they typically have in USA: the need to support different rules for each state:

We satisfy a bevvy of unique requirements for each state. Initially, we found ourselves writing a lot of boilerplate code by hand, instead of concentrating on what made each state a snowflake. We soon realized that this was a problem that could reap enormous benefits from tooling–namely, creating a domain specific language to accelerate and streamline the development process.

Of course in Europe the problem is one order of magnitude bigger as the differences between European countries are much bigger than between US states.

From what we can see from this post they focus on calculating pay slips. I think that the intended users of their systems are large organizations with many employees. So their focus is different from ours, as we focus on small business owners.

An internal DSL for testing income tax calculations built at ClearTax

At ClearTax instead they focused on building an internal DSL just for testing income tax calculations. Now, that is being very specific. It also gives the idea of how complex can be performing the calculations for just one tax in one country. Let’s imagine how complex is the accounting domain if we consider multiple taxes (which could affect one another) and multiple countries.

From what we understand their system is intended for individuals who need to calculate their own income tax.

OpenFisca France

OpenFisca is an engine to do taxes calculations and simulations. It permits the definition of new taxes in a simple format based on Python and YAML. In the future it aims to adopt DSLs to describe these rules. While it is not yet using a DSL it contains a repository of tax models that could be interesting to study to see the abstractions needed to define taxes.

Below there is a presentation on the project.

While OpenFisca was created in France it has now a description of taxes used in several countries:

The Dutch Tax and Customs office

The Dutch Tax and Customs office is describing all of the laws that affect their work through a series of DSLs built using Jetbrains MPS.

They gave a very nice presentation about this and the MPS Comunity Meetup held recently in Munich and also at the LangDev Meeting held in Amsterdam (here a recap of that great event).

Below you can find the video of the presentation they have given in Munich. Very interesting.

Here the intended user is… the tax office itself! Because of this their focus is different. They do not need to focus on simulations or supporting business decisions but instead they need a process to ensure they are representing correctly the law and a way to update their system regularly as the tax code is amended every year.

Other financial DSLs

Here can be found a list of financial DSLs. Many of those are internal DSLs: in other words they are not real languages, just some sort of fluent interface. Still, some ideas could be taken and used to create external DSLs, you know, the real stuff.

Most of these DSLs have been built for traders or for other professional figures and are not intended to be used by small business owners, like this DSL we want to build.

Considering a situation: taxes of an Italian limited company

We will apply this DSL to the calculation of taxes for a small Italian limited company (SRL in Italy, which stands for Società a Responsabilità Limitata).

Now, for a small business it makes sense to consider all the cost of all the taxes and other fiscal obligations both at the company level and on the personal level.

The fundamental question a business owner wants answered is: how many of the revenues will end in my pocket after we paid all we are supposed to pay?

The all we are supposed to pay is composed by:

  • business expenses: travels, hardware, software, office, tax consultants, employees salaries, etc. These are typically reasonably predictable for a business owner
  • company taxes: the company has to pay taxes on the profits
  • personal taxes: the owner can receive money from the company as a salary or as distributed profits. In both cases these are taxed at the personal level
  • pension contributions: in some countries they are mandatory and they could have to be paid by the company, the receiver or split between the two. In any case this is money that do not end up in the pocket of the business owner

This system will not focus on calculating the revenues or the business expenses: they will need to be inserted as parameters. However given these parameters the system should help calculating:

  • company taxes
  • personal taxes for the owners
  • pension contributions (to be paid by the employees or the company)

Defines the main taxes using the DSL

Now that we have clarified the goals of our systems let’s look at the logic we have to represent.

In Italy a small business has to pay two main taxes: IRES and IRAP. Let’s examine them separately.

Company taxes: IRES

IRES was introduced in 2004 to replace a previous tax named IRPEG. Its rate changed over time. In recent years it was set to 27.5% until it was reduced in 2017 to 24%.

This rate is applied to the profits of the company. However the profits on which it is calculated are not obtained simply by subtracting the expenses to the revenues.

This is the case mainly because of these two aspects:

  • deducibility: not all expenses can be considered to reduce the profit. Among the expenses that can be deduced not all of them can be reduced for their entire value but only for a fraction (e.g., 80%). The rate of deducibility typically depend on the nature of the expense and on its relevance to the business of the company
  • depreciation: when a company incurs in a large expense its deducibility has to be divided among many years. Consider the acquisition of a small office for 100.000 Euro. A similar expense could be distributed among 30 years. That means that every year for 30 years the business will be able to subtract 3.333 Euro from its revenues in order to calculate the profits relevant for the IRES. Depending on the cost and typology of goods the corresponding expense could be distributed among a different number of years

If we wanted to represent precisely these rules we would need the users to input into the system the type of each expense and then properly calculate the profits. These calculations will be performed by the tax consultant once per year but it is not very practical for the users to input the data at the necessary level of details when they want to do a rough simulation.

In our case it could be enough to do a simple approximation. For example we could consider that typically the 80% of the expenses could be subtracted by the revenues to obtain the fiscal profits. This percentage could be adapted by the users depending on their businesses. For example, in some sectors certain expenses that are not deductible could have a strong impact so that percentage could be reduced, while in others it could be increased.

Company taxes: IRAP

IRAP is a regional tax that is calculated on a different value with respect to the IRES. As a simplification we could say that is calculated on the profits (as intented for the IRES) plus the costs of the personnel. The rate depends on the kind of activities done and it could vary from region to region. The typical rate, which is applied to consulting business like ours in Piedmont, is 3.9%

Personal taxes

The main tax paid on the personal level in Italy is called IRPEF. The amount is calculated on the total personal income considering different brackets intervals, so that the rate is progressive. In addition to the main tax there are increments that depends on the region (addizionale regionale) and on the town where the taxpayer is living (addizionale comunale).

The brackets for IRPEF are:

  • 0-15.000 Euro: 23%
  • 15.000-28.000 Euro: 27%
  • 28.000-55.000 Euro: 38%
  • 55.000-75.000 Euro: 41%
  • above 75.000 Euro: 43%

Someone living in Piedmont have also to consider these values for the addizionale regionale:

  • 0-15.000 Euro: 1.62%
  • 15.000-28.000 Euro: 2.13%
  • 28.000-55.000 Euro: 2.75%
  • 55.000-75.000 Euro: 3.32%
  • above 75.000 Euro: 3.33%

Someone living in Turin will have to pay the addizionale comunale according to these brackets:

  • 0-11.670 Euro: 0%
  • above 11.670 Euro: 0.8%

There are deductions and no-tax areas that applies under certain conditions. We are ignoring them for simplicity. We obviously could not do that, if our intended user was the tax office. However we can reasonably do that, given that our intended users are business owners who want a reasonable estimate obtained through understandable logic.

Pension contributions

A small business owner who is working on his own business has to pay two different pension contributions:

  • one on his income as business owner (i.e., on the distributed profits)
  • one for the work he is doing in the company (i.e., on the salary he receives)

Pension contributions on income as business owner

As a business owner he has to pay a contribution on the whole profit produced by the company even if the profit is not distributed to shareholders but kept in the company. In this case we considered the net profit, in other words the profits on which the IRES is calculated from which we subtract the amount paid for the IRES and IRAP taxes. On these profits we consider the shares he owns and we assign a proportional amount of profit to him. Depending on the field in which the company he is operating he has to pay a different contribution.

For the tertiary sector:

  • 0-46.123 Euro: 22.74%
  • 46.123-76.872 Euro: 23.74%
  • above 78.872 Euro: 0%

With a minimum value of 3.535,61. That means that even if the profits are zero or negative, the owner still needs to pay this amount for his pension contribution.

Pension contributions on compensation received by the company

Pension contributions have to be paid on the compensation given to employees or owners for their work on the company. These contributions are paid for 2/3 by the company (they reduce the profits as calculated for the IRES, but not the profits as calculated for the IRAP) and for 1/3 by the receiver.

The rate depends on the receiver being already receiving a pension or being contributing to other pension systems. However typical it is calculated as:

  • 0-100.324 Euro: 27.72%
  • above 100.324 Euro: 0%

Conceptual modeling

Let’s see which concepts we are going to need in our system:

  • We will need a way to define the entities we are considering (companies and persons), specifying some of their characteristics (for example in which town they are based)
  • We will need a way to provide values such as total revenues and business expenses
  • In the DSL we should define which taxes exist, to whom they apply (companies or persons) and how they are calculated
  • Similarly we should define pension contributions, defining when they apply and who pays for them
  • We would need to specify how some values vary over time. For example, many tax rates could vary one year from the other

Entities: companies and persons

We will need a way in the system to list all the entities that we are interested into.

Should that be done using the DSL or as a configuration step?

Now, it would probably be more intuitive having the configuration step to be done outside the DSL. For example by passing parameters to the simulation engine that will interpret the DSL or through JSON files. However doing this would require to build a configuration mechanism for this purpose so it could be reasonable to having the logic and the configuration both specified through a single DSL.

Also, by having the definition of entities and the calculation rules in the same environment we will facilitate quick adaptations of the model. Consider this: the calculations done by business owners will typically be approximations as representing exactly how taxes works would be way too difficult. Some taxes are described in hundreds of pages of rules, exceptions, and caveats. What we are typically using are simplifications of the correct calculations. In other words we use a simplified model that is good enough for our calculations even if not 100% precise.

Now, depending on the context, we could need to slightly simplify our model or make it more detailed. By making the model simpler or more detailed we would need to change the calculations and the information specified for the different entities. For example, if we want to consider additional income a business owner could have, beside the income he obtains from the company, we would need to consider this extra field in the calculation and to specify the new value for each person. If this configuration was done through an external interface, we would need to adapt it. By having it in the same DSL we make this process much more agile.

When defining entities we will need to be able to detail a few pieces of data:

  • the name of the entity, so that we can identify it and refer to it in the DSL
  • where the entity is based, as certain rules depends on the town or region where the entity is based
  • the type of the entity: is it a person or a company? If it is a company which type of company is it?
  • some values like the monthly salary of a person or the fixed costs of a company. These values could change as we change the calculations

Defining taxes and pensions contributions

We should define who is paying for them. In the case of pension contributions we have seen that in some cases they could be split between the employee and the employer. We should also specify under which conditions they have to be paid.

We should define the amount to be paid. Generally we either have a fixed rate or the usage of brackets. The rate or the brackets are applied to a certain value, the taxable amount. The way we calculate this taxable amount could require the definition of some intermediate values.

Expressions

We will need expressions which:

  • currencies: We could probably consider just one currency for each model. In more advanced versions of the DSL we may want to support multiple currencies
  • percentages: it seems a recurring concept used in calculations
  • brackets: it would be nice to have first-class support for brackets as they are used in several calculations
  • boolean logic: to specify conditions. This means supporting boolean literals and boolean operators (logic and, logic or, etc)
  • arithmetic operations: they will be necessary to calculate the taxable amounts
  • literals: they will necessary be present in any language using expressions

Geographic model

It should be enough to say that a company and a person are based in Turin to imply that they are also in the Piedmont region (and in Italy, but countries could be ignored for the moment). To do so we should have a list of regions and towns, plus the associations between the two.

Time

It is important to consider that some rules, and in particular some taxes, vary over time. So we may want to specify the period of validity of a given rule. Suppose for example you want to do a simulation of the cashflow across the next five years and you know a certain tax will change two years from now. For that calculation you will need to apply the earlier version of the tax for the first two years and the future one for the other three years.

We will also need to consider that some values are periodic. For example, we want to specify salaries and compensations as monthly amounts.

Sketching the DSL

Let’s sketch the DSL by considering some use cases and imagine how they could be solved by using our DSL

Calculate the IRES tax

We will start by imagining how we could define the logic to calculate the IRES tax:

Our code tells us that the tax is applied to SRL companies and it is calculated on the 120% of the gross profit of the company. The rate used varied in 2017, being reduced from 24% to 27.5%.

Now, to use this we will need to be able to define the SRL company type and the companies to which to apply it.

In this case we simply declare that SRL is one type of company in Italy. We then define an instance of this type of company: AcmeInc.

AcmeInc is based in Torino. It is owned by Mario (owning the 66%) and Alberto (owning the 34%).

The gross profit is 213.000 Euros. The currency is not specified as for now we imagine to use one single currency in each file.

We decided to support as a suffix for numbers to indicate they are indicating thousands, so instead of writing 213000 we can write 213K.

It would be possible to make the company type definition richer, for example by listing the required fields an instance should have and including some calculations that are valid for each instance:

We could also want to declare Mario and Alberto (the owners of AcmeInc) as two persons. For persons we could imagine a definitions very similar to the one we have seen for company istances. For persons we will obviously remove the owners field.

Also, we may want to specify the geographical model in the DSL:

In this case we list the countries we want to consider and for each country we specify if it is part of the European Union or not. This could be useful as many regulations are different for countries inside and outside the EU. We then lists the regions for each country and the cities for each region.

This geographical model could be perhaps defined separately and imported in the relevant DSL files.

If we wanted to be really precise we could consider that cities could move to different regions over time. But let’s ignore that for our own mental sanity 🙂

Calculate the IRAP tax

Let’s now see how to calculate the IRAP tax:

Here we see how certain calculations could depend on where the entity is based. In this case the rate is 3.9% for companies based in Piedmont, while it is undefined for other regions. So if we tried to calculate the IRAP for a company based in Lombardy or Tuscany we would get an error. We expect users to build incomplete models, specifying the information the need for the cases they are interested into and completing them as new needs arise.

Now, we could see that to calculate the IRAP we need to know the pre-tax profit of the company and its personnel costs. For this reason we will need to update the definition of the AcmeInc company we considered.

Because of the way the IRAP tax is calculated we need information on the costs of personnel.

The cost of personnel will depend on the values of the salaries of Mario and Alberto. They are the owners of the company and they are also paid by the company. Note that to represent a field access we decided to use the keyword of instead of the classic dot used in many programming languages. We expect this to be more intuitive for our users.

Also, note that the profit is calculated considering the amount of the IRES and IRAP amounts. In this context it is clear that we are referring to the amount calculated for the company considered (AcmeInc).

Calculate personal taxes

Let’s see now how we can calculate IRPEF, the personal tax we have in Italy:

In this case we do not specify simply a rate but we define directly the amount with a more complex rule. In particular we sum three components based on the national, regional, and town rates. All these rates are based on brackets.

Calculate pension contributions

Our system have to considered also pension contributions, as they are very relevant to forecast how much disposable income a small business owner will be able to keep.

Looking at these examples we can realize that so far we have specified who owns a company, so we could be able to determine to who to apply the first pension contribution (InpsTerziario). However to calculate the second (InpsGLA) we need to know who is employed by whom and who is employing who. So far in our DSL we have defined just salaries. We should instead define employments as a relation between the two, for example like this:

We can also see that the considered_salary of InpsTerziario requires some new concepts:

  • taxable of IRES for employer: we need to be able to calculate the IRES amount for the same company we are considering in this pension contribution. From that calculation we need to access the taxable field
  • amount of IRES for employer, amount of IRAP for employer: similar to previous point, we just need to access a different field
  • by ownership share: in this case we need to consider the portion of shares the owner owns in the company considered

Considerations

So far we have sketched a DSL. We will need to refine the language incrementally considering different aspects:

  • Do intended users understand it intuitively or after reasonable training?
  • Is it general enough to be usable for other problems we want to solve in the field?
  • Are there technical difficulties to implement it as it is? Is it parseable? Are the performance of the resulting system too horrible?

Use cases

Let’s see how the DSL we define could be used, supposing we have built a parser and an interpreter for this language. The interpreter could simply calculate the values of any field for any entity and the amount of taxes for each entity affected and print them on the screen.

Simulate taxes and pension contributions on real revenues and expenses

Using this DSL the owners of the Acme Inc. could simply insert the values of revenues, costs, and their salaries to calculate the IRES and IRAP to be paid.

Simulate taxes and pension contributions on forecasted revenues and expenses

Using this DSL the owners of the Acme Inc. could simply insert the forecasted values of revenues, costs, and their salaries to calculate the IRES and IRAP to be paid if the forecasts prove to be correct. In this case we are imagining the users to simply change the data specified directly in the DSL and run the interpreter. This is not the approach a software engineer would like to use in his own code but it could be adequate for simple usages and it would have the benefit of not requiring effort to be supported, beside writing the interpreter.

Simulate taxes and pension contributions for alternative scenarios: increased salary vs increased distributed profits

In the same way as the users modify the code to simulate a forecasted scenarios. For example by reducing the salaries to obtain higher distributed profits or viceversa.

Consideration on the implementation

How should we go on implementing this DSL?

We could consider different approaches:

  • Create a textual DSL with an editor running on the desktop, for example using Xtext
  • Create a projectional editor using Jetbrains MPS
  • Create a textual DSL with an editor running on the browser, for example based on ANTLR

For a more detailed discussion you can read our complete guide to DSLs.

In this case:

  • We would discard Xtext as it works better when proposing solutions for developers
  • MPS could be interesting as it would permit to add complex simulators and tabular notations, which could help, for example for brackets
  • An editor running in the browser could permit to business owner to use the system without having to install anything. It could also make possible to easily share models online

At this stage we should probably consider the budget and the timeline for the project before deciding one way or the other.

Conclusions

One thing we realized is that different users can be interested in very different models for the same phenomena. For example, a tax office or a tax consultant would need a complete model of every single tax. We spoke with a contributor to OpenFisca Italy, Lorenzo Stacchio. He told us that simply to model IRPEF (the income tax on individuals) they used 1.500 variables and around 400 parameters. This is the level of precision some users would need. Other users, like business owners, could prefer a much simpler model so that they have to insert much less parameters and they can get a system they understand fully.

As it always the case, we should consider carefully who we want to serve and what they really need. As business owners or as someone who have to pay taxes, I would be happy to have a DSL to describe models of my financial situation. I think I am not the only one, from what I have heard. I am looking forward to what you readers think of this DSL and the simple design overview we presented. It is fun to create DSLs, isn’t it?

The complete guide to (external) Domain Specific Languages

Cover pdf guide dsl

Receive the guide by email and get more tips on DSLs

Powered by ConvertKit

Do You Need a Domain Specific Language (DSL)?

We can design and implement languages tailored to support your processes. We build also all the necessary infrastructure: editors, code generators, compilers, simulators. Our goal is to deliver complete solutions.

We can use different technologies like Jetbrains MPS, Xtext, and ANTLR for custom solutions.