How to validate migrations from one language to another

Key takeaway: The truth is that there is no silver bullet. This article describes how static analysis and behavior testing can be used to achieve real confidence in a migration. When combined with experience and a sufficient amount of effort, you won’t have to just hope for the best.

Language migration—the process of converting a codebase from a legacy language like RPG to a modern one like Java, Python, or C#—is a high-stakes endeavor. Many companies undertake this journey to escape the constraints of legacy technology, such as a shrinking talent pool of RPG developers or the high operational costs of platforms like IBM i.

Validating a migration means ensuring that the new code preserves the behavior that matters—not replicating every low-level detail, but maintaining domain-relevant semantics. How do you prove that the new, migrated system works correctly and build enough trust to decommission a system that has reliably run your business for decades?

What Is a Language Migration?

Language migration means taking a software system that works, that often has years (or decades) of embedded value, and re-expressing it in a different technology — not because it’s broken, but because the current language or platform has become a liability. Maybe it’s hard to find developers who still understand it. Maybe productivity is low, integrations are painful, or it forces you to depend on expensive hardware or licenses. Whatever the reason, the original technology is no longer a good fit for your business goals.

Now, rewriting a large system from scratch is almost always a bad idea (see related article Why rewriting software from scratch is generally a bad idea). It’s very expensive, terribly slow, and risky. You lose institutional knowledge, create gaps in functionality, and introduce bugs that weren’t there before.

Instead, language migration offers an alternative. Take the existing codebase and translate it into a modern language — say, from RPG to Python, PL/SQL to Java, or SAS to PySpark — in a way that preserves its external behavior. The new system should offer the same results and functionalities to users, but be expressed in a maintainable, future-proof technology.

At Strumenta, we approach this by building transpilers; automated tools that convert code reliably from one language to another. But regardless of whether the migration is manual, AI assisted, or automated, the hard requirement remains the same – you must validate that the new system behaves like the original. This article outlines a practical, two-pronged approach to achieve just that.

Why Validation Is Critical in Language Migrations

Language migration is rarely done for trivial systems. Usually, what’s at stake is critical business software — complex, battle-tested, and often tailored over years.

These systems are the backbone of your business and without proper validation you risk missing:

Errors that appear in rarely used parts of the system
Errors which have no obvious consequences

Errors in rarely used parts of the system

The system being translated will probably contain workflows that only emerge in specific contexts: the logic behind annual financial reports, bi-annual compliance procedures, or routines triggered when opening a new office.

If you don’t validate your migration carefully, these issues may emerge months or years after the migration has been completed. You don’t want to live with this sword of Damocles hanging over your head for years to come.

Not obvious errors

Without proper validation, small errors can slip through unnoticed — and they don’t explode immediately. A rounding discrepancy, a job silently not added to a queue, a permission check failing in edge cases. The most dangerous bugs are often the quiet ones. They can come from semantic mismatches between languages — different arithmetic rules, string encoding behavior, or exception handling. These differences don’t always throw errors, but they can corrupt behavior subtly and silently. These kinds of errors risk going unnoticed, until you notice their consequences and this may happen only way down the road.

So validation is about defining a method that will ensure you catch the maximum possible errors before they cause damage.

Another point to consider – confidence

There’s another layer – confidence. Even if the system “works,” a migration project can still fail if it doesn’t inspire trust. If developers, stakeholders, or users suspect hidden problems — or worse, don’t know how correctness was verified — the project is at risk. Teams may refuse to adopt the new system. Management may pull the plug halfway through. And the business may stick with the old, costly platform simply because no one proved it was safe to move on.

Validation, therefore, isn’t an afterthought. It’s a precondition for success. It enables the migration to proceed, so that you can reach your objectives: easier hiring, better integrations, improved tooling. To get that, we need to avoid having the project cancelled after 12 or 24 months of effort because the organization doesn’t trust the results being produced.

The Two-Pronged Approach to Validation

When we say we want to validate a migration, we mean something very specific. The new system must behave, from the user’s point of view, like the original. We assume the original system is correct and we want to preserve its externally visible behavior.

This does not mean the two systems, the legacy and the new one, must be identical at a low level. Quite the opposite. Internal differences in architecture, file formats, or code structure are expected and often necessary for the new system to be idiomatic and maintainable. One core challenge is understanding what must remain the same (external behavior). All the rest can (and possibly should) change (internal structure).

We suggest a structured validation approach that can be broken down into two distinct, complementary activities: first, ensuring the technical quality of the new code, and second, verifying that its business behavior is identical to the original.

1. Quality Validation: Ensuring the New Code is Clean

Before verifying complex business logic, the first step is to ensure the generated code is technically sound and free of common defects. This provides a quality baseline.

Use Automated Static Analysis: Modern languages have a rich ecosystem of tools that can automatically scan code for bugs, style issues, and security vulnerabilities. This is low-hanging fruit that offers great value with minimal effort. For example:

For Python, tools like ruff (a fast linter) and mypy (a static type checker) are indispensable.
For Java, consider using Error Prone — a static analysis tool developed by Google that catches common and subtle Java mistakes at compile time. It integrates with javac and works well with Maven and Gradle. You can also look at NullAway, a fast and practical nullability checker built on top of Error Prone. If you don’t mind drowning in false positives, there are also tools like SpotBugs or SonarQube.
For C#, tools like dotnet format (for code style and formatting) and Roslyn Analyzers (for enforcing coding conventions and detecting issues) can be run via the CLI. You can also integrate tools like StyleCop.Analyzers and FxCopAnalyzers into your project for additional checks during builds.

Validate the Transpiler: When adopting an automated approach to code migration, the quality and correctness of the transpiler become absolutely critical. The core assumption is that if the transpiler is correct, then the output code it produces will also be correct. Therefore, validating the transpiler is not just a best practice — it’s a foundational requirement.

This validation is typically achieved through a combination of unit tests and targeted scenario tests that verify the behavior of individual components within the transpiler. Since the transpiler is usually composed of several transformation units — each responsible for handling a specific construct or pattern — we can write tests that focus on each unit in isolation. If each of these building blocks behaves correctly on its own, we can be reasonably confident that their composition will also behave correctly when generating complete output programs.

Moreover, one of the strengths of automated transpilation is its consistency. The same inputs will always produce the same outputs, following well-defined and repeatable rules. This consistency gives us an additional layer of confidence: if we inspect a representative set of generated outputs and find them correct, we can infer that other outputs — even ones we haven’t manually reviewed — will also be correct, as they are produced by the same logic and transformations.

This approach doesn’t eliminate the need for careful review or testing, but it does mean that by investing effort into validating the transpiler’s components and transformations upfront, we gain confidence across the entire space of possible outputs — including those we can’t explicitly anticipate.

2. Equivalence Verification: Matching Behavior Against an Oracle

This is the heart of the challenge: proving that the new system behaves correctly in all relevant scenarios. This corresponds to creating an oracle (a source of truth for expected behavior) and then comparing the migrated system against it.

Step 1: Defining the Oracle (The Expected Behavior)

In most real-world legacy systems, detailed functional tests or formal specifications don’t exist, so we need to define the expected behavior ourselves. There are two main ways to do this:

A) Write Explicit Behavioral Scenarios: This method involves using a structured, plain-text language to describe what the system should do. We recommend using
Gherkin, which is designed to be readable by domain experts and technical stakeholders alike. A typical Gherkin test follows a
Given-When-Then format:

Feature: Order processing

Scenario: Valid order reduces warehouse stock

  Given the warehouse contains 10 units of item A

  When I place an order for 3 units of item A

  Then the warehouse should contain 7 units of item A

  And a shipment label should be generated

  And the customer should be charged

While this takes effort, it forces clarity. These scenarios become executable tests by linking each step to a function in your testing code (a “step definition”) that performs the described action or verification.

Once you’ve written a few Gherkin scenarios, you could experiment with using LLMs to generate more—provided you review and validate them carefully. This could accelerate the creation of a reliable oracle.

B) Record Actual System Executions: For highly complex or poorly understood “black box” programs, the most reliable way to define the oracle is to observe and record the legacy system in action. This empirical method is based on this process:
1. Set up an isolated test environment with a perfect copy of the legacy system to avoid affecting production.
2. Prepare an initial state by populating the database with specific, carefully selected test data. Here it is important that data is limited and it does not contain confidential data.
3. Execute a sequence of operations on the legacy system, mimicking real user actions.
4. Capture all relevant outputs, which includes not just database changes but also any domain-relevant “side effects,” such as the content of a printed label or data added to a business process, while discarding low-level artifacts like temporary files.
This recording—the initial state, the actions, and all resulting outputs—becomes a highly accurate test case that the new system must replicate perfectly.

Step 2: Comparing the New System to the Oracle

Once you’ve clearly defined what the original system is supposed to do — whether through expected outputs, behavioral specifications, or a set of regression tests — the next step is to validate that the migrated system behaves identically. In this context, the original system (or its well-defined behavior) serves as the oracle: the trusted source of truth against which the new system is evaluated.

To perform this comparison, you run the migrated system with the same inputs as the original and observe its outputs, side effects, and behavior. You’re looking to ensure that the new system replicates the functional behavior of the original system.

Side-effect of Validating: Building a Key Asset

Something we should consider is this: the significant effort poured into validation is not a one-off cost but a strategic investment in the future health of your system. The test suite you build becomes a permanent, high-value asset that delivers benefits long after the migration is complete:

It Becomes a Living Specification. Unlike static documents that quickly become outdated, your Gherkin scenarios serve as a perpetually accurate and executable description of what your system does. This provides lasting clarity and can accelerate the onboarding of new developers.
It Creates a Safety Net for Future Changes. The test suite catches regressions automatically, giving your team the confidence to refactor code, add new features, and evolve the system without the constant fear of introducing bugs.

So this becomes another fundamental contribution towards obtaining the migration goal – making the new system better to work with when compared to the legacy one.

Summary

There is no silver bullet for validating a language migration. The process is a disciplined engineering effort that requires structure, planning, and a clear understanding of the goal: to build a bridge of trust from the old system to the new. By adopting a two-pronged approach—verifying technical quality through static analysis and ensuring behavioral correctness against a well-defined oracle—it is possible to move forward with confidence and not just hope for the best.

A migration isn’t just a technical operation. It’s a way to carry forward the embedded knowledge and business value in your legacy system, while shedding the technical debt. Done right, it transforms risk into opportunity and uncertainty into confidence. There’s no magic, but with discipline and the right validation techniques, there is a clear path forward.