The challenge will be transforming RPG code into Clear Sequence Diagrams. In the dynamic landscape of software development, legacy languages like RPG continue to play a crucial role, particularly in sectors like finance and banking. Navigating and maintaining legacy code, however, presents a unique set of challenges. Addressing this gap, Strumenta has developed an RPG parser that can be used to develop many tools such as transpilers, language servers, or code analysis tools. In this article, we present a simple Java implementation of a transpiler that converts RPG code into a PlantUML sequence diagram.
RPG parser
Before jumping into the implementation let’s take a look at the features of the Strumenta RPG parser. The parser supports fixed and free format RPG code and it can also parse physical and logical file definitions (DDS). A symbol resolution module is available as an extension of the parser. The parser is written in Kotlin and is based on the open source Kolasu AST library. The code in this article uses the Java programming language to illustrate that Kotlin is not a requirement to run the parser. The parser produces as output an Abstract Syntax Tree (AST) that can be traversed and processed programmatically using the Kolasu API.
The Language Engineering Pipeline
The Java code presented is a basic implementation of the Language Engineering Pipeline architecture, a structured approach to language processing and translation. This approach is rooted in Model Driven Development, where models (in this case the Abstract Syntax Trees) play a crucial role.
In this architecture, the output of a component is used as input for the next component in the pipeline. The diagram below illustrates the conceptual architecture of the pipeline.

The pipeline has 3 main components:
- SourceToModel: This component focuses on parsing the RPG source code using the RPG parser and generating the model (AST).
- ModelToModel: This component is responsible for the core conversion logic. It walks the RPG AST, analyzing each statement or expression type. Based on the type, it employs the transformation conversion logic to generate the corresponding PlantUML AST node.
- ModelToSource: This component bridges the gap between the internal PlantUML model and the textual PlantUML code.
The Java implementation of the pipeline looks like the following code.
You can find the repository at the following link: RPGtoPUML.java
Pipeline pipeline = new Pipeline( new SourceToModel(inputFile,outputFile), new ModelToModel(inputFile,outputFile), new ModelToSource(inputFile,outputFile) );
Each component of the pipeline is implemented in a specific Java class that performs a specific transformation. For example, the SourceToModel class makes use of the Strumenta RPG parser to transform the RPGLE source code into the corresponding AST.
You can find the repository at the following link: SourceToModel.java
/** * Performs the transformation of RPG source code into an AST. * Reads the source code from the input file, parses it using * RPGKolasuParser and returns the resulting AST. * * @param model Not used in this implementation as the transformation is * from source code to AST. * @return The root node of the generated AST representing the * parsed RPG code. * @throws Exception if the input file is invalid or the parsing fails. */ @Override public Node transform(Node model) throws Exception { if (inputFile.isFile() && inputFile.exists()) { RPGKolasuParser rpgParser = RPGKolasuParser.parserFromExtension(inputFile); ParsingResult result = rpgParser.parse(inputFile); if(result.getCorrect()) { return result.getRoot(); } } throw new Exception(String.format("Invalid file '%s'", inputFile.toPath())); }
The code shows how the RPGKolasuParser is instantiated, in this case detecting the extension of the input file, and then the instance is used to parse the file and create the AST. Although the RPG parser is implemented in Kotlin, the integration in Java happens seamlessly.
The second step ModelToModel transforms an RPG AST into the PUML (PlantUML) AST and provides the implementation for transforming various RPG statements into their corresponding PUML AST node representations. Below is the class diagram of the PlantUML AST.
You can find the repository at the following link: PUMLDiagram.java

The abstract syntax tree implements just a few of the constructs available in PlantUML, but it is enough for this example. The class PUMLNode extends the Node class provided by the Kolasu AST library. Below is the Java code of the ModelToModel transformation implementation.
You can find the repository at the following link: ModelToModel.java
/** * Transforms a given RPG AST into a PUML diagram. * * @param model the RPG AST to be transformed. * @return the AST representing the PUML diagram. * @throws Exception if the transformation fails. */ @Override public Node transform(Node model) throws Exception { PUMLDiagram target = new PUMLDiagram(); String file = inputFile.getName(); if (model instanceof CompilationUnit) { CompilationUnit cu = (CompilationUnit) model; // RPG code contains an initialization routine it is executed first for (Subroutine s : cu.getSubroutines()) { if (s.isInitializationSubroutine()) { target.add( new PUMInvoke(file, file, "inzsr", List.of() )); } } // Process the main statements for (Statement s : cu.getMainStatements()) { target.add(transformStatement(cu, s)); } return target; } throw new Exception(String.format("Invalid input Model: %s", model.getClass().getName())); }
The code performs a loop on the Subroutine collection to find if there is an initialization subroutine (INZSR in RPG). If present, it adds it as the first invocation. The transformStatement processes the AST nodes of the main program statements. The implementation in this example is designed to identify some common patterns in RPG, such as loops and file operations.
The last step of the pipeline, ModelToSource, consists of the PlantUML code generation. It utilizes a template-based approach, where placeholders are filled with the generated PlantUML statements based on the processed AST. Additionally, it employs a strategy pattern (using a nodePrinters map) to associate specific PUML node types with their corresponding printing functions, ensuring proper code generation for different elements.
Check out PUMLCodeGenerator.java source code on GitHub.
The table below summarizes some of the various constructs of the RPG code that are transformed into PlantUML code and how they are displayed. CUS300.rpgle is a sample program.
RPG code | PlantUML code | Diagram |
INZSR | CUS300.rpgle -> CUS300.rpgle : inzsr | |
SETLL *LOVAL CUSTOMER | CUS300.rpgle -> CUSTOMER : SETLL *LOVAL CUSTOMER | |
READ CUSTOMER | CUS300.rpgle -> CUSTOMER : READ CUSTOMER | |
DOU NOT %EOF(CUSTOMER) | loop UNTIL NOT %EOF(CUSTOMER)end | |
EXSR clrsum | CUS300.rpgle -> clrsum : clrsum |
In this implementation, in addition to the calls to subroutines, we also wanted to highlight the access to the files so that by examining the sequence diagram it would be possible to identify what files are involved and their mode of operation (Read/Write).
Generating the Diagram
Let’s examine the example RPG program used in this article and the relative PlantUML sequence diagram. The CUS300.rpgle program uses 3 files: CUSTOMER, ORDERS, and ORDSUM; it defines the INZSR initialization subroutine; and defines 3 subroutines: clrsum,dspcus, and calctotal.
The code is written in free format and contains some classic RPG constructs to read/write data files.
F********************************************************************** F* * F* PROGRAM ID : CUS300 * F* PROGRAM NAME: SAMPLE PROGRAM * F* * F********************************************************************** D DSP S 50 INZ('CUSTOMER') D TOTAL S 9P 2 INZ(0) D NUM S 9P 2 INZ(1) D CNT S 9P 2 INZ(0) FCUSTOMER UF E K Disk FORDERS IF E K Disk FORDSUM IF E K Disk C *INZSR BEGSR C EVAL DSP='CUSTOMER REPORT' C DSPLY DSP C ENDSR /free CNT = 0; TOTAL = 0; EXSR clrsum; DSPLY '------ Forward ------'; Setll *Loval CUSTOMER; Dou NOT %EOF(CUSTOMER); Read CUSTOMER; If NOT %EOF(CUSTOMER); EXSR calctotal; EXSR dspcus; If TOTAL > 0; OSCUID = CUID; TOTAL *= (TOTAL / CNT +1) * 0.1; OSTOT = TOTAL; OSCUNM = CUSTNM; Write ORDSUM; EndIf; EndIf; EndDO; Begsr calctotal; CNT = 0; TOTAL = 0; Setll *Loval ORDERS; Dou NOT %EOF(ORDERS); Read ORDERS; If NOT %EOF(ORDERS); If CUID = ORCUID; TOTAL += ORTOT; CNT += 1; Update ORDERS; EndIf; EndIf; EndDo; EndSr; Begsr dspcus; If TOTAL > 0; eval DSP='CUSTOMER: ' + CUSTNM + ' $' + TOTAL; DSPLY DSP; EndIf; EndSr; Begsr clrsum; CNT = 0; DSPLY '------ Delete ------'; Setll *Loval ORDSUM; Dou NOT %EOF(ORDSUM); Read ORDSUM; If NOT %EOF(ORDSUM); delete ORDSUM; CNT+=1; EndIf; EndDO; DSPLY 'DELETED: ' + CNT + ' RECORDS'; EndSr;
Running the transpiler on the CUS300.rpgle file will generate the PlantUML diagram file named CUS300.rpgle.puml.
@startuml 'https://plantuml.com/sequence-diagram !pragma teoz true hide footbox skinparam sequence { ArrowColor Black LifeLineBorderColor #000000 LifeLineBackgroundColor #FFFFFF ParticipantBorderColor #000000 ParticipantBackgroundColor #FFFFFF ParticipantFontColor #000000 } client -> CUS300.rpgle : CUS300.rpgle -> CUS300.rpgle : inzsr CUS300.rpgle -> clrsum : clrsum clrsum -> ORDSUM : SETLL *LOVAL ORDSUM loop UNTIL NOT %EOF(ORDSUM) clrsum -> ORDSUM : READ ORDSUM group IF NOT %EOF(ORDSUM) clrsum -> ORDSUM : DELETE ORDSUM end end CUS300.rpgle -> CUSTOMER : SETLL *LOVAL CUSTOMER loop UNTIL NOT %EOF(CUSTOMER) CUS300.rpgle -> CUSTOMER : READ CUSTOMER group IF NOT %EOF(CUSTOMER) CUS300.rpgle -> calctotal : calctotal calctotal -> ORDERS : SETLL *LOVAL ORDERS loop UNTIL NOT %EOF(ORDERS) calctotal -> ORDERS : READ ORDERS group IF NOT %EOF(ORDERS) group IF CUID = ORCUID calctotal -> ORDERS : UPDATE ORDERS end end end CUS300.rpgle -> dspcus : dspcus group IF TOTAL > 0 end group IF TOTAL > 0 CUS300.rpgle -> ORDSUM : WRITE ORDSUM end end end @enduml
Below is the diagram corresponding to the PlantUML code.
The diagram presents the sequence of operations performed by the RPG program. It starts with the execution of INZSR then executes the crlsum subroutine which executes a loop on the record of the ORDSUM file and performs a delete operation. Then the program continues processing the record of the CUSTOMER table and so forth. Transforming RPG code into PlantUML sequence diagrams offers several advantages:
- Improved Code Comprehension: Visualizing the program flow through a PlantUML diagram makes it easier to understand the relationships between different parts of the code, especially for complex logic or interactions between subroutines.
- Enhanced Collaboration: PlantUML diagrams provide a universal language for developers to discuss and document the program’s functionality. This can improve communication and collaboration within a team.
- Efficient Debugging: By visualizing the execution flow, pinpointing errors or unexpected behavior in the code becomes more straightforward. Developers can identify issues by tracing the diagram and analyzing the interactions between elements.
The Java code presented in this example can be easily customized by implementing a specific logic for selected RPG statements. The same pipeline could be used to generate code for other text-to-diagram tools such as d2lang.
Once your diagrams are in text you can transform your text-based diagrams into a powerhouse of insights with your preferred Large Language Model (LLM), effortlessly generating detailed documentation, obtaining tailored best practices, and unlocking optimization strategies that align perfectly with your architecture. But that’s not all – take it a step further by vectorizing these insights for seamless storage in a vector database to elevate your project’s potential and streamline your workflow in ways you never thought possible.
Source code available on GitHub: https://github.com/Strumenta/rpg-puml-sequence
Summary
In this article, we have explored the features of the Strumenta RPG parser to create a configurable pipeline approach to the development of language processing tools, allowing developers to create a transpiler from RPG to PlantUML easily.
With its support for ASTs, traversing, transformations, and cross-referencing, the RPG parser
offers a comprehensive set of tools for working with source code.
By leveraging these features, developers can build powerful language processing tools to explore things that simply could not be seen before.