The challenge will be transforming RPG code into Clear Sequence Diagrams. In the dynamic landscape of software development, legacy languages like RPG continue to play a crucial role, particularly in sectors like finance and banking. Navigating and maintaining legacy code, however, presents a unique set of challenges. Addressing this gap, Strumenta has developed an RPG parser that can be used to develop many tools such as transpilers, language servers, or code analysis tools. In this article, we present a simple Java implementation of a transpiler that converts RPG code into a PlantUML sequence diagram

RPG parser

Before jumping into the implementation let’s take a look at the features of the Strumenta RPG parser. The parser supports fixed and free format RPG code and it can also parse physical and logical file definitions (DDS). A symbol resolution module is available as an extension of the parser. The parser is written in Kotlin and is based on the open source Kolasu AST library. The code in this article uses the Java programming language to illustrate that Kotlin is not a requirement to run the parser. The parser produces as output an Abstract Syntax Tree (AST) that can be traversed and processed programmatically using the Kolasu API. 

The Language Engineering Pipeline

The Java code presented is a basic implementation of the Language Engineering Pipeline architecture, a structured approach to language processing and translation. This approach is rooted in Model Driven Development, where models (in this case the Abstract Syntax Trees) play a crucial role.

In this architecture, the output of a component is used as input for the next component in the pipeline. The diagram below illustrates the conceptual architecture of the pipeline.

The pipeline has 3 main components:

  • SourceToModel: This component focuses on parsing the RPG source code using the RPG parser and generating the model (AST).
  • ModelToModel: This component is responsible for the core conversion logic. It walks  the RPG AST, analyzing each statement or expression type. Based on the type, it employs the transformation conversion logic to generate the corresponding PlantUML AST node. 
  • ModelToSource: This component bridges the gap between the internal PlantUML model and the textual PlantUML code. 

The Java implementation of the pipeline looks like the following code.

You can find the repository at the following link: RPGtoPUML.java

Pipeline pipeline = new Pipeline(    new SourceToModel(inputFile,outputFile),        new ModelToModel(inputFile,outputFile),         new ModelToSource(inputFile,outputFile)    );

Each component of the pipeline is implemented in a specific Java class that performs a specific transformation. For example, the  SourceToModel class makes use of the Strumenta RPG parser to transform the RPGLE source code into the corresponding AST.

You can find the repository at the following link: SourceToModel.java

/**
 * Performs the transformation of RPG source code into an AST.
 * Reads the source code from the input file, parses it using
 * RPGKolasuParser and returns the resulting AST.
 *
 * @param model Not used in this implementation as the transformation is
 * from source code to AST.
 * @return The root node of the generated AST representing the 
 * parsed RPG code.
 * @throws Exception if the input file is invalid or the parsing fails.
 */
@Override
public Node transform(Node model) throws Exception {
  if (inputFile.isFile() && inputFile.exists()) {
     RPGKolasuParser rpgParser = RPGKolasuParser.parserFromExtension(inputFile);
     ParsingResult result = rpgParser.parse(inputFile);
     if(result.getCorrect()) {
        return result.getRoot();
     }
  }
  throw new Exception(String.format("Invalid file '%s'", inputFile.toPath()));
 }

The code shows how the RPGKolasuParser is instantiated, in this case detecting the extension of the input file, and then the instance is used to parse the file and create the AST. Although the RPG parser is implemented in Kotlin, the integration in Java happens seamlessly. 

The second step ModelToModel transforms an RPG AST into the PUML (PlantUML) AST and provides the implementation for transforming various RPG statements into their corresponding PUML AST node representations. Below is the class diagram of the PlantUML AST. 

You can find the repository at the following link: PUMLDiagram.java

This image has an empty alt attribute; its file name is ddeYyxzUIri-vutu2z_9ZRa9BZnCOGQHRLzoCiVopsTuWfy6GzdilNRxCFGujf3Undu8ZK_bpLb3tr8CUKxTtQ6SaRsfZYYUGnlN5z3n9e7Y0g44ODSMAJjM9tcwCs6XK7o49wUVnhpvXLfZxr4A1rA

The abstract syntax tree implements just a few of the constructs available in PlantUML, but it is enough for this example. The class PUMLNode extends the Node class provided by the Kolasu AST library. Below is the Java code of the ModelToModel transformation implementation.

You can find the repository at the following link: ModelToModel.java

/**
  * Transforms a given RPG AST into a PUML diagram.
  *
  * @param model the RPG AST to be transformed.
  * @return the AST representing the PUML diagram.
  * @throws Exception if the transformation fails.
  */
@Override
public Node transform(Node model) throws Exception {
   PUMLDiagram target = new PUMLDiagram();
   String file = inputFile.getName();
   
   if (model instanceof CompilationUnit) {
      CompilationUnit cu = (CompilationUnit) model;
      // RPG code contains an initialization routine it is executed first
      for (Subroutine s : cu.getSubroutines()) {
      	if (s.isInitializationSubroutine()) {
	   target.add(
              new PUMInvoke(file, file, "inzsr", List.of()
            ));
        }
     }
     // Process the main statements
     for (Statement s : cu.getMainStatements()) {
       target.add(transformStatement(cu, s));
      }
      return target;
   }
   throw new Exception(String.format("Invalid input Model: %s",
                       model.getClass().getName()));

}

The code performs a loop on the  Subroutine collection to find if there is an initialization subroutine (INZSR in RPG). If present, it adds it as the first invocation. The transformStatement processes the AST nodes of the main program statements. The implementation in this example is designed to identify some common patterns in RPG, such as loops and file operations. 

The last step of the pipeline, ModelToSource, consists of the PlantUML code generation. It utilizes a template-based approach, where placeholders are filled with the generated PlantUML statements based on the processed AST. Additionally, it employs a strategy pattern (using a nodePrinters map) to associate specific PUML node types with their corresponding printing functions, ensuring proper code generation for different elements. 

Check out PUMLCodeGenerator.java source code on GitHub.

The table below summarizes some of the various constructs of the RPG code that are transformed into PlantUML code and how they are displayed. CUS300.rpgle is a sample program.

RPG codePlantUML codeDiagram
INZSRCUS300.rpgle -> CUS300.rpgle : inzsr
SETLL *LOVAL CUSTOMERCUS300.rpgle -> CUSTOMER : SETLL *LOVAL CUSTOMER
READ CUSTOMERCUS300.rpgle -> CUSTOMER : READ CUSTOMER
DOU NOT %EOF(CUSTOMER)loop UNTIL NOT %EOF(CUSTOMER)end
EXSR clrsumCUS300.rpgle -> clrsum : clrsum

In this implementation, in addition to the calls to subroutines, we also wanted to highlight the access to the files so that by examining the sequence diagram it would be possible to identify what files are involved and their mode of operation (Read/Write).

Generating the Diagram

Let’s examine the example RPG program used in this article and the relative PlantUML sequence diagram. The CUS300.rpgle program uses 3 files: CUSTOMER, ORDERS, and ORDSUM; it defines the INZSR initialization subroutine; and defines 3 subroutines: clrsum,dspcus, and calctotal.

The code is written in free format and contains some classic RPG constructs to read/write data files.

CUS300.rpgle

F**********************************************************************
F*                                                                    *
F* PROGRAM ID  : CUS300                                               *
F* PROGRAM NAME: SAMPLE PROGRAM                                       *
F*                                                                    *
F**********************************************************************
D DSP             S             50    INZ('CUSTOMER')
D TOTAL           S              9P 2 INZ(0)
D NUM             S              9P 2 INZ(1)
D CNT             S              9P 2 INZ(0)
FCUSTOMER  UF   E           K Disk
FORDERS    IF   E           K Disk
FORDSUM    IF   E           K Disk
C     *INZSR        BEGSR
C                   EVAL      DSP='CUSTOMER REPORT'
C                   DSPLY     DSP
C                   ENDSR
/free
      CNT = 0;
      TOTAL = 0;
      EXSR clrsum;
      DSPLY '------  Forward  ------';
      Setll *Loval CUSTOMER;
      Dou NOT %EOF(CUSTOMER);
          Read CUSTOMER;
          If NOT %EOF(CUSTOMER);
             EXSR calctotal;
             EXSR dspcus;
             If TOTAL > 0;
                 OSCUID = CUID;
                 TOTAL *=  (TOTAL / CNT +1) * 0.1;
                 OSTOT = TOTAL;
                 OSCUNM = CUSTNM;
                 Write  ORDSUM;
              EndIf;
          EndIf;
      EndDO;


      Begsr  calctotal;
          CNT = 0;
          TOTAL = 0;
          Setll *Loval ORDERS;
          Dou NOT %EOF(ORDERS);
              Read ORDERS;
              If NOT %EOF(ORDERS);
                  If CUID = ORCUID;
                      TOTAL += ORTOT;
                      CNT += 1;
                      Update ORDERS;
                  EndIf;
              EndIf;
          EndDo;
      EndSr;


      Begsr  dspcus;
          If TOTAL > 0;
              eval DSP='CUSTOMER: ' + CUSTNM + ' $' + TOTAL;
              DSPLY     DSP;
          EndIf;
      EndSr;


      Begsr  clrsum;
          CNT = 0;
          DSPLY '------  Delete  ------';
          Setll *Loval ORDSUM;
          Dou NOT %EOF(ORDSUM);
              Read ORDSUM;
              If NOT %EOF(ORDSUM);
                  delete ORDSUM;
                  CNT+=1;
              EndIf;
          EndDO;
          DSPLY 'DELETED: ' + CNT +  ' RECORDS';
      EndSr;

Running the transpiler on the CUS300.rpgle file will generate the PlantUML diagram file named CUS300.rpgle.puml.

@startuml
'https://plantuml.com/sequence-diagram


!pragma teoz true


hide footbox


skinparam sequence {
   ArrowColor Black
   LifeLineBorderColor #000000
   LifeLineBackgroundColor #FFFFFF


   ParticipantBorderColor #000000
   ParticipantBackgroundColor #FFFFFF


   ParticipantFontColor #000000
}
client -> CUS300.rpgle :
CUS300.rpgle -> CUS300.rpgle : inzsr
CUS300.rpgle -> clrsum : clrsum




clrsum -> ORDSUM : SETLL *LOVAL ORDSUM
loop UNTIL NOT %EOF(ORDSUM)
clrsum -> ORDSUM : READ ORDSUM
   group IF NOT %EOF(ORDSUM)
clrsum -> ORDSUM : DELETE ORDSUM


end
end


CUS300.rpgle -> CUSTOMER : SETLL *LOVAL CUSTOMER
loop UNTIL NOT %EOF(CUSTOMER)
CUS300.rpgle -> CUSTOMER : READ CUSTOMER
group IF NOT %EOF(CUSTOMER)
CUS300.rpgle -> calctotal : calctotal




calctotal -> ORDERS : SETLL *LOVAL ORDERS
loop UNTIL NOT %EOF(ORDERS)
calctotal -> ORDERS : READ ORDERS
group IF NOT %EOF(ORDERS)
group IF CUID = ORCUID




calctotal -> ORDERS : UPDATE ORDERS
end
end
end
CUS300.rpgle -> dspcus : dspcus
group IF TOTAL > 0




end
group IF TOTAL > 0








CUS300.rpgle -> ORDSUM : WRITE ORDSUM
end
end
end
@enduml

Below is the diagram corresponding to the PlantUML code.

The diagram presents the sequence of operations performed by the RPG program. It starts with the execution of INZSR then executes the crlsum subroutine which executes a loop on the record of the ORDSUM file and performs a delete operation. Then the program continues processing the record of the CUSTOMER table and so forth. Transforming RPG code into PlantUML sequence diagrams offers several advantages:

  • Improved Code Comprehension: Visualizing the program flow through a PlantUML diagram makes it easier to understand the relationships between different parts of the code, especially for complex logic or interactions between subroutines.
  • Enhanced Collaboration: PlantUML diagrams provide a universal language for developers to discuss and document the program’s functionality. This can improve communication and collaboration within a team.
  • Efficient Debugging: By visualizing the execution flow, pinpointing errors or unexpected behavior in the code becomes more straightforward. Developers can identify issues by tracing the diagram and analyzing the interactions between elements.

The Java code presented in this example can be easily customized by implementing a specific logic for selected RPG statements. The same pipeline could be used to generate code for other text-to-diagram tools such as d2lang. 

Once your diagrams are in text you can transform your text-based diagrams into a powerhouse of insights with your preferred Large Language Model (LLM), effortlessly generating detailed documentation, obtaining tailored best practices, and unlocking optimization strategies that align perfectly with your architecture. But that’s not all – take it a step further by vectorizing these insights for seamless storage in a vector database to elevate your project’s potential and streamline your workflow in ways you never thought possible.

Source code available on GitHub: https://github.com/Strumenta/rpg-puml-sequence

Summary

In this article, we have explored the features of the Strumenta RPG parser to create a configurable pipeline approach to the development of language processing tools, allowing developers to create a transpiler from RPG to PlantUML easily. 

With its support for ASTs, traversing, transformations, and cross-referencing, the RPG parser

offers a comprehensive set of tools for working with source code.

By leveraging these features, developers can build powerful language processing tools to explore things that simply could not be seen before.