We interviewed Michael Frank, Software Engineer at QAware, an independent consulting and project house for software technology. QAware specializes in the analysis, renovation, invention, and implementation of software systems.

Federico Tomassetti: Okay, so, hello, let’s get started. And I will ask right away if you can explain to us, what do you do at Qware?

Michael Frank: So, at Qaware, I’m in the position of a software architect with ten years experience, seven years at Qaware. Last year I did a lot of stuff from team lead to technical sales. So double role, and at least in the team lead role I used Strumenta power. And I guess that’s the main focus.

Federico Tomassetti: Yes, indeed. And exactly. I wanted to ask, why did you decide to work with Strumenta?

Michael Frank: The answer is simple. Google told us you’re the only one who has an RPG parser, so this helps, I guess. We found the git repository with the, I guess, previous version, and we really had a need for AST parser. That’s very specific to the project. But I guess that’s another question.

Federico Tomassetti: Yes, exactly. So my next question was, what did we build together? But I think we share the fact that you’ve been using an RPG parser we provided. Is there anything you can say about the goal for which you use an RPG parser?

Michael Frank: Yeah, I can give you the context. So last year we had contact with a company that has a big legacy code base with an IBM AS400 host, and they have not many host developers left. Exactly three, though there’s a bit of a pressing need to move away from that technology because they’re not having really the resources anymore to maintain it. And they also were in this program to migrate away from the host for several years now, but they got stuck and they had a really solid estimate for the timeline and how to approach it. And that was one part. And the second part is, in order to really retire some components, you need to know: what are my components? and you need to answer the really important question: if I touch this database table or this program, if I delete it, what will happen? If you can’t answer this question, you can’t migrate it and you can’t turn it off. But that’s really crucial to answer. But there is no easy answer because it’s an old system and it was not really organized with a domain model. It was more organized as, well, put every RPG files in this folder, So there was no really no real component model visible. But basically to move forward with this project, the first goal was to establish transparency because that was lacking the most. You have no solid base for any estimate, for any planning, for any identifying what are the first migration candidates. You cannot do a timeline, you cannot budget, you cannot plan. So that is what we refer to as you need to get transparency in the migration project, that is really important. We do that with an approach we call establishing a single source of truth, which basically means, you take the thing, you want to migrate to the source code because that’s the only reliable documentation, but you need to compress it, you need to extract important parts, because not every code line is important in this stage of the project. But what’s important is the connection between programs is the read-write relations to databases, scheduling components, external cores, stuff like that. So you need to extract this information and compress it down to be able to work with some, and we put that in a graph database. So how do we get to this information? Well, code is the only truth. So yeah, there is metadata on AS400 machines you can use and use that as well. So the single source of truth incorporates a lot of information sources, but the most important one is the source code because that’s the only real truth there is documentation is lying to you. We need something to parse RPG files because it’s mainly an RPG code-based, but there’s also a lot of CL script and as always, there’s not that much budget. So if three of us guys are working for several months developing a parser, that’s not going to work out. So basically we were searching for a parser that is readily available and good enough for CL script we had to write it our own. And we also passed out embedded stuff and feed it between the parsers and for the RPG stuff we used your parser. So basically that’s it. It’s a part of a single source of truth pipeline that will scrape AS400 machines for source code and feed it into the single source of truth. So it gets updatable, it’s queryable, and it can show information in the level that is currently needed to plan a migration project and is shared across the team. So it’s not something only we use. We share that with our customer as well. We run it on customer hardware and we publish it on an endpoint that everyone can access the database and run. Curious against it to find out what parts he has to be migrating and has to do and has to plan and how it’s working and reverse engineer the whole system.

Federico Tomassetti: Seems a very interesting project, building this big model representing all the information present in the code and considering also different languages like CL and RPG. So nice project. Okay, and can I ask you if you are satisfied with the results that you got?

Michael Frank: Yeah, it did exactly what we needed to do and a few things that were lacking. You readily provided with updated version like we needed to find all call sites. So what does programs get called from? That’s important. And there were some smaller things that we suggested introducing interface for it: so we can just grab the interface and get the calls out. So yeah, it did what it should. And also you provided methods for traversing through the AST. So that really helps to do a I think it was a depth first search. I don’t know anymore, but yeah, it works really well. That would have been a pain to write ourselves.

Federico Tomassetti: Yeah, you can configure it if you want, prepare step first, but yeah. And can I ask you how it was to work with us if there was any particular challenge?

Michael Frank: Challenge, no, quite a contrary. The contact was always very nice and you provided fixes readily. So it was never a block of us in the project. So that worked really well and it was really really important for us and our customer to get that running.

Federico Tomasetti: Good. And is there anything that you think we could improve?

Michael Frank: Tough question. Maybe think about the licensing model for the transpiler. Unfortunately, the project decided not to use this part. Currently I’m not no longer in that project and ultimately it’s a decision for the customer if you want to take over the feed, but I guess pay-per-use and feeding it to some external service is a no go for many customers, feeding in codes to different sets. So they’re very keen about confidentiality, though it needed to run on customer hardware and stuff like that. You’re not getting anything out. As a side note, at first we tried to feed RPG code into chatGPT or any other larger language model to make sense of the stuff, but that didn’t work out because it’s so ancient and they don’t have really much bout it to that’s not entirely true. It knows it, it gets results, but it has too many failures in it. And unless you’re an expert, you will not notice it. But if you’re an expert, you don’t have any advantages from the GPT. So. But we are diverging from the topic, so that transpiler would be really interesting. However, the project does not use it, and if we use it, we need to use it as a library. We thought about integrating it into an IDE plugin. So hey, give me this snippet and translate it to something I can understand. Because maybe I’m a Java developer and a .net developer, I can’t read RPG, I don’t want to read RPG. My task is to migrate away from it. So I only needed to understand for a certain level. So, I’m not really interested in the result that doesn’t need to compile; it just needs to run. But there’s also the issue with the usefulness of such results. I need to understand the inner workings of an AS400 in RPG quite some bit to make sense of the code, or else you will not figure out how this works. So that’s the blocker, and that’s ultimately why the team decided, okay, no, that makes no sense, just translating it. You still need to know how this context stuff works, how all this overwriting variables and piping it in and out works. There’s no real Java construct for that.

Federico Tomassetti: It’s interesting that the main challenge is about finding the lie, the right licensing model, more than the technical challenges maybe that means that with the technical ones we found a solution for. But when lawyers get involved, okay, things get difficult, and now we go with easier questions. So I wanted to ask, how did you find out about Strumenta?

Michael Frank: As I said, an introduction, basically we did go the route of chatGPT and so on, and then we decided, okay, no, we want to do it. We need to achieve some degree of perfection in parsing, because if diverging from the question again. But it’s important for our approach to work. we need to get as much information as possible, because in such a project there are a lot of unknown the code stuff, get that to a high enough degree so you really need to pass that. And so we decided we need a good AST parser that does not miss anything. All the special cases that are in RPG. And then we googled and we found some GitHub projects, which is basically, I guess, the preliminary version of your ANTLR grammar? Probably, I guess. Yes. And that was not sufficient. So we found you where you said, okay, getting that to work, reliable, we make a company from it because it’s taking a lot of time to get that right. But I understand you’re completely getting that the last nuance to work is important. And so it was important for us too, because we needed that kind of perfection to get the list of known unknowns down as much as possible. Because all the planning of this migration project relies on that data. If we miss a call site, it’s not there. So it will haunt us when going into production. And if you kill something, then that will haunt us. So doing the work now reduces bugs later. So it’s really important to get a good parser for that. And that’s how we ultimately found you.

Federico Tomassetti: Yeah, good. Yeah. If I understand correctly, is that if you have some solid foundation that you build on top of, then you are, you have a good start. Otherwise, if you have to quickly put together a parser and then you’re not really sure about the results, then increase the complexity.

Micheal Frank: You really need that. It wasn’t the first migration project at our customer. They did it before and they more or less we are wrong about 10x. So the project is 10x more complex than they initially thought. And that’s it with, yeah, look at this small part. And there’s this whole thing of things I don’t know. And then you get totally unreliable numbers. What if that thing you thought, you know, is pretty simple in contrast to other stuff, but that’s why it’s so important from this dimension of what you’re seeing to cover as much as possible. So there can’t only be a 3% of things that surprise you because you can’t handle 3% of surprises, you can’t handle 90% of surprises that doesn’t work. So it’s in such a big project, 3% of the stuff you missed can’t amount to 90% of the effort, but 10% there might be something. Yeah. So you really need to get that to the 95 96 97 98 percentile of completeness in order to really plan this stuff. It’s just like with response times 50 percentile response time is shit. You need to really optimize to the 99% and so that’s the same as migration project. If you miss it in the technical analysis, which the computer can do, then you already lost and it paid out. We now can produce forecasts. You can go to management with that. You can exactly tell them that’s the effort. And we are pretty sure this time that we haven’t missed anything. But that’s really, really good. And also having this technical approach, so not just manual, some analysts analyzing everything, because the migration project will span three, four years. So you have to redo this analysis periodically because code gets changed, codes gets newly written, so it doesn’t help doing it once and then for the whole migration project. So getting that baked into the pipeline, doing it with technical approach, the parser approach is really, really helpful. You also need to add stuff the experts found out into the model. But that’s not concerning your parser. That’s just concerning the single source of truth model, which we are feeding this step. So that’s the motivation for all of this.

Federico Tomassetti: And would you suggest someone to work with Strumenta? And can I ask you to explain why in one case or another?

Michael Frank: For one, it was a pleasure working with you. And second, there are not much alternatives.

Federico Tomassetti: Yeah. Good. So thank you. I would like to ask you if you can share any resource or any way to find out more about what Qware is doing, what services are providing.

Michael Frank: You can always look into our website and I also could share some marketing material proactively about the single source of truth. Unfortunately, our customer didn’t have time yet to write any article about the project. Sorry. Depending on him on that. But yeah, they have more pressing charges. Like a dying host.

Federico Tomassetti: Yeah, seems urgent.

Michael Frank: But I can share with you, it’s a big logistics company, and if there’s something missing in the migration, they really, really, really get slapped on the wrist. No, not really. Slap on the risk. They are done in the market if they make an error in the migration. Because if you, for example, there’s an anti-terror list, and if you do business with one from this list, your company is done, no one is allowed to make trades with you. So if you would mess up that part in the migration, it’s really bad. So it has some potential for ending the company. So yeah, they have currently some pressing methods on their hands.

Federico Tomassetti: Nice example. So why is it important to get the migration exactly right? Yeah, yeah.

Michael Frank: That’s really the main thing they are worried about and they are right about it. The core logistics system is written in this AS400 machine. You want to migrate that you have to rip out the corn. If you fuck that up, you have a problem. You don’t have a company anymore. You don’t have a company anymore. You can’t recover from that. You’re on the no trade list of the US and international community. No one will trade with you, which is maybe the death for a trading company.

Federico Tomassetti: So, yeah, so that’s pretty interesting. And this is time for my last question. I wanted just to ask you if there is anything that you want to add. Maybe anything that I forgot to ask you.

Michale Frank: Not in particular. I guess maybe turn around the question. How was it for you working with us? Is it a common use case or what use cases you generally have, what the other customers using the parser for? That would be something that I’m interested in,

Federico Tomassetti: Yeah. I would say that there are people licensing the parsers for different reasons. Some people want to build their own transpiler. Nowadays we offer our transpiration services, but as you said, we do not give access to the transpiler. And there are companies maybe working in financial services or other fields where they really are afraid that their algorithms leave the company. So they maybe license the parcel and then build a transpiler on their own. Some others do some sort of analysis on the RPG code for all sorts of purposes. This is a little more broad, varies from case to case, and unfortunately they do not always share with us the details of what kind of analysis they’re doing on the code. So maybe we just hear about, you know, possible issues that define in the parser, but that is the end. About working with you. I think from our point of view, it works great because you learn how to use the parser very quickly and you reported issues very precisely. So I think from our point of view, everything work as well as we could hope for.

Michael Frank: Yeah, that’s my strong suit. Reverse engineering libraries, like with hosts, that’s the main shot of.

Michael Frank: So I think that was the last question. So thank you very much.