For those not familiar, a mainframe is just a large server. Older applications need to run on one of a small number of proprietary operating systems, but on most mainframes you can also install UNIX or Linux. If you want to create a lot of virtual Linux servers for your cloud environment, the best and cheapest solution might be a mainframe.

Back in the late 90s, I helped a large IBM customer set up a website on their mainframe, with Java servlets accessing SQL data in DB2, which performed very well. So the problem is not writing new systems to run on the mainframe server, the problem comes when you want to get rid of the mainframe, ostensibly to save money, but you have those pesky older applications which require the proprietary operating systems.

Mostly when people refer to the mainframe or mainframe code or even COBOL programs, they are really referring to applications that run in one of the proprietary systems on a mainframe, such as CICS, IMS, or batch jobs using JCL.

So what is the difficulty in migrating mainframe applications to Linux to run in the cloud? Let me be clear: it's not COBOL.

If you have a batch application written in COBOL which processes DB2 SQL data, it is relatively simple to migrate this to a Linux server. There are some minor incompatibilities between DB2 and MySQL for example, but no worse than between MySQL and Postgres, and there are COBOL compilers available on Linux, or you can rewrite the code. I admit that rewriting a huge procedural COBOL program would not be fun, but it could be done, and there are translation tools available. One issue would be the lack of test suites - TDD was unheard of, but such a system could be tested by running workloads through the old and new systems and comparing the results.

Notice that I said 'batch application' above? That's kind of a give-away - mainframes generally process huge batch runs every night, and have advanced job-scheduling systems and a job control language to process flows of jobs, which often pass temporary files as the output of one step into the input of the next, and system utilities to sort said temp files any which way. Even if your entire organization's data is stored in relational databases like DB2 or Oracle, migrating the complete batch cycle is not an easy job, but again, it is a well-trodden path, so there are tools and techniques available.

For interactive systems, the mainframe terminals which are text-only (mainly 80 columns by 24 rows) are connected to a transaction processing system such as CICS or IMS/TM. In the simple case, where all the data is in relational databases, you're probably using CICS. The interactive programs are usually complex, with terminal-control and validation code intermingled with database access. They did not have to be written that way - my company was one of many that delivered well-structured systems in the 90s which kept a strict separation between front-end code and back-end business logic, but most of the CICS code that I have seen was not done this way. However, this is not a criticism of mainframe developers - if you have been around long enough you'll recall some very tangled code in C or BASIC from the 80s also - the mainframe code is unique only in that it is still used regularly by big companies.

You could also attempt to run the terminal programs in a CICS emulation on *nix - IBM still sells TXSeries, which is a capable product on Linux or Windows, but you will almost certainly have to make some modifications to the code, and you would still end up with a text UI. Most migration projects are looking for a browser interface, so unless you are happy with one of the many screen-scraping systems, you are going to have to rewrite the code.

You could attempt to split the COBOL code into purely front-end code and purely back-end code, and put it back into production before migrating, and sometimes that would be a worthwhile step. Most times though, you would bite the bullet and handle the functionality split while rewriting.

So the interactive systems could get rewritten into whatever tooling you feel like, splitting the logic into front-end and back-end code. You would probably want to target a browser, so maybe React or Angular on the front end, and Node or Java/Spring on the back end. You would need someone on the team with a good understanding of CICS to tell you what the legacy code does, but this is again a fairly straightforward task. Others have done this, and I've worked on a successful project like this myself.

Now I have cheated a bit in the examples above, by saying that the data was in a DB2 relational/SQL DB. DB2 became common around 1990, and was heavily used in new systems since then, but if you have older systems, then you will be working with data in VSAM indexed files or flat files, read sequentially. Indexed files are not usually considered a database, but if you are using CICS, then it supports ACID transactions for updates in multiple indexed files, and (through two-phase commits) it even extends transactions to DB2. Mixed forms of data are not uncommon, with older applications using indexed files having newer functionality added using DB2.

Flat files are the modern incarnation of magnetic tapes - those spinning reels that you see in old TV shows. You would see the tapes moving in both directions, because it was common for a job to read a record, then back up and write it with new values. This same functionality still exists today in many batch applications, often without rewriting the programs, but now working with flat files that are processed sequentially. This functionality can be handled in much the same way on *nix servers, but there are not many programmers who are familiar with working with files like this.

When I've worked on migrations like this, we've generally gone for a complete conversion of the indexed-file data to relational as part of the migration, rather than try to come up with ways to handle transactions that span multiple indexed files. However, let me just say that converting indexed file data to a SQL DB is not for the faint of heart.

Now if a mainframe developer is reading this, they would say "hold on, you are ignoring IMS DB". That's quite true, I am ignoring IMS DB. IMS DB is a hierarchical DB that is blazing fast and there is nothing like in the *nix or Windows world. I worked on a small IMS DB to DB2 migration once, and it was complex. "Get the next child record", or "get the parent record" can be simulated in SQL using clever keys, but it is error-prone and complex - not ideal. Trying to emulate a hierarchical structure in a SQL DB can result in code that is brittle and hard to maintain. So I would not consider migrating an existing IMS DB application - I would look at rewriting it from scratch. Of course, IMS DB is used in very high-performance applications, so the rewrite would have to be designed for high performance.

I hope I've made it clear that the problems I have seen with migrating mainframe applications to *nix have nothing to do with the source code - the database is the key to the problems. I have seen projects that got underway with great enthusiasm, and then simply ground to a halt once they had uncovered all the issues around the data. Other projects kept going and migrated subsystems that could be completed, but left large parts of the total intact. Some projects kept going until the funding ran out, and then stopped having delivered nothing useful. I also heard of some projects that were completely successful, but very expensive.

Successful projects generally involve a mixed approach: replacing some applications with purchased equivalents, discovering some unnecessary applications, rewriting some applications, and migrating some applications. Any successful project though, has to deal with large amounts of data in various file formats, and decide what to do with it.

The least of the worries is whether the programs were written in COBOL.

Previous Post Next Post