Comments on: On Optimization

By: Luke Maciak

Luke Maciak — Wed, 29 Jul 2009 14:03:25 +0000

@ Animesh Sarkar: Hey, thanks for the perspective. Most of us automatically assume that COBOL runs on a dusty old, steam powered behemoths that were built in the bronze age, require souls of orphans to boot up and can only be found in the deepest, darkest corner of the company basement in a room with a “BEWARE OF THE LEOPARD” on the door. But, as we can see, it can also be found on a server rack in a modern data center. :)

By: Animesh Sarkar

Animesh Sarkar — Wed, 29 Jul 2009 11:16:14 +0000

Believe it or not I am a COBOL programmer, programming on a mainframe for a major financial institution. Just to clear up a few misconceptions here are a few facts
1. The latest mainframe OS by IBM (Z/OS) is actually the same age as WinXP (released in 2000) and it’s latest stable release is younger than Vista ( released in 2008), if wikipedia is to be believed http://en.wikipedia.org/wiki/Z/OS
2. IBM (and most other manufacturers) still manufacture mainframes and hardware upgrades for existing ones.

So just because it has been around for a long long time it doesn’t mean that it is old.

3. COBOL is extremely and I repeat extremely good at processing fixed length records in fixed formats. Give it data in a predefined format and it is parsed as it is read –automatically no extra coding required.
4. Almost all COBOL program will have to access a database of some kind and DB2 uses SQL.

In general the data that needs to be processed has largely repetitive portions which will be most likely be used to query the db

So the only optimization that a COBOL programmer has to do is make sure that a. the data being fed to the program is sorted on the most likely criteria for db query ( done in JCL just before the program itself) and
b. query the db once every time the values have changed.

I learnt the second lesson the hard way when I wrote a program which processed around eight million records in ~200 minutes and by just adding an ‘if ‘ statement immediately before the db query reduced the execution time to 3 minutes.
Yes COBOL is fast! Blindingly so!

On the other hand making COBOL do much of something else for example parsing a variable length web address and you would be looking at more than 1000 (yes thats three zeros) of coding.

Then again there are other tools better suited for the job in mainframes too :)

By: Ivan Voras

Ivan Voras — Fri, 24 Jul 2009 21:12:02 +0000

I have also heard stories about mainframes and COBOL from old graybeards, to the tune of millions of transactions processed by some ancient code in some $timeframe, but surprisingly, I’ve never ever heard anyone analyzing why is it so, how is it possible and why can’t newer systems do that.

Let’s consider databases – for example a database of utility bills to be sent out. In a database you go through these records, calculate the amount to be charged and mark records as “sent”. For consistency sake (ACID rules) you need to update the database synchronously to avoid duplicate processing, for resilience to power failures, etc.

For example, a 10-drive modern disk array can be relied upon to perform about 2000 random IO transactions / s (this is actually a grossly oversimplified number as there’s actually a lot of specifics here I’m leaving out for the sake of brevity – like RAID type, etc. – Google will enlighten curious readers). Think about it – 2000 transactions per second. That’s modern mechanical drives (SSDs are just coming to the data centers). What those mainframes had to work with was certainly not near this kind of performance. The *only* ways you can increase this performance, and ones that are routinely used, are either by making everything sequential or by various caching methods.

If those old applications are to approach the performance it’s claimed of them, they either need to do processing on the drive sequentially or they need to rely on caching. But any kind of memory caching is dangerous due to power loss, etc. risks. On the other hand, those mainframes are often built in underground bunkers with their own diesel generators so the power doesn’t ever need to go down.

Hard coding everything could make it possible to do most of the work sequentially (and thus gain those 100+ MB/s raw data IO numbers everyone’s familiar with in benchmarks) in a way that’s simply not possible with SQL (but it could be “emulated” in SQL – like not doing “select *” but grouping data by some criteria – street name for example: “select * … where street=x” and processing it thusly – on error, entire street batch is redone).

I suspect it’s a combination: what the modern programmers forgot is how to not rely on SQL but simply scrape the data off the drive platters and process them as it comes, and on the other hand, management refuses to build monumental system rooms to house data centers that would be resilient to bomb strikes (thus making caching available) ever again.

By: Luke Maciak

Luke Maciak — Wed, 22 Jul 2009 00:07:53 +0000

@ astine: Yep, if you are writing a framework you are likely “doin it rong”… Unless of course you actually set out to write a framework.

“Let’s build a framework” seems to be the descendant of the much older “let’s create a domain specific language” problem solving strategy.

@ stan geller: In any case, .NET is probably the last thing you would want to migrate to being a linux shop and all. You’d probably want Java instead – but Java would probably keel over and die on your hardware and software stack. So what you are doing right now is probably the most cost efficient thing that can be done.

Do you guys have any plans to replace COBOL core with something else eventually? Cause, let’s face it – the number of people who are actually good COBOL hackers (is there such a thing?) is probably slowly approaching zero. I mean, you could probably train new guys on the job but you know how that is. I’m curious if your company is at all worried about stuff like that.

I know that many are not, and I’m sure we will see COBOL code in production environments for the next 20-30 years if not more.

By: stan geller

stan geller — Tue, 21 Jul 2009 23:07:34 +0000

Luke Maciak wrote:

4 mil – ouch! I imagine most of that is hardware upgrade (and if you want .NET windows licenses). Sometimes it just doesn’t make sense to upgrade while the old system is working.

we don’t have anything .NET or MS server related here at all… Linux servers …that is why our rack is worth about only $5k in hardware..the only upgrades/updates would be COBOL – web/MYSQL connections and services

By: astine

astine — Tue, 21 Jul 2009 22:32:14 +0000

Luke Maciak wrote:

There is really no reason why the new code on new hardware shouldn’t be faster …

Or if you had absolutely no understanding of the problem with which you were dealing. Or, more importantly, if you tried to overgeneralize the code. It sounds to my like a combination of two things happened. One, the greenhorns either didn’t or couldn’t read the code of the older application and had no idea how it worked or supposed did its job. And, two, being straight out of college, probably over engineered their new application ridiculously.

The old Cobol program was likely tailored very exactly to the problem at hand, taking into account harware and business rules into its optimizations. The Asp.NET program on the other hand was probably engineered as meta-solution, business logic and hardware cleanly abstracted out of the core for the sake of flexibility. Too many layers can cause problems and it’s likely that a lot of the overhead of the .NET app had to do with dispatching on objects that shouldn’t be objects and parsing XML that shouldn’t be XML. (I once worked on a project that was slow for a very similar reason.)

Leason being: make sure your code is modular and flexible so that it can adapt, but don’t go too far or or will turn into a different kind of spagetti code. If your app can be described as a ‘framework,’ of any kind, it is too general.

By: Luke Maciak

Luke Maciak — Tue, 21 Jul 2009 20:26:17 +0000

@ Steve: Yeah, back in the times of COBOL security was likely not a primary concern. :)

@ Naum: Hey, you can’t say you have many stories and then not share even a single one! That’s Tell the stories!

@ stan geller: 4 mil – ouch! I imagine most of that is hardware upgrade (and if you want .NET windows licenses). Sometimes it just doesn’t make sense to upgrade while the old system is working.

Zel wrote:

Still, it’s strange that current hardware can’t match a 20 years old system. The op/s count has been multiplied about a thousandfold or more, and you could probably fit the whole HD in RAM disks, so even a poorly written code using a similar algorithm (I don’t know COBOL, can it do things others -like C- can’t? ) should be faster.

I suspect this was due to Daily-WTF quality code the ASP.NET team produced. I don’t really know the exact details, but that’s what it sounded like. There is really no reason why the new code on new hardware shouldn’t be faster – unless of course you are using bubble sort of everything, or your whole database is a single table with no keys, constraints of indexes. Or if you use MSAccess as your db backend. :P

By: Zel

Zel — Tue, 21 Jul 2009 18:48:03 +0000

Some large banks also still use COBOL, probably for the same reasons you mention. I’ve heard quite a few stories about attempts of migration, none were successful that I know of.

There are some tools to detect bottlenecks in .NET though. Back when I was toying with C# and XNA, I used Microsoft’s CLR profiler to detect major memory users and lengthy functions, and it worked pretty well.

Still, it’s strange that current hardware can’t match a 20 years old system. The op/s count has been multiplied about a thousandfold or more, and you could probably fit the whole HD in RAM disks, so even a poorly written code using a similar algorithm (I don’t know COBOL, can it do things others -like C- can’t? ) should be faster.

By: stan geller

stan geller — Tue, 21 Jul 2009 18:09:25 +0000

here in Canada as well…
We run Cobol platform on a $500 Celeron home build 5 y/o Linux Madrake!!!
machine that serves 10 branches all across US and Canada.
Switching to sap/.NET/whatever would easily cost us $3-4 millions.
I would rather add some COBOL – Python/php/MySQL wrappers for our web services such as stock checks and so on and spend another $20 or so on a memory upgrade….cheers

By: Naum

Naum — Tue, 21 Jul 2009 17:00:50 +0000

Oh, on COBOL and legacy mainframe systems, I have some stories… …don’t have time to dive into ATM, but I could spin many a yarn resembling, but even more comical, in the theme of your post…