Here is an interesting story that I got from one of the old-timers in our industry. The guy who told it to me used to be a COBOL developer back in the day when Cobol was the “bleeding edge” technology. He is no longer working in the field nowadays, and he sort of lost track of the technology train.
He told me that he recently was working on some deal with the first company that hired him out of college. They gave him a brief tour and talked about the upcoming upgrade of their billing/accounting/everything else system. Apparently they were finally moving from their old cryptic, COBOL application to a brand new one written in ASP.NET. Few more prodding questions confirmed his suspicion. The COBOL app was the same exact system he helped to design 20 something years ago. This was some of the worst, buggiest and most unreadable code he has ever wrote in his life (being green and fresh out of school) and yet it was still in operation.
That’s not all though. He asked them how come they have never replaced the system with something more modern up until now. It turns out that they had. This was actually the third attempt at migration to a new technology. Previous 2 have failed miserably. Their development teams did produce viable code which fared pretty well in small scale tests. But when they actually tried to run full scale operations, the ASP app would just grind to a halt.
I chuckled. I was not surprised. “Back in the day, people knew how to write code. We are so spoiled by the Moores Law that we forgot how to do it these days”, I mused. He nodded in agreement.
You see, the COBOL system processed millions of records every day. Even though it was old, and running on an ancient hardware each batch would only take seconds to crank out. It was stable, reliable and the COBOL old-timers optimized the shit out of it over the last 20 years using every known trick in the book. The app was maintained by a shrinking cadre of wise bearded fellows who scoffed at new fangled concepts such as objects, polymorphism or encapsulation. They, however knew exactly how to shave few seconds off an operation by writing intricate spaghetti code.
The ASP and then ASP.NET code on the other hand was written by groups of greenhorns fresh out of college. They were idealistic, and excited about their project. They wrote great object oriented code, split into clearly defined modules. They leveraged open source libraries. More importantly their code was run on top of the line servers – best the money could buy.
And yet, each time they put it to real life stress test the ancient COBOL kludge would run laps around them. It would process 10 thousand records before their code even finished initializing. What took the old app 10 minutes would take 2 days on the .NET platform despite running on a hardware that was at least 10 times as fast.
They could not take such a major hit in speed as it would hurt their productivity. So twice they have shelved the project waiting until hardware finally catches up. Yep, they were waiting for hardware to catch up so that they could even hope to match the performance of a 20 year old application. Every once in a while they would brush of the code install it on new, juiced up servers and have another crack at it. They didn’t bother rewriting it, because so much money was sunk into it in the first place. So each consecutive team assigned to it would just do some minor re-factoring. This time however they were sure of success. The initial tests revealed that the newest incarnation of the ASP app was only 30% slower than the COBOL app which was considered an overwhelming success.
Non only that, they explained, but in 2-3 years the hardware will become twice as fast, which means that they might actually be able to match or even exceed the COBOL performance. Imagine that.
I’m not taking a crack at .NET or modern programming paradigms here. There is nothing wrong with either. Someone could argue that choosing to write this code in C would be a better idea from optimization point of view. Then again, modern JIT compilers can often optimize the executed code much better at run time than a C guru could ever do it by hand.
In fact, there is no for why COBOL code running on old hardware would really outperform .NET running on a modern rig. None besides crappy coding on the .NET platform. Back in the day when memory and disk space was scarce and each CPU cycle was important people knew how to optimize code. They knew how to write programs that will scale well, under limited resources. They had to learn these tricks because there was simply no way to shove hundreds of megabytes of data items in and out of memory like it is today. When they wrote code, they had to think about how it will work with large data sets.
Over the years we have sort of become lazy and complacent. I’m as guilty of it as everyone else. When I write code I hardly ever consider large data sets. I just make sure the important columns in the database are indexed, and that my query is not retarded. I hardly ever look at the actual logic within my program. I write deeply nested loops without thinking about scalability. It became sort of a pathology.
I became painfully aware of this while working on my thesis. When I was forced to do operations on big data matrices over and over again, I had to go down to basics – get rid of fancy objects, iterators and all that jazz, and just use simple loops and arrays. And even then I was struggling because no one ever taught us how to really approach practical optimization. I mean we talked about it in theory – and we were taught about algorithms. But no one really bothered to teach us practical things such as good ways to identify bottlenecks in your code, or practical optimization tricks.
I guess everyone assumed you will pick up stuff like that on your own. Or it will get drummed into you at your first job. Or people simply forget about it. After all, parallelism is the sexy thing to talk about these days. So instead of finding bottlenecks and eliminating them let’s parallelize the code and make it run on a cluster. Which is a valid approach, but not in every situation. Every once in a while you run into situation like the one I described above. There is an ancient COBOL app running on an ancient hardware – and it cranks out results faster than your code written in a modern language, running on a modern computer.
Does it means that we lost an edge? Does it mean we forgot how to write efficient code that will run fast even with very limited resources. Not really. There are still people out there who can do this sort of thing well. And of course over optimization can be harmful too.
This is just something to think about. Situations like this one happen in real life, and are quite ironic. I wonder how did the .NET development team justify their poor performance to the management.
/dev/random