The thing about being an IT professional or a sysadmin is that your workload comes and goes in waves. Some days are just slow and lazy, and there is not much for you to do. You are all caught up on your current projects, all of which are pending review, waiting for approval or at a standstill. You are done with all regular maintenance tasks, and anything requiring serious work can’t be done during business hours anyway because it would require taking down a crucial server, or two. Most of your day is spent with silly paperwork, answering random calls from marketers trying to sell you enterprise business solutions, and browsing the web. On such days even the users relax a bit.
You can hear it in their typing. On normal days the clickety-clack of the keyboards is an angry, hate filled sound. They are not typing they are spitefully inflict punishments onto their machines in the form of text. Work is a struggle between man and the disobedient machine. But on the slow days, their touch becomes gentler. You could almost imagine that they don’t hate and fear their computers, but have grown to accept them as inanimate objects that are merely tools of their trade. This of course is merely wishful thinking, as you can still smell their fear and disdain in the air. But on those days when nothing breaks, there are no upgrades and everything is sailing smoothly they are lulled into brief and temporary state of comfort and relaxation.
Then, there are the other days. We call them “Fan Days” – days where the shit hits the fan hard, and everything breakable, decides to break at the same instant. This is a story about one of such days.
The first indication that you are about to have a fan day, is a curious convergence of vacation days, sick days and personal leave request among your coworkers. Nothing bad usually happens when your IT team is at full strength, and in good shape to swiftly respond to major issues. It is only when you are running with a bare bones skeleton crew when things start to crumble. On this fateful day, I was more or less flying solo using the intern as the storm shield against the wrath of the users calling the help desk.
The first issue of the day came early in the morning and concerned our Barracuda SSL VPN box. Granted we go get anywhere between 5 to 10 calls for that thing every day, but that’s just because around 10 of our users just can’t wrap their heads around the concept of two factor authentication. The help desk is sort of a fourth factor in their authentication process, walking them through the extremely difficult task of plugging in the USB dongle and typing a password into a login box. This was an all-together different pair of shoes – not a PEBKAC authentication problem, but an actual functionality issue.
The nice thing about using the Barracuda is that it allows us to give remote users access to our intranet web apps without actually exposing any of their servers to the internet. The communication is proxied by the SSL VPN box, encrypted and hidden behind a two factor authentication scheme. One of our users had a problem uploading a large zip archive to that very service, but the Barracuda proxy server would not have it. We have never noticed this in testing, because none of us reasonable IT folks even dared to assume someone would be silly enough to try uploading over a 100 MB of crap. Average size of the attached files uploaded through that form was 5-6MB so we collectively figured that setting the upload size limit to exactly 100MB would be more than enough. But, alas here was a user with a 115MB file that needed to be submitted by yesterday.
Fortunately, it was not ha huge issue. I simply logged into the admin panel and increased the file size limit, applied the changes and… Inadvertently broke the SSL VPN box somehow. It basically fell of the internet. One minute it was there, the next it was completely gone, not responding to any HTTP requests. I could still ping it, and nmap could see that it was listening on the usual ports, but there was no one answering when you knocked on the door.
I promptly fell out of my chair, and scrambled towards the server room to check whether or not the device bricked itself somehow. With shaking hands I fumbled for my access card, and proceeded to swipe it exactly four times, until I finally got the right alignment of magnetic stripe to the card reader. Then I dropped by server rack keys on the floor a few times, before I managed to claw my way to the silently humming boxes inside.
When I switched the KVM to the Barracuda box, the screen lit up and I saw the familiar login prompt, and the maintenance menus. Sudden rush of relief caused my lungs to exhale stale air, probably for the first time in the last 10 minutes. The box was not bricked, but the internal web server was down.
Rebooting the device would probably be a logical choice at this point, but I was hesitant to try it. After all my tiny, insignificant change somehow put it into a very weird state and who knows what it did to the internal data store. I wasn’t about to risk doing even more damage to it than I already did, so I got Barracuda support team on the phone.
Have you ever dealt with them? They have a great, high quality team that speaks fluent English (which is rare these days) and you can usually get a live person on the phone in under 10 minutes. Unfortunately, the product specialist I talked with wasn’t much help. To diagnose the issue he had to gain access to the device, and since the web based admin panel went to hell I could not give him the permissions to do that. I tried enabling the SSH tunnel from the physical console, but that did not work either but for an entirely different reason.
You see, I never specified that port 22 needed to be open for this box, so our over-eager network admin Andy likely locked it down. So of course I called him next, and he was not very happy to hear from me, seeing how this was his vacation day.
“Dude, I’m on the beach right now. What did you break?”
“SSL VPN box. Barracuda support need to ssh into it to un-fuck it” I replied without missing a beat.
Andy grumbled something about dealing with n00bs while on vacation but eventually agreed to give me the log in information to the firewall so I could open port 22.
“Ok, so the password is my last name, then 123″
“Yeah, all lowercase.”
“I… I have no words for this Andy…”
“No one can spell my last name anyway…”
“Yeah, but your last name is on the website.”
“Oh, yeah… Well, you can change it to something else while you’re in there.”
So of course I did – I changed it to “Fuck You 4ndy” followed by long random string of characters. I assumed he would appreciate that upon coming back after a full day of beach bumming.
Needless to say, once I was armed with Andy’s terribad password, which was about 3 times worse than the kind of shit we yell at our users for, I managed to open the right port and call Barracuda back. Sadly, it turned out to be a massive waste of time. Whatever took down the web server, also seemed to blow away ssh. For an instant I felt bad for even calling Andy on his off day, but then I got over it. I mean, who does he think he is, taking a vacation while I’m in here breaking mission critical systems left and right. Screw him.
Since we ran out of options the Barracuda help desk recommended power-cycling the appliance. The idea was not to do a graceful shutdown but just kill it, and bring it back up without letting it write any permanent changes to disk (unless it already did but we would worry about that later).
So that’s what I did. I mechanically unlocked the face plate on the device, took it off and placed it on the floor. Still chatting with the support guy on the other line, I depressed the power button.
At that very instant something clicked in my brain and with a start, I realized three things:
- The Barracuda SSL VPN box has no face plate
- The box on which I was pressing power had a little plate next to the button that said Dell, and a small sticker labeled “Firewall”
- This entire story was taking place before The Firewall Saga
In case you haven’t read my Firewall Saga, let me explain: at that time, our Checkpoint firewall had a weird glitch that caused it to “forget” the license keys and subsequently close off all network ports, fall off the internet, and bring down the entire network to a halt each time you rebooted it. The only way around it was to log in via the physical console, and manually type in the long license key strings to re-activate it.
While I had the power button depressed, the server kept chugging along just fine. I briefly considered just staying in the server room till 5pm holding that button, but that wasn’t really an option. So I attempted to delicately slide my finger off of it, hoping the machine will forget that I pressed it in the first place. That did not work.
Three point seven seconds later, The Intern appeared in the doorway giving me a questioning look:
“Dude, did you reboot the internet?”
“What? No. Of course not!” I lied discretely pushing the power button again, to bring the firewall back up.
Two seconds later, Jay from accounting materialized behind the Intern and told me the internet went down.
“I know.” I replied “That’s why I’m here. I’m working on it”
In about quarter of a second a third person appeared in the doorway, forming the beginnings of an impromptu conga line. This was one of our supervisors who also noticed the lack of internet – or as he described it “the email erroring out on him”.
“He knows.” said Jay.
“He is working on it. That’s why he is here” added the Intern.
This seemed to placate the supervisory entity. He nodded, and wandered away. As his footsteps faded out in the distance, I heard him repeat “he knows, he’s working on it” to at least three people who were making a bee line for the server room.
Meanwhile I was already dialing Andy, hoping he can walk me through the manual application of the license keys for the Firewall.
So to summarize – I single handedly bricked the SSL VPN box, temporarily took down the firewall and disconnected the entire office from the internet – all before 11am. Amusingly enough, this was not the last crazy thing that happened that day. Not by a long shot.