mostly true – Terminally Incoherent http://www.terminally-incoherent.com/blog I will not fix your computer. Wed, 05 Jan 2022 03:54:09 +0000 en-US hourly 1 https://wordpress.org/?v=4.7.26 Pam’s Peculiar Printer Problems http://www.terminally-incoherent.com/blog/2012/10/19/pams-peculiar-printer-problems/ http://www.terminally-incoherent.com/blog/2012/10/19/pams-peculiar-printer-problems/#comments Fri, 19 Oct 2012 14:18:30 +0000 http://www.terminally-incoherent.com/blog/?p=12847 Continue reading ]]> Every IT department has “regulars” – users who submit tickets so frequently they ought to be issued customer loyalty cards. Users whose computers spend more time on the help desk bench than on their desks. Users who use the Ash Ketchum approach when dealing with viruses worms and trojans (“Gotta catch’em all!”). Our regular was Pam, and these days she is mostly remembered because of her printer problem. Pam’s Printer became part of the office parlance – a colloquialism that to this day crops up in tech support tickets. But before the IT department immortalized her by coining a new expression she was already infamous for issuing verbose, and descriptive help desk tickets along the lines of “see me” or “computer broke”.

I’m mentioning this so you can get an idea of the type of a user Pam was. Absolutely clueless, uncooperative, needy, insignificant in the corporate structure, but at the same time privileged by association due to a position as an administrative assistant under a local VIP. Her problems were mostly trivial and low priority, but very frequently escalated to critical status via directorial fiat because Pam’s frustration and inability to deal with technological issues was immediately visible to the powers that be.

One day Pam’s printer broke. At least that’s what she claimed. The ticket she submitted in our system read verbatim: “new printer”. Pam never really grasped the purpose of the ticketing system, but then again 80% of our users had the same problem so we never really held it against her. In fact, our DBA was actually quite fond of Pam’s succinct messages. Even though Pam was in the top 1% of the most frequent users of said system, her submissions used so few characters that they took almost no space on disk. In this particular case, Pam has decided to spare us the needless troubleshooting steps and jumped straight to the resolution stage. The ticket body was what she decided was the best solution for her problem: namely a new printer. Why did she need one? Well, explaining that was obviously a waste of her precious time.

Naturally, as it was the custom, the help desk engaged in the ritual cat and mouse game of trying to get the user update the ticket with the proper information. If you have ever worked on the front lines of support queue, you have probably seen this conversation play out a thousand times over email, or ticketing system of some sort. It goes a little bit like this:

Updated by Help-Desk at 9:00am:
     What is the exact reason for this request?
Updated by PAM at 9:05am:
     printer broke
Updated by Help-Desk at 9:15am:
     What is exactly wrong with the device?
Updated by PAM at 9:17am:
     it doesn't work

In theory, this sort of exchange can be carried on indefinitely without actual information exchange taking place. For some strange reason, users are very reluctant in volunteering troubleshooting information – their usual tactic is avoidance and misdirection. The best way to get information out of the user is to ask directly, and repeatedly. It works best during face to face confrontations or phone conversations because you can just keep repeating your question over, and over, and over again for 10-15 minutes straight, until the user realizes that attempts to change the subjects are futile and reluctantly agrees to read the error message of their screen. This tactic is not as effective via email. In the best case scenario, the user subjected to the “let me repeat my initial question one more time” treatment gets frustrated by lack of resolution, and actually calls the help desk, at which point he or she can be properly interrogated. At worst the email/ticket conversation keeps on going in this manner for days, or even weeks. Our usual policy is to c-c-c-ombo break these things whenever we see it happening, call the user on the phone and Spanish Inquisition the shit out of them.

Upon interviewing Pam, we determined that her printer wasn’t broken. It was streaking – as in producing unwanted vertical lines on printed pages. This was hardly a reason for a new printer. Nothing that good cleaning and replacing the toner wouldn’t fix. We have dispatched a drone with a can of air, a non-linting static free cloth, and a toner cartridge to Pam’s lair, closed the ticket and moved on to bigger and better things. Streaks were gone, and Pam was mollified for the time being.

Two days later, streaking issue returned with a vengeance. We have re-applied the same resolution as before, hoping for the best. The astute drone we sent out, dragged the replaced toner cartridge back to the IT cave and pointed out an interesting irregularity. The normally smooth, blueish transfer roll inside of the cartridge was now speckled with black clumps. Said clumps were toner deposits that fused to the surface of the roll and calcified into solid matter. Wiping the roll with soft anti-static cloth had almost no effect. The only way to remove these deposits to was apply a good deal of force and abrasive friction to the affected areas (or in layman terms, rubbing it really hard) which was likely to scratch or damage the surface of the roll.

What could be causing such toner deposits? Well, the desk-top laser printers are really not that complicated. Maintenance wise, there are really just two major “consumable” components that are likely to affect print quality: the toner and the fusing assembly. The fusers are only guaranteed to function properly for N pages but most consumers don’t give a fuck because N tends to larger than their predicted lifetime print usage. The only places that usually bother to replace fusers are corporate shops that kill half a forest every day. Here is the brilliant part:

Vendors such as HP like to sell their fusing assembly kits for older printer models at about the same price as brand new printers with comparable specs. So sometimes it is actually more cost-effective to just replace the printer… Unless of course you happen to work in IT. Then, fuser is usually the safer option. Why? Well, if you put a dozen new printers on your expense report, it is almost certain someone up the chain of command is going to have a conniption. To an average pointy haired manager, a printer is a robust commodity that once purchased is supposed to last at least one hundred million years. The need to periodically replace desktop and laptop machines, and even monitors is readily understood by most. Printers on the other hand… That’s a different matter. Put dozen of fusers on your expense report, and it gets rubber stamped and approved with no questions asked.

Considering that Pam’s printer was cleaned, and re-tonered just two days ago, I figured the culprit had to be a failing fuser. So I placed an order for a full on maintenance kit that includes the fuser, and a brand new set of rollers, springs and other junk. I broke the printer apart, and went through the entire kit, replacing just about every movable part. As I did this, I noticed a lot of weird gunk on some of the rollers and paper leads but I paid it no heed. After all, gunk is not an unusual thing to find inside of a printer. I made sure to wipe, clean and brush every surface and get the machine into a pristine condition. I installed in the new fuser and a new toner. I printed some test pages and the quality was nothing but immaculate – the white areas were as bright and clear as fresh snow, the black was as dark as the heart of our marketing director. The machine worked perfectly. I dropped it off at Pam’s desk and all was well… For about three days.

The same problem with streaking and weird deposits of toner on the transfer roll returned without a warning. At this point Pam decided that we are straight up trolling her and obstructing her work. So she logged a heartfelt plea with her supervisor, and her case got promptly escalated to the exclusive special status of “just get the damn woman a new fucking printer already”.

Powers that be command, we execute. Pam got a new printer post haste. It had all bells and whistles and it was delivered at no expenses spared using the normally verboten next business day delivery method wrapped in a bow and with a motherfucking cherry on top. We installed, tested, configure it, and set it free. We even sent The Intern to polish the front LED panel so that it would glow an acceptable shade of blue for her. Perhaps for the first time in her life Pam was truly happy. Not necessarily because she liked the new printer (oh no, she hated it because the buttons on the front were in different order and complained about it at length). No, she was happy because she could gloat, and brag to her friends how she won. Because apparently tech support is a battle of wits of some sort. Apparently our job in the IT department is not to resolve their issues and help them to do their jobs – no the users consider us mortal enemies, whose only mission in life is to obstruct their work and ruin their productivity. But such is life in the IT department. Every time a user sends you a personalized email thanking you for your help, you should make a note of it. That user has remembered your name, and he or she will now blame their next late report or other fuck-up on you. Why? because you were the last person to touch their computer and therefore you are responsible for all technical issues with that machine from that point on, until the next person fixes it. You get used to it. In fact, you take some comfort in it. If the user is happy, who cares what are the reasons. As long as Pam and her boss were content we could concentrate our efforts on supporting actual mission critical infrastructure.

I took Pam’s old printer and deposited it in the Magical Repository of Useless Junk also known as that empty cubicle next to my cubicle. Don’t go in there by the way, I hung up that avalanche warning sign there for a reason.

Two days went by and Pam’s new printer started to streak just like the old one. It is the same exact issue – vertical streaks down the page caused by toner buildup on the transfer roll. Pam and her boss were absolutely baffled, whereas I had a sudden epiphany. Lo and behold, I have identified the root of the problem. It was not the printer – we have definitely ruled it out by replacing it almost twice. The only part of the equation that did not change was Pam.

To confirm my suspicion I cracked open Pam’s new printer, and as expected I saw the now familiar gunk everywhere inside. It was a weird rubbery adhesive substance that was distributed evenly along the entire paper path. It looked clear near the manual feed tray inputs, and black near the back of the printer where it got dusted by the toner. Something that Pam had fed into the printer had to be depositing this gunk on the rollers. Since said gunk was sticky, it could easily be transferred piece-meal onto the paper, which would carry it into the guts of the printer, where it get heated up by the fuser and coated with the toner dust and flash-baked into the surface of the transfer roll.

I cleaned the rollers, replaced the toner, made sure the streaking was gone and then went to have a chat with Pam. A novice IT worker would probably try to interrogate Pam and ask her what has changed in her routine. Obviously Pam had to start putting something new into this printer about a week ago, seeing how she never had that issue before. But I knew better than that. Even if she could conceptually make a connection between a change of printing materials and print quality issues (which is doubtful, because Pam subscribed to the “technology is arcane magic that is not to be understood by mere mortals” school of thought), she would surely not want to let me pin the blame for this issue on her. So I decided to Sherlock Holmes it. It was a complete shot in the dark, but it was worth a try. After some miss-direction and small talk to throw her off the track, I launched my probe:

“So, how are the new self-adhesive labels you got last week?”

I had no clue if what changed was the type of labels she used, but it was a reasonable conjecture. I suspected something with an adhesive layer had to be involved, so I just made a wild guess. Pam immediately lit up:

“Oh, they are working just fine. They are same size and same everything as the old ones, but like half the price. I’m really glad I found them. Now if you guys could only figure out those printer issues we would be all set…”

She swallowed my bait, hook, line and sinker and incriminated herself without even realizing it. Problem solved!

We don’t actually provide Pam’s department with office supplies. They have their own budget for that stuff, and they purchase stuff like paper and labels directly from Staples or wherever. Apparently Pam decided to try a new cheaper brand of self-adhesive folder labels. I examined her bargain bin stickers and verified them as a likely source of the printer clogging gook. The issue was not immediately apparent though. If you printed a full page of labels at a time, nothing would happen. Pam however didn’t usually need to print a full page. Usually she would print 3-4 labels at a time, out of a page of 20. She would peel these stickers off and re-use the page for the next batch – as you should.

The problem was that the cheep new labels would often leave some glue behind once you peeled a sticker off. This was most likely a design flaw, or just a quirk of the cheep adhesive used in the manufacturing process. When you fed a half-finished page into the printer, the glue from the “empty” areas would rub off onto the rollers and gum up the works. I tried relating this to Pam but of course refused to believe me. This was to be expected – in her eyes, I was not only trying to pin the blame for the printer issue on her, but also undermining her thrifty cost-saving purchase decision for which she already gathered some accolades with her supervisors. Fortunately her boss proved to be more susceptible to facts, common sense and logic. When presented with orgy of evidence (Pam’s old printer, gummed up rollers, label sheets with gooey back-sides, etc..) he caved and told Pam to STFU and buy the more expensive labels. The streaking problem has never returned.

Since that incident Pam’s Printer has become a colloquialism we use to signify a Red Herring PEBKAC issue that ends up costing the company more money and time than it should, because it is being resolved from the wrong end. In Pam’s case, properly interrogating and vetting the user could have saved us a lot of effort. Pam’s problem was astonishingly easy to fix once we knew the actual source, but obtaining that knowledge was tricky. Sometimes it is easier to just make educated guesses about the nature of the problem throw replacement parts at it until something sticks than it is to try getting the exact details out of uncooperative user. That approach can be a costly gamble though. If you are lucky, you get it right on the first try. If you guessed wrong, the issue can quickly spiral out of control and become Pam’s Printer.

A little Spanish Inquisition-ing and Sherlock Holmes-ing early on, can pay off big time.

]]>
http://www.terminally-incoherent.com/blog/2012/10/19/pams-peculiar-printer-problems/feed/ 5
We are out of space: Part 2 http://www.terminally-incoherent.com/blog/2012/02/22/we-are-out-of-space-part-2/ http://www.terminally-incoherent.com/blog/2012/02/22/we-are-out-of-space-part-2/#comments Wed, 22 Feb 2012 15:09:16 +0000 http://www.terminally-incoherent.com/blog/?p=11348 Continue reading ]]> In the previous installment of this story, I have learned that the company somehow managed to fill 200GB of free space on the network shares, overnight. I was more baffled than surprised, as this sort of thing was not new. Our company was known to go on huge data digitization binges without ever bothering to tell the IT about it, or purchase appropriate hardware to store it. In fact, most of such projects were closely guarded secrets, hidden away from the IT because the people who came up with them did not want to be blamed for additional tech related expenses. I got my buddy Larry up to speed on this case, and gave him a mission:

“I need you to use your contacts upstairs to get some intel for us. See if the bright heads over there got any new genius ideas like the scanning project.”

This was the kind of task that required some finesse, charm and social agility. It was a delicate social engineering hack that warranted a light touch. Larry was about as sneaky as a beached walrus, and as socially agile as a wounded elephant in a china shop. He was perfect for the job. His investigation would be brash, abrasive, accusatory and it would instill fear in the legions of the luserati. He would be the bad cop, to my… Lawful evil, but less threatening cop. Or something like that. Someone would eventually squeal under the pressure, or call me with the information hoping I can shield him/her from the wrath of Larry.

While my trusty minion went to ruffle up some feathers, and throw his weight around I decided to investigate the file shares themselves. My plan was to identify large folders, and ask Jeremy why they grew so big, and/or if they can be archived to a different share to make space. Thankfully I was running linux so this would be somewhat easy. I mounted the problem directory as a samba share, then did:

du -skh *

If you are Unix illiterate, this command prints out disk usage stats. The -s makes it print out only top level directories and files, but recuses into them to calculate the size, while -h makes it display the sizes in human readable format (ie. KB, MB an GB rather than in raw bytes). The result of this command was eye opening. Most of the folders on the drive were tiny. Only few took more than a gig of storage. There was nothing outrageously big there, except one entry:

257G DfsrPrivate

DfsrPrivate is a hidden system folder created and maintained by the Micrsoft DFS Replication Service. Without getting into to much of technical explanation, this service keeps file shares on different servers in sync with each other. We set up most of our file sharing servers this way as a means for rapid disaster recovery – always in working pairs, and if one of them fails, you can immediately fail over to the second one while you gut and restore the first.

The DfsrPrivate folder is used for staging the files that are to be replicated, and for storing copies of files on conflict. That conflict folder, turned out the be the actual culprit. Normally, the contents of DfsrPrivate\ConflictAndDeleted are supposed to be kept under 600MB in size. Every once in a while though, the DFSR service decides to ignore the quota and starts dumping huge amounts of data there, without ever deleting anything.

Quick google search revealed that one of the Microsoft Technet blog posts has described the exact same problem I was having and outlined a solution. In case that blog ever goes down here is what you do.

First run the following command:

WMIC.EXE /namespace:\\root\microsoftdfs path dfsrreplicatedfolderconfig get replicatedfolderguid,replicatedfoldername

Yes, this is Windows administration, and we are using a shell. Is your mind blown yet? Anyways, the output of the above should give you a GUID’s for all the network shares you have on the affected server. If you only have one share, then you will get one entry that will look something like: 70bebd41-d5ae-4524-b7df-4eadb89e511e. If you have more shares, make sure you pick the right one, and copy the GUID.

Then you run the following command:

WMIC.EXE /namespace:\\root\microsoftdfs path dfsrreplicatedfolderinfo where "replicatedfolderguid='70bebd41-d5ae-4524-b7df-4eadb89e511e'" call cleanupconflictdirectory

Make sue you substitute the GUID I posted with the one that was generated for your share, otherwise this won’t work. The output will actually look like the script bugged out and dumped a weird error message like this:

DFSR Cleanup

DFSR Cleanup

This is actually what you want to see. It means it’s working. Once you do it, you wait a few minutes and check your DfsrPrivate\ConflictAndDeleted folder. It should be either empty, or significantly reduced in size. If it’s not you can go to Plan B which is manual deletion.

I run through this solution, run the WMIC scripts and get virtually no results. Plan B it is then. In case you didn’t know, the B in Plan B stands for “Brute Force” or “Brace Yourself”. I take a deep breath, kill the DFSR service, and then manually delete EVERY-FUCKING-THING in that folder. I restart the service and all is well. Replication continues as normal, quota is once again respected, and the users have more than 250 GB to fill before we have a storage problem again. We are back on schedule for Peak Storage 2014. Crisis has been temporarily averted. Now it’s back in it’s place – hanging above our head along with 113 other critical problems that management refuses to address until they start threatening to shut down the company, or interfere with the browsing of Facebook.

Few hours later, I get the following email:

Luke, buddy! I hear there were some problems with downloading or something. Larry was here in Marketing asking all kins of weird questions about network sharewares and what not. It made me nervous. I hope I don’t get in trouble for this, but about a week ago I downloaded my iTunes music onto the G: drive on my computer (that’s the one with lots of free space) cause I like to listen to my music as I work. That’s ok, right? It’s all legit stuff I paid for, and it’s on my computer so it should not matter. I didn’t tell Larry cause he would probably write me up or something. Let me know if I could get in trouble for this. I put the music under In Process\Marketing\Legal cause I knew no one would ever look there.

Apparently my Larry “the bad cop” gambit has worked like a charm, and spooked at least one dude who was making unauthorized use of the network resources to store his pirated music. My response was along the lines of:

No worries, I got your back – I deleted all that music so that you don’t get in trouble. Make sure you don’t bring any more music on the G:, H: or I: drives because those are shared network resources and it will get you in trouble. I promise not to tell Larry.

Email CC’d to Larry and that guy’s supervisor of course. Pity that his collection was less than a gig – I was hoping for more space savings.

The moral of the story: if something eats hundreds of gigabytes of storage over the weekend, don’t automatically blame the users. Chances are it’s shitty Microsoft service instead. Actually, scratch that – blame the users anyway. They deserve it.

]]>
http://www.terminally-incoherent.com/blog/2012/02/22/we-are-out-of-space-part-2/feed/ 6
We are out of space: Part 1 http://www.terminally-incoherent.com/blog/2012/02/20/we-are-out-of-space-part-1/ http://www.terminally-incoherent.com/blog/2012/02/20/we-are-out-of-space-part-1/#comments Mon, 20 Feb 2012 15:10:34 +0000 http://www.terminally-incoherent.com/blog/?p=11296 Continue reading ]]> It’s Monday morning, and I’m sick. I thought that I was smart when I got my flu shot a few months ago. I figured that I can cock-block the influenza virus, and skate through the winter unscathed. What I did not anticipate was that being free of flu I was a prime target for things that are much worse. One of these super-bugs that makes you congested, and makes swallowing feel like gargling with sandpaper got the best of me. I would have stayed home, but I don’t have sick days.

In their infinite wisdom, the powers that be decided that staying home sick counts as a “vacation” so I wisely opted to drag my sneezing, germ producing carcass to work, making it my mission to spread the malignant disease as far and wide as possible. My hope was that I was going to be able to burrow myself in my cubicle, put some code on the monitor so it looks like I’m programming and then just survive till 5pm battling fever and sneezing fits.

Alas, this was not to be. As soon as I sit down and open my email, support ticket notifications start streaming in:

  • Ticket #6451 [High]: CAN”T SAVE AANYTHING!!1
  • Ticket #6452 [Medium]: G Drive not working
  • Ticket #6453 [Critical]: facbook.com saiz page not fund – pls unblock!
  • Ticket #6454 [High]: Can’t save to LAN.
  • Ticket #6455 [High]: When trying to save to G: drive it says “Drive full or right protected”
  • Ticket #6456 [Medium]: Is the public network share full? Can’t save anything.
  • Ticket #6457 [Low]: Can’t save to network share. Probably over quota. Please extend.

It goes on like this for at least 20 more messages. The emerging pattern is somewhat clear – something is terribly wrong with the network shares. I’m fraught with a sense of déjà vu. We already had this problem last week, and we narrowly avoided the catastrophe, by installing some extra hard drives and moving bunch of archival data off the main network shares. The whole operation took several hours, but I managed to reclaim close to a 200 GB of space by deleting logs, dumping out garbage files and moving really, really old files that no one has touched in ages to a separate drive. Then I sent out an email to some of the administrative staff telling them what has been done, and advising them to do some more archiving. Last Thursday things seemed to be mostly under control.

I quickly log into one of the servers, to check how much of that rescued space was devoured over the weekend. Apparently all of it. The drive used for one of the network shares has exactly 57Kb of free space.

At that very moment, Jeremy bursts into the IT cave system. I have no clue what Jeremy does at our company, because I personally do not care. I am sure he had told me what his important job functions are at more than one occasion, but I have never considered that information to be relevant enough to actually commit neurons to it.

“Thank God you’re in! I know you just got here, but we are having some serious problems with the LAN so if you could look into that…”

I lift a finger to stop him from rambling.

“I know. I was just reading the tickets… Listen Jodie…”

One of the other things I have never commit to memory is Jeremy’s name. There is a limited number of neurons in my head, and now that I have shaved my beard I can no longer offload Unix skills to facial hair follicles so I tend to locally optimize storage this way. I just call him girl names for consistency.

“Jeremy.”

“Julie…”

“Jeremy.”

“Genvieve…”

“Oh, come on – that’s not even close!”

“Whatever. Listen, Sally Ann Margaret…” I pause for a second to see if he approves this new handle I just made up. He gives up, and allows me to continue.

“Remember last week when we moved the 1997 through 2009 folders off the main network share to the ‘Archive’ one?”

He nods.

“That freed up around 200 GB of space. How the hell did you guys fill out 200GB over the weekend?”

He shrugs. “I don’t know. We’ve been scanning a lot of invoices, and other stuff…”

Last summer the corporate deities that bestow the blessed health insurance benefits upon us, have decided that it would be a good idea to scan everything in sight. This development coincided with a batch of new “all-in-one” printers we have installed near High Hrothgar – also known as the area where all the directorial critters have their offices. Someone has discovered that you can scan a document and have it emailed to you as a PDF. So naturally, their first reaction was to scan everything in sight. The initial feature shock, soon became a company policy but no one told anything to the IT department. So we have been oblivious that a few floors above there was a major digitization project going on, armed only with few aging Windows servers and replicated files shares to support it.

We only discovered the scanning project by accident, when we noticed the servers are low on drive space. After shaking up few users we managed to get them to spill the beans. Apparently they were not supposed to tell us, because their supervisors were afraid we will start making ruckus about buying new servers and shit.

So we went and put in a proposal to build a dedicated architecture for this project. Preferably some sort of a NAS to which we could slowly add new drives as the project grew in size. That got rejected immediately because it would actually cost money, versus the current solution of not doing shit that was completely free. Eventually we did get an approval to buy some new drives, and beef up the windows servers, but that was a temporary fix.

A fellow NOC Denizen, Larry actually made a chart. He measured the amount of storage we consume every day, adjusted it for possible growth and cross referenced that with the amount of internal storage we could possibly add to these servers. According to his math, we would get starved for space, and the system would become unsustainable within 3-4 years if we were lucky.

“Pshh, 5-7 years?” said the omipotent directorial oracles, who always think we are low-balling sustainability figures “In 7-8 years we might have flying cars, and colonies on Mars. Don’t you worry yourself about things that will happen 8-10 years from now.”

And so we were left with a teetering wreck just waiting to happen. In other words, business as usual.

I wander of to Larry’s desk. He is redditing like a pro, and ignoring the morning commotion surrounding the storage shortage. I don’t blame him. I would too, but they have found me first.

“Hey asshole, your math skills are shit.” I give him a friendly greeting.

Larry, a rotund gentleman, swivels around in his chair and gives me a shit eating grin:

“I will have you know, that my math skills are impeccable, you abominable cunt.”

Pleasantries out of the way, I give him the basic rundown of our current crisis. Something just ate 200GB over the weekend. Whatever it was, it was not included in our “Impending Storage Disaster of 2014” projections (also known as “Peak Storage 2014” for short). Now we need to identify it, and figure out how to deal with it. I had a very specific mission for Larry…

But more on that next time.

]]>
http://www.terminally-incoherent.com/blog/2012/02/20/we-are-out-of-space-part-1/feed/ 16
Why Dell Hardware is Shit http://www.terminally-incoherent.com/blog/2012/02/13/why-dell-hardware-is-shit/ http://www.terminally-incoherent.com/blog/2012/02/13/why-dell-hardware-is-shit/#comments Mon, 13 Feb 2012 15:06:47 +0000 http://www.terminally-incoherent.com/blog/?p=11308 Continue reading ]]> Before I begin this rant I want to make it clear I am not a Dell hater. I am a rather happy owner of a Dell XPS desktop and I work on a team that professionally maintains a fleet of Dell laptops. Most of them are fine. Some are shit, straight out of the box.

Recently my boss wandered into the IT cave demanding a new laptop. This was quite unexpected because he has been overly protective of his old HP tablet with the swivel display. Even after he lost the stylus, damaged the cradle port, and had the sound card fail he refused to let it go. Each of us took turns trying to coax him into releasing the new machine and getting a new system. He would not hear about it. At least not until now.

He barged into our inner sanctum, woke up the slumbering intern and bellowed an order: he wanted something thin and cool like them Apple laptops. Stupidly I asked if he mayhaps wanted a MacBook Pro or Air cause we could hook him up with that, seeing how he owns the company and all. It would probably cause some support headaches, but it is not like he is not getting preferential treatment, admin privileges and etc already. Apparently this was the wrong thing to say:

“What? Are you crazy? I don’t want an Apple computer. I just want something like Apple. But cheap. And from Dell, cause that’s where we get corporate discounts.”

So essentially he wanted a cheap MacBook knock-off. Fair enough. I put together some options and pricing for him and we ended up with the following machine: XPS 14z.

Dell XPS 14z

Dell XPS 14z

It had a thin, glass to the edge display, back-lit chicklet keyboard an an aluminum case, a flush touch-pad and a slot loaded DVD. I guess it was jut ‘Apply enough to fit the bill.

Fast forward about a week. The machine is delivered, configured and ready to go. All the files are transferred from the old wreck of the machine and we are good to go. I hand it off to the boss and he happily takes possession of his new toy. He decides to keep the ancient HP as a “backup” which is fine, cause I would just retire and junk it otherwise.

Few days pass, and he reports a first problem with the machine. Apparently it “jumps off the network”. I read that ticket, I have the mental image of the poor laptop getting fed up with the abuse, and leaping off a cliff, AssCreed style. The poor thing probably never anticipated the things that would be done to it.

I look around at my peers to see if any of them wants to pick up this ticket. All of them flash the universal sign of “not it” which is made by extending the middle finger of your dominant hand, palm facing inwards. Unfortunately I have to field this one on my own. So I take the journey to the “oval office” as we call it. The big boss is sitting on his leather couch (yes, he has one in his office, he owns the place) with the legs on his coffee table, the machine on his lap.

“Ah, good thing you’re here. It jumped off the network again.”

He swivels the machine so that I can see it. The machine is still on his lap, and I’m quite unsure what am I supposed to do. Does he expect me to work on it while it’s still on his lap? I sort of reach for it, but then hesitate for a second. At that point he realizes that there are probably more optimal configurations for laptop support, and hands me the machine telling me to sit at his desk and figure it out.

I take the laptop, and unplug the Ethernet cable. You see, we don’t use wireless at any of our offices because of the security implications. No matter how much encryption you use, a wired network is always more secure. It is way easier to control access to available ports and cables than to try and secure the airways. I am really glad the expensive external consultants were able to sell the management on this idea, that we, the Internal IT, were advocating for years.

Anyway, I plug his machine at his desk, and it works. I pick up his by-the-couch cable, plug it in and it works. I hand the machine back to him, and it drops again. I take it back, unplug, plug back it – it works. Hand it back – it drops. I replace the Ethernet cable from the one he had by his desk – same scenario. As soon as the machine leaves my hands, the connection drops.

Now, techno-muggles often joke around that we, the Denizens of the NOC have special technomancer powers that allow us to temporarily coax machines into a working state by mere physical proximity. I never paid any heed to such folk tales but this was quickly becoming a live demonstration of just that sort of a phenomenon.

After troubleshooting some more, I managed to identify the problem. It was the Ethernet port. When you plug in an RJ45 jack into a port, it is supposed to be locked into place with a little click. Once locked, the pins should be in full contact and there should be no play on the wire – jiggling it, or pulling back on it should not cause a disconnection, unless the release latch on top. On this particular laptop however, the RJ45 would be loose in the port. Plugging it in would produce the familiar click, but it would still be possible to push the plug about two millimeters deeper into the port. In fact, pushing it deeper would be crucial to establish a connection. A slight tug on the wire, once it was locked in, would make it slide back a few millimeters, and the pins would lose contact causing disconnection. The cable would still be locked in place, and it would be impossible to pull it out without disengaging the latch – but connection could be easily broken and re-established by very subtle movements in and out. It’s almost as if the pins were nested too deep in the port. You can probably see how sitting on the couch, with the computer on your lap could produce a wire tension, ant thus cause intermittent disconnects.

RJ45 is a standard that has not changed in years. To actually make an Ethernet port that fails this way would require actual effort because all these parts are machine made to a standard spec. I figured it must have been a one-off manufacturing defect, and decided to call Warranty support and have it replaced.

I got Dell on the horn, and explained the issue to their support drone. He gives me an attitude right of the bat:

“Sir, I can assure you we have never had this sort of an issue reported for this particular model.”

That’s great buddy! I’m really glad to hear that, but it does not change the fact that my machine has a fucked port that does not work. This reinforces my guess about a factory defect, but then again, you are a worker at a call center in India, so what the fuck do you know. It’s a brand new machine, the port is broken and I would line it not to be. Eventually he reluctantly agrees to dispatch a technician and replace both the motherboard (with the integrated Ethernet card) and the bottom baseboard casing. Fair enough.

Next day the Dell technician comes in, I set him up in the conference room and let him do the work. About 20 minutes later, he calls me and says he is done. I pop into the room as he is packing up his stuff, and try to power up the machine. It’s dead. I ask him if he tried to power it up.

“Sure thing, all works now.”

I show him that it doesn’t. He starts packing faster…

“Oh, that’s just the battery. If you plug it in, it’ll be fine.”

I grab the spare AC adapter that’s underneath the conference room table for just this occasion and plug it in. The machine is still dead. I look at him, he looks at me and goes:

“Listen, I’m already late to my next appointment…”

I slowly inch the laptop towards him then point at it and I do my best impression of Liam Neeson from Taken and slowly and calmly explain the situation to him:

“This was booting up fine. You fucked it. Now I need you to un-fuck it. And if it’s un-fucked by the time I come back, I might be nice enough not to call your supervisors and not complain about this shit.”

Then I leave, slam the door and summon the artist formerly known as Intern.

“INTERN, YOU ARE BEING SUMMONED!”

Intern immediately materializes at my side as is in his nature and greets me in his internly fashion:

“What’s up, bro?”

I pull him aside, show him the conference room with the trapped technician.

“See that guy? Watch him and don’t let him leave, until I come back. If he tries to make a run for it, call security and tell them he is stealing our shit, trespassing and that he raped you or something.”

“For real bro?”

“Yes, he is raping, pillaging and infringing on our intellectual prosperity or something. I don’t care. He’s fixing the big boss’ laptop, and if it’s not working then shit will get really ugly around here.”

Intern salutes, and stands guard at the door, armed with a lukewarm cup of coffee and Nintendo DS. From his expression I know he is not afraid to use any of these weapons to amuse and caffeinate himself while he performs this sacred duty.

I come back 20 minutes later, and the Intern is dead asleep, out of coffee and his Pokemon is apparently losing a battle with a Charizard. The Dell technician is still there, slightly more sweaty than before. This time the machine boots up fine. I grab an Ethernet cable, plug it in and verify that the connection works. Then I use one finger and lightly tug on the cable. The RJ45 slides a few millimeters and the connection drops. I show him the problem.

“Did you actually replace the motherboard?”

He swears up and down that he did. Both the mobo and the bottom plastics have been replaced with brand new parts. He calls his Dell budies, they exchange secret codes, case numbers and pleasantries. Then he puts me on the phone. They give me two choices:

  1. Dispatch another mobo and bottom plastics and try another swap
  2. Have the machine sent in to their repair depot

I choose option #1 because I’m almost certain that this oaf somehow destroyed the replacement mobo while installing it (which is why the damn thing wouldn’t boot), and then swapped the old part back in to cover his ass. I also ask them to send a different technician.

Next day another technician arrives. This one looks slightly more competent, but then again, you never know with these guys. To be sure, I install the Intern in the conference room as my spy. I make sure I show the technician him the issue we are trying to correct and he gets to work. A little bit later he calls me over.

“Man, I don’t know what’s wrong with this thing. I replaced both the motherboard and the bottom plastics, right…” he looks to the Intern.

The Intern nods vigorously.

“I made sure everything is aligned perfectly, and it is screwed in tightly. But the cable is still loose in the jack though… You might have to send this one to the depot or something.”

He gets Dell on the phone again, and they give me the repair depot spiel again. You know what? Fuck the depot. If I wanted to send it to the depot, I would not pay for the “Next Business Day, On Site” warranty. I would pick the cheapo “we will spend a week looking through your shit, and copying your pr0n at the depot” option. I politely explain to them they can suck my proverbial dick, and get stuffed. Eventually they agree to replace the machine instead. They are sending us a replacement laptop and it should be here in a few days. If that one has a shitty Ethernet port, I will probably have The Intern punch himself in the face or do something even more drastic.

TLDR: Dell sold us a laptop with a broken Ethernet port. Tried to replace it twice, but each time it was equally shitty. Now they are replacing the entire laptop. I have a feeling the new machine will be equally shitty. Don’t buy XPS 14Z.

]]>
http://www.terminally-incoherent.com/blog/2012/02/13/why-dell-hardware-is-shit/feed/ 31
Fan Day: Part 1 http://www.terminally-incoherent.com/blog/2011/11/02/fan-day-part-1/ http://www.terminally-incoherent.com/blog/2011/11/02/fan-day-part-1/#comments Wed, 02 Nov 2011 14:07:37 +0000 http://www.terminally-incoherent.com/blog/?p=10045 Continue reading ]]> The thing about being an IT professional or a sysadmin is that your workload comes and goes in waves. Some days are just slow and lazy, and there is not much for you to do. You are all caught up on your current projects, all of which are pending review, waiting for approval or at a standstill. You are done with all regular maintenance tasks, and anything requiring serious work can’t be done during business hours anyway because it would require taking down a crucial server, or two. Most of your day is spent with silly paperwork, answering random calls from marketers trying to sell you enterprise business solutions, and browsing the web. On such days even the users relax a bit.

You can hear it in their typing. On normal days the clickety-clack of the keyboards is an angry, hate filled sound. They are not typing they are spitefully inflict punishments onto their machines in the form of text. Work is a struggle between man and the disobedient machine. But on the slow days, their touch becomes gentler. You could almost imagine that they don’t hate and fear their computers, but have grown to accept them as inanimate objects that are merely tools of their trade. This of course is merely wishful thinking, as you can still smell their fear and disdain in the air. But on those days when nothing breaks, there are no upgrades and everything is sailing smoothly they are lulled into brief and temporary state of comfort and relaxation.

Then, there are the other days. We call them “Fan Days” – days where the shit hits the fan hard, and everything breakable, decides to break at the same instant. This is a story about one of such days.

The first indication that you are about to have a fan day, is a curious convergence of vacation days, sick days and personal leave request among your coworkers. Nothing bad usually happens when your IT team is at full strength, and in good shape to swiftly respond to major issues. It is only when you are running with a bare bones skeleton crew when things start to crumble. On this fateful day, I was more or less flying solo using the intern as the storm shield against the wrath of the users calling the help desk.

The first issue of the day came early in the morning and concerned our Barracuda SSL VPN box. Granted we go get anywhere between 5 to 10 calls for that thing every day, but that’s just because around 10 of our users just can’t wrap their heads around the concept of two factor authentication. The help desk is sort of a fourth factor in their authentication process, walking them through the extremely difficult task of plugging in the USB dongle and typing a password into a login box. This was an all-together different pair of shoes – not a PEBKAC authentication problem, but an actual functionality issue.

The nice thing about using the Barracuda is that it allows us to give remote users access to our intranet web apps without actually exposing any of their servers to the internet. The communication is proxied by the SSL VPN box, encrypted and hidden behind a two factor authentication scheme. One of our users had a problem uploading a large zip archive to that very service, but the Barracuda proxy server would not have it. We have never noticed this in testing, because none of us reasonable IT folks even dared to assume someone would be silly enough to try uploading over a 100 MB of crap. Average size of the attached files uploaded through that form was 5-6MB so we collectively figured that setting the upload size limit to exactly 100MB would be more than enough. But, alas here was a user with a 115MB file that needed to be submitted by yesterday.

Fortunately, it was not ha huge issue. I simply logged into the admin panel and increased the file size limit, applied the changes and… Inadvertently broke the SSL VPN box somehow. It basically fell of the internet. One minute it was there, the next it was completely gone, not responding to any HTTP requests. I could still ping it, and nmap could see that it was listening on the usual ports, but there was no one answering when you knocked on the door.

I promptly fell out of my chair, and scrambled towards the server room to check whether or not the device bricked itself somehow. With shaking hands I fumbled for my access card, and proceeded to swipe it exactly four times, until I finally got the right alignment of magnetic stripe to the card reader. Then I dropped by server rack keys on the floor a few times, before I managed to claw my way to the silently humming boxes inside.

When I switched the KVM to the Barracuda box, the screen lit up and I saw the familiar login prompt, and the maintenance menus. Sudden rush of relief caused my lungs to exhale stale air, probably for the first time in the last 10 minutes. The box was not bricked, but the internal web server was down.

Rebooting the device would probably be a logical choice at this point, but I was hesitant to try it. After all my tiny, insignificant change somehow put it into a very weird state and who knows what it did to the internal data store. I wasn’t about to risk doing even more damage to it than I already did, so I got Barracuda support team on the phone.

Have you ever dealt with them? They have a great, high quality team that speaks fluent English (which is rare these days) and you can usually get a live person on the phone in under 10 minutes. Unfortunately, the product specialist I talked with wasn’t much help. To diagnose the issue he had to gain access to the device, and since the web based admin panel went to hell I could not give him the permissions to do that. I tried enabling the SSH tunnel from the physical console, but that did not work either but for an entirely different reason.

You see, I never specified that port 22 needed to be open for this box, so our over-eager network admin Andy likely locked it down. So of course I called him next, and he was not very happy to hear from me, seeing how this was his vacation day.

“Dude, I’m on the beach right now. What did you break?”

“SSL VPN box. Barracuda support need to ssh into it to un-fuck it” I replied without missing a beat.

Andy grumbled something about dealing with n00bs while on vacation but eventually agreed to give me the log in information to the firewall so I could open port 22.

“Ok, so the password is my last name, then 123”

“Really?”

“Yeah, all lowercase.”

“I… I have no words for this Andy…”

“No one can spell my last name anyway…”

“Yeah, but your last name is on the website.”

“Oh, yeah… Well, you can change it to something else while you’re in there.”

So of course I did – I changed it to “Fuck You 4ndy” followed by long random string of characters. I assumed he would appreciate that upon coming back after a full day of beach bumming.

Needless to say, once I was armed with Andy’s terribad password, which was about 3 times worse than the kind of shit we yell at our users for, I managed to open the right port and call Barracuda back. Sadly, it turned out to be a massive waste of time. Whatever took down the web server, also seemed to blow away ssh. For an instant I felt bad for even calling Andy on his off day, but then I got over it. I mean, who does he think he is, taking a vacation while I’m in here breaking mission critical systems left and right. Screw him.

Since we ran out of options the Barracuda help desk recommended power-cycling the appliance. The idea was not to do a graceful shutdown but just kill it, and bring it back up without letting it write any permanent changes to disk (unless it already did but we would worry about that later).

So that’s what I did. I mechanically unlocked the face plate on the device, took it off and placed it on the floor. Still chatting with the support guy on the other line, I depressed the power button.

At that very instant something clicked in my brain and with a start, I realized three things:

  1. The Barracuda SSL VPN box has no face plate
  2. The box on which I was pressing power had a little plate next to the button that said Dell, and a small sticker labeled “Firewall”
  3. This entire story was taking place before The Firewall Saga

In case you haven’t read my Firewall Saga, let me explain: at that time, our Checkpoint firewall had a weird glitch that caused it to “forget” the license keys and subsequently close off all network ports, fall off the internet, and bring down the entire network to a halt each time you rebooted it. The only way around it was to log in via the physical console, and manually type in the long license key strings to re-activate it.

While I had the power button depressed, the server kept chugging along just fine. I briefly considered just staying in the server room till 5pm holding that button, but that wasn’t really an option. So I attempted to delicately slide my finger off of it, hoping the machine will forget that I pressed it in the first place. That did not work.

Three point seven seconds later, The Intern appeared in the doorway giving me a questioning look:

“Dude, did you reboot the internet?”

“What? No. Of course not!” I lied discretely pushing the power button again, to bring the firewall back up.

Two seconds later, Jay from accounting materialized behind the Intern and told me the internet went down.

“I know.” I replied “That’s why I’m here. I’m working on it”

In about quarter of a second a third person appeared in the doorway, forming the beginnings of an impromptu conga line. This was one of our supervisors who also noticed the lack of internet – or as he described it “the email erroring out on him”.

“He knows.” said Jay.

“He is working on it. That’s why he is here” added the Intern.

This seemed to placate the supervisory entity. He nodded, and wandered away. As his footsteps faded out in the distance, I heard him repeat “he knows, he’s working on it” to at least three people who were making a bee line for the server room.

Meanwhile I was already dialing Andy, hoping he can walk me through the manual application of the license keys for the Firewall.

So to summarize – I single handedly bricked the SSL VPN box, temporarily took down the firewall and disconnected the entire office from the internet – all before 11am. Amusingly enough, this was not the last crazy thing that happened that day. Not by a long shot.

]]>
http://www.terminally-incoherent.com/blog/2011/11/02/fan-day-part-1/feed/ 15
The Firewall Saga: Part 7 http://www.terminally-incoherent.com/blog/2011/09/19/the-firewall-saga-part-7/ http://www.terminally-incoherent.com/blog/2011/09/19/the-firewall-saga-part-7/#comments Mon, 19 Sep 2011 14:28:47 +0000 http://www.terminally-incoherent.com/blog/?p=10036 Continue reading ]]> It has been almost a week since Steve’s visit. My daily interactions with Verizon have settled into a very predictable pattern. Every afternoon I get an “unexpected” visit from an on-site tech. I explain the problem is not local, and they leave. I call Verizon and complain. I bitch, moan, threaten to change service and etc. I talk to a floor manager at the call center who does not have any power or resources to help me. All he has is an online chat tool and ticketing system for escalating things to tier 2 – the exact tools the phone drones use. His boss can’t help me because he is in no way, shape or form affiliated with Verizon, but rather manages the call center company. After a while they placate me with solemn promises of swift resolution, reimbursement and etc. The following morning I get a call, notifying me that my issue was market as resolved. I call again, bitch, yell and complain some more. The issue gets escalated. Tier 2 picks it up, and bumps it back down with a note to dispatch a technician to check all the connections, and jiggle all the wires. And the cycle repeats.

Every time I call, I give them the same speech. We had this issue once before. Somehow you have resolved it. All you need to do is to look about a year back in your case history and figure out what was done back then. Unfortunately, I get the impression that the people I’m dealing with either do not have access too, or do not keep case notes that would reach that far back. I’m documenting all the dates, and failed attempts to rectify the issue, because I fully intend to ask them to refund us all this lost time.

In the meantime a brand new version of certain crappy, proprietary web app comes out, and the angry shoes brigade gets annoyed. I can’t upgrade that server, because if you recall from previous installments of this series, it is located in another data center, and the VPN tunnel is broken.

I figure that the Verizon issue does not look like it is going to get resolved anytime soon. The lack of connectivity with the data center is becoming a nuisance, so I decide to call up Barry from the network team. I figure that I will get him, and Charlie from the data center and we’ll just keep rebooting the damn machines and tweaking Firewall rules until they sync up and establish a viable connection.

I couldn’t have picked better timing to call about this. It turns out that Barry and Toby are going to be in our data center the next day installing and configuring some new hardware. Toby is apparently in charge of hauling all the equipment onto the site, while Barry will be bringing his networking skills. Even better, Barry has to drive past my office on his way to the data center so he agrees to just stop by in the morning. This way we will have one of these guys on each end of this conundrum, and we won’t have to rely on people like Agent Beef to be our hands, eyes and ears in the rack-space. Our plan is to get things done in the early morning, before the managers and directors start slowly trickling in after 9am.

Next morning I arrive at the office extra early. When I leave the house it is still dark outside. When I arrive at the office, my car is the third vehicle on the completely empty lot. As I grab my laptop bag from the trunk, the day star crests over the horizon and vomits painful bright orange light onto the deserted sea of concrete. People talk about dawn as if it was something beautiful and romantic – but every single time I see one, it is like getting stabbed in the face with a condensed beam of fatigue.

The malnourished, dirty hobo birds that stupidly picked the parking lot as their feeding ground are woken by suns slow upwards creep and decide it is time to scream their fucking beaks off like it’s some big event. I flip them off, and curmudgeonly drag my sleep deprived carcass into the building.

I bump into one of my early rising coworkers – he is on the super early shift, to field those 6am phone calls from our workaholic clients whose morning routine includes 3 important business calls, shower and coffee. Some people jog in the morning – these folks make work related calls for sport I guess. Never understood this attitude, but then again who am I to judge.

My coworker inquires about my unusually and uncharacteristically early arrival. I attempt to tell him that I have an appointment with Barry to fix some outstanding issues with our network but what comes out of my mouth is:

“Hhhnngrrr brrrry ntfffff!”

Somehow I manage to convince my stiff and unresponsive body to make a zombie shuffle up to the coffee machine. I suck on it for about 20 minutes, and then collapse at my desk. I try calling Barry, but he is incommunicado.

So I wait… And wait… And wait some more.

9am rolls around and Barry is still MIA. Finally he calls me around 9:30 to let me know he is about 15 minutes from the office. He arrives at 10:45. Barry lives out of sync with the normal time-space continuum and his personal field of influence time works differently. Of course Toby is not at the data center yet, so all we can do is to run some local checks, and make sure the firewall rules are correct as we wait. I use this time to fill Barry in on my dealings with Verizon. He is amused, appalled but not very surprised. He offers to do a tag-team call with me so we can take turns yelling at Verizon. I doubt that we will accomplish anything new, but I am willing to try anything at this point. Hell, I don’t even want to strangle him for making me wake up so early and then failing to show up till almost 11am. That would require too much energy, and in my sleep deprived state I am all about energy conservation.

Eventually Toby gets to the data center around noon and we get to do some troubleshooting. It appears that firewalls on both ends see each other, but for some reason can’t establish a tunnel. Unfortunately Toby’s end uses a very dumb dedicated network appliance which is not giving us any good diagnostic data or meaningful error messages. After few reboots of the appliance Barry gets an idea.

“Toby, what is the time on that appliance?”

Toby scrambles to find the information in the web interface. You can hear him click on a dozen of tabs and/or links before he finds a status page. Finally he goes:

“It’s is showing time as 12:25pm EST”

I watch Barry run the date command on the firewall’s console. It spits out 12:13pm EST. Way off! He quickly resets the time on our end, tries to re-establish connection and we see the VPN tunnel snap to life. Apparently the authentication algorithms were thrown off by the time discrepancy on the two systems. When Toby and Barry set up this replacement firewall in Part 2 they probably did not bother syncing it with an NTP server. Most likely Toby just glanced at the wall clock when setting up the date – one of those cheap, unreliable battery powered things that tend to drift a lot. That was the reason why we got cut off from the data center.

Now if we could only fix the non-routable IP issue that quickly. Since Barry is already logged into the firewall he decides to poke around a bit. We more or less exhausted all the possibilities last time, but he figures we can perhaps take screenshots, and logs and use them to support our claims. We get Toby to plug his laptop into an external line (not the VPN one) and send packets to the non-routing IP, while we watch the activity on the screen. Something weird happens – we see text scrolling down the screen. Packets are coming in.

I jump of my stool and scramble to boot up that laptop we set up for Steve. It is running IIS, and a simple test webpage and the firewall is set to route all the inbound traffic to it’s internal IP. When it’s up, we ask Toby to try hitting that IP with his web browser.

There is a short pause and he goes:

“Oh shit! I see an animal!”

I whip out my phone, and sure enough – there is my test page:

The title attribute for this page was Mushroom, Mushroom.

“Barry… What the fuck just happened?”

Barry is just as stumped as me. The only thing we changed on the firewall today was the time. It is impossible that a 10 minute system clock drift could possibly have any effect of routability of one of our 5 IP addresses. Nothing we did today could have possibly resolved or issue. And yet, our insurmountable problem, somehow fixed itself, literally overnight (when I checked it last night it was still broken). How did this happen? I have no clue. Barry had a hypothesis or two:

  • It is possible (but unlikely) that my complaints somehow got forwarded to the right department
  • Perhaps some network engineer noticed this issue during regular maintenance and fixed it
  • It is also possible that Verizon routers have some self healing protocols that cause them refresh their routing tables every once in a while

I feel relieved, but also a bit cheated. I sort of wanted Verizon to acknowledge this problem and resolve it. If it happens again (and it may) we will be back to square one, dispatching useless technicians to fix a routing issue. On the other hand, the thought of no longer having to deal with Verizon made me extremely happy. I was just sick and tired of the entire ordeal – especially the brain dead Verizon tech support drones and on-site technicians.

You would think that this is the end of the story, but it is not. There is still one event of note that I still haven’t mentioned. But to get to it, we have to advance the clock by about a month or two. I am finally free of grief and residual pain from this ordeal, and I’m turning it into a long running multi-part series on Terminally Incoherent. Around the time the part of the story where I introduce Steve hits the web, I suddenly get a text message from on old friend:

Do you want me to file an internal escalation for you?

Note that this is completely out of the blue, and out of context for me. By now I’m done with this whole issue. It is ancient history that made for a funny series of articles. So my response is along the lines of “Huh? An escalation for what? Why?”. Then he explains:

For your nonroutable ip from your firewall saga

This, ladies and gentlemen is the exact moment when I punched myself in the face. Somehow I managed to completely forget that I know someone on the inside. I have insider contacts within the bowels of Verizon. And apparently while I was sitting here contemplating firebombing their headquarters, this person could have filed an internal ticket for me. A ticket that that could have potentially helped to fast track this entire ordeal.

Thus concludes The Firewall Saga. I have some more stuff like this in the pipeline – though probably not as long. Now that the series is over, I went back and added a little navigation table at the end of each post. This way, if you decide to share this story with a friend, they can just click through to the end and read the entire thing without too much hunting around.

The Firewall Saga
<< Prev Next >>
]]>
http://www.terminally-incoherent.com/blog/2011/09/19/the-firewall-saga-part-7/feed/ 6
The Firewall Saga: Part 6 http://www.terminally-incoherent.com/blog/2011/09/14/firewall-saga-part-6/ http://www.terminally-incoherent.com/blog/2011/09/14/firewall-saga-part-6/#comments Wed, 14 Sep 2011 14:18:44 +0000 http://www.terminally-incoherent.com/blog/?p=10023 Continue reading ]]> On the last episode of The Firewall Saga we met Steve – a peculiar Verizon technician who turned out not to be the “Network Specialist” we were promised. I managed to co-opt him into a crazy plan of getting the Verizon tier 2 techs to escalate my issue. Last time I saw him, he was making tomato stains on the wall of my server room. So I relocated him to the lunch area.

Now, Steve is starting to get on my nerves. He has been sitting in the lunch room for about 40 minutes, devouring pretty much everything. I think someone offered him a drink, and he eagerly cleaned out four cans of soda from the communal fridge, devoured two snack sized bags of potato chips and dipped into the basket of assorted sweets we had on the table. His ravenous appetite reminds me of Shaggy from Scoobie-Doo, that is if Shaggie was a graying gentleman in a trucker hat and with handlebar mustache.

Around the two hour mark, he finally gets connected to a live person. He hurries to the server room, logs into his laptop and begins troubleshooting. I help him plug it into the network, and give him a piece of paper with the IP address and the default gateway he needs to use. Then I grab my cell and dial Barry in case we need to do something on the firewall side. Steve is busy mumbling into his phone, and typing up a storm on his machine. Suddenly he turns to me and goes:

“Hey pal, I think this cable you gave me is not live. I can’t get out on the internet.”

I know for sure that the cable is live, because I used it to test the dummy laptop this morning. I ask him if he used the IP address I gave him. His eyes glaze over, and his expression betrays that he has not the faintest clue what I’m talking about. I attempt to explain, but I realize that everything that comes out of my mouth sounds like some abstract moon language to this guy. So I give up, and decide to let him use the dummy machine we configured. I connect it, boot it up and let him at it.

This is Steve’s browsing session in an itemized list form:

  1. He clicks on Internet Explorer
  2. He smiles as the MSN page comes up
  3. He uses the mouse to click on the search box even though it already has focus
  4. He types in ‘google.com’ into the search box
  5. He uses mouse to click on the Search button
  6. He clicks on the first link
  7. He clicks inside of the Google search box even though it already has focus
  8. He goes “Ok, so what’s that address you want me to go to?” into his phone
  9. He types in 197.168.1.1
  10. He uses his mouse to click on the “Search” button
  11. He scrolls around the results page by using the scroll bar, completely ignoring the scroll wheel on the mouse
  12. He goes: “Ummmmm… I am not seeing that…”

I look at The Intern, and he looks at me. We are both in a state of shock, and at a complete loss of words. We just stand there for a solid minute unable to say anything. Finally he breaks the silence and whispers to me:

“Dude, what the fuck is he doing?”

The worst part is that I know exactly what he is doing. He is trying to log into the Actiontech router, that we are not using. But he is failing at it so hard, that he ought to receive some sort of award for it. I suddenly realize that Steve knows less about computers than most of my users. On its own, this would be quite an accomplishment. But the fact he is actually working as a Verizon on-site support technician makes his technological illiteracy quite ironic.

Eventually Steve’s counterpart on the other end of the line manages to explain to him how to use an address box. They contemplate the inability to bring up the Actiontech login page for a bit, and conclude it is time to power-cycle the router. Steve gets up, looks at the server rack, looks at the walls, scans the entire room and becomes confused.

“Buddy, where do you keep the Actiontech router?” he asks.

I explain we are not using it, but Steve refuses to accept that as an answer. He can’t comprehend how we could possibly connect to the internet without the router. He decides that it has to be somewhere and starts snooping around the server room. He looks behind the server rack, the tries to open the other rack next to it, all the while trying to explain to us how the device would look like. I look at The Intern and go:

“I think I have made a huge mistake…”

Steve’s exploratory search for truth brings him to the shelves where we store spare parts, cables, assorted cable control devices. It just so happens that that’s where we left the poor Actiontech router. It’s been sitting on that shelf for years now, gathering dust and acting as a paperweight. Steve spots it, yells “Aha!” and slides it from underneath all the crap that was on top, and triumphantly waves it at me. Bereft of their support, some boxes and trays that were above the forgotten router topple down, spilling wire clips, Velcro fasteners and papers all across the floor. The Intern dives in to rescue falling equipment while I just stand there staring in astonishment.

Steve makes a happy dance, as if he solved the issue. Here is the root of your problem gentlemen. Your router was not connected, and I, the great Sherlock Holmes, found it on this dusty shelf.

Steve triumphantly brings the router to where we set up the laptop, plugs it into the electric socket, then unplugs his Ethernet cable from the main switch, and connects it to the router. He types something in and goes:

“There we go!”

Apparently he finally got to the Actiontech login page. Unfortunately his happiness is short lived. I watch his smile turn into a frown as he is obviously unable to get internet connection. Him and his buddy on the phone go into this intense troubleshooting session of a router that is not connected to anything other than the laptop. After about 5 minutes of this, I interrupt them and go:

“Steve, that router is not connected to anything!”

Steve looks at me befuddled, wiggles the Ethernet cable between his laptop and the appliance: “Sure it is!”

When I try to explain, Steve just asks me to “Sit tight” and assures me that they “will get to the bottom of this”. I am not so sure of that. I think we have reached the rock bottom when Steve found the router. Now he is diligently digging himself into a hole that is getting deeper and deeper every minute.

After some more troubleshooting, Steve hangs up the phone, gets up, waves me over and goes: “Good news buddy. We figured out what was wrong with your connection.”

“Oh, really?”

Apparently Steve completely forgot that he was supposed to be a pawn in my clever ploy to get to tier 2 support to fuck off, and get network engineer on the case. Apparently he managed to resolve the issue all on his own. What a hero!

“Yep. That router…” he points at the still-disconnected device “…there is something wrong with it. The good news is, that I have a spare router in the truck. So I’m gonna go, have a smoke, grab something to eat and bring it up here.”

He gives me a friendly pat on the shoulder.

“We’ll get you all patched up, and back online in no time.”

At that point, I politely thank Steve for his help, ask him to gather up his stuff and not come back. There is just no point in continuing this charade past this point. The Intern seems to be having a blast watching this unfold, but I’m just annoyed. And it’s not really Steve’s fault. I’m sure he would do fine as a residential support tech. His cluelessness wouldn’t really hold him back that much if all he had to do was to power-cycle and/or replace appliances and crimp wires. This issue is just way above his head. In fact, it seems to be way above the head most of the on-site techs that Verizon likes to send out to their customers. The whole ordeal is just a monumental waste of my time.

Next morning I arrive at my desk, only to field an early morning call from an old friend:

“Hi, this is Bob from Verizon and I’m just doing a follow up courtesy call about your recent issue. I see here that yesterday we have sent an on-site technician to your location. He has marked the issue as resolved. I wanted to make sure that everything is working correctly and see if there is anything else we can do for you.”

Well, Bob… Since you have asked, let me tell you a story about a guy named Steve.

Next time on The Firewall Saga: the long awaited resolution.

The Firewall Saga
<< Prev Next >>
]]>
http://www.terminally-incoherent.com/blog/2011/09/14/firewall-saga-part-6/feed/ 14
The Firewall Saga: Part 5 http://www.terminally-incoherent.com/blog/2011/09/12/firewall-saga-part-5/ http://www.terminally-incoherent.com/blog/2011/09/12/firewall-saga-part-5/#comments Mon, 12 Sep 2011 14:32:19 +0000 http://www.terminally-incoherent.com/blog/?p=10008 Continue reading ]]> Welcome to the penultimate yet another installment of the Firewall Saga (it was supposed to be penultimate but it did not work out that way). If you haven’t been following it, please try to catch up. It will make more sense that way.

When we left off last time, a firewall replacement somehow left me with a non routable IP address – a problem that, beyond any shade of doubt was my ISP’s fault. I have called Verizon, only to realize their outsourced tech support call center was entirely incapable of dealing with problems of this complexity. I needed to talk to a network engineer to resolve a router configuration issue but they misunderstood and sent me a repair monkey to jiggle the cables and power-cycle the local appliances. I called them back, and ranted for about 20 minutes, heavily over-using words such as incompetence, outrage, lack of professionalism, dropping the ball, disrespecting the customer, etc… It made me feel a bit better, and they did promise to definitely escalate my issue to second tier.

I walk into work, the following morning, wake up The Intern who has dozed off at his workstation, acquire coffee and start sifting through all the spam in my inbox. Few years ago, my inbox was pristine clean – mostly untouched by the filth of spam messages. My co-workers used to marvel at this phenomenon, and inquired how do I manage to save off the avalanches of crap that flooded their email daily. Unfortunately I do not have a secret technique. I’m simply careful not to give out my email on the internets, and vigilant about deleting and flagging anything that seemed suspicious. Then I went on vacation, and my boss told me to put up an auto-reply “out of office” message. Nowadays it seems like my email is on the list of every single disposed Nigerian prince, penis enlargement specialist and Viagra salesman. Also, apparently I have money in 50+ different banks, who constantly threaten to close my account if I don’t give them my PIN and passwords. It has gotten so bad, that it is actually easier to white-list internal company correspondence and emails from known clients and partners. I currently have close to a 100 filtering rules that help me to fight with the sea of unwanted spam, and make the important and urgent emails instantly visible by application of labels and priority folders.

Email filters – a forgotten arcane art, mastered only by the chosen few. I know for sure that the only client-side filters used in my company have been set up by me. No one else even knows such things exist.

I’m in the middle of fiddling with my tangled web of email filtering rules when I hear my phone ringing. I’m expecting to hear yet another complaint about the time sheet app. If you recall, the whole firewall bonanza somehow broke the VPN tunnel to that remotely hosted server. So for the time being, I am unable to reboot it or tinker with it. But since the non-routable IP issue is more pressing I have pushed the VPN problems aside. Especially after what happened the last time I attempted to fix it. Surprisingly, it was not an internal call. It was a Verizon representative doing a courtesy call. It went a little bit like this:

“Hi, this is Bob from Verizon and I’m just doing a follow up courtesy call about your recent issue. I see here that yesterday we have sent an on-site technician to your location. He has marked the issue as resolved. I wanted to make sure that everything is working correctly and see if there is anything else we can do for you.”

Here are some of the emotions I’m feeling at that exact moment: anger, annoyance, disbelief, rage, befuddlement and hunger… The last one, because I didn’t have a chance to get anything to eat that morning. To help you visualize my reaction, here is a two panel re-enactment of that event. Just imagine that the iPhone is a big clunky office phone, and the coffee mug is a paper cup, and that my shirt has a collar:

He marked it as what?

So it turns out that the field technician that visited us the other day decided to say he fixed the issue. In retrospect, I guess I can understand how it happened. It is very likely that this guy does not work directly for Verizon. He probably works for some local company that Verizon uses to outsource all the cable wiggling, wire snipping and power cycling it needs to do at customer locations. They are likely set up to receive work orders from up high. When they fail to resolve a customer facing issue (for whatever reason) it probably counts against them. So this guys manager probably just said “fuck it, since it was not a local problem we will just put it in the system as resolved”.

But that only occurred to me much, much later. As I’m sitting there on the phone my driving, logic clouding emotion is anger. The upside is that “Bob from Verizon” seems to be speaking perfect English. This is something new. All the support drones I dealt with up until now had very heavy accents. So chances are I’m actually talking to someone physically located in the states. Probably still not an employee of Verizon, but perhaps his call center/department can get me what I need.

So I recount my long and sad story, spearing him no gruesome details. When I’m done, he apologizes profusely then promises to get my issue resolved. He gets a support drone on the phone and together we rely the issue, and it’s importance to him. The thick-accented drone gets in touch with tier 2 support. Tier 2 support insists on sending a network specialist to our location. I try to protest, and try to make a compelling case against it but it seems like there is no use. Apparently they have to make absolutely sure the issue is not local, before they escalate it to the network people. So we make an appointment. I call Barry and let him know we have this guy coming. Together we set up a spare laptop, plug it into our network, assign it a static IP and set firewall to pretend it is our server. The guy will be able to jump onto it and verify that no packets are coming in. I also print out a network diagram, and Barry sends me a document that contains all the relevant firewall rules. When our Network Specialist comes with a visit, we ought to have enough evidence to show him the problem is definitely not on our end.

The next day, our “Network Specialist” Steve arrives at the office. Only, he doesn’t look like a specialist. Of course, looks can be deceiving – and geeky guys can sometimes look peculiar. But this guy just does not look like a networking dude. He a middle aged man, wearing a trucker hat, shorts and a crumpled up t-shirt. The large coffee stain on the front, seems to be locked in territorial combat with his the armpit sweat stains. His gray handlebar mustache gives me an impression that he would be much more comfortable rebuilding motorcycles than troubleshooting network issues. But I decide to give him a benefit of the doubt.

I take him to the server room, where we set up our perfect trap. Next to the rack, there is a little stool, and on it there is the orgy of evidence. The network diagrams, the firewall rules, the trace route logs and the little test laptop ready to be fired up and tested. His eyes glaze over a bit as I talk so I ask him what tests he needs to run, and explain how we rigged the test laptop. He goes:

“Son, no offense but I have no clue what any of what you just said means. I was under impression yous guys had no internet connection…”

I have a sinking feeling in the pit of my stomach.

“You are not a ‘Network Specialist’, are you?”

“What? Hell no! Kid, I was retired up until last week. This is my first day on the job. I sure ain’t no specialist!”

Well, fuck.

I explain my predicament to him. I was promised a specialist, but I got him. Tier 2 refuses to move forward until they have someone on-site run the checks they require. So I hatch a crazy plan. I have now a physical Verizon representative on the premises. Well, more like a trained monkey really – I don’t think he knows anything about anything, but he should be able to follow simple instructions. If we can get the tier 2 assholes on the phone, they can walk him through the required tests. Then we can move on.

Steve agrees to this crazy plan, but says he will probably need his company laptop which he left in the truck. Fair enough. I escort him out of the office and let the front desk know he will be coming right back and to send him right to me.

Steve is gone for about an hour an a half. When he finally shows up, I notice his shirt has acquired ketchup and mustard stains and is beginning to look like a genuine abstract painting. I ask him what happened and he launches into a long winded explanation how he first decided to have a smoke, then he realized he was hungry, and how he has low blood sugar and etc.. I let it go. The sooner we can do these tests, the faster I can get him out of my hair. I show him where to set up, and watch him pull out his ancient flip phone and dial a number.

Then he gets to a voice menu. Then they put him on hold. I shake my head in disbelief:

“Wow, they put you guys on hold too?”

“Of course.” he gives me a wide, gap-toothed smile “People think we have some special, internal number but we don’t. We call the same tech-support number as you do, when you have a problem”

That sinking feeling I mentioned before – it’s back with a vengeance. Steve patiently waits for “the next available representative” while I contemplate suicide for the twentieth time this week. Eventually I get bored watching Steve, and I excuse myself figuring I might as well get some work done. I interrupt The Intern’s intense game of tower defense and tell him to go keep Steve company, and make sure he does not try to mess with the equipment, or walk out with any of our servers. Oh, and to call me when Steve finally gets a live person on the phone.

After about 20 minutes I get a phone call on my desk. It’s not The Intern, but one of my other coworkers.

“Luke, I think we have a problem…”

Oh, God… As if I didn’t have enough problems.

“Damn it… What did you break this time…”

“No, this is more of a Human Resources problem.

Oh, sweet relief! At least I won’t have to deal with this.

“And you are calling me about it because…”

“Well, it involves your server room. I think we have a hobo infestation.”

I chuckle, and explain that he was actually sent by Verizon.

“Ah, that’s what they all say. Next thing you know they start breeding and you get like a dozen homeless people living in your server room. Mark my words man.”

“What do you suggest, oh wise one?”

“Nuke it from the orbit. That’s the only way to make sure.”

“Well, I have The Intern babysitting him…”

“Yes… And? I don’t follow..”

“Right, good point. I’ll see what I can do about it, but you have to submit a ticket for it first”.

About an hour later, I go check up on my server room buddies. I find The Intern intently watching Steve munch on a sandwich. I give him a disapproving look and tell Steve he can it in the lunch area because we do not want food in the server room. He is apologetic:

“Sorry about that. I just got hungry, and with my low blood sugar… You know how it is. I’m still on hold, and they can pick up any time. I figured there is no harm in a little snack…”

To emphasize his point, Steve emphatically waves the sandwich around as he talks. On one of the swings a tomato slice gets dislodged and soars through the sky, hitting the opposite wall with a loud smack. In astonishment Steve slightly releases his grip, and a slice of ham, and some lettuce slither out from between the bread and land on his laptop keyboard. He grabs them, stuffs them in his mouth and then shakes the laptop off sending crumbles, lettuce shreds and other unidentified bits of food on the floor.

I ask The Intern to clean it up before anyone notices, and sternly march Steve to the lunch room, trying to decide whether I should kill Steve or myself first.

Next time, more fun with Steve, and hopefully the climactic resolution. Well, maybe.

The Firewall Saga
<< Prev Next >>
]]>
http://www.terminally-incoherent.com/blog/2011/09/12/firewall-saga-part-5/feed/ 6
Advice for new Helpdesk Analyst http://www.terminally-incoherent.com/blog/2011/08/31/advice-for-new-helpdesk-analyst/ http://www.terminally-incoherent.com/blog/2011/08/31/advice-for-new-helpdesk-analyst/#comments Wed, 31 Aug 2011 14:03:39 +0000 http://www.terminally-incoherent.com/blog/?p=9864 Continue reading ]]> My morning ritual at work is at follows: I roll into the office, scamper into the IT cave, drop my bag down, turn my computer on and while it boots I make a bee line for the local coffee dispenser. Unfortunately, the closest caffeine distribution device is located in the lunch room area, outside to confines of the technological inner sanctum.

Some of us old timers faintly remember “the golden age” when our jolly brigade of tech nerds was allowed to host a dedicated productivity pot in our poorly lit bunker. An unfortunate accident involving a fresh pot landing liquid first into some relatively new equipment put an end to that glorious era. But that’s a story for another time. I will just say we discovered an interesting double standard that day: when users spill coffee on their laptops the management treats them like little cuddly puppies – they get a light slap on the wrist, a wagging of the finger followed by “Awwwww, you poor thing – you know we can’t stay mad at you. We will get you a brand, spanking new laptop right away”. On the other hand when we accidentally spill coffee onto inessential spare parts that were just gathering dust, we get coffee privileges revoked. There is no justice in the corporate world. No justice at all.

But that was long time ago, and probably not true. All tales spun by seasoned sysadmins and grizzled hackers tend to get embellished over time. How do you know when one of us starts making stuff up? It’s easy – just look for the part when the protagonist does something cool, or says something witty. That part is usually hindsight-driven embellishment. Oh, and the parts where users are being ridiculously stupid, and to unbelievably dumb things? Those tend to be 100% true. The sad part is that you just can’t make up this shit.

Since our dedicated coffee pot was taken away we must tank up in the lunch room along with all the other poor souls damned to toil at our company. As I made my way to our glorious and thrice blessed wake-up machine I completely missed a minor crowd of the NOC dwellers conspiring in the corner. The powers that be have delivered onto us a new “Help Desk Analyst”. A fresh, uncorrupted and optimistic looking application support technician was parked in front of the IT cave, and we were ordered to get him trained, plug him into the phone system and prevent him from jamming sharp objects into his eyeballs after few hours of interaction with our users. My fellow geeks gathered together, desperately trying to avoid the responsibility of babysitting the newb all day.

When I returned, coffee cup in hand, I was greeted by conspiratory grins from the corner of the room. They beckoned me to come closer, and filled me in on the new guy situation. Apparently someone had to train him.

So I did the only thing I could do in that situation. I loudly screeched “NOT IT!” and watched them grin, unfazed.

“Oh, fuck… You guys already did the ‘not-it’ thing, didn’t you?”

They nodded in unison.

“Well, why can’t The Intern do it?” I stabbed my finger in the direction of The Intern who, like me, just finished his pilgrimage to the land of Lunch Room. The sudden movement startled the poor fellow, making him yelp like a frightened animal and splatter some of his freshly brewed coffee on the floor.

“First of, The Intern has a name…” one of my coworkers explained “…though I can’t remember it at this time. Secondly, he is an intern.”

So it was up to me to teach the poor new fellow some of the basics. I will spare you the gory details, but here are some of the tidbits of knowledge I shared with him that day. I figured some of you guys may find these amusing and/or helpful. The new guy sure did.

  1. First off, when you are working help desk in a bigger or smaller office you have to remember that your users are going to be white collar, educated professionals. Their computer is their primary work tool. One they use for 8+ hours every single day. Most people who start in this line of business assume that this would make the users at least somewhat proficient at using their tool. This is a mistake. When you pick up this phone, assume the person on the other line is basically Brendan Fraser from the movie Blast from the Past. By that I mean a person who spent the last 35 years in some atomic bunker deep underground, and just emerged from there yesterday. While they know what a computer is in theory, and can probably turn it on and off, and type stuff into it but that’s about the extent of their knowledge.

  2. Never assume the user is actually at the computer or that the program they are having problems with is actually open. For all you know, the users just stepped out for a smoke or is standing in line at a local burger joint. Why would they call you when not at their computer? Productivity I guess. Plus they assume you can just fix it over the internet. You know, log into the laptop that is turned off and in the trunk of their car and fix it. This happens more than you think.

  3. Most users only know 3 applications: Word, Excel and Outlook. If you are lucky, they might have also heard about Powerpoint. Everything else is a mysterious land of confusion, populated by fox fires and other wild things. Users tame these scary and unfamiliar things by giving them nicknames – usually taken from partial name of the company that made it. So they might call you about the Adobe thing, the Photo thing, the zips, the Microsofts, etc..

    Sometimes users get attached to certain name, and refused to let go of it even when you change the underlying system. For example, back in the ancient times our company used a custom made time and expense tracking system made by some company called Velocity Systems. It got replaced by a completely different web based system years ago. Old users never stopped referring to that system as “Velocity”, and new ones pick it up and repeat it without realizing why.

    If you ever talk to someone who is not confused about applications they are running, and can communicate the problem directly without using some made up of half remembered buzzwords and brands then you are talking to a power user. Enjoy it, because they don’t call often.

  4. Users newer write down error messages. When they write them down, they usually completely fixate on some random detail that looks cryptic and mysterious to them, but leave out the important things like the actual exception/error number, and the plain text message that was associated with it. If you ask a user for an error message expect an incomplete stack trace instead of a clear message.

  5. Never assume the user knows common keyboard shortcuts for copying and pasting. Make them use the context menus if available. If an application does not have a context menu for copy/paste you will have to explain in detail where the Ctrl button is located, and that it needs to be held down why pressing C and V.

  6. About once a week you will get a call from a user claiming their internet is upside down or sideways. Don’t get confused. They just have a laptop that lets them flip the screen orientation. Ctrl+Alt+Arrow keys will flip it back. Make sure they write this key combination down, because otherwise they will be calling back in about a week.

  7. When helping laptop users, try to find some high resolution pictures of the supported models. You will need them in order to explain where to find keys like PrtScn or to help them find the hardware WiFi switch. If you don’t point out the exact location or if you are unable to describe it’s icon/shape they probably won’t find it.

  8. Assume the user cannot handle multitasking. Always frame your instructions in such a way that could be handled while running one app at a time in full screen mode.

  9. Dragging and dropping is for power users only. Anything that requires a user to have more than single maximized window on the screen at the time will be profoundly difficult to explain on the phone.

  10. Don’t assume the user actually understands the concept of a file system. Do not assume they store their work under “My Documents”. It is entirely more likely that they use Outlook as a file system proxy, saving all their documents as attachments. When a user asks you to help recovering a deleted file, check the “Deleted Emails” folder in Outlook before the system Recycle Bin.

  11. Users will commonly “lose” files by accidentally saving them under “My Documents” or straight into the %TEMP% folder. Be prepared to recover them for them from these locations via remote desktop.

  12. Never assume the users understand what a zip archive is. To them it is just a funky looking folder. Most users will always edit zipped files in place without extracting them. This usually works except some particular situations when it does not.

  13. Keep in mind that most people you will be dealing with think that words: UPLOAD, DOWNLOAD, INSTALL, UPGRADE and UPDATE are synonyms of each other. If you are aiming for clarity try to avoid using these words. For example you will be “putting programs on their computer”, “pulling files from the website and onto the computer”, “bringing the application up to date”, “putting documents onto the website” and etc…

  14. Similarly, most of your users assume that terms such as WEB BROWSER, THE WEB, SEARCH ENGINE, SEARCH BAR and THE INTERNET are basically different words for that blue E icon on their desktop. It is usually a good idea to say “go to the internet” when you want them to open a web browser, and “Go to Google” if you want them to make a search.

  15. Most users will be unable to locate the “address bar” on their browser. If you give them a web address they will type it into the search box on their home page. For example, if you will tell a user to go to google.com they will type “www.google.com” into the bing search box, and then click the first result. If your are directing them to an internal, non-indexed site it is just easier to email them a link.

  16. Users will always put “www” in front of web addresses. No exceptions. Make sure you keep this in mind when you send them to a site hosted on a non www subdomain.

  17. TLD’s that are not .com, .net, .org, .edu and .gov do not exist. If you need a user to visit a .me, .co.uk, .tv or perhaps .ly site make it very, very clear not to type .com after it. Because they will do it.

  18. Do not assume users know which web browser they have open. In fact, do not assume they always use the same browser. For example, if Firefox was ever set up as a default browser it will automatically open when they click on links, but they will likely still habitually access the web by clicking the blue E on the desktop. They probably won’t know the difference. Use server side browser detection scripts to work around that.

  19. Get used to hearing “Fox Fire” a lot. Correcting them is pretty much a waste of time at this point.

  20. When sending out a mass email always use BCC. Never actually put emails or mailing list addresses in the TO or CC fields. Users will always Reply To All on company wide memos and announcements. No exceptions.

Can you add anything to this list? What are your tips on dealing with users? Let’s continue this in the comments.

]]>
http://www.terminally-incoherent.com/blog/2011/08/31/advice-for-new-helpdesk-analyst/feed/ 16
The Firewall Saga: Part 4 http://www.terminally-incoherent.com/blog/2011/08/29/the-firewall-saga-part-4/ http://www.terminally-incoherent.com/blog/2011/08/29/the-firewall-saga-part-4/#comments Mon, 29 Aug 2011 14:16:36 +0000 http://www.terminally-incoherent.com/blog/?p=9898 Continue reading ]]> The saga continues. If you haven’t been following this series, you can catch up to speed here. What follows might be funnier that way.

It is the day after the Beef Instrumentality Incident #631. We are finally chugging along on all cylinders, the users are mostly placated and I finally have some time to call Verizon about my on non-routing IP address. I’m not empty handed either. Barry was kind enough to arm me with traceroute logs for all our IP’s captured from two different outside locations. Which is not much, but they show that the packets sent to that one problem address route fine until they hit the Verizon network. And then, boom, they get shunted into the depths of cybervoid instead of being safely delivered into our office.

Barry also dug out, and sent me his impressively anal retentive case notes from a few years ago, when the same thing happened. These things are dense with tiresome detail: actual dates of calls he made, names of support drones spoke with, etc. According to the notes it only took 5 days and about a dozen phone calls to get the IP routable again. I figure I can do it in half of that time considering the extensive documentation I am armed with. Hell, maybe I can even do it in a shingle phone call.

Around 10 am I let my coworkers know I will be calling Verizon, and that I probably will be on hold till closing time. I ask them to drag me out of there after 5pm, bid them farewell, and gather some necessities. Two cups of coffee, bottled water and snacks in case I am trapped by the phone for weeks. I do some mental preparation, then dial the number. I go through the byzantine labyrinth of voice menus and finally end up in the wait queue. I put the phone on speaker and proceed to do some busy work while listening to their horrible on-hold music.

20 minutes into the call, a coworker from the next cubicle over starts parroting the looped voice assuring me that “Your call is very important to us. Please hold for the next available representative.” After the third or fourth time, I join in and we both say it together.

45 minutes into the call, every single person in the IT cave in on the joke. Every time that looped sound byte repeats, five voices rise up in unison. We are like a group of monks chanting an ancient prayer.

50 minutes into the call, someone decides to hit the Staples “That was Easy” button every time we chant our little chant.

An hour into the call, The Intern manages to perfectly imitate the elevator music with his mouth. By this time, my phone speaker volume is cranked all the way up and it’s a regular sing-along party.

At 65 minutes, someone wanders into the IT bunker with a question, hears us chanting, says “You guys are a bunch of nerds” and leaves. We decide we must do this more often. Anything that keeps users from dropping by with non-essential, non-work-related questions is worth working into our daily routine. Oh, and in case you were wondering that user just had a question about home theater sound systems. Because, you know – system administrators and programmers know all about home multimedia setups.

Finally, after an hour and 20 minutes on hold someone picks up. There is some booing in the background as I disengage the speaker. Apparently everyone was having fun.

I jump through all the requite identification hoops. Then I launch into a 15 minute detailed explanation about our routing issue. I explain to him the dozen or so local and remote tests we performed to verify this is not a local configuration problem. I offer to send him the traceroute logs so that he can see the problem happens only for a single IP address. I also tell him that this seems to be a recurring problem, and give him a quick rundown of Barry’s case history. The guy on the other end patiently listens to all of this, and once I’m done goes:

“Thank you for that information sir. It appears this is a router configuration issue. What we will need to do is to power-cycle the router. It is the little black box with the antenna that we have provided you when we set up your internet. What I want you to do is to unplug it from power, wait 60 seconds and then plug it back in…”

Granted, I sort of expected this to happen. I patiently explain that we do not use their cheap, off-the-shelf appliance with crippled custom Verizon software. I also reiterate that this is a routing issue, not a local configuration error. I ask him to escalate this call to the team that handles network problems, and point to the case history to support my claims. I even give him the exact date when the previous ticket was escalated to that department (thanks to Barry and his disturbingly obsessive note taking).

“I’m sorry sir, but I cannot help you if you are not using the ActionTech router we have provided you. I will need you to unplug your current router, and replace it with the ActionTech before we can continue this troubleshooting.”

Suddenly I realize I might have dialed residential support line. I ask the guy to verify, but he claims I actually called the right number. He is a proud member of the Small and Medium Business department. Surprisingly to everyone, including myself this sets me off on a weird Socratic Method rant.

I tell him that we have this many machines, this many servers and this many persistent VPN tunnels that connect us to other offices and data centers. Then I ask him whether he would classify this as a small or a medium business. Not knowing where the hell am I going with this he agrees that we are probably in a “medium” class.

Next I ask him whether or not the $20 off-the shelf, router they gave us can handle maintaining several persistent VPN tunnels? Does it have a commercial grade firewall software that would allow us to do real time packet inspection and intrusion detection? I ask him if that router has any of the features that we get audited for, and that we are contractually obliged to have in place to protect our client data?

He hesitantly agrees that it probably does not have all these features. I have a hunch he doesn’t know I’m bluffing and that we never, ever get security audits (except for internal ones). Still he insists that I temporarily connect the ActionTech just for the sake of troubleshooting.

I ask him whether he would classify the ActionTech router as an enterprise level device for medium business users, or a personal use appliance for residential clients?

He agrees it is more on the residential side.

For my cup the grace I go:

“Ok, so let me get this straight. We are paying for business class FiOS connection, and business class support. Why then are you reading troubleshooting steps from the residential support checklist, asking me to dismantle my entire network architecture and connect via an a off-the-shelf, residential device?”

Then I once again plead with him to escalate this to network support team, or to connect me to his manager. He mumbles a bit confused, then asks to put me on hold while he consults with his supervisor. A coworker from the next cubicle chimes in:

“You tell them, dude! Gotta be stern with these assholes.”

I interpret this as a compliment, seeing how I am usually not the most assertive person. In fact I sort of feel an inkling of pride for coming up with that question and answer session. Then the phone suddenly goes to dial-tone.

I have just rolled like a natural 20 on my persuasion check. I have had this guy on the ropes! He was about to do what I asked him to do! And the motherfucker hangs up on me! To make matters worse, now I have to call them again, and repeat the exact same exercise with another asshole who will ask me questions about my ActionTech router.

Boiling… Murderous… Rage…

If I was Bruce Banner, I would probably be rampaging green skinned monster in ripped up purple pants by now. Fortunately I never participated in any gamma radiation experiments so I merely let out an agonizing groan, slam my phone down real hard and decide to take an early lunch. I nearly collide with The Intern who fetched himself a fresh cup of java. Fight or flight reaction kicks in, and sends him spilling half of the mug as he is frantically scampering out of the path of my angry walk.

I return sometime later, with a clear head and full stomach. I’m determined to get this call done today, so I sit down and repeat the entire procedure. This time I don’t put it on speaker because I don’t think anyone else is in the mood to sing along. At least I know I’m not.

I spend close to an hour on hold, but fortunately this time I get someone whose IQ does not seem to be a single digit number. He still tries to make me fetch the ActionTech router, but I eventually drops that troubleshooting path. Instead, he decides to follow a different branch on his troubleshooting decision tree.

“Sir, because of your unique network configuration I’m afraid I will need to send an on-site technician to perform some local tests.”

God, damn it! No! This is a configuration issue on your end. All I need is five minutes of time of one of your network engineers. There is one entry in your routing tables that got fucked. I just need someone to go and un-fuck it. Just let me speak to someone who knows what a “routing issue” is. Please! It happened before. Look in your case notes. It should have all the information. My notes say we spoke to some dude named Richard. He fixed it last time. Can you please get me that guy!

He ignores my pleas, insists on sending out a guy. We go back and forward like this for about 15 minutes, an I eventually manage to twist his arm into escalating the issue somewhere higher. In fact, he agrees to forward my traceroute logs to the Tier 2 team for reference.

“Ok sir, you can send it to my email. It is V as in Victory, Z as in Zebra 123456789995-0 at hotmail.com”

Hotmail?

“Is that your personal email? Don’t you have like a verizon email account?” I inquire out of sheer curiosity.

“I don’t know. This is what they set up for me and told me to use sir…”

Few more innocent probing questions reveal that my friend on the other end of the line doesn’t even work for Verizon. He works for an outsourcing company. They are not directly affiliated with Verizon – they are just hired by it to act as a storm shield against the wrath of dissatisfied customers. So of course they don’t get to have verizon email accounts. Providing legit emails to folks who handle their front-line customer support is apparently not important to Verizon. They are perfectly fine with the legion of support drones sharing a few dozen hotmail accounts, and looking very, very unprofessional.

I get him the logs, I grab a case number and finally hang up. It’s almost 4pm. I have wasted almost an entire work day trying to get a single stupid issue logged in the Verizon system and escalated to proper department. Still, I feel like I have accomplished something. The case is being sent to the second tier, so perhaps someone who actually works for Verizon will get a chance to look at it. I might have wasted way to many hours on this but I feel like I have a realistic chance at beating Barry’s week-long turnaround for this issue.

Next day, I spend entire morning catching up on work I didn’t have a chance to do while fucking around with outsourced Verizon support drones. Around lunch time, I get a phone call from the front desk. Apparently some “Verizon Guy” showed up, searching for Luke.

It turns out that Verizon sent out a field technician to our office anyway. They said they wouldn’t. They said the issue was being escalated to Tier 2, but apparently that is not what happened. Right now there is a guy in our lobby and there is absolutely nothing he can do to fix the issue. But I figure that maybe I can explain the problem to him and get him to forward that information up the stream. Hell, maybe he can plug himself in on our network, run whatever diagnostics he needs to rule out a local configuration issue being the cause of our problem.

I go and fetch hom, bring him back to the server room, show him where the FiOS box is, and how it connects to the firewall. I ask him what tests does he need to do, and make sure he knows we can’t bring anything down during business hours. The guy looks at the server rack, the tangle of network cables going in and out of various switches, all wide eyed and slack jawed. He goes:

“Dude… I just thought you are going to have a bad connection, or maybe a broken router or something… This…” he gestures at the server rack housing the firewall “This is way out of my league, man.”

Apparently no one even told him what the problem was. The dispatch just said the client was experiencing connection issues. There was nothing there about routing problems. And even if there was, this guy was not trained to troubleshoot issues like that. He was armed with a spare modem and a wire crimper, and trained to jiggle cables and power cycle basic network appliances. But he seems like a nice guy, so we chat for a bit, and laugh at Verizon’s lack of competence. He says he will talk to his supervisor and see if he can pass the message along up the chain of command.

He leaves, and I clear the rest of my schedule for another grueling Verizon support call. Also I contemplate committing a ritual suicide.

Next time on Firewall Saga: Verizon sends out an “Network Specialist” to our location. Hilarity ensues.

The Firewall Saga
<< Prev Next >>
]]>
http://www.terminally-incoherent.com/blog/2011/08/29/the-firewall-saga-part-4/feed/ 7