We are out of space: Part 2

In the previous installment of this story, I have learned that the company somehow managed to fill 200GB of free space on the network shares, overnight. I was more baffled than surprised, as this sort of thing was not new. Our company was known to go on huge data digitization binges without ever bothering to tell the IT about it, or purchase appropriate hardware to store it. In fact, most of such projects were closely guarded secrets, hidden away from the IT because the people who came up with them did not want to be blamed for additional tech related expenses. I got my buddy Larry up to speed on this case, and gave him a mission:

“I need you to use your contacts upstairs to get some intel for us. See if the bright heads over there got any new genius ideas like the scanning project.”

This was the kind of task that required some finesse, charm and social agility. It was a delicate social engineering hack that warranted a light touch. Larry was about as sneaky as a beached walrus, and as socially agile as a wounded elephant in a china shop. He was perfect for the job. His investigation would be brash, abrasive, accusatory and it would instill fear in the legions of the luserati. He would be the bad cop, to my… Lawful evil, but less threatening cop. Or something like that. Someone would eventually squeal under the pressure, or call me with the information hoping I can shield him/her from the wrath of Larry.

While my trusty minion went to ruffle up some feathers, and throw his weight around I decided to investigate the file shares themselves. My plan was to identify large folders, and ask Jeremy why they grew so big, and/or if they can be archived to a different share to make space. Thankfully I was running linux so this would be somewhat easy. I mounted the problem directory as a samba share, then did:

du -skh *

If you are Unix illiterate, this command prints out disk usage stats. The -s makes it print out only top level directories and files, but recuses into them to calculate the size, while -h makes it display the sizes in human readable format (ie. KB, MB an GB rather than in raw bytes). The result of this command was eye opening. Most of the folders on the drive were tiny. Only few took more than a gig of storage. There was nothing outrageously big there, except one entry:

257G DfsrPrivate

DfsrPrivate is a hidden system folder created and maintained by the Micrsoft DFS Replication Service. Without getting into to much of technical explanation, this service keeps file shares on different servers in sync with each other. We set up most of our file sharing servers this way as a means for rapid disaster recovery – always in working pairs, and if one of them fails, you can immediately fail over to the second one while you gut and restore the first.

The DfsrPrivate folder is used for staging the files that are to be replicated, and for storing copies of files on conflict. That conflict folder, turned out the be the actual culprit. Normally, the contents of DfsrPrivate\ConflictAndDeleted are supposed to be kept under 600MB in size. Every once in a while though, the DFSR service decides to ignore the quota and starts dumping huge amounts of data there, without ever deleting anything.

Quick google search revealed that one of the Microsoft Technet blog posts has described the exact same problem I was having and outlined a solution. In case that blog ever goes down here is what you do.

First run the following command:

WMIC.EXE /namespace:\\root\microsoftdfs path dfsrreplicatedfolderconfig get replicatedfolderguid,replicatedfoldername

Yes, this is Windows administration, and we are using a shell. Is your mind blown yet? Anyways, the output of the above should give you a GUID’s for all the network shares you have on the affected server. If you only have one share, then you will get one entry that will look something like: 70bebd41-d5ae-4524-b7df-4eadb89e511e. If you have more shares, make sure you pick the right one, and copy the GUID.

Then you run the following command:

WMIC.EXE /namespace:\\root\microsoftdfs path dfsrreplicatedfolderinfo where "replicatedfolderguid='70bebd41-d5ae-4524-b7df-4eadb89e511e'" call cleanupconflictdirectory

Make sue you substitute the GUID I posted with the one that was generated for your share, otherwise this won’t work. The output will actually look like the script bugged out and dumped a weird error message like this:

DFSR Cleanup

DFSR Cleanup

This is actually what you want to see. It means it’s working. Once you do it, you wait a few minutes and check your DfsrPrivate\ConflictAndDeleted folder. It should be either empty, or significantly reduced in size. If it’s not you can go to Plan B which is manual deletion.

I run through this solution, run the WMIC scripts and get virtually no results. Plan B it is then. In case you didn’t know, the B in Plan B stands for “Brute Force” or “Brace Yourself”. I take a deep breath, kill the DFSR service, and then manually delete EVERY-FUCKING-THING in that folder. I restart the service and all is well. Replication continues as normal, quota is once again respected, and the users have more than 250 GB to fill before we have a storage problem again. We are back on schedule for Peak Storage 2014. Crisis has been temporarily averted. Now it’s back in it’s place – hanging above our head along with 113 other critical problems that management refuses to address until they start threatening to shut down the company, or interfere with the browsing of Facebook.

Few hours later, I get the following email:

Luke, buddy! I hear there were some problems with downloading or something. Larry was here in Marketing asking all kins of weird questions about network sharewares and what not. It made me nervous. I hope I don’t get in trouble for this, but about a week ago I downloaded my iTunes music onto the G: drive on my computer (that’s the one with lots of free space) cause I like to listen to my music as I work. That’s ok, right? It’s all legit stuff I paid for, and it’s on my computer so it should not matter. I didn’t tell Larry cause he would probably write me up or something. Let me know if I could get in trouble for this. I put the music under In Process\Marketing\Legal cause I knew no one would ever look there.

Apparently my Larry “the bad cop” gambit has worked like a charm, and spooked at least one dude who was making unauthorized use of the network resources to store his pirated music. My response was along the lines of:

No worries, I got your back – I deleted all that music so that you don’t get in trouble. Make sure you don’t bring any more music on the G:, H: or I: drives because those are shared network resources and it will get you in trouble. I promise not to tell Larry.

Email CC’d to Larry and that guy’s supervisor of course. Pity that his collection was less than a gig – I was hoping for more space savings.

The moral of the story: if something eats hundreds of gigabytes of storage over the weekend, don’t automatically blame the users. Chances are it’s shitty Microsoft service instead. Actually, scratch that – blame the users anyway. They deserve it.

Posted in sysadmin notes | Tagged , | 3 Comments

We are out of space: Part 1

It’s Monday morning, and I’m sick. I thought that I was smart when I got my flu shot a few months ago. I figured that I can cock-block the influenza virus, and skate through the winter unscathed. What I did not anticipate was that being free of flu I was a prime target for things that are much worse. One of these super-bugs that makes you congested, and makes swallowing feel like gargling with sandpaper got the best of me. I would have stayed home, but I don’t have sick days.

In their infinite wisdom, the powers that be decided that staying home sick counts as a “vacation” so I wisely opted to drag my sneezing, germ producing carcass to work, making it my mission to spread the malignant disease as far and wide as possible. My hope was that I was going to be able to burrow myself in my cubicle, put some code on the monitor so it looks like I’m programming and then just survive till 5pm battling fever and sneezing fits.

Alas, this was not to be. As soon as I sit down and open my email, support ticket notifications start streaming in:

  • Ticket #6451 [High]: CAN”T SAVE AANYTHING!!1
  • Ticket #6452 [Medium]: G Drive not working
  • Ticket #6453 [Critical]: facbook.com saiz page not fund – pls unblock!
  • Ticket #6454 [High]: Can’t save to LAN.
  • Ticket #6455 [High]: When trying to save to G: drive it says “Drive full or right protected”
  • Ticket #6456 [Medium]: Is the public network share full? Can’t save anything.
  • Ticket #6457 [Low]: Can’t save to network share. Probably over quota. Please extend.

It goes on like this for at least 20 more messages. The emerging pattern is somewhat clear – something is terribly wrong with the network shares. I’m fraught with a sense of déjà vu. We already had this problem last week, and we narrowly avoided the catastrophe, by installing some extra hard drives and moving bunch of archival data off the main network shares. The whole operation took several hours, but I managed to reclaim close to a 200 GB of space by deleting logs, dumping out garbage files and moving really, really old files that no one has touched in ages to a separate drive. Then I sent out an email to some of the administrative staff telling them what has been done, and advising them to do some more archiving. Last Thursday things seemed to be mostly under control.

I quickly log into one of the servers, to check how much of that rescued space was devoured over the weekend. Apparently all of it. The drive used for one of the network shares has exactly 57Kb of free space.

At that very moment, Jeremy bursts into the IT cave system. I have no clue what Jeremy does at our company, because I personally do not care. I am sure he had told me what his important job functions are at more than one occasion, but I have never considered that information to be relevant enough to actually commit neurons to it.

“Thank God you’re in! I know you just got here, but we are having some serious problems with the LAN so if you could look into that…”

I lift a finger to stop him from rambling.

“I know. I was just reading the tickets… Listen Jodie…”

One of the other things I have never commit to memory is Jeremy’s name. There is a limited number of neurons in my head, and now that I have shaved my beard I can no longer offload Unix skills to facial hair follicles so I tend to locally optimize storage this way. I just call him girl names for consistency.

“Jeremy.”

“Julie…”

“Jeremy.”

“Genvieve…”

“Oh, come on – that’s not even close!”

“Whatever. Listen, Sally Ann Margaret…” I pause for a second to see if he approves this new handle I just made up. He gives up, and allows me to continue.

“Remember last week when we moved the 1997 through 2009 folders off the main network share to the ‘Archive’ one?”

He nods.

“That freed up around 200 GB of space. How the hell did you guys fill out 200GB over the weekend?”

He shrugs. “I don’t know. We’ve been scanning a lot of invoices, and other stuff…”

Last summer the corporate deities that bestow the blessed health insurance benefits upon us, have decided that it would be a good idea to scan everything in sight. This development coincided with a batch of new “all-in-one” printers we have installed near High Hrothgar – also known as the area where all the directorial critters have their offices. Someone has discovered that you can scan a document and have it emailed to you as a PDF. So naturally, their first reaction was to scan everything in sight. The initial feature shock, soon became a company policy but no one told anything to the IT department. So we have been oblivious that a few floors above there was a major digitization project going on, armed only with few aging Windows servers and replicated files shares to support it.

We only discovered the scanning project by accident, when we noticed the servers are low on drive space. After shaking up few users we managed to get them to spill the beans. Apparently they were not supposed to tell us, because their supervisors were afraid we will start making ruckus about buying new servers and shit.

So we went and put in a proposal to build a dedicated architecture for this project. Preferably some sort of a NAS to which we could slowly add new drives as the project grew in size. That got rejected immediately because it would actually cost money, versus the current solution of not doing shit that was completely free. Eventually we did get an approval to buy some new drives, and beef up the windows servers, but that was a temporary fix.

A fellow NOC Denizen, Larry actually made a chart. He measured the amount of storage we consume every day, adjusted it for possible growth and cross referenced that with the amount of internal storage we could possibly add to these servers. According to his math, we would get starved for space, and the system would become unsustainable within 3-4 years if we were lucky.

“Pshh, 5-7 years?” said the omipotent directorial oracles, who always think we are low-balling sustainability figures “In 7-8 years we might have flying cars, and colonies on Mars. Don’t you worry yourself about things that will happen 8-10 years from now.”

And so we were left with a teetering wreck just waiting to happen. In other words, business as usual.

I wander of to Larry’s desk. He is redditing like a pro, and ignoring the morning commotion surrounding the storage shortage. I don’t blame him. I would too, but they have found me first.

“Hey asshole, your math skills are shit.” I give him a friendly greeting.

Larry, a rotund gentleman, swivels around in his chair and gives me a shit eating grin:

“I will have you know, that my math skills are impeccable, you abominable cunt.”

Pleasantries out of the way, I give him the basic rundown of our current crisis. Something just ate 200GB over the weekend. Whatever it was, it was not included in our “Impending Storage Disaster of 2014″ projections (also known as “Peak Storage 2014″ for short). Now we need to identify it, and figure out how to deal with it. I had a very specific mission for Larry…

But more on that next time.

Posted in sysadmin notes | Tagged , | 16 Comments

Batman: Arkham City

Recently, I have finished Batman: Arkham City. I have been trying to figure out how to review it for a few days now. I’m really not sure what to say about it. I actually kinda liked Arkham Asylum, and this one is a very similar game. In fact, it is an improved game.

A lot of my gripes from the previous installment were fixed in this one. For example, Batmaning around the city feels great now. in Asylum, the grapple points were a scarce resource. Most of the time, you were constrained by chest high walls that you could not Batjump over. You could Batglide using your cape sometimes but the level design limited the amount of distance you could cover very carefully by never putting grapple points where you could glide, or glide friendly open distances where you could grapple. In Arkham City, every building has multiple grapple points, and gliding is your primary mode of locomotion. You can avoid 90% of the open would combat by simply running, gliding and grappling your way over the roofs. Sometimes I would just aimlessly explore the city this way ignoring all the objectives, and I must say it worked really well.

Arkham City

Arkham City

Arkham City also got better at the boss sequences. This time around, most bosses (with exception of Joker of course) can be eventually punched in the face, to your satisfaction. You even get to beat up the clown a bit at some point. Kill stealing, and running away at the last second is kept to the minimum in this one.

They even lampshade the plot driven unlocking of new Bat-accessories – Batman at one point comments he once tried taking a bigger accessory belt into the field, but it would weigh him down, and get in the way. Very nice touch if you ask me.

The mood is about the same as in the last game, if not darker. The plot is actually very decent, but I’m not going to talk about it much. Why? Because I would mostly be re-iterating the same points that Shamus made in his very eloquent and comprehensive post on this very topic. I don’t think I could explain why the excellent foreshadowing of the main plot twist worked so well in this game any better than he did. Furthermore, he covered all my plot related gripes (and then some) in his followup nitpicking post. So I can’t do that either.

What is left for me to say about this game then? Well, I could say it feels very… Video Gamey I guess.

I know, I know. It is a video game. Duh! That’s not my point. I mentioned this on Twitter before, but I don’t think I explained myself very thoroughly. It’s kinda hard to do that on Twitter. This has to do with immersion.

When I play games like Skyrim, Deux Ex or Mass Effect (for example) I get immersed in the game world and identify with my character. I’m Sheppard, I’m the Gray Warden, I’m Adam Jensen, I’m Skippy McPooperpants the crazy Khajit archer. I don’t think of the combat in terms of push-button-gimmicks but in terms of strategy, stealth, weapons and etc. Combat in Arkham City often breaks that immersion. When you do normal stealth sequences, and you get to silently take out the enemies one by one without being seen, the immersion is usually intact. When you fight hordes of enemies on the ground, it mostly works too. But boss battles ruin it.

Let me give you an example – when you fight Salomon Grundy the only way of defeating him is to use your detonation gel on the floor reactors that are reanimating him. This is a classic video-game gimmick straight out of a Mario game or something. When I think Batman, I don’t think blowing up floor things to needle down a life bar of a big, invulnerable boss. Almost all big battles are like that. Near the end of the game, most henchmen sequences get like that too. At that point every group includes a couple of shield dudes, and armor dudes who have to be taken out using a very specific combo.

Most of the time the game has really great free flowing combat mechanic that allows you to mix a lot of techniques – you can hit people and rack up combo points, and then use special moves. You can block and counter to stay almost invulnerable. You can stun people and hit them with flurries and etc. But when shield dudes, cattle prod dudes and armor dudes show up, they ruin everything. They break up your combos, they can’t be easily blocked and they can only be knocked in one specific way. I don’t know about you. but that always made me loose my immersion and start thinking about combat mechanics. Arguably, that’s not a good thing.

I guess some people like it. Some folks prefer their games to be challenging and like boss battles with clear repeat patterns. I personally am more of a story and immersion guy. I like to experience the game “as” my character, and not as an outside spectator waiting for another button-mashing puzzle like you wait on a buss that’s 15 minutes late. To me, this constant immersion dropping is the main weakness of the Arkham series. I like the stories they are telling, I like the setting, I like the mood and I like playing as Batman. What I don’t like is to be constantly reminded that I’m in a video game and I need to do some contrived actions that would not make sense in a comic or a movie.

How about you? Did you notice the same thing about this game, or am I full of shit on this? Did you think at least some of the boss battles could be more intense if they at least tried to keep you immersed in the story rather than resorting to classic video game gimmicks? Let me know in the comments.

Posted in video games | Tagged | 1 Comment