What Have I Learned From My Hard Drive Failure

As you may have read, my windows box has suffered a hard drive failure after a power outage yesterday. Well, my machine is back – at least in a way. I’m running on a bare bones windows installation with just the antivirus, firewall and few other crucial applications such as Firefox.

Hard Drive

I mentioned that Knoppix was able to access the drive without much problems. Windows was not as nice. The access was sketchy, and the system partition, which was the one I wanted to access was missing. After few tries I pulled it out, mounted my secondary drive, and closed the box up. I just didn’t feel like dealing with it. What exactly did I loose?

  1. My Firefox Profile – I miss my familiar setup and my adblock filters which I’ve been tweaking for the last few years. Still, all my bookmarks are either in del.icio.us or in Google Notebook so nothing irrecoverable was really lost.
  2. The whole MUGEN folder with the 50+ characters I downloaded. Oh well. I can re-download all this stuff, it will just take time.
  3. Morrowind and HL2 saved games. While these are irreplaceable, I can live without them.
  4. All the emails I pulled from my school email account over the last 2-3 years. Most of that stuff was mirrored on my laptop, and I don’t think I will need any of it any time soon.
  5. Some applications that I obtained… Um… Let’s say, less then legally. I don’t think I ever bothered archiving the iso’s and installation files for these things

That’s about the extent of the damage.

I did learn that my backup strategy needs to be more robust. Because of my unique drive situation, I have been very diligent in backing up the failing drive. I’d usually back up my both drives to an external device twice a week using the Windows NTBackup software. Each time it was a full backup (not an incremental one) and because of space constraints, I would simply overwrite the previous file.

Of course, this plan has one big hole in it. What happens if the machine dies in the middle of a backup? Well, you end up with an unusable, corrupted file. Since Murphy’s Law never fails, this is exactly what happened to me. I thought I was ready, and I thought I was “doing it rite” but I guess I was not.

My new backup plan is:

  1. Backup twice a week like before
  2. Always keep at least 2 backups on the drive
  3. Automatically delete the backup with the suffix _old from the drive
  4. Rename the current backup with a suffix _old
  5. If necessary get another external drive and start a weekly rotation
  6. Check the integrity of backups at least once in a while

Fortunately my policy of keeping crucial data on non-system drive did pay off big time. This is the least amount of data I have ever lost in a critical failure of this magnitude.

I also learned that NTBackup does not like failing drives. I had an old backup from April stashed away somewhere, and it seemed to be in a pristine condition… Unfortunately I was unable to recover anything from the system partition. All the other drives and partitions were fine. Which just goes to show you that auditing the integrity of your backups is crucial even in home environment.

I decided to finally shell out some cash for a UPS. I saw small 1hr ones at Best Buy few days ago for $50. They seemed like a perfect size for your home desktop. This is another lesson that came out of this whole ordeal. If I had a working UPS underneath my desk, chances are the machine would do a graceful shutdown, perhaps extending the life of my drive few more months. Power surges, and hard reboots are definitely not healthy for your hardware.

Finally, I will never use a computer with just one hard drive, unless it’s a laptop. It is a security policy that prevents you from pulling hair, and murdering innocent bystanders in a fit of rage. That second hard drive is crucial to my mental health and I will always have one.

[tags]hard drive, hard drive failure, backup, backups, backup plan, ups, hardware[/tags]

This entry was posted in sysadmin notes and tagged , . Bookmark the permalink.

14 Responses to What Have I Learned From My Hard Drive Failure

  1. Fr3d UNITED KINGDOM Mozilla Firefox Windows says:

    Bad luck with that… but at least you can replace most of the lost stuff :)

    I backup all my important stuff (emails, firefox profile, documents, photos, music etc) more than once a week to a local linux box (which has 2 HDDs in hardware RAID1 and two UPSes to help protect it from power problems). But if one of my local drives died, I would still lose a hell of a lot – my next PC will have a lot more HDDs in, which will all be in RAID 1 or 5 :)

    A quick note about cheap UPSes: they are pretty rubbish. I don’t know the specs of your pc so I couldn’t say how long it would last, but I have a 1000VA UPS (£200/$400 new) for my PC (AMD A64 3000+, 3GB DDR, 4x250GB SATA2, 7800GT), and I doubt it would even last 30 mins… eBay is a great place for UPSes – all of mine came from eBay (costing about 1/4 of the price when new) and have all worked fine :)

    You’re probably right though, saying that the small UPS you have should be enough to allow a graceful shutdown, although even switching the PC to standby would extend the battery life a lot.

    Reply  |  Quote
  2. vacri AUSTRALIA Mozilla Firefox Windows says:

    I use a UPS to help smooth out the brownouts I used to get at my old home. A few times a week you’d notice the lights dim for a second or two, then come back. I went through a couple of power supplies for the one PC in a year, made the mental link, and nicked an old UPS from work. The UPS has a surge filter on it (which protects from non-lightning power spikes) and of course protects against power drops. Since installing the UPS, I’ve not had to replace a power supply (~18 months).

    So, anecdotally at least, they’re not just good for protecting against power outages, they’re also good for protecting against dirty power sources.

    Re: RAID – I learned the hard way from taking over control of someone else’s “my RAIDed file server” that RAID0 is worse than useless, except in very specific circumstances. Instead of having one filesystem that is vulnerable to an error on one drive, you have one filesystem that is vulnerable to a single error on either of two (or more) drives. Ergh.

    Reply  |  Quote
  3. Luke UNITED STATES Mozilla Firefox Windows says:

    Why would one use RAID 0 anyway? I never saw the point of it. I mean how much more expensive and/or difficult can it be to use RAID 3, 4, 5 or 6 instead? Then at least you have some fault tolerance.

    I’m actually considering using RAID 1 array as my system drive on my next box. It might help to avoid stupid shit like this.

    Reply  |  Quote
  4. Fr3d UNITED KINGDOM Mozilla Firefox Windows says:

    RAID 0 is good for very fast storage of non-important stuff, such as games.

    Take a look here for diagrams and descriptions of all the RAID levels. RAID 0, 1, 5 and 10 are the most commonly present on modern motherboards.

    Reply  |  Quote
  5. crucial applications such as Firefox.

    … Yeah i remember windows not booting up because it didn’t have firefox ;)

    Reply  |  Quote
  6. Luke UNITED STATES Mozilla Firefox Ubuntu Linux says:

    [quote comment=”5433″]RAID 0 is good for very fast storage of non-important stuff, such as games.[/quote]

    Is RAID 5 that much slower though? I always figured that if you are RAID’ing anyway, you might as well have some built in fault tolerance.

    [quote comment=”5434″]… Yeah i remember windows not booting up because it didn’t have firefox ;)[/quote]

    LOL. Actually, windows is pretty much self contained, so there is not many applications that can be considered crucial from the OS functionality. I mean crucial for me being able to work on the machine without getting pissed off. I can’t browse the web with IE6 – it just can’t be done.

    Reply  |  Quote
  7. Fr3d UNITED KINGDOM Mozilla Firefox Windows says:

    [quote post=”1763″]Is RAID 5 that much slower though? I always figured that if you are RAID’ing anyway, you might as well have some built in fault tolerance.[/quote]
    It’s slower for writing (the same speed as one single drive) but it’s (usually) much faster for reading.

    Reply  |  Quote
  8. Luke UNITED STATES Mozilla Firefox Ubuntu Linux says:

    Ah, ok. That kinda makes sense. Of course in my mind Fault Tolerance > Speed but I can see how that may not always be the case.

    Btw, I see you’re on Vista now. Are you loving it or are you hating it?

    Reply  |  Quote
  9. Fr3d UNITED KINGDOM Mozilla Firefox Windows says:

    [quote post=”1763″]Of course in my mind Fault Tolerance > Speed but I can see how that may not always be the case.[/quote]
    Indeed – If you want real speed you’d buy a Western Digital Raptor (or Raptor X) and then use RAID 1 ;)

    It has it’s ups and downs :p

    There are some bits that I really like, such as the search in the start menu, which makes it much faster to find a program. The mobility center for laptops is pretty good too.

    However both my installations (PC and Laptop) have worse driver installation procedures than Windows 95… But I think that’s something to do with the ISO, since it’s the same problem on two completely different machines. I did manage to fix it once, but it soon broke again and I haven’t managed to mend it again.

    Bottom line, once you get used to it, and get it configured how you like it, it’s not that bad :)

    Reply  |  Quote
  10. vacri AUSTRALIA Mozilla Firefox Windows says:

    From what I’ve heard, you’d use RAID0 for the express purpose of increasing speed for temporary data – things like squid cache, where you don’t need data reliability, but getting data fast is very important. I’ve also heard the same thing of the reiser4 filesystem…

    Reply  |  Quote
  11. Luke UNITED STATES Mozilla Firefox Windows says:

    Fr3d – how about Hitachi Ultrastar series – 15K RPM! LOL

    I feel kinda inadequate with my 7200 RPM drives right here. Sigh…

    Reply  |  Quote
  12. Matt` UNITED KINGDOM Mozilla Firefox Windows Terminalist says:

    My “backup policy” is essentially just to have a copy of anything important in several places – coursework from school was on the school servers, my laptop and a thumb drive, updated whenever possible, most of the contents of My Documents is mirrored between the PC and the laptop, updated whenever I have anything important enough to warrant it.

    In the event of a crash of either machine all I lose is a ton of legitimately acquired *cough* videos that I don’t have the space or inclination to keep 2 copies of (and the stuff I haven’t watched yet is normally on both anyway, so I don’t have to worry about transferring it when I want to watch it)

    It’s probably not the best way, but I’m lazy, I don’t have yet the fear instilled by experience of a fatal hard drive crash (just a few buggered Windows installations, and they gave me time to evacuate data) and hard drive space enough to backup everything is kinda expensive – I would want to be storing something new on it to justify the cost

    Reply  |  Quote
  13. I wish i learned from you…
    see my blog

    Reply  |  Quote
  14. Luke UNITED STATES Mozilla Firefox Ubuntu Linux says:

    Oh man! It sucks. I commented back with some tips.

    Knoppix saves lives. Not an alternative to a good backup strategy mind you, but hell of a life line to have. :)

    Reply  |  Quote

Leave a Reply

Your email address will not be published. Required fields are marked *