Here is an interesting question: how long would it take to send a petabyte of data from Sa Fran to Hong Kong? Jonathan Schwartz of Sun makes a very interesting point: it would actually be faster to ship that data on a sailboat than to send it over the internet. At the current data transfer speeds, such a transmission would take at least few years – you can follow Mr. Schwartz’s article and calculate it yourself.
I have a follow-up question: assuming that you do have a petabyte of data on tape, and a facility in Hong Kong that can actually store that petabyte and process it. How long would it take to load that data from the tape? I looked around and saw bunch of tape drives which claim they have transfer speeds anywhere in between 25MB/s to 80MB/s. So let’s assume the Hong Kong facility has a good drive which can transmit data at the modest speed of 100MB/s. Let’s also assume that there is no tape switching involved in the process (or that it is instantaneous). It would still take around 10 million seconds, or if I’m not mistaken, would be around 4 months of continuous operation. If you add tape switching times, it will end up being more (most of the tape drives I worked with would take up to a minute to align the heads and position themselves after plopping a new tape into them).
This is of course the time it takes to read data off the tape. We still need to remember that this information must be written somewhere. I think the best you can get on the market right now are drives that rotate at 15,000 RPM giving you approximate 110 MB/s transfer rate. Of course there is no pentabyte drive out there, so we are talking about a disk array here so you probably need to add some logical overhead for writing across many drives and etc.
How fast can we push data from the tape to the hard drive? A 800MHz, 32 bit FSB can pretty much send 1600 MB/s which means it is still much faster than both the disk and tape by a factor of 10. But it does give us an upper limit of how fast we can go. Even if you find super fast media devices, you will still only be able to transfer roughly 1GB/s because of the bus speed. And at that rate, transferring a petabyte will still take you over a week.
Even in the best case scenario, moving this much data is a pain in the ass. No one in their right mind would attempt to ship a petabyte of live operating data to another location, and no one would actually need this much data transferred to them in one go. It would be much more feasible to transfer the data as needed.
The only feasible scenario when it would sense to ship around this much data, is if you would be moving your data center, or relocating/copying your backup archives. In both cases, moving the actual physical media would be your first instinct anyway. So what this article illustrates is not really a real world problem – at least not yet. It shows us something a little bit different.
I found Johnatan’s article interesting because it shows this strange dichotomy in the way we think about data. Wen someone asks about shipping a perabyte of data, the network transfer seems plausible solution – at least initially, because it is hard for us to actually imagine how much data that actually is. On the other hand, if someone actually shown you a pentabyte of data in it’s physical form – a truckload of tapes or a gigantic disk array, you would immediately cross off network transfer from your list.
Btw, if my math, or the hypothetical hardware specs are dead wrong please feel free to correct me in the comments.
[tags]petabyte, johnatan schwartz, sun, transfer, backup, tape, drive, fsb, speed[/tags]