Huge SSDs will force changes to data protection strategies – with @alexgalbraith

Summary: 60TB is a lot of data for a single device, that could fail unexpectedly as disk drives sometimes do. Alex Galbraith (@alexgalbraith) goes on a ridecast with us to discuss the tactics and strategies for protecting data in the new world of enormous SSDs.

Follow on blog:

Alex wrote a blog post on the topic which discusses data protection for these large SSDs in greater detail:

VulcanCast Follow Up – A few thoughts on 60TB SSDs

Transcript:

MF: Hi, this is a technology ride cast I’m Marc Farley and our special guest this afternoon is Alex Galbraith. How are you doing Alex?

AG: I’m good, thanks Marc, how are you?

MF: Thanks for coming on. So the hot topic this week is the new Seagate 60TB drive. Have you had a chance to look at it and what do you think its impact is going to be on the industry?

AG: It’s pretty interesting. The thing that really impressed me was obviously the power to capacity ratios that you’re going to be able to get out of these things. Working in the service provider industry it’s really big thing for us, but I can’t really see these necessarily being used for primary workloads, more for archiving and secondary tier, maybe media storage and things like a sneakernet type solution.

MF: Yeah like Amazon, what you call it, umm …

AG: Snowball

MF: Snowball,

AG: that’s the one

MF: Data in a suitcase where you ship it over

AG: Exactly. If you can get 50 terabytes in something that’s about the size of your hand it’s going to be a lot more convenient to take one with you on the airplane when you want to move it from AWS and out.

MF: Yeah, very interesting. So how do you think this changes the data protection landscape?

AG: It’s a similar challenge that we’re facing with object storage these days which is that the data volumes are just getting so massive. I mean how can you possibly, for example, back up that data when you’re talking about, you know, data quantities of this scale.  You can imagine the rate of change on them is going to be pretty high as well, so being able to look at backing up that data in a single night or even the changes might be beyond the capability of a typical backup solution. I’m starting to think that maybe once you’re reaching these kind of multi-petabyte scale solutions it’s going to come down to having to look at what do we do about ensuring the durability of the data in the first place. If you can increase durability of the data by using things like erasure coding it’s going to reduce the likelihood of you having to, you know, utilize your backup solutions. But on top of that, when it’s 60 terabytes of data in one go, how quickly can you rebuild that data. Say you’re using RAID, I don’t really see that taking 24 hours or so to rebuild a drive – it’s not really that feasible, so again I’m looking at other technologies to increase the rate at which we can get that data back for you. It’s  almost like we need to entirely rethink the way that we protect data from the ground up.

MF:  So considering that, what kind of customers are likely to use this device then?

AG: Well, I can see things like media companies. I mean, one example would be Netflix. You know that you can get quite reasonable read rates – and random read rates – out of these drives – they’ve not produced the stats around how decent the write rates are and I suspect that’s because they’re probably not very good.  It’s the kind of thing that say a Netflix might use for their caching devices on their edge, you might also see people using it for photo media so when you go into Facebook, for example, you might not look at a photo for two years but then you go back and you expect it to appear almost instantaneously. Using tape and and slower S3, that’s just not really feasible, whereas these kinds of technologies might actually help to accelerate those pieces of data you don’t access often but when you want them, you want them now.

MF: Yeah I can see where it would be a great fit for Facebook, you know, that seldom accessed data but if someone wants to access it, you know, they don’t want to wait 15 seconds for it they want to see it, you know, in one or two seconds – that’s reasonable.

AG: Exactly

MF: Alex this was great having you on. Thanks for your insights on this drive.

AG: My pleasure, thanks very much Marc

 

podcast-logo

1 Comment

Leave a Reply

Your email address will not be published. Required fields are marked *