Home > Backup, De-dupe > The Cost of Data De-duplication

The Cost of Data De-duplication

Backup data-deduplication has been one of the hottest technologies of the last few years.   We’ve seen major players like EMC spend 2 billion to purchase DataDomain, and the industry in general is a 2 billion dollar market annually.   Why all the focus here?  Two reasons I believe:

1) These technologies solve an important business need to back up an ever-increasing volume of data (much of it redundant). 

2) Storage manufacturers have to find a way to maintain their growth rates to satisfy Wall St. and backup de-duplication is still one of the fastest areas of growth.

I was shocked to hear the other day when a very large client reported they were moving away from backup de-duplication and simply going to backup on SATA/NL-SAS 3TB drives.   What was the reasoning behind that decision?   The cost of SATA/NL-SAS drives is coming down faster than the cost of data de-duplication.  

That is certainly an interesting theory, and one that deserves some further consideration.  If there is a common challenge I’ve seen with customers, it’s dealing with the cost associated with next-generation backup, and prices have only come down minimally in the past 2 years.   Backup is still an oft-forgotten step-child within IT infrastructure and it’s hard to explain to corporate management why money is needed to fix backups.   Often, when I’m designing a storage and backup solution for a customer, the storage is no longer the most expensive piece of the solution.   Thanks to storage arrays being built on industry-standard x86 hardware, iSCSI taking away market share from Fibre Channel, and advancements in SAS making it the preferred back-end architecture instead of FC, the storage has become downright cheap and backup is the most costly part of the solution.   This issue affects the cost-conscious low-end of the market more than anywhere else, but nonetheless it can be a challenge across all segments.  

In the case of this particular customer who decided it was more economical to move away from de-dupe, there is certainly more to the story.   Namely, they were backing up a large amount of images and only seeing 3:1 de-dupe ratios.    However, I have recently seen another use case for a customer who only needed to keep backups for 1 week where it was more economical to do straight B2D on fat SATA/NL-SAS solution.  By layering in some software that does compression yielding 2:1 savings, it becomes even more economical.   

From the manufacturer perspective, I’m sure it’s not easy to come up with pricing for de-dupe Purpose Built Backup Appliances (PBBAs).   The box can’t be priced based on the actual amount of SATA/NL-SAS in the box as that would be too cheap based on the amount of data you can truly store on it, but it can’t priced for the full logical capacity as there less incentive to use a de-dupe PBBA vs. straight disk.   Generally speaking, to make a de-dupe PBBA a good value, you need to have a data retention schedule that can yield at least 4:1 or 5:1 de-dupe in my experience.   

Even if you can’t obtain 5:1 or greater de-dupe, there are a few additional things worth considering that may still make a PBBA the right choice instead of straight disk.   First, a PBBA with de-dupe can still offer a lot of benefits for bandwidth-friendly replication to a remote site.    Second, a PBBA with de-dupe can offer a significantly better environmental savings in terms of space, power, and cooling than straight disk.   

Advertisements
Categories: Backup, De-dupe
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: