Storage Efficiency: Multi-Layering of Deduplication |
Thursday, 06 August 2009 by Michel Roth | |
As we're all aware, the amount of data being created today is growing exponentially as new devices (iPhone, Blackberry) and applications (Social Media) encourage users to create more digital bits. Analytics are also contributing a greater piece of our ability to conduct business and predict future trends.
So IT managers are now facing three issues:
So how is an IT manager supposed to deal with these conflicting challenges? For those IT managers that use NetApp storage, one of the most powerful tools they have in their toolbox is Deduplication.
Why is Deduplication so powerful? First of all, it addresses item #1, which is that often times data is replicated (with limited changes) into multiple forms. NetApp deduplication is able to remove all the replicated/redundant data from your primary storage arrays, immediately savings disk-space. Typically we see this form of deduplication save at least 50-70%. And the value of disk-deduplication extends beyond just the primary storage. When storage is replicated via SnapMirror to backup or disaster-recovery sites, those deduplication savings are extended to the remote site. Not only does it save on WAN costs, but it saves on disk costs at the remote site.
But we still haven't addressed item #2, which is that more data typically requires more storage, either because of space or performance requirements. This is where deduplication can once again be a powerful tool in the toolbox. This time it's in the form of dedup-aware Intelligent Caching. By adding a NetApp Performance Acceleration Module (PAM), the storage arrays are able to significantly increase performance by offloading the need to constantly access disk for frequent read/writes. So this intelligence cache reduces the amount of disk required without sacrificing performance to the application. Taking this a step farther, the NetApp PAM module is aware of deduplication, so it is able to intelligently reduce the amount of redundant data it keeps in cache. This allows more frequently-accessed information to populate the PAM, again assisting in the reduction of storage needed to provide the required performance.
|
Hi Michael...one other technology admins are using is real-time compression. It reduces the size of every file created up to 10x before it's even written to disk. There's no degradation in performance and admins can still deupe compressed data. The combination of real-time compression and NetApp dedupe creates huges savings throughout the data lifecycle. For more, check out www.storwize.com.
Thanks...Peter Smails
Hi Peter,
Thanks for the reply. What puzzles me is how one can compress without performance degradation? No CPU overhead? Or are you just saying it is small :-) ?