Feeds

Delete all you like, but it won't free up space

You've been (de)duped ...

Beginner's guide to SSL certificates

Comment: Networker blog author Preston de Guise has pointed out a simple and inescapable fact: deleting files on a deduplicated storage volume may not free up any space.

De Guise points out that, in un-deduplicated storage: "There is a 1:1 mapping between amount of data deleted and amount of space reclaimed." Also, space reclamation is near-instantaneous. With deduplication neither need be true.

Huh? Think about it. You add files to a deduplicated volume and any blocks of data in them that are identical to existing stored block groups get deduplicated out of existence and replaced by pointers. The file shrinks. This carries on as more files are added. The drive's capacity gets used up. You become aware of this. You start deleting files to reclaim space. You may find that much of the deleted files' originally fat content is actually skinny pointers and you just reclaim a few bytes of space instead of megabytes or terabytes. Oops; you just got stuffed by deduplication.

Space reclamation with dedupe also requires the dedupe function to do some scanning once a file is deleted:

Whenever data is deleted from a deduplication system, the system must scan remaining data to see if there are any dependencies. Only if the data deleted was completely unique will it actually be reclaimed in earnest; otherwise all that happens is that pointers to unique data are cleared. (It may be that the only space you get back is the equivalent of what you’d pull back from a Unix filesystem when you delete a symbolic link.)

Not only that, reclamation is rarely run on a continuous basis on deduplication systems – instead, you either have to wait for the next scheduled process, or manually force it to start.

His conclusion is this:

The net lesson? Eternal vigilance! It’s not enough to monitor and start to intervene when there’s say, 5 per cent of capacity remaining. Depending on the deduplication system, you may find that 5 per cent remaining space is so critically low that space reclamation becomes a complete nightmare.

He recommends the use of "alerts, processes and procedures targeting" a set of capacity utilisation levels such as 60 per cent, 70 per cent, 75 per cent and so on.

Great idea. Preston de Guise is a clever guy. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
The cloud that goes puff: Seagate Central home NAS woes
4TB of home storage is great, until you wake up to a dead device
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Internet Security Threat Report 2014
An overview and analysis of the year in global threat activity: identify, analyze, and provide commentary on emerging trends in the dynamic threat landscape.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.