Feeds

Delete all you like, but it won't free up space

You've been (de)duped ...

Next gen security for virtualised datacentres

Comment: Networker blog author Preston de Guise has pointed out a simple and inescapable fact: deleting files on a deduplicated storage volume may not free up any space.

De Guise points out that, in un-deduplicated storage: "There is a 1:1 mapping between amount of data deleted and amount of space reclaimed." Also, space reclamation is near-instantaneous. With deduplication neither need be true.

Huh? Think about it. You add files to a deduplicated volume and any blocks of data in them that are identical to existing stored block groups get deduplicated out of existence and replaced by pointers. The file shrinks. This carries on as more files are added. The drive's capacity gets used up. You become aware of this. You start deleting files to reclaim space. You may find that much of the deleted files' originally fat content is actually skinny pointers and you just reclaim a few bytes of space instead of megabytes or terabytes. Oops; you just got stuffed by deduplication.

Space reclamation with dedupe also requires the dedupe function to do some scanning once a file is deleted:

Whenever data is deleted from a deduplication system, the system must scan remaining data to see if there are any dependencies. Only if the data deleted was completely unique will it actually be reclaimed in earnest; otherwise all that happens is that pointers to unique data are cleared. (It may be that the only space you get back is the equivalent of what you’d pull back from a Unix filesystem when you delete a symbolic link.)

Not only that, reclamation is rarely run on a continuous basis on deduplication systems – instead, you either have to wait for the next scheduled process, or manually force it to start.

His conclusion is this:

The net lesson? Eternal vigilance! It’s not enough to monitor and start to intervene when there’s say, 5 per cent of capacity remaining. Depending on the deduplication system, you may find that 5 per cent remaining space is so critically low that space reclamation becomes a complete nightmare.

He recommends the use of "alerts, processes and procedures targeting" a set of capacity utilisation levels such as 60 per cent, 70 per cent, 75 per cent and so on.

Great idea. Preston de Guise is a clever guy. ®

5 things you didn’t know about cloud backup

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story

Whitepapers

Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.