This move by Dropbox will reduce users' files to tiers: Rarely, regularly accessed data now kept separate

Not all documents are created, er, stored equal

Tue 7 May 2019 // 18:17 UTC

Cloudy storage provider Dropbox has enhanced its bit barns with a tiered storage architecture that divides the contents of the platform into frequently accessed "warm" data and "cold" data, with the latter less likely to be disturbed.

The storage shop has changed the way it does replication for older data, cutting the amount of disk space needed by 25 per cent less disk, while providing the same levels of reliability and just slightly longer access times. One would hope the savings will be passed on to customers.

"The end experience for users is almost indistinguishable between the two tiers," the company said this week.

The cold data repository is based on the same dedicated Magic Pocket infrastructure that Dropbox announced in 2016. It features servers operated by the company, rather than standard infrastructure from AWS, which Dropbox used to store files in the early days of the platform.

AWS offers its own cold storage service called Glacier – designed for the enterprise and billed separately from the company's Simple Storage Service (S3). Dropbox has created something very different and built cold storage into the back-end so data tiering is achieved automatically – the customer can't decide how their files will be classified.

Penguins line up to dive into the icy water from the ice floe.

Dropbox plans to drop encrypted Linux filesystems in November

In a technical blog post, staff engineer Preslav Le explained that over 40 per cent of all requests on the company's platform are for data uploaded in the last 24 hours, and over 90 per cent of requests for data uploaded in the past year.

One of the simplest ways to cut the costs of storing older data was to tweak replication. Previously, full copies of all files were replicated across multiple data centres, located thousands of miles apart. "We needed to somehow remove the full cross-region replication, but still be able to tolerate geographic outages," Le said.

After several experiments, Dropbox came up with a model that split storage blocks into fragments and striped those fragments across multiple regions. To get a block, the system issues a get request to all three regions, waits for the fastest two responses, and cancels the remaining request.

"The most obvious downside of this model is that, even in the best-case, we can't satisfy a read without fetching fragments from multiple regions," Le said.

"Overall, the results we got significantly beat our expectations. Such a small difference would not affect the end user experience, which is dominated by transferring data over the internet. That allowed us to be more aggressive in what data we consider 'cold' and eligible for migration." ®

Topics

Special Features

Vendor Voice

Resources

SaaS

This move by Dropbox will reduce users' files to tiers: Rarely, regularly accessed data now kept separate

Not all documents are created, er, stored equal

Dropbox plans to drop encrypted Linux filesystems in November

More about

More about

More about

More about

More about

TIP US OFF

Other stories you might like

The truth about Dropbox opening up your files to AI – and the loss of trust in tech

Dropbox drops bucks to ditch digs in long-term WFH model

Dropbox limits ‘all the storage you need’ unlimited plan, blames abusive users

Reducing the cloud security overhead

Dropbox drops 16% of staff, points finger at hard-up customers and AI

Dropbox admits 130 of its private GitHub repos were copied after phishing attack

Microsoft mucks with PrtScr key for first time in decades

Dropbox unplugged its own datacenter – and things went better than expected

Indian government issues confidential infosec guidance to staff – who leak it

About Us

Our Websites

Your Privacy