Feeds

Facebook adds Flash to up the tempo of its enormous disk-o-tech

'Anyone wanting to deliver Terabytes to the web might be interested'

5 things you didn’t know about cloud backup

Facebook has updated an open source tool that lets admins wring fast performance cheaply from disk-based arrays fed from PCI-e flash cards.

The "Flashcache" tool was updated to version 3.0 by the company on Wednesday. The tool lets the company sit a high-performance cache on PCI-e flash cards to speed access to important data for applications, without having to break the bank and start using all-SSD arrays.

Flashcache is a writeback block caching technology and is implemented as a Linux kernel device mapper target, which makes it easy to use as a general purpose system for highly trafficked applications, Facebook said.

"Our setup of enterprise flash plus massive arrays may be interesting to anyone who wants to build a multiple-terabyte system that needs web access latencies - it does not need rewrite of software to get benefits, so investment even at few machine scale is smaller than putting everything on all-flash," Domas Mituzas, a Facebook data engineer, told The Register via email.

Version 3.0 of the technology has been given better read-write distribution by tuning the disk-side and flash-side sizes of sets to disperse hot data over more of the cache and avoid bottlenecks. Facebook also modified its cache eviction and write efficiency techniques to provide more predictable performance.

Though originally designed at Facebook, the open source technology has received some interest from the wider community. "We see community efforts around it – there is activity on mailing lists, open source code submissions and consulting companies in the database space are providing support for it," Domas Mituzas, a Facebook Data Engineer, told The Register via email.

The next areas of technology development for Flashcache include metadata restructuring to make accessing data more efficient, and making sure that it isn't writing too much into the cache so it avoids flooding the underlying disk infrastructure with queued writes.

"As we end up having multiple terabytes of cache and tens of terabyte of data per machine, we need to cautiously balance usage of memory and CPU," Mituzas explains. "More CPU-efficient algorithms tend to consume more memory. For example, adding additional pointer or timestamp to metadata entry for a system page requires 4GB of RAM if 2TB of cache is being used ... as applications can have great uses for it as well."

But it's worth noting that Facebook's tools are not for everyone, as you need a certain amount of expertise and scale in-house before a fully integrated self-built stack becomes possible.

"There is significant software work required to shift from more expensive to cheaper technology - which saves lots of money at large scale, and on the other hand, going to more capable storage devices allows to move faster in engineering storage-centric systems," Mituzas said. ®

Build a business case: developing custom apps

More from The Register

next story
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
prev story

Whitepapers

A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.