Feeds

Facebook adds Flash to up the tempo of its enormous disk-o-tech

'Anyone wanting to deliver Terabytes to the web might be interested'

Secure remote control for conventional and virtual desktops

Facebook has updated an open source tool that lets admins wring fast performance cheaply from disk-based arrays fed from PCI-e flash cards.

The "Flashcache" tool was updated to version 3.0 by the company on Wednesday. The tool lets the company sit a high-performance cache on PCI-e flash cards to speed access to important data for applications, without having to break the bank and start using all-SSD arrays.

Flashcache is a writeback block caching technology and is implemented as a Linux kernel device mapper target, which makes it easy to use as a general purpose system for highly trafficked applications, Facebook said.

"Our setup of enterprise flash plus massive arrays may be interesting to anyone who wants to build a multiple-terabyte system that needs web access latencies - it does not need rewrite of software to get benefits, so investment even at few machine scale is smaller than putting everything on all-flash," Domas Mituzas, a Facebook data engineer, told The Register via email.

Version 3.0 of the technology has been given better read-write distribution by tuning the disk-side and flash-side sizes of sets to disperse hot data over more of the cache and avoid bottlenecks. Facebook also modified its cache eviction and write efficiency techniques to provide more predictable performance.

Though originally designed at Facebook, the open source technology has received some interest from the wider community. "We see community efforts around it – there is activity on mailing lists, open source code submissions and consulting companies in the database space are providing support for it," Domas Mituzas, a Facebook Data Engineer, told The Register via email.

The next areas of technology development for Flashcache include metadata restructuring to make accessing data more efficient, and making sure that it isn't writing too much into the cache so it avoids flooding the underlying disk infrastructure with queued writes.

"As we end up having multiple terabytes of cache and tens of terabyte of data per machine, we need to cautiously balance usage of memory and CPU," Mituzas explains. "More CPU-efficient algorithms tend to consume more memory. For example, adding additional pointer or timestamp to metadata entry for a system page requires 4GB of RAM if 2TB of cache is being used ... as applications can have great uses for it as well."

But it's worth noting that Facebook's tools are not for everyone, as you need a certain amount of expertise and scale in-house before a fully integrated self-built stack becomes possible.

"There is significant software work required to shift from more expensive to cheaper technology - which saves lots of money at large scale, and on the other hand, going to more capable storage devices allows to move faster in engineering storage-centric systems," Mituzas said. ®

Remote control for virtualized desktops

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
729 teraflops, 71,000-core Super cost just US$5,500 to build
Cloud doubters, this isn't going to be your best day
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
SAVE ME, NASA system builder, from my DEAD WORKSTATION
Anal-retentive hardware nerd in paws-on workstation crisis
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
How to simplify SSL certificate management
Simple steps to take control of SSL certificates across the enterprise, and recommendations centralizing certificate management throughout their lifecycle.
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.