Travel IT giant Amadeus making eyes at Micron's SolidScale architecture

Global biz is flying to fast-access frames of flash

Man throws briefcase in the air happily on the beach. Photo by Shutterstock

Analysis Amadeus, the global travel booking business, is testing Micron's SolidScale NVMe flash arrays, thinking they can provide vastly better realtime access to the terabytes of flight information it holds on behalf of airlines and travel operators.

Nearly 75,000 travel agencies and more than 11,000 airline sales offices use the Amadeus Computer Reservation System (CRS) to run their business. Amadeus started out processing on mainframes, and still operates a couple of them. It now runs a whole lot of Linux x86 servers.

Its global system is accessed from more than 190 countries, with main sites in Madrid(HQ and marketing), Nice in France (development) and the operations centre in Erding, Germany. Its data centre stores more than 49PB of data and currently handles more than 3.8 billion transactions per day at peak load, upwards of 55,000 transactions per second.

A boarding pass checked for Lufthansa at Sydney Airport, Australia, requires a network access to the Erding centre where departure control system information is accessed. The round trip takes 800ms.

The TBs of CRS data cannot be held in-memory, the cost would be astronomical, but fast access flash arrays could be affordable and support its need to cope with the formidable rise in travel bookings and flight information accesses it is facing. Fast access means NVMe over Fabric access to shelves packed with NVMe flash drives.

To give an impression of Amadeus's system needs, 647 million passengers boarded in 2015. Its CRS processed 595 million bookings in 2016, meaning 1,074 per minute, or 18 each second. The travel item availability requests it processes are far greater in number.

In May 2011 they passed 50 billion. This month it expects to pass 700 billion. Here's a chart it provided at a Micron briefing event in London earlier this month showing this data:

Amadeus_processing_explosion

It says the per-second number is 8 million availability requests.

Paul Hubert from Amadeus's CTO office said the company was an early adopter of FusionIO's PCIe flash cards in 2010. It's now transitioning to NVMe flash drives.

It holds a huge amount of data on 3.5-inch disk drives and wants to move it to flash, partly to lower power consumption.

He said the database ecosystem evolution at Amadeus includes:

  • Moving from fundamentally highly consistent to especially available (CAP* theorem)
  • Transactional engines have scalability challenges; need several solutions tackling different problems (Key Value, Document, Visualisation, Graph, Full Text Search)
  • Availability mechanisms are moving off the infrastructure up to the application
  • Database availability principle evolving from shared storage to replications
  • Flash becoming the primary storage media

He identifies two infrastructure consequences:

  1. Ephemeral storage (i.e. local to the server) is becoming a significant deployment pattern
  2. Consolidation on highly standardised X86 servers (primarily 2-socket servers)

It means Amadeus is looking more and more at flash-based storage:

Amadeus_flash_journey

Hubert prefers external storage for the app-running servers but doesn't want to pay a network access latency penalty. There is already enough latency when an availability or booking request comes into the Erding data centre.

Enter NVMe over Fabrics, which adds, he suggests, only 1 per cent extra time to a networked storage access request compared to direct-attached NVMe flash drives. The slide above expresses this thought: we "need to go back to some form of 'SAN approach' but keep the low latency and high throughput of NVMe."

And here is the kicker for suppliers of NVMe flash arrays, such as Micron:

We hope to transition ASAP to flash for high storage density and a lower power consumption.

But he is strict on the need for sustained consistent latency, with no so-called long or tail latency issues, where some access takes vastly longer to complete:

Amadeus_Low_latency_Chart

What Hubert would like is for the application to schedule and manage the garbage collection of deleted cells so that it does not interfere with data access latency consistency. If Micron is to achieve that then its NVMe SSDs have to relinquish garbage collection control to upper-stack software, such as the Excelero NVMe Mesh or possibly even higher to Amadeus's own code.

Since Micron prides itself on modifying/customising its flash firmware, etc, for customers this could be music to its ears.

Hubert talked about a pod or rack containing a bunch of servers and a bunch of flash shelves which they access as shared external storage using NVMe over Fabrics. This is very close to Micron's SolidScale concept.

He places this online transaction processing (OLTP) flash storage in a continuum of other storage setups:

Amadeus_Storage_set_ups

For Amadeus, NVMe over Fabrics flash is just one aspect of its overall storage needs but it may, in this respect, be a harbinger of what is coming to more generalised enterprise data centres.

Summation

Amadeus is an outlier in terms of high-capacity flash array adoption. It is looking to store mission-critical data in its flash arrays and have OLTP applications access it, not realtime analytical applications in a big data scenario.

For suppliers to the Big Data realtime analytics market, Amadeus is a waypoint, a sign of things they hope will come. And key to that hope is QLC – quad-level cell flash with 4bits/cell. Amadeus is testing a couple of 100TB, QLC SSDs. Joining the dots formed by Amadeus OLTP access to masses of OLTP data stored in flash arrays, Micron SolidScale arrays, Excelero NVMe over Fabrics storage software and 100TB SSDs provides us with the sketchy outline of coming realtime Big Data analytics storage systems.

*CAP stands for Consistency, Availability, and Partition tolerance. It is said that a data store cannot provide more than two of these at the same time; specifically you must choose between consistency and availability when a network partition or failure takes place.




Biting the hand that feeds IT © 1998–2018