Huawei hugs open-sourcey Alluxio: Thanks for the memories
China giant + OS software in big data analytics acceleration scheme
Huawei has announced a Big Data analytics acceleration scheme using its FusionStorage product and Alluxio open source software; which seems to be the canine genitalia du jour for speeding up lethargic analytics queries.
Alluxio is the renamed Tachyon Nexus, an Andreessen Horowitz-backed startup.
FusionStorage is Huawei's distributed software-defined storage system. The latest FusionStorage 6.0 supports distributed block, file, and object storage, with classification, encryption and deduplication.
Alluxio's software is a memory-centric, virtual distributed storage system. It functions as a ginormous local cache for a remote storage system, such as a set of HDFS nodes. It is based on a cluster of local nodes which are accessed by compute nodes running Big Data analytics jobs and queries.
The software provides:
- Tiered storage (memory, flash, disk) with automatic data placement
- Single namespace with transparent naming
- Native S3, Google Cloud Storage, Open Stack Swift, Alibaba OSS, Microsoft Azure Blob store integrations
- Fuse Connector, key-value interface
- One command cluster deployment
- Metrics reporting
Alluxio supports lots of different underlying (remote) storage systems include HDFS, Gluster, S3, OpenStack, GCS, NFS, OrangeFS, IBM Spectrum Scake, Ceph, Isilon and others.
Baidu ran Spark queries 30 times faster with Alluxio. Batch queries that took 15 minutes completed in less than 30 seconds and a 1000-user Alluxio cluster provided more than 50TB of RAM space.
Barclays accelerated Spark jobs from hours to seconds by interposing an Alluxio set-up between the query-running compute nodes and a Teradata data repository.