Feeds

ScaleBase shatters MySQL for scalability

And glues it back together for speed

Internet Security Threat Report 2014

For both performance and capacity reasons, companies running large transaction processing systems, whether they are tickled directly by Web users or just end users working behind the company firewall, sometimes have to partition their production relational databases. This practice, called sharding, is a pain in the neck. Actually, it is several pains in the neck. And ScaleBase has some software unguents to cope with it.

ScaleBase is a startup founded in Israel and located now in Boston that is rolling out its first product, called the Database Load Balancer, out today. As the name suggests, it is a proxy server that sits in front of the actual database and in this case, the tool breaks a monolithic relational database into chunks and spreads it out across multiple physical servers. (That's the sharding part.)

Companies have been doing sharding for years, and it is a particularly popular technique for goosing the performance of databases working on very large servers or spread across many clustered nodes. However, if you shard your database, you have to basically rewrite the entire data access layer of a database management system – this is basically what Oracle Real Application Clusters, or RAC, does. Homegrown sharding algorithms often spread data over a fixed number of nodes, and reports and applications based on the database have to be tweaked to be aware of the shards. Backing up and tuning each database node also has to be done more or less by hand.

ScaleBase logo

The ScaleBase Database Load Balancer wants to trick those applications, backup programs, and report writers into thinking they are talking to one database even though they are legion. The automated sharding software is the brainchild of Doron Levari, currently the CEO at ScaleBase, and Liran Zelkha, vice president of business development. Levari has been a database administrator for 15 years, and ran Aluna, a database consulting firm that was eventually sold to Matrix, the largest system integrator in Israel. Zelkha has worked for a number of large-scale database and cloud projects and kept running into the same issues of performance and scalability.

"The third time we wrote a sharding layer for a customer, we knew we were on to something," Zelkha tells El Reg with a laugh.

The Database Load Balancer looks exactly like a MySQL or Oracle database would at the network level to any application. But it shards the database across multiple nodes, and does so automatically. The database proxy then accepts SQL commands and depending on what those commands are, it either runs the query against the appropriate subset of the database or across all the shards at once. You don't have to change one line of your application code, but you may have to work out a different license with your database vendor.

The Database Load Balancer is packaged up in a virtual machine that is compatible with Amazon's EC2 compute cloud (that's where the 500 beta testers have been playing around with it since it quietly went into beta in January) as well as VMware's ESXi hypervisor. The database shards themselves can be run inside virtual machines or on bare metal, the Database Load Balancer doesn't care. Each node of the ScaleBase tool can manage from 8 to 12 database nodes, according to Zelkha, and the server running the Database Load Balancer is "nothing too fancy", just a two-socket machine with four-core x64 processors and 16GB of memory doing the trick.

MySQL is just a start

At the moment, the Database Load Balancer supports the open source MySQL database, now controlled by Oracle, and Zelkha says that the next database to get front-ended and sharded will likely be Oracle's eponymous database. Depending on customer demand, ScaleBase will add support for IBM's DB2, Microsoft's SQL Server, and other open source databases such as PostgreSQL. The sharding program was written in Java and requires a Java SE6-compliant runtime to operate. While all of the beta testers have deployed the tool on top of Linux, the program will run atop AIX. Solaris, HP-UX, and any other box that has the right Java support. Customers should cluster their ScaleBase sharding nodes for high availability, of course, and the architecture recommends having standaby servers for each shard as well.

The Database Load Balancer is not for everyone, says Zelkha. It is aimed at databases that are 50GB or larger and that have to field tens to hundreds of requests per second. Depending on the configuration that customers use to shard the database, they can cut response time to one-quarter to one-half of whatever it was on a monolithic database setup and get in position for linear scaling as they need to add data and therefore nodes to their sharded database clusters. (ScaleBase has posted some initial performance metrics to give you a feel for it.) Zelkha warns that on databases that are under 10GB in size, using the tool will actually probably hurt performance.

At the moment, ScaleBase is targeting transaction processing and hyperscale Web applications with the tool, but over time will make tweaks to it that help companies build clustered data warehouses. Within the next twelve months, ScaleBase will add support at least for one more database – either Oracle 11g or SQL Server. It could do better than this, depending on how sales take off and the technical issues with front-ending these databases.

Unlike a lot of software vendors, ScaleBase has also published its price list. A development version of the Database Load Balancer costs $1,500 per year per back-end database node and comes with 9x5 business hour support. The Enterprise Edition allows for the ScaleBase tool to be installed on multiple servers and has 24x7 support and costs $5,000 per back-end database and $4,500 per database node if you order ten or more. The Premium Edition boosts support response to one-hour turnaround (from four hours in the Enterprise Edition) and adds phone support as well (instead of just web and email); it costs $6,000 per database node, or $5,400 if you order ten or more. ®

Internet Security Threat Report 2014

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
IBM storage revenues sink: 'We are disappointed,' says CEO
Time to put the storage biz up for sale?
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.