Inside the Hekaton: SQL Server 2014's database engine deconstructed

Nadella's database squares the circle of cheap memory vs speed

Boost IT visibility and business value

It's 1996 and Mission:Impossible has just arrived on the cinema screens. RAM is $10 per megabyte and falling. Against this backdrop, Microsoft has quietly commenced its own seemingly impossible mission, a complete re-write of a little known database which will be released two years into the future, and will be known as SQL Server 7.

Redmond's new database will compete with the entrenched kings of the database world, Oracle and DB2.

Fast-forward to 2014.

MI:4 has come and gone. RAM costs $10 a gigabyte. And now we have the latest incarnation of one of the industry's best-selling databases: SQL Server 2014. Microsoft's freshly minted chief executive Satya Nadella presides over the launch at an overly orchestrated corporate event in San Francisco, California, intended to demonstrate his company is all over "big data."

SQL Server is consistently one of the strongest growers in Microsoft's product portfolio, meaning Microsoft's revenue share of the RDBMS market has grown in nearly two decades. Exact numbers vary according to the analyst you buy into, but what's not in dispute is the fact that Microsoft has clawed out a third-place position behind IBM and Oracle (number one) who have been going at this for longer.

Arguably, the only thing that's prevented SQL Server from doing better is the fact it's only available for Windows.

Market share aside, perhaps the two most significant changes in those 18 years have been on the technology front - the increase in server processing power and the price of RAM, the latter decreasing in cost one thousand fold. This is significant, as it is these two factors that are the major drivers behind SQL Server 2014's key theme of performance.

Memory is the new disk

When designing this release, Microsoft re-evaluated the reasons data is stored on disk rather than in memory. This is reflected in the three key features of this release. Assumptions valid when SQL Server 7 was written no longer hold true, ie: main memory was expensive and limited, and CPUs and operating systems had only 32-bit address spaces.

Main memory is now huge, dense, and most importantly, very cheap. Durable flash memory has and will almost certainly continue to follow the same curve. Whilst 32-bit Windows coped with at most 3GB of usable memory, CPUs are now 64-bit and Windows Server 2012 can address 4TB of RAM. Memory sizes are growing at such a rapid rate, it's estimated that more than 95 per cent of disk-based online transaction processing (OLTP) databases are small enough to be stored in-memory.

Conversely, there are problems with disk. Data retrieval latency is orders of magnitude slower than memory. We're talking milliseconds compared to nanoseconds, a million-fold difference.

Also, as processors have reached commercially viable speed limits, to increase their computational throughput CPU manufacturers have pipelined and scaled out their processors. In 1996, twin core CPU systems were a rare novelty and hugely expensive. In 2014, encountering 16, 24 or 32 cores inside a database server is a given.

These parallel giants consume data at a staggering rate, spending much of their time waiting on the disk subsystem to supply it, whilst disk-oriented mechanisms have evolved within the database engine to ensure data updates behave in the consistent way we expect them to in order to maintain ACID compliance.

It is time to re-think why data is stored on slow, unreliable disk, for anything other than recovery and archive purposes.


In SQL Server 2014, Microsoft has started again. The Hekaton database engine – an addition to rather than a replacement for the core SQL Server database engine – now has the capability to store tables in memory. In contrast to Oracle's TimesTen and IBM's SolidDB, SQL Server's in-memory database engine is fully integrated into the core product, and at no additional cost. This makes adoption attractive both technically and financially.

One table at a time migration is practical, reducing risk without requiring expensive consultancy or new skill sets. All common data types are supported - only the less often used XML, CLR and max() types are excluded. Table schemas only, or schemas and data can be persisted in memory, the former useful in cases where temporary data does not have to be persisted, such as staging tables. Furthermore, in contrast to its competitors, all tables or one table at a time can be migrated.

Full ACID compliance is ensured through optimistic MVCC (multi version concurrency control). Row-versioning and time-stamping is used rather than page locks to ensure ACID compliance, now a feasible technique as each row is stored in memory rather than disk. This results in the elimination of locking and blocking, and the waits and timeouts associated with them, a major bane of any production database administrator's life.

As a result, disk read waits are completely eliminated, as all reads are now performed from memory. Schema changes have to be logged to disk, not a big deal as they are comparatively rare. If the schema plus data table type is used, data changes are also logged to disk to provide durability for ACID compliance and recovery.

In this case, checkpointing and write logging still has to be performed to maintain durability, but this is now implemented much more efficiently, and in a well configured system rarely presents a bottleneck.

Hekaton's in-memory features are supported by traditional backup and restore and SQL Server's high availability options, AlwaysOn and log shipping. Puzzlingly, replication is not supported, though neither is database mirroring, which is understandable as it is soon to be deprecated.

A new migration aid, the Analysis, Migration and Reporting tool, is included as part of SQL Server Management Studio to assist the DBA in converting a schema to in-memory tables. With a few clicks, this generates a report containing recommendations and any necessary changes for successful migration.

Although this is of great assistance, a little new knowledge needs to be acquired by the DBA to transition data from disk to memory resident tables, particularly in regard to the restrictions imposed upon table schemas. Developers are unlikely to persevere with them in the current release, but Microsoft’s recent track record with SQL Server is to ship functionality and reliability first, then add GUI based front end management tools later – the way it should be.

In the final pre-manufacturing release, the most notable issues are lack of support for truncate and merge statements, and a clutch of disallowed query hints that might cause existing code to break. There are other minor restrictions, but Microsoft needs to address the keyword constraints, as Hekaton will probably be first adopted by data warehouse customers, where the use of both statements is likely to be more prevalent. The migration wizard also needs some work in terms of its recommendations, but this will no doubt improve in time.

The essential guide to IT transformation

Next page: Stored procedures

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.