Microsoft hopes it has a sequel better than Godfather Part II: SQL Server 2019 previewed
Hadoop? Spark? Teradata? Oh my!
Ignite SQL Server 2019, the latest version of Microsoft’s venerable database, dropped into preview at the company’s Orlando shindig, Ignite, this week.
Head honcho of SQL Server at Redmond, Asad Khan, was keen to mention the 25 years of history behind the database software. Graybeards who remember the first 16-bit OS/2 release of the software in 1989 would be forgiven for raising an eyebrow at that figure, as would Unix fans recalling the original Sybase release back in 1987.
The 2019 version is quite different to the first version shipped on Windows NT, 4.21, in 1993 and it would be difficult to imagine Microsoft back then being as happy to cuddle up with all manner of alternative platforms as its modern day counterpart.
Additional support for Linux arrives in the preview, as the port, announced back in 2016 and released in 2017, with some notable missing functionality, sees some of the holes closed as Replication (Transactional, Snapshot and Merge) makes an appearance along with support for the Microsoft Distributed Transaction Coordinator (MSDTC). In-database Machine Learning is also present as is OpenLDAP.
However, if you want to talk to the server from your Linux desktop, the command line or VS Code remains your friend since Microsoft has yet to port its legacy SQL Server Management Studio. Alternatively, you could take a look at the freshly released Azure Data Studio, a renamed SQL Operations Studio and available on Windows, macOS and Linux.
Big Data, SQL Server's big thing
Hidden away in the preview (and you’ll need to apply to get on the limited public preview) is support for deploying a Big Data cluster with SQL Server and Spark Linux containers on Kubernetes, accessible using the Hadoop Distributed File System (HDFS). As Kahn excitedly observes, SQL Server 2019 can now provide a unified data platform. It's all quite some way from its relational database beginnings.
Going further along the “single source for reporting” path, Microsoft has also expanded its PolyBase product (which allows SQL Server to process T-SQL queries using data from external sources) beyond Hadoop and Azure Blob storage to include Oracle, Teradata and MongoDB.
Revving up the engine
Down among the oil and gears of the SQL Server 2019 database engine, users will be pleased to find UTF-8 characters finally supported, right down to column-level collation for text data. Microsoft reckons that switching away from the likes of
NVARCHAR, with their UTF-16 encoding, to a UTF-8 configured
CHAR type, users could realize almost 50 per cent savings in storage.
Indexing sees improvements, most notably with the arrival of Resumable Index Create, which will allow a lengthy, resource hungry, index operation to be paused in order to free up system resources. Clustered columnstore indexes can now also be built online, removing the need to take things offline while SQL Server does its work.
Other tweaks have been made in query processing, with more big data functions making an appearance, including
APPROX_COUNT_DISTINCT to return a result typically within 2 per cent of the correct answer without taking out the database while the engine churns through a massive data set. Java code can also be executed once Machine Learning Services is enabled.
But what of the cloud?
With SQL Server 2019, Microsoft has taken a huge step toward making the product a fully fledged member of the big data club, certainly in terms of giving developers a single point (if spread over compute groups) to query from an array of virtualised data sources.
However, with Azure SQL Managed Instances arriving this week and providing nearly complete compatibility with the on-premises version of SQL Server it is only a matter of time before a cloudier version of Microsoft’s database product line eclipses its venerable predecessor. ®