The Register talks to Azure Data Veep about Synapse and SQL Server
How to query the telemetry from 1 trillion queries a day
You may recall that Azure SQL Data Warehouse got a blasting with the Redmond rebrandogun at the company's Ignite event earlier this month. The Reg caught up with corporate veep for Azure Data, Rohan Kumar, at the recent Big Data event in London to find out more.
Kumar is a Microsoft veteran, having joined at the end of the Bill Gates era, clung on during the days of Steve Ballmer and, finally, survived the purges of current CEO Satya Nadella. Kumar's transition from work on the doomed WinFS file system to SQL Server and then onto cloudier projects has stood him in good stead.
He was wheeled out on stage during Nadella's Ignite keynote to show off the performance of the rebranded and revved-up Azure Synapse Analytics service, which was compared to cloud rivals (to dark mutterings of "lies, damn lies and benchmarks" from some of the more cynical attendees, although Kumar insisted to us that the company had used "an industry standard" for the things). Kumar also unveiled a "code-free visual environment" in the form of Azure Synapse studio for managing workspaces.
There's a Snowflake in Washington: Microsoft lets data warehouser in on Azure GovernmentREAD MORE
While we have to confess to a certain cynicism when anyone mentions "code-free" since, in our experience, getting stuff done always needs glue somewhere and those consultants rarely come for free, Kumar's use of T-SQL was intriguing. The deep integration of the likes of Apache Spark meant that queries could use Spark tables over massive relational and non-relational data sets without tiresome faffing with the likes of external table creation.
And, of course, Microsoft was keen to emphasise the "limitless" nature of the service, because it's the cloud, innit? Er, maybe not. Azure has had its struggles with capacity; recently users in the US East 2 region found quota requests turned down and customers in Blighty are long familiar with Microsoft failing to put its compute where its mouth is.
Those capacity issues will need addressing, since the newly rebranded service allows serverless on-demand queries for ad hoc analysis or provisioned resources for something a little more intensive.
Kumar told us that Synapse can leap across the company's data centres with a single bound if things get tight, although a glance at Azure's storage and compute pricing indicates that such replication will not come cheap.
"At any point, if we believe we are running into concurrency limits, we can transparently spin up another one," said Kumar when questioned about his demo that simulated 10,000 concurrent users bashing the service at the same time.
Of course, lurking behind the scenes of much of the database work coming out of Redmond is SQL Server 2019, and Kumar is quick to state that these days the company has adopted a "cloud first development model."
"The cloud," Kumar told The Register, "is really helpful."
Let's talk telemetry
Microsoft has more than six million SQL DBs "all-up", as a spokesperson put it, with approximately a trillion queries running per day. And all those queries make for some delightful telemetry to allow the SQL Server gang, many of whom have been with the product for decades, to work out what is actually useful rather than mere frippery.
Microsoft was quick to reassure us that it filtered any personally identifiable information from the telemetry before it leapt from the customer and into the Windows giants' analytics. Any information tagged as customer data cannot leave the boundary for customer data, the company added. That same customer data is also removed from query text.
We've naturally asked Microsoft how that categorisation works and will update with a response.
The idea of a "cloud first" SQL Server is enough to strike fear into the heart of many a business, still happily running on-premises versions of the database, and Kumar acknowledged the public cloud would not be an option for some users - although Microsoft would dearly like those workloads shunted its way.
"You can basically pick a SQL Server 2008 R2 database and just simply migrate and things should just work. And that's a promise," said Kumar. From now on, the plan for SQL Server in Azure is to have the service be evergreen and continually updated.
However, going the other way and getting Azure Synapse Analytics out of Microsoft's cloud isn't currently a go-er. While Azure Arc has taken some small steps to move things off the company's servers, there remains work to do.
"If I had my way, I'd want it now," said Kumar, "It's one of the highest priorities." ®
Sponsored: Beyond the Data Frontier