# Mark Whitehorn

Contact Follow
Professor Mark Whitehorn is chair of analytics at Dundee University's School of Computing.

### Combinations? Permutations? Those words don't mean what you think they mean

Hello, wrong number At the heart of machine learning are patterns, and patterns are all about counting, so it's important to make sure we are counting the correct items in the correct way. Combinatorics is the branch of mathematics concerned with counting things; more specifically, all the wonderful ways you can count, arrange and manipulate finite …

### What is the probability of being drunk at work and also being tested? Let's find out! Correctly

Hello, wrong number Analytical skills are in big demand so it is really important not to make the basic, common, mistakes that show you up as a newbie. For example, probability calculations are often performed on binary outcomes such as "What is the probability that a given policy holder will claim?" The result is binary because they will either …

### Numbers war: How Bayesian vs frequentist statistics influence AI

If you want to develop your ML and AI skills, you will need to pick up some statistics and before you have got more than a few steps down that path you will find (whether you like it or not) that you have entered the Twilight Zone that is the frequentist/Bayesian religious war. I use the term "war" advisedly because war, by …
Mark Whitehorn, 22 Jun 2017

### 8 out of 10 cats fear statistics – AI doesn't have this problem

If statistics were a human being, it would have been in deep therapy all of its 350-year life. The sessions might go like this: Statistics: "Everyone hates me." Pause. Therapist: "I'm sure it's not everyone..." Statistics: "And they misunderstand me." Pause. Therapist: "Sorry, I didn't quite get what you meant there..." …
Mark Whitehorn, 24 May 2017

### Do the numbers, Einstein: AI is more than maths as some know it

Microsoft arrived on the graph-database scene last month. Three years in the making, Microsoft released Trinity under a typical-by-now-of-Microsoft-boring-trade name of Graph Engine. Already on that scene are Neo4J, MarkLogic, Oracle, SAP and Teradata - among others. Driving Microsoft, like those before, is the desire to …
Mark Whitehorn, 17 Mar 2017

### Hard numbers: The mathematical architectures of Artificial Intelligence

Pity the 34 staff of Fukoku Mutual Life Insurance in Japan, diligently calculating insurance payouts and brutally replaced by an AI system. If you believe the reports from January, the AI revolution is here. In my opinion, the goings-on in Japan cannot possibly qualify as AI, but, in order to explain why, I have to explain …

### We ain't in 1996 anymore, Dorothy: SQL Server 2016 proves it

Microsoft has had a database since 1989, initially working with Ashton-Tate and Sybase to create a variant of Sybase SQL Server for IBM’s OS/2. But it wasn’t until 1995 that Microsoft really got serious with SQL Server 6 for Microsoft’s rock-solid server operating system Windows NT. Back then, however, engines like SQL Server …
Mark Whitehorn, 18 Jul 2016

### Who you callin' stoopid? No excuses for biz intelligence's poor stats

Business Intelligence (BI) systems are designed to turn raw data into useful information, so why don’t they do the job properly? Why do most of them fail so completely to make use of the huge range of capabilities that the analytics world has to offer? Even at the most basic level, they fail catastrophically to take simple …

### Land Rover's return: Last orders and leather seats for Defender nerds

We all know there’s only on one true Land Rover: the Defender. A cheerful, competent, boxy-shaped device that’s been in production since 1948, inspired by the Jeep, the Allies' WWII workhorse. It looks as good pulling logs from a forest as it does pulling up outside a house in Mayfair and it was voted Greatest Car of All Time …

### Thank heavens for the silicon chip: A BRIEF history of data

Data Pair – Part 1 Data was born around 20,000 years ago, around the time the last ice age was at its peak and Cro-Magnon man was appearing in Europe. Data was made both by those early humans' minds and these humans’ ability to store facts outside their brains. Why the human mind? It is because data doesn’t exist outside the context of the mind …
Mark Whitehorn, 21 Apr 2015

### Car hacker secrets revealed: Clutching up a tank engine in a classic motor

+Diagrams Several people were kind enough to comment favourably on this article and asked for more information, pictures and video. We are delighted to oblige. Technically no one asked for poorly drawn diagrams, but I’ve included some of these as well. First, the video. Mark and chum turn over the Rolls Royce Meteor engine. The sound …
Mark Whitehorn, 21 Dec 2014

### Beyond the genome: YOU'VE BEEN DECODED, again

Most people have heard of the human genome project (HGP), few have yet heard of the human proteome project (HPP) but it is going to transform your life in a far more fundamental way than the HGP never did. The human genome project was completed in April 2003 - we are currently the only species known to have deciphered its own …
Mark Whitehorn, 28 Nov 2014

### Hi-torque tank engines: EXTREME car hacking with The Register

Most car marques – Lagonda, Ford, Morgan and so on – have a proud history and the respective car clubs often worship the original form; if you present a car for judging, it had better be exactly as per factory spec. Or else. There are notable exceptions and perhaps the most surprising, given the value of some of the cars, is …
Mark Whitehorn, 26 Nov 2014

### You can crunch it all you like, but the answer is NOT always in the data

Evidence-based decision making is so clearly sensible because the alternative — making random decisions based on no evidence — is so clearly ludicrous. The “evidence” that we often use is in the form of information that we extract from raw data, often by data mining. Sadly, there has been an upsurge in the number people who …
Mark Whitehorn, 20 Oct 2014

### About to make a big bet? Don't crash out, cash in with the power of maths

Big Data's Big 5 When and how to make change to a successful business or popular website can be a huge risk. Get things right and - at best - nobody notices. Get things wrong, however, and you run the risk of losing business and suffering a damaged reputation. A good recent example is that of film and TV service Netflix, whose fluffed …
Mark Whitehorn, 29 May 2014

Big Data's Big 5 When his class is asked to give an example of a paradox in The Simpsons, Bart offers: "You're damned if ya' do, and you're damned if ya' don't." The dictionary defines a paradox as an absurd or seemingly absurd or contradictory statement that might prove to be true and when it comes to data a seemingly contradictory situation …
Mark Whitehorn, 28 May 2014

### Big data hitting the fan? Nyquist-Shannon TOOL SAMPLE can save you

Big Data's Big 5 You are working on a big data project that collects data from sensors that can be polled every 0.1 of a second. But just because you can doesn’t mean you should, so how do we decide how frequently to poll sensors? The tempting answer is to collect it all, every last ping. That way, not only is your back covered but you can …
Mark Whitehorn, 23 May 2014

### Achtung! Use maths to smash the German tank problem – and your rival

Big Data's Big 5 You've employed Benford's Law to out fraudsters hidden in seemingly random numbers. Now what do you do if you need answers but some of your data is missing? Welcome to the German tank problem, the second in The Reg's guide to crafty techniques from the world of mathematics that can help you quickly solve niggling data problems …
Mark Whitehorn, 20 May 2014

### How to catch a fraudster – using 'top cop' Benford and the power of maths

Big Data's Big 5 Yes, we've been hit over the head enough times with the phrase "big data" to be aware of its presence, even though we've been up to our armpits in streams of huge unstructured datasets for years. Those of you who are analysts or data scientists will have already picked up a set of tools that help you find hidden information …
Mark Whitehorn, 14 May 2014

### Tell me, professor, what is big data?

Big Data may be misunderstood and overhyped - but the promise of data growth enabling a goldmine of insight is compelling. Professor Mark Whitehorn, the eminent data scientist, author and occasional Register columnist, explains what big data is and why it is important. Sometimes life is generous and hands you an unexpected gift …
Mark Whitehorn, 12 Aug 2013

### We gave SQL Server 2012 one year to prove itself: What happened?

Deep dive I reviewed SQL Server 2012, codenamed Denali, just over a year back and highlighted the major improvements in Microsoft's relational database. After a year in production, was I right, or have other features proved more important in practice? I'm bound to agree with my last review, so I polled colleagues who also work with SQL …
Mark Whitehorn, 28 May 2013

### Hey, software snobs: Hardware love can set your code free

Comment In computing there are many, many different ways to run down other people’s work, not the least of which is: “OK, so they removed the bottleneck, but only by throwing faster hardware at it.” The implication is that tackling an issue just with software is intrinsically better. When did you ever hear anyone say: “OK, so they …

### Big Data versus small data: Unpicking the paradox

NoSQL and Big Data crashed into the ordered world of relational architectures a few years back, thanks to services like Twitter, Facebook and LinkedIn. But while concepts such as key value stores and content-specific stores have certainly enriched our environments, the downside to their arrival is that it has created quite a …

### Big Data's big issue: Where are all the data scientists coming from?

Analysis Plug “data scientist” into Google and it is clear the job title has finally come of age and, suddenly there is a huge skills shortage. An oft-quoted source about this shortage is a McKinsey Global Institute study, here. This predicts a talent gap of 140,000 to 190,000 people by 2018 in the US alone. I am always sceptical of IT …

### Big Data bites back: How to handle those unwieldy digits

Data is easy. It comes in tables that store facts and figures about particular items – say, people. The columns define the data to be stored about each item (such as FirstName, LastName) and there is one row for each person. Most tabular database engines are relational and we use SQL for querying. So this "Big Data" thang must …
Mark Whitehorn, 27 Aug 2012