Feeds

Running queries on the HMRC database fiasco

Dis-information systems management

Build a business case: developing custom apps

Why wasn't the sensitive, non-required, data removed?

According to the Telegraph, the National Audit Office (NAO) asked for the names, National Insurance numbers, and child benefit numbers of every child so that it could select 100 cases at random for its annual audit of Revenue and Customs. The NAO asked for bank and other details to be removed, but an HMRC official replied that, to keep costs down, the HMRC could only provide all of the details on the database.

Now, let's go back to Database 101 for a moment (not a theoretical Database 101; I actually designed and teach on the database course at Dundee University). The first example in the first lecture on querying shows the students how to subset the data by column and then by row. Database engines are built, from the ground up, to perform sub-setting. It doesn't get any easier than this. So, where's the problem?

But, let's assume the worst: that the HMRC uses a very complex, unwieldy engine that cannot subset data easily. Well, no matter how complex the original database was, it is reasonable to assume that it was reduced to a simple format for the CDs. In which case, importing it into an engine that can subset easily is trivial.

So, unless there are some very odd circumstances to which we are not privy, I find it impossible to believe that removing the bank and other details would have involved significant cost.

Go on, Mark, stick your neck out. How much?

Well, the Telegraph has an estimate:

However The Telegraph has established that a typical clean-up operation would cost around £5,000 and take a software engineer less than a week. A spokesman for HMRC said that the £5,000 cost of removing the information "was not a figure we recognise" and declined to discuss the cost because the matter is the subject of a review.

I don't recognise the figure either, but that's because I think the Telegraph is being far, far too generous to the HMRC. Assuming 25 million CSV records, I would estimate half a day's work to subset by column. If I was familiar with the data structure and had done the job before, maybe an hour. Any competent DBA/DBA could do it in the same time. Now DBAs are expensive, but not £10,000 per day. I'd do it for £500.

A spokesperson for the HMRC said: "We don't have infinite resources, we have to use our resources rationally."

One has to wonder about the definition of the word "rationally" here.

Endpoint data privacy in the cloud is easier than you think

More from The Register

next story
14 antivirus apps found to have security problems
Vendors just don't care, says researcher, after finding basic boo-boos in security software
'Things' on the Internet-of-things have 25 vulnerabilities apiece
Leaking sprinklers, overheated thermostats and picked locks all online
iWallet: No BONKING PLEASE, we're Apple
BLE-ding iPhones, not NFC bonkers, will drive trend - marketeers
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
Only '3% of web servers in top corps' fully fixed after Heartbleed snafu
Just slapping a patched OpenSSL on a machine ain't going to cut it, we're told
Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
They're not emails, they're business records, says court
Plug and PREY: Hackers reprogram USB drives to silently infect PCs
BadUSB instructs gadget chips to inject key-presses, redirect net traffic and more
How long is too long to wait for a security fix?
Synology finally patches OpenSSL bugs in Trevor's NAS
prev story

Whitepapers

7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?