Original URL: http://www.theregister.co.uk/2007/05/17/dna_v_rozzers/
100,000 'erroneous' records on DNA database
Load failures of 26,000 are part of the problem
The complex relationship between the police, the National DNA Database Unit and the forensic service has left the UK's DNA database with at least 100,000 erroneous records, The Register can reveal.
Which makes the NDNAD Unit's admission in its annual report today that between 1995 and 2005 it failed to load 26,200 records to the DNA database because of errors sound trifling. 183 crimes went undetected as a result of this failure.
90 per cent of these 26,200 "load failures" only occurred after the NDNAD was linked the Police National Computer (PNC) in 2001. After the link was created, new NDNAD records were routinely checked against the PNC and if they were found to be erroneous, were rejected.
But prior to 2001, most erroneous records were not being picked up and so were inputed direct onto the NDNAD, and are still there today, a spokesman for the NDNAD Unit admitted today.
"There's in the order of 100,000 unreconciled records now," said the source.
"We don't actually know," he said when asked exactly how many erroneous entries the database contains.
"There was a lower stringency on loading checks. There might have been an error but it wouldn't have been apparent," he added.
It would have been more accurate for the NDNAD to say in its report today that between 1995 and 2005 approximately it tried to load 126,200 erroneous records onto the database, of which only 26,200 were stopped by the system.
The revelation also makes a mockery of the Home Office's claim today that the problem had been cleaned up already.
"Swift action was taken to resolve the situation and by January 2006 all the profiles had been investigated and subsequently loaded or otherwise resolved," it said in a statement.
According to the annual report, about 80 per cent of the 26,200 load failures had been cleaned of errors and loaded. The other 100,000 records would have been load failures if they had been added to the database after 2001. But they weren't and they are still on the NDNAD.
The spokesman said that a further 10 per cent of records on the DNA database, which contains about 4.1m records, are duplicates. These are records that have identical DNA samples as other records on the database, but different identifying information.
The extent of data errors in DNA database only came to light after it was linked to the PNC in 2001 because, though the link was primitive, it was nevertheless good enough to reject about 10-12 per cent of DNA samples because of errors stemming from one of their key identifying numbers. These errors stemmed, said the spokesman, from the one part of the procedure that was still still done on paper.
The reasons why the problems were never tackled, he said, was because of poor co-operation between the police, the DNA database unit and the forensic service companies that process the samples, which in England and Wales are the Forensic Science Service, LGC Ltd and Orchid Cellmark.
Police DNA samples are identified and linked to both the DNA database and the PNC with two unique numbers. One of these is automatically printed on a label, along with a barcode, in a police officer's DNA sampling kit. The other is the Arrest Summons Number, which a police officer writes by hand onto the sample label.
The samples are sent to the forensic service firms for processing, where both numbers are inputted into a Laboratory Information Management System (LIMS). This information is then emailed to the DNAD Unit, where another computer system automatically extracts the information and sticks it in a skeleton DNA record, which would have been created when a police officer took the original sample.
If the lab technician misread the police officer's hand writing, there would be mismatch because both numbers need to match a corresponding record in the PNC in order for the DNA record to be loaded successfully.
These problems only came to light when the PNC link was established in 2001 because records were refused onto the DNAD unless they could be matched with the PNC. The extent of this problem, with 10 to 12 per cent of records failing to load, was found to be so great that it swamped the forensic firms.
"The [forensic] suppliers had a backlog built up largely because the suppliers hadn't been able to get the support of the police force," said the source.
"The electronic link revealed the extent of the problem," said the source, "The problems were there before but they were in the database. Now the DQIT team are looking at cleaning it up."
The DQIT (Data Quality and Integrity Team) of the DNAD Unit started dealing with the forensic service firms' backlog in 2005 after the police gave them a link to the PNC. Neither the forensic firms nor the police had to pull their finger out to clean up their administrative mess - the DQIT did most of it for them.
So the load failure rate for DNA samples was wheedled back to its current 1 per cent, but prior to 2001, the problem records were not being identified because no checks were being made on the two key identifying numbers.
"Until we had the Data Quality and Integrity Team, we didn't know about the problem," said the spokesman today.®