'Cyber Genome Project' kicked off by DARPA
The code you write - it'll be as traceable as your DNA
Applecart-bothering Pentagon boffinry bureau DARPA is at it again. This time, the military scientists want to establish a "Cyber Genome" project which will allow any digital artifact - a document, a piece of malware - to be probed to its very origins.
According to an announcement put out yesterday by DARPA, the "Cyber Genome Program" will "produce revolutionary cyber defense and investigatory technologies". In detail:
Digital artifacts may be collected from live systems (traditional computers, personal digital assistants, and/or distributed information systems such as ‘cloud computers'), from wired or wireless networks, or collected storage media. The format may include electronic documents or software (to include malicious software - malware).
The Cyber Genome Program will encompass several program phases and technical areas of interest. Each of the technical areas will develop the cyber equivalent of fingerprints or DNA to facilitate developing the digital equivalent of genotype, as well as observed and inferred phenotype in order to determine the identity, lineage, and provenance of digital artifacts and users.
In essence it seems that almost any data trawled from a relevant network, a computer, a flash drive, someone's phone or whatever is to be analysed much as human genetic material now can be. The code or document's relationships with other "digital artifacts" will be revealed, perhaps its origins, and other info of interest to a Pentagon admin defending military networks or a military/spook investigator tracing online adversaries.
Or in other words, any code you write, perhaps even any document you create, might one day be traceable back to you - just as your DNA could be if found at a crime scene, and just as it used to be possible to identify radio operators even on encrypted channels by the distinctive "fist" with which they operated their Morse keys. Or something like that, anyway.
There are to be workshops for interested industrial participants shortly, but it's US citizens only. The wider world may not find out about the Cyber Genome effort unless and until it starts to produce results. ®
I'll believe it when I see it
"any code you write, perhaps even any document you create, might one day be traceable back to you"
Exactly how are they going to be able to tell which of us wrote "hello world" or which of us was the person who wrote that little 10 line XML hack?
You need hundreds of lines of code to spot any patterns and even then there are going to be very simiarl ones in different people's code...
And exactly who do they hope to catch? Virus writers? Hackers? They've had very limited success so far and I expect that to continue pretty much forever.
Typical sales pitch
John Smith has it right. Analogies (v. technical descriptions) work best for obtaining grant money in certain houses.
Pity the poor sci/eng techi whose boss (understanding neither genes nor programming) uses equally obtuse analogies to win the grant, then dumps it on his whiz-kid underlings to deliver. Been there, seem that happen too many times, and, unless his whiz kids are equally bright in redefining the problem into something realizable (with realistic objectives) and selling that back to the granters, they are all in for a rough unpleasant ride.
Code style - not necessarily for "who are you" identification as much as "who you are not"
It may be useful to see that a modification to a piece of code has been made - you write 10,000 lines of assembler net hardware IO, someone adds 100 to enable keystroke logging. You get picked up (YOUR code has the hack) but defence shows a different "fingerprint"
Many moons ago, our pathology records system wasn't word wrapping properly. The engineer was staring at the source in the coffee room. I pointed out to him that the documentations was't up to date. "What do you mean?" "Header block shows one author, but two other people have edited this." "How can you tell?" "From the way they're manipulating the strings. the original author does it C-style, this block was written by someone who's making the transtion to C from Basic and this section was written by someone who's used to Pascal."
Once I'd pointedthat out and we removed the kludges the editors had written to translate strings into their favourite format, it all worked.
Engineer got hell though - customers aren't allowed to change source code...