Why you'll never make really big money as an AI dev
Artificial Intelligence? How the future was back in the '80s
Among the stupider things I said in the 1980s was a comment about Artificial Intelligence, including neural nets - or perceptrons as we called them back then - saying we needed "maybe a processor that worked at a hundred megahertz and literally gigabytes of storage".
I also believed that following our success using Fuzzy Logic to optimize Cement Kilns (from which my college made serious cash), Fuzzy was the future. I was wrong and am now envious of the power you have to play with.
I now go to more conferences than any rational person, like Intel's recent Nervana show, and part of me feels like I’m revising my mid-1980s degree again. Neural networks can classify pictures of goats, despite the occasional confusing of women's feet and various species of crab. Hardware vendors like Cray, Intel and nVidia lurrve neural nets, since massively parallel stupidity masquerading as Artificial Intelligence soaks up whatever power you throw at it.
Backward chaining is appearing as a “new” technique and a big risk factor for plotting a career in AI, in that because we’re mostly mining existing, but buried techniques, the stability of demand for any given technique can be capsized as a new one is found.
We also thought Lisp was cool, easy to knock up a chatbot with (trendy again), and that with a bit of simple BASIC or SLIP you could create Eliza for clinical psychiatry.
AI’s not just neural nets and bright young things who’ve re-discovered linear regression (not part of many CS degrees). One speaker gushed at how he’d optimized his neural network by differentiation, showing us by using the 17th century Newton form of fluxions rather than the later better Leibnitz notation. The smarter talks at the Alan Turing Institute also do graph theory which isn’t actually new either but at least it’s hard and no, it’s not pie charts.
Neural Networks were a joke in the 1980s. I built one, for a given value of "built" since it never ever did anything useful or even go wrong in a particularly interesting way. That's because we used wires "just like the human brain" - if your brain has 30 neurons and runs off a second hand Sinclair Spectrum PSU. They’d been around for decades even then and fell into such obscurity that I can forgive the kids for thinking they’re new. Like coal which has lain buried for millions of years, techniques like Monte Carlo (used in Google’s Go champion) are there for the taking. So is backward chaining as people realise that straight multiply and add takes too much training and working backward from the goal can get there faster.
We also thought Lisp was cool, easy to knock up a chatbot with (trendy again), and that with a bit of simple BASIC or SLIP you could create Eliza for clinical psychiatry. At first the researchers feared patients wouldn’t like a soulless machine, but it turned out they liked a non-judgmental teletype so much that some broke into the lab to get more time with their new friend. Computer addiction is thus older than the Daily Mail writers who whine about “young people”.
But the hardware problem was quite crushing once you started to scale up. Lisp works out the type at run time, making even simple operations several times slower and of course seducing you into creating structures that were both elegant and astonishingly slow because the machine architectures of the time just weren’t up for it.
They’re not much better now, but are stupid literally 8,000 times as quickly. Symbolics.com is now just a click farm but back in the '80s it was the home of the Lisp machine (and the first .COM domain). The CPU supported the sort of discriminated union C/C++ programmers know and (sometimes) love cutting the overhead right down to size. But C and then C++ beat Lisp to death in straight performance leading us to try such abominations as Lisp compilers, but mostly Lisp only being used when it had a clear edge and not always then. So cheap commodity CPUs optimized for C/C++ and Pascal won.
The rising cost of programmers combined with this to give the biggest trend in programming for the last 30 years and for all I know the next 30. The simplest software development process, using the most stupid programming language, running on the cheapest stupid hardware will win.
That’s why we have deep learning right now. Management and too many IT pros believe that you can just shove a pile of slightly correct data into a deep learning cloud and out will pop business solutions. They used to believe that about databases, and some of us still make money out of resolving the blind faith in Oracle.
Big Data - or Very Large Databases as we called them in the 1980s - were going to be a Thing. Japan poured truly vast amounts of money into AI in the 1980s and 1990s, which is why every AI product in the world is now produced by Japan. Oh. No, it isn’t. The 5th Wave (yes that’s a real name, not a tedious surfer biopic) scared the UK so much that it put together the Alvey Programme. A pathetic stub on Wikipedia tells you how successful that was.
In today’s money, it was an easy half a billion quid for research into giving us the sort of hardware we needed. It neither gave us this, nor the UK leadership in User Experience and chip making it sought, being crushed by Visual Basic apps front-ending legacy databases running on Intel. We also watched in horrified fascination at the short brutal lives of radical new architectures now lost in the foreign country of the past.
The Transputer would have given us the horsepower and the data rate, through a fast serial bus, but got killed by politics, the Digital Array Processor and Content Addressable File System are lost in that other country. A joy of living in Buckhurst Hill, home of The Only Way is Essex, is when you get a new next door neighbour they may be a Page 3 star, a footballer, or an assertive New Yorker who informed me that "my dad invited expert systems".
Just before I made a sarcastic remark she shared that he was called Feigenbaum, who’d worked on Mycin, a medical AI which had caused one of my N=N+1 career mistakes.
I had worked for a while on Knowledge Based Systems, which I commend as training for anyone who wants to be a journalist. KBS or “expert systems” as mundane people called them were going to replace people whose job used knowledge and was experience-based (stop me when this gets too familiar) and since they would only get better, by the 21st century it would barely be worth training people for jobs like doctors or judges.
The reason Prolog coding for KBS is so like journalism is not only the wretched pay, but the incessant torrent of lies one is fed in trying to write a system capturing the decision making of an expert. Aside from the jargon (“swizzy is looking a bit toppy”) AKA “futures prices for Swiss Francs are going down) it turned out that the facts they claimed to be using weren’t the real ones - and not just because some suspected the geeks would replace them.
It was because humans are really shit at explaining their decisions. When later I studied economics I learned that social scientists have known this for decades and have experimentally shown that people will claim a factor was important in their decision despite only being told it after they’d made the choice. Also the guy who told me CHF was too high was wrong, it went up and stayed there for 10 years.
So we developed better toolchains with Intelligent Environments Crystal being the sane way to put together a rules-based system. (If it quacks like a duck AND WebbedFeet == TRUE) style which was great for automating humans on helpdesks following scripts, so the next time you have to speak to an Indian call centre, you can thank us pioneers.
So Crystal morphed into Application Manager, which ended up with enough connectivity to industrial scale data sources that, as the first AI boom faded, it carried on.
Do what I say, not what I did
Another reason we didn’t change the world was that our systems were like human experts - the strange, socially awkward oddball who you somehow need, but don’t really talk to, sitting in a corner muttering in Burmese. Our systems did cool but isolated things that weren’t written in the VB/SQL/C/C++ monoculture of the time and were quickly seen as “legacy” and “hard to support”. Not being built of the same stuff meant feeding them with data was hard and brittle, so whereas our first generation of GUIs was so cool that the character-based cave dwellers came over to our workstations to drool with envy, we weren’t knocking up “business reports” or write-only database designs that made you an “architect”.
In a lecture I gave a couple of years ago when AI was beginning to heat up again I introduced the children (mostly postgrads) to the idea of AI architecture and other moral virtues in software engineering. I used HearSay, the 1970s speech recognition system, which integrated several wildly disparate kinds of signal processing into an AI.
Rather than start at the “top”, ie signals to be broken down into words, or at the bottom, assembling data points into syllables, a Blackboard system puts all the data in a pooled area and the “knowledge sources” are fired at it, depending on what offers the best return on investment in CPU time, sometimes gathering data together, sometimes casting it apart.
Advice I’d beg you to follow is to act like one of these KS threads and grab the easy money from Neural Nets whilst it’s there, but don’t expect the pot to be refilled. The reason I’ve skimmed over so many techniques here is to alert you that when NNs aren’t your paymaster any more, you need to be ready with Markov chains for chatbots or the extreme Monte Carlo used in Google’s Deepmind Go champion.
That’s not just a system architecture. This a career architecture, if you want to make more money out of AI than I have. Odds are your employer “wants an AI” possibly because you told them so, but they need a system. So do you, if your job is going to be longer than constructing one new robotic overlord. Your pay is driven by your future value to the firm, not what you did last year, so you’ve got to architect a collection of systems that work together. Given that your AI doesn’t actually work yet (we’re alone here, you can be straight with me), so the various nearly working systems need what amounts to a data bus (or data blackboard) where each can share without screwing each other up too much. My editor says I need to give a conclusion, so it’s one line.
Architects get paid more than programmers. ®
We'll be covering machine learning, AI and analytics - and specialist hardware - at MCubed London in October. Full details, including early bird tickets, right here.