SpinVox: The Inside Story
Santa's Little Helpers
Spinning SpinVox's "Brain"
SpinVox's human agents use a software application that predicts words as its agents type them in. When the agent performing the transcription types "Hello", it can anticipate that the next most likely words that follow will be "How are you?".
Without this software, SpinVox's human agents couldn't translate the messages "in near-real time", as the company claims.
But SpinVox has consistently sought to blur the precise role and definition of this software. It goes by several names: SpinVox agents know it as "Tenzing", SpinVox's publicity literature refers to something it calls D2, which it says is "The Brain".
"D2’s pretty smart. It’s bound to be, as D2’s a combination of artificial intelligence, voice recognition and natural linguistics," says SpinVox.
Meanwhile the company's IP filings refer to an acronym VMCS, or the Voice Message Conversion System. The company also markets VMCS as a "cloud platform" for transcription.
A poster claiming to be from SpinVox last year described VMCS as "a carrier grade engine capable of converting voice-into-text in four different languages," asserting that, "the important point is that it is the SpinVox VMCS, not humans in the Philippines or anywhere else, that converts voice messages into text."
That's false, say people familiar with VMCS.
But on the basis of Spinvox's patent applications, the software that agents use for transcribing most of SpinVox's messages seems not to be performing machine translation at all, but doing something much more mundane - word prediction, something commonly performed by specialist packages for disabled users, such as Penfrield XL, and even mobile phones.
"If the majority of messages get converted by machine - why do they need world class call centres?" said one former SpinVox employee.
The IP arsenal that doesn't exist
Clear evidence that SpinVox depends largely on humans not some artificial intelligence breakthrough (and knows it) comes from SpinVox itself.
In Spinvox's boilerplate text, attached to every press release, the company claims to have made "significant innovations in voice and network technologies which are protected by over 70 patents worldwide".
Where does the figure of 70 come from? A global search of IP databases reveals just 8 listings. Most are patent applications, which offer the inventor some, but only limited protection. Each cluster contains multiple applications of the same patent to each patent authority, such as the UK IPO, the EPO, and WIPO for example Add them all up, and you get around 70.
SpinVox filings cover a number of innovations, and when read chronologically, tell their own story. The first filing simply describes a human powered call centre, while the most recent draws the "VMCS" as simply a box in the cloud. All of them describe business methods and applications. For example, recent filings describe uses for a translation system - such as speaking blog posts or emails, and ideas for inserting media or advertising information into an SMS text message. All these are business applications or methods.
Quite significantly, something is missing.
"None of the patents I have seen describes speech translation," says Lyndsay Wiliams, head of Girton Labs in Cambridge and former Microsoft Researcher.
In the United States, only one patent has been granted: and it's for a human-powered call centre.
In 2004 co-founder Daniel Doulton filed a patent for a "Method of providing voicemails to a wireless information device". This accurately describes SpinVox's business operation in great detail, insiders confirm.
In the patent description's own words:
"...the operator intelligently transcribes the actual message from the original voice message by entering the corresponding text message (actually a succinct version of the original voice message, not a verbose word-for-word conversion) into the computer to generate a transcribed text message. The transcribed text message is then sent to the wireless information device from the computer. Because human operators are used instead of machine transcription, voicemails are converted accurately, intelligently, appropriately and succinctly into text messages (SMS/MMS)."
The AI magic is merely an attribute. One section of the patent describes "Automated Voice Recognition" which, the application (No. 20060223502) explains,
"is to speed up the processing of inbound voice files and reduce operating costs. The prime function will be to auto-detect spoken phone numbers, and detect language to route audio files to the correct human operator staffed transcription bureau. It will also be used for detecting names and spoken numbers and addresses from the users online phone-book (see below) and commands for VoicemailManager controls."
This undermines the claim that "data is encrypted", as SpinVox has claimed this week. A SpinVox user's privacy depends not on encryption, but on its overseas call centres agents behaving. As the patent notes:
"All transcription employees must have signed a confidentiality agreement before being able to deal with any messages and must not divulge, share, copy, forward or otherwise share any user information."
While agents cannot see the sender id, they can identify the recipient by phone number. SpinVox insisted in a statement that agents are unaware of the recipients phone number unless the person sending the message put it in the recipient's phone number in the body of the message.
"All the operators are still bound to comply with the strict data protection requirements of their contracts," SpinVox told us.
But the agents do not always behave. In one notorious incident famous in SpinVox company folklore, the agents took over the call centre.
What happened in Pakistan?
SpinVox sources describe a company struggling to cope with rapid growth of the company. Allegations of unpaid expenses and unpaid agents abound. In once instance, the factors came to a head, with staff sending an "SOS message" to bemused phone users in North America.
"Dear customer. We are employees of SpinVox. We convert your messages here in Pakistan. Since SpinVox has stopped has paying us. We won’t be able to convert your messages from now onwards. There is no software that converts the messages. We humans do. – powered by SpinVox"
"A couple of girls were out there as trainers," sources told us. "They got them out quick. SpinVox cancelled the contract."
What now for SpinVox?
The SpinVox story is becoming increasingly confused. Domecq claims 3,000 agents are used by the company, while sources put the number much higher, between 8,000 and 10,000. The company stands by the figure of 3,000.
Questions have also been raised about the judgment of investors Goldman Sachs and Ariadne Capital, who between them injected $200m of other people's money into the company.
In an emotional blog posting, Ariadne's Julie Meyer played psychologist - musing about the motives of "a cheerleadership of malcontents" - and also played the gender card. Amongst many things, Meyer wrote that she loved CEO Christina Domecq's "search for excellence, and her driiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiive", and said SpinVox was "the first major technology success story out of Europe founded and led by a woman. You’ve got to love that too. Go girl!"
We asked SpinVox if, with so many unanswered questions, it was wise to go on the offensive. The company said that the company "is under anonymous and malicious attack from a group of disgruntled former employees. After two weeks of investigation the company has identified nine of those who are responsible and has started legal action against them."
Meyer claimed SpinVox had "strong IP", "world class technology" and a "staggering" rate of innovation. Yet in five years, SpinVox has had just one patent approved in the United States: Doulton's application described above, for a human-powered call centre, was finally granted (No. 7,532,913) on May 12.
SpinVox's carrier customers don't make decisions based on sentiment or emotion, but on the assumption that the business can grow. As one insider told us:
"Carriers are going to go, 'Hang on a second!'. They are going to ask to be shown the technology that does this transcription. SpinVox won't be able to do that, because it doesn't exist." ®