Feeds

Text to speech is getting emotional

WAKE UP!!!

Intelligent flash storage arrays

In the early days of text to speech (TTS), the requirement was just that the listener could understand. One of the best known examples is Professor Stephen Hawking, the author of A Brief History of Time, who has used a speech synthesiser for many years that sounds Dalek-like. The other well known example, although many fewer people have heard it, is a screen reader, such as Jaws or IBM Home Page reader, used by many people with vision impairments.

These basic speech synthesisers provide a very valuable service to people with vision or speech impairments. However, advances in TTS are being driven by new mass-market applications such as mobile phones, in-car communications and sat-nav.

If you are sitting in a top of the range Mercedes you would not be impressed by a tinny voice telling you to "fasten your seat belt". The voice becomes part of the Mercedes corporate image and needs to be as smooth, unique, and as well mannered as the car. As more information is provided to the driver through speech, so the quality of the speech has to improve. Not only must the words be pronounced correctly in context for example, "please close (cloze) the door", and "you are too close (clos) to the car in front", must be pronounced differently and correctly; but also the intonation of the sentences and words must match the context.

SVOX AG is a Swiss based company that specialises in developing TTS technology. It was founded in 2000 as a spin-off from the Swiss Federal Institute of Technology in Zurich (ETH Zurich). Its products are therefore the culmination of over 15 years of research and development. Switzerland is a multi-lingual country and this is reflected in the whole architecture and philosophy of the products.

Having shown that SVOX can produce high quality voice output the next step is to deal with the greater levels of complexity required in-car. These include:

  • How to deal with multiple languages, firstly to be able to speak in the preferred language of the driver; secondly, and the greater challenge, how should directions be given as the car moves across borders; should it say Munich or Munchen, and in either case, how should it be pronounced (as a local (Mun-chen) or as read by the listener (Munch-en)?
  • Avoiding misleading the on-board computer, which is typically multi-tasking between monitoring the state of the car and other activities (information about a likely engine failure should not be delayed by the TTS engine trying to decipher an SMS).
  • Avoiding over-loading the driver (the command "Wake up" should override information that "the weather is going to get worse in half an hour").
  • How to add emotion into the voice ("Wake up" should be assertive, while "we are low on fuel but the next services is only 15 kilometres away" should be soothing and calm).

The SVOX TTS engine is scalable between mobile, personal and server solutions so that as these new challenges are solved for the mass market they will become available for the specialised accessibility market. Soon we may hear a Professor Hawking talking emotionally about how happy he is that he can speak with passion, or love, or frustration about his latest discoveries; or even just shout at one of his students "Wake up to the possibilities of TTS!"

Copyright © 2006, IT-Analysis.com

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
PEAK APPLE: iOS 8 is least popular Cupertino mobile OS in all of HUMAN HISTORY
'Nerd release' finally staggers past 50 per cent adoption
Microsoft to bake Skype into IE, without plugins
Redmond thinks the Object Real-Time Communications API for WebRTC is ready to roll
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
Mozilla: Spidermonkey ATE Apple's JavaScriptCore, THRASHED Google V8
Moz man claims the win on rivals' own benchmarks
Yes, Virginia, there IS a W3C HTML5 standard – as of now, that is
You asked for it! You begged for it! Then you gave up! And now it's HERE!
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Ubuntu 14.10 tries pulling a Steve Ballmer on cloudy offerings
Oi, Windows, centOS and openSUSE – behave, we're all friends here
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
How to simplify SSL certificate management
Simple steps to take control of SSL certificates across the enterprise, and recommendations centralizing certificate management throughout their lifecycle.