Foresights
Speech synthesis

What is it?
Speech synthesis is the artificial production of human speech by computer systems through a combination of hardware and software. Speech synthesis systems join together pieces of recorded speech to create sentences, and the size of the stored speech units will dictate the overall quality and realism of the output. Systems that store entire words will provide clarity and the expense of the size and range of vocabulary but will be useful for domain-specific applications.
The majority of speech synthesis systems today will assemble words by concatenating phones and diphones – the individual phonetic building blocks of human speech stored by the system. There are other approaches using acoustic modeling (formant synthesis) or simulated models of the human vocal tract (articulatory synthesis) but their use is less widespread. Formant synthesis does not require a large sample database, making it useful in small footprint and embedded applications.
Speech synthesis systems have existed for many years but technology improvements, in particular in the software algorithms that influence accentuation, pitch or rhythm etc, and therefore how natural and realistic the synthesized speech sounds, promise to expand the market for the technology into new application domains and market sectors.
To learn more about Speech synthesis, download our full Foresight PDF file ( 252Kb; opens in a new window).
|