Contemporary speech synthesis is perceived as inadequate for general adoption for user interaction, largely because it rests on an inadequate model of human speech production and perception. This book reviews the underlying model, brings out areas of inadequacy and suggests how improvements might be made. It is argued that a greater understanding of the fine detail of speech will enable new research and application initiatives. The authors draw on their extensive experience in both theoretical and applied research to bring forward proposals for producing more natural sounding synthetic speech.
- Provides an overview of the current work in speech synthesis, including a critical review of markup systems (including XML and SSML) embedded in interactive applications.
- Argues that naturalness in synthesis will benefit from enhancements to underlying models of prosody which more accurately account for the properties of human speech, yet can also be productively transferred to speech synthesis.
- Emphasises the importance of an explicit and extensible architecture as the basis for future developments, stressing particularly the importance of close modelling of expressive and emotive content – key features of naturalness.
- Focuses on the dynamic nature of prosody, as opposed to the more usual static treatment, especially as an adaptive model compliant with pragmatic and environmental constraints.
Developments in Speech Synthesis provides the basis for a comprehensive approach that will appeal to speech synthesis and language technology engineers specialising in building dialogue systems. It will also be an invaluable resource for computer science and engineering students at both advanced undergraduate and postgraduate levels, as well as researchers in the general field of speech synthesis.