SPEECH AND LANGUAGE TECHNOLOGIES ppt

Thông tin tài liệu

SPEECH AND LANGUAGE TECHNOLOGIES Edited by Ivo Ipšić Speech and Language Technologies Edited by Ivo Ipšić Published by InTech Janeza Trdine 9, 51000 Rijeka, Croatia Copyright © 2011 InTech All chapters are Open Access articles distributed under the Creative Commons Non Commercial Share Alike Attribution 3.0 license, which permits to copy, distribute, transmit, and adapt the work in any medium, so long as the original work is properly cited. After this work has been published by InTech, authors have the right to republish it, in whole or part, in any publication of which they are the author, and to make other personal use of the work. Any republication, referencing or personal use of the work must explicitly identify the original source. Statements and opinions expressed in the chapters are these of the individual contributors and not necessarily those of the editors or publisher. No responsibility is accepted for the accuracy of information contained in the published articles. The publisher assumes no responsibility for any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas contained in the book. Publishing Process Manager Iva Lipovic Technical Editor Teodora Smiljanic Cover Designer Jan Hyrat Image Copyright Marko Poplasen, 2010. Used under license from Shutterstock.com First published June, 2011 Printed in India A free online edition of this book is available at www.intechopen.com Additional hard copies can be obtained from orders@intechweb.org Speech and Language Technologies, Edited by Ivo Ipšić p. cm. ISBN 978-953-307-322-4 free online editions of InTech Books and Journals can be found at www.intechopen.com Contents Preface IX Part 1 Machine Translation 1 Chapter 1 Towards Efficient Translation Memory Search Based on Multiple Sentence Signatures 3 Juan M. Huerta Chapter 2 Sentence Alignment by Means of Cross-Language Information Retrieval 17 Marta R. Costa-jussà and Rafael E. Banchs Chapter 3 The BBN TransTalk Speech-to-Speech Translation System 31 David Stallard, Rohit Prasad, Prem Natarajan, Fred Choi, Shirin Saleem, Ralf Meermeier, Kriste Krstovski, Shankar Ananthakrishnan and Jacob Devlin Part 2 Language Learning 53 Chapter 4 Automatic Feedback for L2 Prosody Learning 55 Anne Bonneau and Vincent Colotte Chapter 5 Exploring Speech Technologies for Language Learning 71 Rodolfo Delmonte Part 3 Language Modeling 105 Chapter 6 N-Grams Model for Polish 107 Bartosz Ziółko and Dawid Skurzok VI Contents Part 4 Text to Speech Systems and Emotional Speech 127 Chapter 7 Multilingual and Multimodal Corpus-Based Text-to-Speech System – PLATTOS – 129 Matej Rojc and Izidor Mlakar Chapter 8 Estimation of Speech Intelligibility Using Perceptual Speech Quality Scores 155 Kazuhiro Kondo Chapter 9 Spectral Properties and Prosodic Parameters of Emotional Speech in Czech and Slovak 175 Jiří Přibil and Anna Přibilová Chapter 10 Speech Interface Evaluation on Car Navigation System – Many Undesirable Utterances and Severe Noisy Speech – 201 Nobuo Hataoka, Yasunari Obuchi, Teppei Nakano and Tetsunori Kobayashi Part 5 Speaker Diarization 215 Chapter 11 A Review of Recent Advances in Speaker Diarization with Bayesian Methods 217 Themos Stafylakis and Vassilis Katsouros Chapter 12 Discriminative Universal Background Model Training for Speaker Recognition 241 Wei-Qiang Zhang and Jia Liu Part 6 Applications 257 Chapter 13 Building a Visual Front-end for Audio-Visual Automatic Speech Recognition in Vehicle Environments 259 Robert Hursig and Jane Zhang Chapter 14 Visual Speech Recognition 279 Ahmad B. A. Hassanat Chapter 15 Towards Augmentative Speech Communication 303 Panikos Heracleous, Denis Beautemps, Hiroshi Ishiguro and Norihiro Hagita Chapter 16 Soccer Event Retrieval Based on Speech Content: A Vietnamese Case Study 319 Vu Hai Quan Contents VII Chapter 17 Voice Interfaces in Art – an Experimentation with Web Open Standards as a Model to Increase Web Accessibility and Digital Inclusion 331 Martha Gabriel Preface The book “Speech and Language Technologies” addresses state-of-the-art systems and achievements in various topics in the research field of speech and language technologies. Book chapters are organized in different sections covering diverse problems, which have to be solved in speech recognition and language understanding systems. In the first section machine translation systems based on large parallel corpora using rule-based and statistical-based translation methods are presented. The third chapter presents work on real time two way speech-to-speech translation systems. In the second section two papers explore the use of speech technologies in language learning. The third section presents a work on language modeling used for speech recognition. The chapters in section Text-to-speech systems and emotional speech describe corpus- based speech synthesis and highlight the importance of speech prosody in speech recognition. In the fifth section the problem of speaker diarization is addressed. The last section presents various topics in speech technology applications, like audio- visual speech recognition and lip reading systems. I would like to thank to all authors who have contributed research and application papers from the field of speech and language technologies. Ivo Ipšić University of Rijeka Croatia [...]... queries and documents are typically transformed into a suitable representation One of the most popular representations is the vector space model where documents and queries are represented as vectors, each 20 4 Speech and Language Speech Technologies Technologies Fig 2 Machine translation approaches dimension corresponding to a separate term Usually, terms are weighted with the term frequency and inverse... different meaning depending on context, etc RBMT technology applies a set of linguistic rules in three 18 2 Speech and Language Speech Technologies Technologies different phases: analysis, transfer and generation Therefore, a rule-based system requires: syntax analysis, semantic analysis, syntax generation and semantic generation Statistical Machine Translation (SMT), a corpus-based approach, is a more complicated... service that provides automatic translation among several language pairs including the four Spanish languages plus English, Portuguese and French See Figure 3 Besides, Opentrad is 2 http://www.opentrad.com/ 24 8 Speech and Language Speech Technologies Technologies Language Sentence example English The entire wealth of the country in its different forms, irrespective of ownership, shall be subordinated... worth noticing the high quality of cross -language sentence matching using the query translation approach This high quality is mainly due to the quality of translation Figure 6 shows some examples of the system performance 4 http://lucene.apache.org/solr/tutorial.html 26 10 Speech and Language Speech Technologies Technologies Fig 5 SOLR screenshot Source System language rule-based statistical rule-based... Means of by Means of Cross -Language Information Retrieval 25 9 The detect language option automatically determines the language of the text the user is translating The accuracy of the automatic language detection increases with the amount of text entered Google is constantly working to support more languages and introduce them as soon as the automatic translation meets their standards In order to develop... highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world’s largest internet sites Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat... For example one can run with a standard k for hypotheses larger than 10 and a smaller k for hypotheses smaller or equal to 10 One can see from the distribution of the data that the overall result of this first signature step is the elimination of between 70% to more than 90% of the possible candidates, depending on the specific memory distribution 8 Speech and Language Technologies Fig 2 Histogram of... if one wanted to obtain 2700 sentences we can use cutoff 2 and 5 and run in 271 seconds or alternatively use 4 and 4 and run in 898 seconds The typical configuration attains 2770 sentences (cutoffs 2 and 5) in 271 seconds for an input of size 7795 which means 34 ms per query This response time is typical, or faster that a translation engine and this allows for our approach to be a feasible runtime technique... on the string edit distance computation, and following we present our approach which focuses on speeding up the translation memory search using increasingly stringent sentence signatures We then describe how to implement our approach using a Map/Reduce framework and we conclude with experiments that illustrate the advantages of our method 4 Speech and Language Technologies 2 Translation memory search... ⎭ a ,b ⎩ (1.1) 6 Speech and Language Technologies The initial condition is D[0,0]=0 The edit distance between A and B is found in the lower right cell in the matrix D[m,n] We can see that the computation of the Dynamic Programming can be carried out in practice by filling out the columns (j) of the DP array Figure 1 below, shows the DP matrix between sentences Sentence1=”A B C A A” and Sentence2=”D . SPEECH AND LANGUAGE TECHNOLOGIES Edited by Ivo Ipšić Speech and Language Technologies Edited by Ivo Ipšić Published. used for speech recognition. The chapters in section Text-to -speech systems and emotional speech describe corpus- based speech synthesis and highlight the importance of speech prosody in speech. Preface The book Speech and Language Technologies addresses state-of-the-art systems and achievements in various topics in the research field of speech and language technologies. Book chapters

Ngày đăng: 29/06/2014, 13:20

Xem thêm: SPEECH AND LANGUAGE TECHNOLOGIES ppt, SPEECH AND LANGUAGE TECHNOLOGIES ppt

SPEECH AND LANGUAGE TECHNOLOGIES ppt

Thông tin tài liệu

Từ khóa liên quan

Mục lục

preface_Speechnovo

part 1Machine Translation

01_Towards Efficient Translation Memory Search Based on Multiple Sentence Signatures

02 Sentence Alignment by Means of Cross-Language Information Retrieval

03_The BBN TransTalk Speech-to-Speech Translation System

part 2 Language Learning

04_Automatic Feedback for L2 Prosody Learning

05_Exploring Speech Technologies for Language Learning

part 3Language Modeling

06 N-Grams Model For Polish

part 4Text to Speech Systems and Emotional Speech

07_Multilingual and Multimodal Corpus-Based Text-to-Speech System – PLATTOS –

08 Estimation of Speech Intelligibility Using Perceptual Speech Quality Scores

09_Spectral Properties and Prosodic Parameters of Emotional Speech in Czech and Slovak

10_Speech Interface Evaluation on Car Navigation System – Many Undesirable Utterances and Severe Noisy Speech –

part 5Speaker Diarization

11 A Review of Recent Advances in Speaker Diarization with Bayesian Methods

12 Discriminative Universal Background Model Training for Speaker Recognition

part 6Applications

13_Building a Visual Front-end for Audio-Visual Automatic Speech Recognition in Vehicle Environments

Tài liệu cùng người dùng

Tài liệu liên quan