... June 2005.c2005 Association for Computational LinguisticsA Phonotactic LanguageModelforSpokenLanguage Identification Haizhou Li and Bin Ma Institute for Infocomm Research Singapore ... the 1996 NIST Language Recognition Evaluation database. 1 Introduction Spoken language and written language are similar in many ways. Therefore, much of the research in spoken language identification, ... Chen of Institute for Info-comm Research for insightful discussions. References Jerome R. Bellegarda. 2000. Exploiting latent semantic information in statistical language modeling, In Proc....
... Language& quot;, Cog- nition, Vol. 2, No. 1, pp. 15-47. MADCOW (1992). "Multi-site Data Collection for a SpokenLanguage Corpus", in Proceedings of the DARPA Speech and Natural Language ... could also be enforced by the parse preference component de- 57 GEMINI: A NATURAL LANGUAGE SYSTEM FOR SPOKEN -LANGUAGE UNDERSTANDING* John Dowding, Jean Mark Gawron, Doug Appelt, John Bear, ... Internet: dowding@ai.sri.com 1. INTRODUCTION Gemini is a natural language (NL) under- standing system developed forspokenlanguage applications. This paper describes the details of the system,...
... Dis-tributed language modeling for N-best list re-ranking.The 2006 Conference on Empirical Methods in Natu-ral LanguageProcessing (EMNLP), 216-223.Y. Zhang, 2008. Structured language models for statisti-cal ... Syntax-based language models for statistical machine transla-tion. MT Summit IX., Intl. Assoc. for Machine Trans-lation.C. Chelba and F. Jelinek. 1998. Exploiting syntacticstructure forlanguage modeling. ... Large language models in ma-chine translation. The 2007 Conference on EmpiricalMethods in Natural LanguageProcessing (EMNLP),858-867.E. Charniak. 2001. Immediate-head parsing for language models....
... performance on thestatistical spokenlanguage understanding(SLU) problem. The statistical natural language parsers trained on text performunreliably to encode non-local informa-tion on spoken ... selec-tion algorithm is very efficient for both perfor-mance and time complexity.2 SpokenLanguage Understanding as aSequential Labeling Problem2.1 SpokenLanguage UnderstandingThe goal of SLU ... (the unstructured model) is bet-ter than CRF (the structured model) not only for time cost but also for the performance on our ex-periment4. This result shows that local informa-tion provides...
... number.• For verbs, generated forms had to match theoriginal form for tense and negation.• For adjectives, generated forms had to matchthe original form for degree of comparison andnegation.• For ... exponen-tial models with surface and lemma features can bestraightforwardly trained for all of them. For the ex-periments described below we trained an exponen-tial modelfor the p(Y |X) lexical model. ... generated forms had to matchthe original form for number, case, and gender.• Non-standard inflection forms for all POS wereexcluded.The following criteria were used to select rules for which...
... Markov language model, and a simple set of unification grammar rules for the Chinese language, although the present model is in fact language independent. The system is written in C language ... signal preprocessor is included to form a complete speech recognition system. The language processor consists of a languagemodel and a parser. The languagemodel properly integrates the unification ... summarized. The Laneua~e Model The goal of the languagemodel is to participate in the selection of candidate constituents for a sentence to be identified. The proposed language model is composed...
... probability evaluated by the Conceptual Language Model, described in the next section.2.1 Stochastic Conceptual Language Model (SCLM)An SCLM is an n-gram languagemodel built onsemantic tags. Using ... 202–210,Athens, Greece, 30 March – 3 April 2009.c2009 Association for Computational LinguisticsRe-Ranking Models ForSpokenLanguage UnderstandingMarco DinarelliUniversity of TrentoItalydinarelli@disi.unitn.itAlessandro ... convenientway could improve the SLU performance. Thebest choice in this case is a discriminative model, since it allows for the use of informative features,which, in turn, can model easily feature dependen-cies...
... con-sistently outperforms the similarity-based baselineon all the lecture datasets. We attribute this gainto the presence of more attenuated topic transi-tions in spoken language. Since spoken language is ... did not try to ad-just our model to optimize its performance on thesynthetic data. The smoothing method developed for lecture segmentation may not be appropriate for short segments ranging from ... increase for the UI system. Weattribute this feature to the fact that the model isless dependent on individual recognition errors,which have a detrimental effect on the local seg-ment language modeling...
... Domain A class is defined for each constant of PAL. A class object for a lexical item contains linguistic knowledge in a procedural form. In other words, a class contains information as to how a ... answered based on a predetermined set theoretical model. For example, a noun is interpreted as a set of entities; the noun "penguin", for instance, is interpreted as a set of all penguins. ... to share methods for these cases. Any exceptional method can be attached to lower level items. For example, we can define a class "action verb" which has methods for instrumental...
... Computational LinguisticsGrammar Approximation by Representative Sublanguage:A New ModelforLanguage LearningSmaranda MuresanInstitute for Advanced Computer StudiesUniversity of MarylandCollege ... haveformally defined the ILP-learning problem as the tu-ple , where is the provability re-lation (also called the generalization model) , isthe language of the background knowledge,isthe language ... only for very limited subclasses offirst-order logic (Kietz and Dˇzeroski, 1994; Cohen,1995), which are not appropriate to model natural language grammars.Our grammar induction problem can be formu-lated...
... Sessions,pages 109–112, Ann Arbor, June 2005.c2005 Association for Computational LinguisticsA Flexible Stand-Off Data Model with Query Language for Multi-Level AnnotationChristoph M¨ullerEML Research ... Germanymueller@eml-research.deAbstractWe present an implemented XML data model and anew, simplified query languagefor multi-level an-notated corpora. The new query language involvesautomatic conversion of queries ... language. It offers a simpler and more con-cise way to formulate certain types of queries for multi-level annotated corpora. Queries are automat-ically converted into the underlying query language and...
... sir for how many people please" Figure 3: Structure for (1) 3.2 Splitting input into well-formed parts and ill-formed parts Item (C) splits input into well-formed parts and ill-formed ... Introduction A spoken -language translation system requires the ability to treat long or ill-formed input. An utterance as input of a spoken -language trans- lation system, is not always one well-formed ... Since our splitting method is performed under left-to-right parsing, translation efficiency is not 426 Splitting Long or Ill-formed Input for Robust Spoken -language Translation Osamu FURUSE...