0

a stochastic language model using dependency

Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx

Báo cáo khoa học

... NIST Language Recognition Evaluation database. 1 Introduction Spoken language and written language are similar in many ways. Therefore, much of the research in spoken language identification, ... 2003. Acoustic, Pho-netic and Discriminative Approaches to Automatic language recognition, In Proc. of Eurospeech Masahide Sugiyama. 1991. Automatic language recog-nition using acoustic features, ... of acoustic vocabulary (AV) with mixture of token unigram, bigram, and trigram: a) AV1: 32 broad class phonemes as unigram, selected from 12 languages, also referred to as P-ASM as detailed...
  • 8
  • 436
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Structured Language Model" ppt

Báo cáo khoa học

... Proceedings of the Human Language Technology Workshop, 272-277. ARPA. Raymond Lau, Ronald Rosenfeld, and Salim Roukos. 1993. Trigger-based language models: a maximum entropy approach. In Proceedings ... University, Baltimore, MD. Frederick Jelinek, John Lafferty, David M. Mager- man, Robert Mercer, Adwait Ratnaparkhi, Salim Roukos. 1994. Decision Tree Parsing using a Hid- den Derivational Model. ... those assigned man- ually in the Penn Treebank (Marcus95) after under- going headword percolation and binarization. All four LMs predict a word wk and they were implemented using the Maximum...
  • 3
  • 342
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations" pptx

Báo cáo khoa học

... sec-ond dataset contains three annotated presidential de-bates (Boydstun et al., 2011) between Barack Obamaand John McCain and a vice presidential debate be-tween Joe Biden and Sarah Palin. Each ... Quintana,F. A. (2004). Nonparametric Bayesian data analysis.Statistical Science, 19(1):95–110.[Murray et al., 2005]Murray, G., Renals, S., and Carletta,J. (2005). Extractive summarization of meeting ... moderator.7Similarly, the “Question” speakerhad a relatively high variance, consistent with anamalgamation of many distinct speakers.These topic shift tendencies suggest that all can-didates manage to...
  • 10
  • 555
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A Discriminative Language Model with Pseudo-Negative Samples" pptx

Báo cáo khoa học

... DLMs are trained using correct sentences from a corpus and negativeexamples from a Pseudo-Negative generator.An advantage of sampling is that as many nega-tive examples can be collected as correct ... that they have the dis-advantage of being computationally expensive, andnot all relevant features can be included. A discriminative language model (DLM) assigns a scoreto a sentence , measuring ... spe-cific applications and therefore were able to obtainreal negative examples easily. For example, Roark(2007) proposed a discriminative language model, inwhich a model is trained so that a correct...
  • 8
  • 315
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

Báo cáo khoa học

... features, as described below: a statistical language model and a measure of tensedifficulty.4.1 The language model The lexical difficulty of a text is quite an elaboratephenomenon to parameterise. ... poems as outliers).4 Selection of lexical and syntacticvariablesAny text classification tasks require an object(here a text) to be parameterised into variables,whether qualitative or quantitative. ... Belgiumthomas.francois@uclouvain.beAbstractReading is known to be an essential taskin language learning, but finding the ap-propriate text for every learner is far fromeasy. In this context, automatic...
  • 9
  • 514
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A probabilistic generative model for an intermediate constituency-dependency representation" pptx

Báo cáo khoa học

... Portugal.Federico Sangati and Chiara Mazza. 2009. An English Dependency Treebank` a la Tesni`ere. In The 8th In-ternational Workshop on Treebanks and LinguisticTheories, pages 173–184, Milan, ... (92).Michael J. Collins. 1999. Head-Driven StatisticalModels for Natural Language Parsing. Ph.D. the-sis, University of Pennsylvania.Marie-Catherine de Marneffe and Christopher D. Man-ning. ... coordination, a linguistic phenomena highlyabundant in natural language production, but of-ten neglected when it comes to evaluating parsingresources. We have therefore proposed a specialevaluation...
  • 6
  • 555
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation" doc

Báo cáo khoa học

... signif-icantly. Bear in mind that Charniak et al. (2003) in-tegrated Charniak’s language model with the syntax-based translation model Yamada and Knight pro-posed (2001) to rescore a tree-to-string ... Stochastic analysis of lexical andsemantic enhanced structural language model. The 8thInternational Colloquium on Grammatical Inference(ICGI), 97-111.K. Yamada and K. Knight. 2001. A syntax-based ... (EMNLP),858-867.E. Charniak. 2001. Immediate-head parsing for language models. The 39th Annual Conference on Associationof Computational Linguistics (ACL), 124-131.E. Charniak, K. Knight and K. Yamada. 2003....
  • 10
  • 567
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Smoothing a Tera-word Language Model" doc

Báo cáo khoa học

... and Linda C. Bauman Peto. 1995. A hierarchical Dirichlet language model. Natural Lan-guage Engineering, 1(3):1–19.Y.W. Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. ... n-grams:C(ab) − C(ab∗). A( ab) = max(1, K(C(ab) − C(ab∗))) A different K constant is chosen for each n-gramorder. Using this formulation as an interpolated 5-gram language model gives a cross ... Speech and Language. R. Kneser and H. Ney. 1995. Improved backing-off form-gram language modeling. In International Confer-ence on Acoustics, Speech, and Signal Processing.David J. C. Mackay and...
  • 4
  • 425
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Succinct N-gram Language Model" ppt

Báo cáo khoa học

... com-pression tasks achieved a significant com-pression rate without any loss.1 IntroductionThere has been an increase in available N -gramdata and a large amount of web-scaled N-gramdata has been ... the ACL-IJCNLP 2009 Conference Short Papers, pages 341–344,Suntec, Singapore, 4 August 2009.c2009 ACL and AFNLP A Succinct N-gram Language Model Taro Watanabe Hajime Tsukada Hideki IsozakiNTT ... Communication Science Laboratories2-4 Hikaridai Seika-cho Soraku-gun Kyoto 619-0237 Japan{taro,tsukada,isozaki}@cslab.kecl.ntt.co.jpAbstractEfficient processing of tera-scale text datais an important...
  • 4
  • 457
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Lexical transfer using a vector-space model" doc

Báo cáo khoa học

... using matrix PRICAI-00, 2000, (to appear). Tanaka H. (1995) Statistical Learning of “Case Frame Tree” for Translating English Verbs, Journal of NLP, 2/3, pp. 49-72, (in Japanese). Yamada, ... Laboratories 2-2 Hikaridai, Seika, Soraku Kyoto 619-0288, Japan sumita@slt.atr.co.jp Abstract Building a bilingual dictionary for transfer in a machine translation system is conventionally ... generalization (Akiba et. al., 1996 and Tanaka, 1995); (2) approaches using structural matching: to obtain transfer rules, several search methods have been proposed for maximal structural matching between...
  • 7
  • 654
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

Báo cáo khoa học

... 923 Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model Masaaki NAGATA NTT Information and Communication Systems Laboratories 1-1 Hikari-no-oka Yokosuka-Shi ... such as Japanese and Chinese. It consists of a statistical OCR model, an approxi- mate word matching method using character shape similarity, and a word segmentation algorithm us- ing a statistical ... Yokosuka-Shi Kanagawa, 239-0847 Japan nagata@nttnly, isl. ntt. co. jp Abstract We present a novel OCR error correction method for languages without word delimiters that have a large character...
  • 7
  • 472
  • 0

Xem thêm