Báo cáo khoa học: "Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora" docx

Báo cáo khoa học: "Learning Expressive Models for Word Sense Disambiguation" pot

... words to the right and left of the verb, identified using POS tags, represented by has_narrow(snt, word_ position, word) : has_narrow(snt1, 1st _word_ left, mind). has_narrow(snt1, 1st _word_ right, ... preposition to the right, 1st and 2nd words to the left and right, 1st noun, 1st adjective, and 1st verb to the left and right. These are represented using definitions of the form has_collocation(snt, ... the models produced contain a small number of rules (from 6, for verbs with a few examples, to 88) and all knowledge sources are used across different rules and verbs. In general, results from...

Báo cáo khoa học: "Soft Syntactic Constraints for Word Alignment through Discriminative Training" pot

... bilin-gual word alignment ﬁnds word- to -word connec-tions across languages. Originally introduced as abyproduct of training statistical translation models in (Brown et al., 1993), word alignment ... new information resulting in im-proved alignments.2 Constrained Alignment Let an alignment be the complete structure thatconnects two parallel sentences, and a link beone of the word- to -word ... cohesionconstraint: for every possible alignment, a corre-sponding binary constituency tree must exist for which the alignment maintains phrasal cohesion.Figure 2 shows a word alignment and the corre-sponding...

Tài liệu Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity" pdf

... on Word Alignment The alignment approach to synonym extraction isbased on automatic word alignment. Context vec-tors are built from the alignments found in a paral-lel corpus. Each aligned word ... context and one using translational context based on word alignment and the combination of both. For bothapproaches, we used a cutoff n for each row in our word- by-context matrix. A word is ... extracted from automatic word alignment. We applied GIZA++ and the in-tersection heuristics as explained in section . From the word aligned corpora we extracted word typelinks, pairs of source and...

Tài liệu Báo cáo khoa học: "Incremental Parsing Models for Dialog Task Structure" doc

... utter-ances and automatically annotated with part-of-speech tag and supertag information and namedentities. They were annotated by hand for dia-log acts and tasks/subtasks. The dialog act and task/subtask ... types of infor-mation provide rich clues for building dialog mod-els (Grosz and Sidner, 1986). Dialog models canbe built ofine (for dialog mining and summariza-tion), or online (for dialog ... DAi−1i−k,ci−1i−k)(4)Table 1: Equations used for modeling dialog act and sub-task labeling of agent and user utterances. cui/cai= thewords, syntactic information and named entities associatedwith...

Báo cáo khoa học: "Employing Topic Models for Pattern-based Semantic Class Discovery" doc

... modeling pro-vides a formal and convenient way of grouping documents and words to topics. In order to apply topic models to our problem, we map RASCs to documents, items to words, and treat the output ... fewer “documents”, “words”, and “topics”. To further improve efficiency, we also perform preprocess-ing (refer to Section 3.4 for details) before build-ing topic models for CR(q), where some ... Choose  from a Dirichlet distribution with parameter . 3. For each of the N words wi. a. Choose a topic z from a Multinomial distribution with parameter . b. Pick a word wi from ,...

Báo cáo khoa học: "Re-Ranking Models For Spoken Language Understanding Marco Dinarelli University of Trento Italy" potx

... (Shawe-Taylor and Cristianini, 2004) and tree kernels (Raymond and Riccardi, 2007; Moschitti and Bejan, 2004;Moschitti, 2006) to implicitly encode n-grams and other structural information in ... param-eters, models and results of our experiments of word chunking and concept classiﬁcation. Ourbaseline relates to the error rate of systems basedon only FST and SVMs. The re-ranking models are ... (Bonneau-Maynard et al., 2005) for development and evalu-ation of spoken understanding models and linguis-tic studies. The corpus is composed of 1257 di-alogs, from 250 different speakers, acquired...

Tài liệu Báo cáo khoa học: "Conditional Random Fields for Word Hyphenation" docx

... condi-tional random ﬁelds. We create new train-ing sets for English and Dutch from theCELEX European lexical resource, and achieve error rates for English of less than0.1% for correctly allowed ... 1,I(yi−1= 1 and yi= 0 and x3x4= ph) = 1.All other similar functions have value 0:I(yi−1= 1 and yi= 1 and x2x3= yp) = 0,I(yi−1= 1 and yi= 0 and x2x3= yq) = 0, and so on. ... overall word- level errors #words with at least one FP or FNswe serious word- level errors #words with at least one FPower overall word- level error rate owe / (total #words)swer serious word- level...

Tài liệu Báo cáo khoa học: "Head-Driven Parsing for Word Lattices" ppt

... current evaluation metrics, and suggestions for new metrics. Experiments onstrings and word lattices are reported in Section 5, and conclusions and opportunities for future workare outlined ... and bet-ter models of spoken language (Hall and Johnson,2003; Roark, 2001; Chelba and Jelinek, 2000).Our goal is integration of head-driven lexical-ized parsing with acoustic and n-gram models ... through various formsof extensions to the CKY algorithm, has been ap-plied to word lattices for speech recognition (Hall and Johnson, 2003; Chappelier and Rajman, 1998;Chelba and Jelinek, 2000)....

Tài liệu Báo cáo khoa học: "A Syntactic Framework for Speech Repairs and Other Disruptions" doc

... grammatical and pro- cessing framework for handling the repairs, hesitations, and other interruptions in nat- ural human dialog. The proposed framework has proved adequate for a collection ... (urn) and speech repairs (I mean) and give meta-comments on the ut- terance (right). specify how speech repairs should be handled by the parser. (Hindle, 1983) and (Bear et al., 1992) performed ... Dialog From a traditional parsing perspective, a text is a series of sentences to be analyzed. An interpretation for a text would be a series of parse trees and logical forms, one for each...

Báo cáo khoa học: "Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study" potx

... Tree-bank, and Sinorama, is then given to GIZA++ to perform one word alignment run. It took about 40 hours on our 2.4 GHz machine with 2 GB memory to perform this alignment. After word alignment, ... for most of these nouns. The parallel text alignment approach works well for nature and sense, among these nouns. For nature, the parallel text alignment approach gives better accuracy, and ... are segmented into words. The resulting parallel texts are then input to the GIZA++ software (Och and Ney 2000) for word alignment. In the output of GIZA++, each English word token is aligned...

Xem thêm