Tài liệu Báo cáo khoa học: "Word Alignment for Languages wit

Tài liệu Báo cáo khoa học: "Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs" pptx

... improve word alignment for languages with scarce resources using bilingual corpora of other language pairs. To perform word alignment between languages L1 and L2, we introduce a third language ... improve word alignment for languages with scarce resources using bilingual corpora of other language pairs. To perform word alignment bet...

Ngày tải lên : 20/02/2014, 12:20

8
359
0

Tài liệu Báo cáo khoa học: "Word Alignment with Synonym Regularization" doc

... framework for word alignment that incorporates synonym knowledge collected from monolingual linguistic resources in a bilingual proba- bilistic model. Synonym information is helpful for word alignment ... occurrences of ‘chief’ and ‘forefront’ with ‘head’ do sometimes harm with word alignment accuracy, and we have to model either the context or senses of words. We propos...

Ngày tải lên : 20/02/2014, 04:20

5
470
2

Tài liệu Báo cáo khoa học: "Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations" ppt

... than 48 GB of memory is not widely available even today. Therefore, we parallelized the clustering algo- rithm, to make it suitable for running on a cluster of PCs with a moderate amount of memory ... Torisawa (2007), which encodes the matching with a gazetteer entity using IOB tags, with the modiﬁcation for Japanese. They describe using two types of gazetteer features....

Ngày tải lên : 20/02/2014, 09:20

9
428
0

Tài liệu Báo cáo khoa học: "Discriminative Pruning for Discriminative ITG Alignment" pdf

... alignment system of GIZA++. 1 Introduction Inversion transduction grammar (ITG) (Wu, 1997) is an adaptation of SCFG to bilingual parsing. It does synchronous parsing of two languages with phrasal ... expanding the list of alignment hypotheses of minimal number of span pairs. The first type of pruning is equivalent to mi- nimizing the number of hypernodes in...

Ngày tải lên : 20/02/2014, 04:20

9
429
0

Tài liệu Báo cáo khoa học: "Word representations: A simple and general method for semi-supervised learning" doc

... corpus of 160 million word tokens with a vocabulary size W of 70K word types. There are 2·W types of context (columns): The ﬁrst or second W are counted if the word c occurs within a window of 10 ... EACL. Honkela, T. (1997). Self-organizing maps of words for natural language processing applica- tions. Proceedings of the International ICSC Symposium on Soft Computing. Honkel...

Ngày tải lên : 20/02/2014, 04:20

11
687
0

Tài liệu Báo cáo khoa học: "Word to Sentence Level Emotion Tagging for Bengali Blogs" doc

... This tag weight for each emotion tag has been calculated based on the frequency of occurrence of an emotion tag with respect to the total number of occurrences of all six types of emotion tags ... Bengali part of speech tagger (Ekbal et al. 2008) based on Support Vector Machine (SVM) tech- nique. The POS tagger was developed with a tagset of 26 POS tags 2 , defined for...

Ngày tải lên : 20/02/2014, 09:20

4
429
0

Tài liệu Báo cáo khoa học: "Word Vectors and Two Kinds of Similarity" pptx

... semantic processing. Other methods use a variety of other information: cooccurrence of two words (Burgess, 1998; Sch¨utze, 1998), occurrence of a word in the sense deﬁnitions of a dictionary (Kasahara ... kind of semantic similarity between words in the same level of categories or clusters of the thesaurus, in particular synonyms, antonyms, and other coordinates. Assoc...

Ngày tải lên : 20/02/2014, 12:20

8
473
0

Tài liệu Báo cáo khoa học: "Word Order in German: A Formal Dependency Grammar Using a Topological Hierarchy" pptx

... domain, with positions for all of its dependents, or a restricted phrase, which forms the verb cluster, with no positions for dependents other than predicative elements. These two kinds of phrases ... infinitives (with zu) and bare infinitives (without zu): Bare infinitives cannot form an embedded domain outside of the Vorfeld. Consequently, there are two different prosodie...

Ngày tải lên : 20/02/2014, 18:20

8
575
0

Tài liệu Báo cáo khoa học: " Word Translation Disambiguation Using Bilingual Bootstrapping" doc

... and repeatedly boosts the performances of the classifiers by further classifying data in each of the two languages and by exchanging between the two languages information regarding the classified ... data in both languages, (2) using the constructed classifiers in each of the languages to classify some unclassified data and adding them to the classified training data se...

Ngày tải lên : 20/02/2014, 21:20

9
480
0

Tài liệu Báo cáo khoa học: "WORD, PHRASE AND SENTENCE" pptx

... primarily concerned with analysis of language at the sentence level. The most glamourous areas of natural language research are at levels above the sentence, concerned with dialogues and ... level. Yet, with regard to most of the topics in this and other sessions, there is a stronK sense of de~a vu; the earliest natural language studies featured automatic extrac...

Ngày tải lên : 21/02/2014, 20:20

2
381
0