0

bilingual bootstrapping for wsd

Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Semi-Supervised Learning of Partial Cognates using Bilingual Bootstrapping" doc

Báo cáo khoa học

... in the Mono-lingual Bootstrapping (MB) and Bilingual Bootstrapping (BB) sections. 5.2.1 Monolingual Bootstrapping The monolingual bootstrapping algorithm that we used for experiments on French ... steps 2 and 3 for t times endFor For the first step of the algorithm we used NB-K classifier because it was the classifier that consis-tently performed better. We chose to perform attribute ... sentences was lim-ited by the value of parameter k in the algorithm. 5.2.2 Bilingual Bootstrapping The algorithm for bilingual bootstrapping that we propose and tried in our experiments is: 1....
  • 8
  • 418
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: " Word Translation Disambiguation Using Bilingual Bootstrapping" doc

Báo cáo khoa học

... disambiguation method using a bootstrapping technique called Bilingual Bootstrapping. Experimental results indicate that BB significantly outperforms the existing Monolingual Bootstrapping technique ... context information for translation disambiguation. For an English word ε, we define a binary classifier for resolving each of its translation ambiguities in εT in a general form as: ... .,,,γγεεγγεεUUUULLLLCCEECCEE∈∈∈∈==== UUUU We perform Bilingual Bootstrapping as described in Figure 2. Hereafter, we will only explain the process for English (left-hand side); the process for Chinese (right-hand...
  • 9
  • 480
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A Bilingual Concordancer for Domain-Specific Computer Assisted Translation" potx

Báo cáo khoa học

... Association for Computational Linguistics, pages 55–60,Jeju, Republic of Korea, 8-14 July 2012.c2012 Association for Computational LinguisticsDOMCAT: A Bilingual Concordancer for Domain-Specific ... web-based bilingual concordancer, DOMCAT1, for domain-specific computer assisted translation. Given a multi-word expression as a query, the system involves retrieving sentence pairs from a bilingual ... characteristics of normalized frequency and is adjusted for spotting rare translations. These characteristics are especially important for a domain-specific bilingual concordancer to spot translation...
  • 6
  • 371
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatically Creating Bilingual Lexicons for Machine Translation from Bilingual Text" ppt

Báo cáo khoa học

... lexicon was used for content words. Only a bilingual lexi- con for closed class words and a set of bilingual templates were used. Therefore, new bilingual entries were obtained for all the content ... intervention is foreseen. Furthermore, techniques for the automatic resolution of template overlaps are under inves- tigation. Such techniques assume the presence of a bilingual lexicon. The information ... pre-existing bilingual lexicon is in use, bilingual entries are prioritized over bilingual templates. Consequently, only new entries are created, the others being retrieved from the ex- isting bilingual...
  • 8
  • 375
  • 0
Tài liệu POS-Tagger for English-Vietnamese Bilingual Corpus pdf

Tài liệu POS-Tagger for English-Vietnamese Bilingual Corpus pdf

Điện - Điện tử

... 4). For example, “can” in English may be “Aux” for ability sense, “V” for to make a container sense, and “N” for a container sense and there is hardly existing POS-tagger which can tag POS for ... exploited all linguistic information in English texts and there is no way for us to improve English POS-tagger in case of such a monolingual English texts. By contrast, in the bilingual texts, we ... no POS-annotated corpus available for Vietnamese, we had to manually build a small golden corpus for Vietnamese POS-tagging with approximately 1000 words for evaluating. The results of Vietnamese...
  • 8
  • 676
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels" docx

Báo cáo khoa học

... experiment is shown in Table 1. We use 5000 sentences for experiment and divide them into three parts, with 3k for train-ing, 1k for testing and 1k for tuning the parameters of kernels and thresholds ... the performance of both phrase and syntax based SMT systems. 2 Bilingual Tree Kernels In this section, we propose the two BTKs and study their capability and complexity in modeling the bilingual ...  is the distance for the instances classi-fied as aligned and  is that for the unaligned. We use |, as the confidence to conduct the sure links for those classified...
  • 10
  • 467
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Bilingual Sense Similarity for Statistical Machine Translation" ppt

Báo cáo khoa học

... disambiguation (WSD) techniques in SMT for translation selection. However, WSD techniques for SMT do so indirectly, using source-side context to help select a particular translation for a source ... of the Association for Computational Linguistics, pages 834–843,Uppsala, Sweden, 11-16 July 2010.c2010 Association for Computational Linguistics Bilingual Sense Similarity for Statistical Machine ... co-occurrence counts of the two units. Therefore, questions emerge: how good is the sense similarity computed via VSM for two units from parallel corpora? Is it useful for multi-lingual applications,...
  • 10
  • 594
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "HITS-based Seed Selection and Stop List Construction for Bootstrapping" doc

Báo cáo khoa học

... ofeach system. Due to lack of space, only the results for the second most frequent sense for each word arereported; i.e., the results for more minor senses arenot in the table. However, they ... improvement over thebaseline on interest, for all n = 10, 20, and 30.6 ConclusionsWe have proposed a HITS-based method for allevi-ating semantic drift in the bootstrapping algorithmEspresso. Our ... method on other bootstrapping tasks,including named entity extraction.AcknowledgementsWe thank Masayuki Asahara and Kazuo Hara for helpful discussions and the anonymous reviewers for valuable...
  • 7
  • 382
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs" pptx

Báo cáo khoa học

... are two parameters for the dis-tortion probability: one for head words and the other for non-head words. Distortion Probability for Head Words The distortion probability for head words represents ... approach for languages with scarce resources using bilin-gual corpora of other language pairs. To perform word alignment between languages L1 and L2, we introduce a pivot language L3 and bilingual ... (2005). tλ repre-sents the weights for translation probability. nλ represents the weights for fertility probability. d3λ and d4λ represent the weights for distortion probability in model...
  • 8
  • 359
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Adaptive String Distance Measures for Bilingual Dialect Lexicon Induction" pdf

Báo cáo khoa học

... only use single word in-formation for training and testing, which means thatthe rich contextual information encoded in texts, aswell as the morphologic and syntactic informationavailable in ... cost: a training bilingual lexicon ofsufficient size must be available. For scarce resourcelanguages, such lexicons often need to be built man-ually.3.3 Training without a Bilingual CorpusIn ... tests were performed on 10corpora of 100 word pairs each. The numbersrepresent the percentage of correctly induced wordpairs.similar words (for example, different inflected formsof the same...
  • 6
  • 388
  • 0

Xem thêm