... Association for Computational Linguistics
Applying Morphology Generation Models to Machine Translation
Kristina Toutanova
Microsoft Research
Redmond, WA, USA
kristout@microsoft.com
Hisami Suzuki
Microsoft ... base MT system. In the second setting, we
allow the model to use up to 100 translations, and
to automatically select the best number to use. As
seen in Table 3, (n=16)...
... Phrase-Based Backoff Models for Machine Translation of Highly Inflected
Languages
Mei Yang
Department of Electrical Engineering
University of Washin g ton
Seattle, WA, USA
yangmei@ee.washington.edu
Katrin ... dis-
counting factor (generally between 0 and 1) that is
applied to the higher-order distribution. The nor-
malization factor α(w
i−1
, w
i−2
) ensures that the
distribution sums t...
... {c
1
–e
3
,c
2
–e
4
,c
3
–e
1
,c
3
–e
2
,c
3
–e
5
},
which means that c
1
is aligned to e
3
, c
2
is aligned to
1
In order to check this requirement, we extended Hiero to
make word alignment information available to the decoder.
36
Input: rule R ... WSD
To incorporate WSD into Hiero, we use the trans-
lations proposed by the WSD system to help Hiero
obtain a better or more proba...
...
assured to be the same. Therefore, we need to
first map a vector into the space of the other vec-
tor, so that the similarity can be calculated. Fung
(1998) and Rapp (1999) map the vector one-
dimension -to- one-dimension ... one-
dimension -to- one-dimension (a context word is a
dimension in each vector space) from one lan-
guage to another language via an initial bilingual
dictio...
... m-word, m-pos to refer to head and modi-
fier words and POS tags, and append a numerical
value to shift the word offset either to the left or to
the right (e.g., h-pos+1 is the POS to the right ... language models are key to state-of-
the-art performance (Brants et al., 2007), and
the ability of phrase-based decoders to handle
large-size, high-order language models with no...
... LOGON MT demonstrator assembles
independently valuable general-purpose
NLP components into a machine trans-
lation pipeline that capitalizes on output
quality. The demonstrator embodies an in-
teresting ... to
right, the corpus sub-division by input length, total number
of items, and average string length, ambiguity rate, grammat-
ical coverage, and generation time, respectively.
transl...
... vector of feature weights specific to
predicting at anchor size j, and φ is a vector of size-
independent configuration features, detailed below.
We then perform inference using these models to
predict ... train
models to predict the BLEU score at m anchor sizes
s
1
, . . . , s
m
, based on a set of features globally char-
acterizing the configuration of interest. We restrict
our atten...
... prison and sentenced
him to twenty years to life , slightly less than the maximum possible of twenty-five years to
life .
Simple
Wikipedia
he was sentenced to twenty-five years to life in prison in ... years to life to life .
PBMT-R the judge ordered that chapman should get psychiatric treatment in prison and sentenced him
to twenty years to life , a little bit less than the h...
...
Template Approach to Statistical Machine Transla-
tion. Computational Linguistics, 30(4): 417-449.
Kishore Papineni, Salim Roukos, Todd Ward, and
Weijing Zhu. 2002. BLEU: a method for automatic
evaluation ... 22,660 entries is used
to convert and
into their stem forms
and
by replacing each word into its
stem form. This feature is computed similarly
to th...
...
“legal” is related to “rule”, which in turn is related
to “mandatory”; that “age” is related to “aged”;
and that “Argentine” is related to “Argentina”. It is
not difficult to see by now that ...
generative story (Figure 1 lists some of the factors
specific to this computation.) The readers familiar
with the statistical machine translation (SMT)
literature should recognize...