... respectively. The mean values are
−9.57 for SO-PMI and 0.08 for SimRank, the
standard deviations are 13.75 and 0.22. SimRank
values range between −0.67 and 0.41, SO-PMI
ranges between −46.21 and 46.59. ... human ratings and SO-PMI and Sim-
Rank scores which illustrate advantages and pos-
sible shortcomings of the two methods. The medi-
ans of SO-PMI and SimRank scores are −15.58
and −0.05, re...
... Creating Speech and Language Data
With Amazon’s Mechanical Turk, pages 108–113.
Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Gan-
itkevitch, Ann Irvine, Sanjeev Khudanpur, Lane
Schwartz, Wren ... ex-
istence to data like the Canadian Hansards (which by
law must be published in both French and English).
SMT can be applied to any language pair for which
there is sufficient data, and it has...
... translations as well as the four human ref-
erence translations, using both the original named-
entity translation annotation and the re-annotation:
Gold Standard BBN GS Re-annotated GS
Human ... burtran to Pow-
ertrain. The human reference translations for this
phrase are
1. Portran site in Tremolo
2. Termoli plant (one name dropped)
3. Portran in Tirnoli
4. Portran assembly plant, in ... wo...
... phrases and their abbrevi-
ations is an interesting and important task for many
natural language processing applications (e.g., m a-
chine translation, question answering, information
retrieval, and ... thank Yi Su, Sanjeev Khudan-
pur, Philip Resnik, Smaranda Muresan, Chris Dyer
and the anonymous reviewers for their helpful com-
ments. This work was partially supported by the De-
fense Adv...
... an assembly of indi-
visible sub-sentential elementary trees (ETs), we
can find a proper way to transduce the input tree to
the output tree. An ET is a single “symbol” in a
transducer’s language. ... I 1947 year since always live in Canada
[
ITERATION 1 & 2 ] Partition at word pair
(“I” and “wo”) (“Canada” and “janada”)
[
ITERATION 3 ] (“been” and “zhu”) are chosen but no
p...
...
Let
ε
stand for a random variable on E,
γ
a
random variable on C. Also let e stand for a
random variable on E, c a random variable on C,
and t a random variable on T. While
ε
and
γ
represent ... word translation
disambiguation. For instance, we are concerned
with an ambiguous word in English (e.g., ‘plant’),
which has multiple translations in Chinese (e.g.,
‘
(gongchang)’ a...
... Resolving Translation Mismatches With Information Flow
Megumi Kameyama, Ryo Ochitani, Stanley Peters
The Center for the Study of Language and Information
Ventura Hall, Stanford University, Stanford, ... source language (SL) and those
of the target language (TL). This process can be
best modelled with information flow graphs (IFG)
defined in Barwise and Etchemendy 1990. An IFG
is a se...
... text by its translation wherever it appears,
leaving the rest of the translation to be done by the human
translator, systems where the translator as he produces the
translation can consult specialist ... translation will normaly be post-edited,
just as human translation is normally revised. Some systems
aim at giving nothing more than a very rough raw trans-
lation, to be used by the hu...
...
tion and disambiguates speech acts. The
final goal is to improve translation quality
in a speech-to-speech translation system.
1 Ambiguity in Speech Translation
For any given utterance out ... thousand two hundred and
fifteen", the room number "'one two one five", and so
on. Although English can conflate all those possible
meanings into one expression, the translat...