... Exploiting Parallel Texts for Word Sense Disambiguation:
An Empirical Study
Hwee Tou Ng
Bin Wang
Yee Seng Chan
Department of Computer Science
National ... of senses before and after sense lump-
ing is 5.07 and 3.52 respectively.
After sense lumping, we trained a WSD classi-
fier for each noun w, by using the lumped senses in
the manually sense- tagged ... transl...
... using one's feet” and “to direct or control”.
WSD can be useful for many applications, includ-
ing information retrieval, information extraction
and machine translation. Sense ambiguity has ... (Res-
nik and Yarowsky, 1997). For example, in machine
translation, WSD, or translation disambiguation, is
responsible for identifying the correct translation
for an ambiguous sour...
... OF TEXTS FOR INFORMATION RETRIEVAL
N.J. Belkin, B.G. Michell, and D.G. Kuehner
University of Western Ontario
The representation of whole texts is a major concern of
the field known as information ... the following:
a. A user, recognizing an information need, presents to
an IR mechanism (i.e., a collection of texts, with a
set of associated activities for representing, stor-...
... answer. As an example, (Harabagiu
and Maiorano, 1999) describes answer validation as
an abductive inference process, where an answer is
valid with respect to a question if an explanation for
it, ... the
question words influence the appearance of answer
words. Therefore, we introduce additional linguis-
tic techniques for pattern and query formulation,
such as keyword extraction, an...
... for Word Hyphenation
Nikolaos Trogkanis
Computer Science and Engineering
University of California, San Diego
La Jolla, California 92093-0404
tronikos@gmail.com
Charles Elkan
Computer Science and ... condi-
tional random fields. We create new train-
ing sets for English and Dutch from the
CELEX European lexical resource, and
achieve error rates for English of less than
0.1% for correc...
... Journal treebank and lattice cor-
pora show word error rates competitive with the
standard n-gram language model while extracting
additional structural information useful for speech
understanding.
1 ... training.
1 corpus are annotated with trigram scores trained
using a 20 thousand word vocabulary and 40 mil-
lion word training sample. The word lattices have a
unique start and end...
... Similarly,
any index x /∈ [i, k] is external to T
[i,k]
. An in-
valid span is any span for which our provided tree
T[i,k]
x1 i j k x2j'
T
Figure 3: Illustration of invalid spans. [j
, j] and
[j, ... alignment be the complete structure that
connects two parallel sentences, and a link be
one of the word- to -word connections that make
up an alignment. All word alignment meth...
... the hum-
Table i: Spearman rank correlation analysis of
the neighborhood density and frequency effects
for empirical and theoretical words of length 4.
Dutch Mand. Mand Simon
dens.
freq. ... function words excluded, and
charts the lexical similarity effects of the subset
of words with length 4 by means of boxplots.
These show the mean (dotted line), the median,
the upper and lowe...
... natural form of a parser which utilizes
abandonment would be an IPA model. The construction
of more than one analysis for an ambiguity would
trigger the parser to throw out the analyses and
wait ... other sort can be called
strong parallelism, in which the possible analyses
can stay active and be expanded as new input is
received. If further input is inconsistent with any
of the...
... the European Union, where texts must
be translated daily into eleven languages, or
even in the U.S.A. where Spanish and English
speaking communities are intermingled.
Parallel texts
(texts that ... varies
according to language similarity. For instance,
on average, it is higher for Portuguese–Spanish
than for Portuguese–English.
These words end up being mainly numbers
and names....