... describes SENSELEARNER – a
minimally supervised word sense disam-
biguation system that attempts to disam-
biguate all content words in a text using
WordNet senses. We evaluate the accu-
racy of SENSELEARNER ... present a method for solving the
semantic ambiguity of all content words in a text. The
algorithm can be thought of as a minimally supervised
word sense disambi...
... frequent words in T (cf.
Section 3.1), the star pattern σ(s) associated with
s is obtained by replacing with * all the words
w
i
∈ F, that is all the tokens that are non-frequent
words. For instance, ... 2008. Word
lattice reranking for chineseword segmentation and
part-of-speech tagging. In Proceedings of the 22nd
International Conference on Computational Lin-
guistics (C...
... have
usually been investigated within Senseval using
the All Words dataset, which does not include
training examples. In this paper we preferred us-
ing the same test set which was used for the ... polyse-
mous source words provide poor training models
for sense matching. This can be explained by ob-
serving that polysemous source words can be sub-
stituted with the target word...
... t on a word containing char c (not the
starting or ending character)
12
tag t on a word starting with char c
0
and
containing char c
13
tag t on a word ending with char c
0
and
containing char ... POS tagging
using an HMM-based approach. Word information is
used to process known -words, and character infor-
mation is used for unknown words in a similar way
to Ng and Low (2004)....
... the same number of
senses for all the words, since tuning this number
individually for each word would be prohibitive.
We experimented with values ranging from three
to nine senses. Figure 3 shows ... a grouping of these instances into classes cor-
responding to the induced senses. In other words,
contexts that are grouped together in the same
class represent a specific word...
... work
Pre-processing approaches to word reordering aim
at permuting input words in a way that minimizes
the reordering needed for translation: determinis-
tic reordering aims at finding a single optimal ... with a 42% increase of the run time.
Results in the row “allReo” are obtained by encod-
ing all the rule-generated reorderings in L×F chunk-
to -word conversion mode. Except...
... word in the Wikipedia page for
the word sense; (ii) occurrences of the word in
Wikipedia pages pointing to the page for the word
sense; (iii) occurrences of the word in external
pages linked in ... found
in the page for the sense being trained.
• TiMBL-inlinks uses the examples found in
Wikipedia pages pointing to the sense being
trained.
• TiMBL -all uses b...
... a document
4
D into a queried question
q. Rather than translating single words in isola-
tion, the phrase-based model translates one sequence
of words into another sequence of words, thus in-
corporating ... ranking algo-
rithm proceeds as follows. First, all the words in
a given document are added as vertices in a graph
G. Then edges are added between words if the
words...
... experiment with splitting words into
their stem and suffix components for mod-
eling morphologically rich languages. We
show that using a morphological ana-
lyzer and disambiguator results in a sig-
nificant ... contains about 600 thousand sentences
in the training set and 60 thousand sentences in the
test set (giving a total of about 10 million words) .
The versions of the corpus we...
... interaction in the domain of meet-
ing retrieval and for developing NLP mod-
ules for this specific domain.
1 Introduction
In the past few years, there has been an increasing
interest in research ... controlled
for in the experiment increases substantially. For
instance, if it is the case that within a single inter-
face any task that can be performed using natural
language can...