...
Computational Linguistics.
J. Stetina, S. Kurohashi, and M. Nagao. 1998.
General wordsense disambiguation method
based on a full sentential context. In
Us-
age of WordNet in Natural Language ... Mihalcea and D.I. Moldovan. 1999. An au-
tomatic methodfor generating sense tagged
corpora. In
Proceedings of AAAI-99,
Or-
lando, FL, July. (to appear).
G. Miller, M. Chodorow, S. Landes, ... verb and
noun appear.
First, Algorithm 1 was applied and search
the Internet using AltaVista, for all possi-
ble pairs V-N that may be created using re-
vise
and the words from the similarity...
... lexical sample format, which is an
XML–based format that has been used for both the
S
ENSEVAL-2 and SENSEVAL-3 exercises. A file in
this format includes a number of instances, each one
made up ... Poster and Demonstration Sessions,
pages 73–76, Ann Arbor, June 2005.
c
2005 Association for Computational Linguistics
SenseRelate::TargetWord – A Generalized Framework
for WordSense Disambiguation
Siddharth ... disambiguation that com-
putes the intended sense of a target word,
using WordNet-based measures of seman-
tic relatedness (Patwardhan et al., 2003).
SenseRelate::TargetWord is a Perl pack-
age that implements...
... ambigu-
ous word in a given context. As a fundamental
task in natural language processing (NLP), WSD
can benefit applications such as machine transla-
tion (Chan et al., 200 7a; Carpuat and Wu, 2007)
and ... systems are publicly available – the only
other publicly available WSD system that we are
aware of is SenseLearner (Mihalcea and Csomai,
2005). Therefore, for applications which employ
WSD as a component, ... 200
megabytes.
3 Evaluation
In our experiments, we evaluate our IMS system
on SensEval and SemEval tasks, the benchmark
data sets for WSD. The evaluation on both lexical-
sample and all-words tasks...
... Electric Corporation
5-1-1 Ofuna, Kamakura, Kanagawa, Japan
{Imamura.Makoto@bx,Takayama.Yasu
hiro@ea}.MitsubishiElectric.co.jp
Nobuhiro Kaji, Masashi Toyoda
and Masaru Kitsuregawa
Institute ... with Positive and Unlabeled Examples forWordSense
Disambiguation: An Empirical Study on Japanese Web Search Query
Makoto Imamura
and Yasuhiro Takayama
Information Technology R&D Center, ... train WSD systems we need a large
amount of positive and negative examples. In the
real Web mining application, how to acquire
training data fora various target of analysis has
become a major...
... of a military person-year is equal to a
metric called regular military compensation (RMC). RMC includes
average basic pay for each military grade, basic allowance for housing,
basic allowance ... on data for the last 50 years would
be only a starting point. e nature of modern warfare and modern casualty treatment
options have changed the ratio and cost of deaths and disabilities drastically. ... two boards of actuaries
after analysis of past data trends and comparisons to similar assump-
tions in other relevant federal programs and private plans. In cases
where the current rate is...
... that this process also refers to the inability of the multinational naval forces
wydyf an hdhh alamlyt ayda tshyr aly adm qdrt almtaddt aljnsyt alqwat albhryt
(a) Source phrase
Source POS and ... high-level
linguistic structures are likely to transfer across certain
language pairs. For example, prepositional phrases
(PP) in Arabic and English are similar in a sense
that PPs generally appear at the end of ... hdhh alamlyt ayda tshyr aly adm qdrt almtaddt aljnsyt alqwat albhryt
He adds that this process also refers to the inability of the multinational naval forces
MT output
Source POS
Source
Target...
... their aligned translations (and probabil-
319
algorithm parameters in machine learning of language.
Machine Learning, pages 84–95.
I. Dagan and A. Itai. 1994. Wordsense disambiguation
using a second ... state-of-the-art systems for all languages, ex-
cept for Spanish where the results are very similar.
As all steps are run automatically, this multilingual
approach could be an answer for the acquisition ... bot-
tleneck, as long as there are parallel corpora avail-
able for the targeted languages. Although large mul-
tilingual corpora are still rather scarce, we strongly
believe there will be more parallel...
... grammatical
and n-gram based statistical language constraints, and uses
a robust parsing technique to apply the grammatical
constraints described by context-free grammar (Tsukada
et
aL,
97). ... the Error-Pattem-Database and String-Database
can be mechanically prepared, which reduces the effort
required to prepare the databases and makes it possible to
apply this method to a new recognition ... correcting
accuracy by changing algorithms and will also try to
improve translation performance by combining our
method with Wakita's method.
References
T. Araki et al., 93. AMethodfor Detecting...
... the
two SENSEVAL tasks. This gave a set of 6 nouns
for SENSEVAL-2 and 9 nouns for SENSEVAL-
3. For each noun, we gathered a maximum of 500
parallel text examples as training data, similar to
what ... sampling with incomplete infor-
mation. Annals of Mathematical Statistics, 26(4).
Yee Seng Chan and Hwee Tou Ng. 200 5a. Scaling
up wordsense disambiguation via parallel texts. In
Proc. of AAAI05.
Yee ... on data which
was automatically gathered from the Internet. The
authors reported a 14% improvement in accuracy
if they have an accurate estimate of the sense pri-
ors in the evaluation data and...
... be available
for many examples. The problem of data sparse-
ness increases as more knowledge is exploited and
this can cause problems for the machine learning
algorithms.
A final disadvantage ... 1st_prep_right, back).
Rule_2. sense (A, chegar) :-
has_rel (A, subj, B), has_bigram (A, today, B),
has_bag_trans (A, hoje).
Rule_3. sense (A, chegar) :-
satisfy_restriction (A, [animal, human], [concrete]);
... In-
troduction to Machine Translation. Academic Press,
Great Britain.
Abolfazl K. Lamjiri, Osama El Demerdash, Leila Kos-
seim. 2004. Simple features for statistical Word
Sense Disambiguation. Proceedings...
... and accuracy improvement is less than
1% after all the available WSJ adaptation examples are added
as additional training data. To obtain a clearer picture of the
adaptation process, we discard ... in BC and
WSJ, average MFS accuracy, average number of BC
training, and WSJ adaptation examples per noun.
data, and the rest of the WSJ examples are desig-
nated as in-domain adaptation data. The ... pos-
teriori (MAP) estimation, and successfully used it
for probabilistic context-free grammar domain adap-
tation (Roark and Bacchiani, 2003) and language
model adaptation (Bacchiani and Roark, 2003).
Count-merging...
... training data
so that we can do a fair comparison between the
accuracy of the parallel text alignment approach
versus the manual sense- tagging approach.
After training a WSD classifier for w ...
However, large-scale, good-quality parallel
corpora have recently become available. For ex-
ample, six English-Chinese parallel corpora are
GIZA++. For two of the corpora, Hong Kong Han-
sards and ... corpora. To ensure a fairer
comparison, for each of the 10-trial manually
sense- tagged training data that gave rise to the ac-
curacy figure M2 of a noun w, we extracted a new
subset of 10-trial...
... Shimo-tsuruma, Yamato-shi, Kanagawa-ken 242 Japan
{ uramoto, takeda } @trl. ibm. co.j p
Abstract
This paper describes methods for relating (thread-
ing) multiple newspaper articles, and for visualizing ... quantity of information available today
makes it difficult to search for and understand the
information that we want. If there are many related
documents about a topic, it is important to capture ... news-
paper articles automatically, and its application for
a Webcasting application. A set of article on a par-
I htt p://www.pointcast.com
ticular topic is ordered chronologically, and the...
... re-
trieval and data mining, in which case it is impor-
tant to be able to read through them automatically,
without resorting to a human annotator. The holy
grail in this area would be an application ... gray are unreachable.
The cell at (d) is filled using the trigram probabilities and the probability of the path at starting at (a) .
In all of the data considered, the frequency of
spaces was far ... 48th Annual Meeting of the Association for Computational Linguistics, pages 1040–1047,
Uppsala, Sweden, 11-16 July 2010.
c
2010 Association for Computational Linguistics
An Exact A* Method for...
... were discarded.
tions caused by tagging or lemmatization errors,
we manually corrected any bad tags and lemmas
for the target instances.
4
Sense Paraphrases Forwordsense disam-
biguation tasks, ... Boyd-Graber
et al. (2007) enhance the basic LDA algorithm by
incorporating WordNet senses as an additional la-
tent variable. Instead of generating words directly
from a topic, each topic is associated ... in sense paraphrases
increases performance. Longer paraphrases con-
tain more information, and they are statistically
more stable for inference.
We find that nouns get the greatest perfor-
mance...