... Association for Computational Linguistics, pages 678–687,
Uppsala, Sweden, 11-16 July 2010.
c
2010 Association for Computational Linguistics
A Taxonomy, Dataset, and Classifier for Automatic Noun Compound
Interpretation
Stephen ... anno-
tated dataset, and a supervised classifica-
tion method for automatic noun compound
interpretation.
1 Introduction
Noun comp...
... the Association for Computational Linguistics, pages 180–189,
Portland, Oregon, June 19-24, 2011.
c
2011 Association for Computational Linguistics
A New Dataset and Method for Automatically Grading ... a ceiling for the perfor-
mance of our system, we calculate the average corre-
lation between the CLC and the examiners’ scores,
and find an upper bound of 0.796 and 0.792 Pear-...
... T
∗
=
i
T
i
. If
the function ψ exists for every text t
i
z
∈ T
∗
and
for every language L
j
, and is known, then the
corpus is parallel and aligned at document level.
For the purpose of this paper it ... words. We randomly
split both the English and Italian part into 75%
training and 25% test (see Table 2). We processed
the corpus with PoS taggers, keeping only nouns,
verbs...
... using automatically
acquired statistical information from the
POS tagged corpus and extracts nouns by
detecting word boundaries. Furthermore,
it does not require any labor for construct-
ing and ... sentence
by using statistical information and extracts nouns
by detecting the word boundaries. The statistical in-
formation is automatically acquired from a POS an-
notated corpus and t...
... performed before dinner, getting
dinner, and activities to be performed after din-
ner. It knows activities such as playing football,
squash, or badminton; going to the gym or shop-
ping; and ... starting point for
generation is predicate-form descriptions provided
by the dialogue manager. Further details and
contextual information are retrieved from the di-
alogue history and the u...
... A UNIFIED MANAGEMENT AND PROCESSING OF WORD-FORMS,
IDIOMS AND ANALYTICAL COMPOUNDS
Dan Tufts
Octav Popescu
Research Institute for Informatics
Miciurin 8-10, 71316, Bucharest, ... governing the com-
pound verbal forms (including interrogative forms
and
"aliens"
(adverbs, reflexive pronoun insertion)
for English, French, Romanian, Russian and Span-
ish.
As an exa...
... Kudo and Matsumoto, 2002; Sassano,
2004) for bunsetsu-based parsers. We use the fol-
lowing features for each morpheme:
1. major POS, minor POS, conjugation type,
conjugation form, surface form ... chunking
and dependency parsing and, in addition, does
them with a single scan. Most of the modern
dependency parsers for Japanese require bunsetsu
chunking (base phrase chunking) before...
... Kornai
(1995), Bird and Ellison (1994), Pulman and Hep-
ple (1993), whose formalism Kiraz adopts, and
others.
4 Design Goals for MAGEAD
This work is aimed at a unified processing archi-
tecture for the morphology ... many of the analyses incorrect, and only
the analysis chosen for the token in context usually
hand-corrected. We use LATB files fsa
16* for de-
velopment, and fo...
... experiment, we randomly selected two sets
of 100 thousand sentences. The first 100 thousand
sentences are used for training the language model.
The second 100 thousand sentences are used for test- ...
<WT>, <U-t>) and
f(ci[ci-t,
<WT>, <U-t>) are the relative frequen-
cies of the character unigram and bigram for each
word type and part of speech,
f(...
... non-copula
questions and build the model for only copula questions.
ponent of a candidate sentence. For example for
the given question, ”When did Nixon die?”, when
the following candidate sentence, ... affirmed ques-
tions did not contain any object and they are also
in copula (linking) sentence form that is, they
are only formed by subject and information about
the subject as: {su...