... would initially require a
50,000 x 50,000 array of values (or a trian-
gular array of about half this size). With
our current hardware, the largest array we
can comfortably handle is about 100 ...
To compare the similarity of two
groups
of
nouns, we define similarity as the average of
the cosines between each pair of nouns made
up of one noun from each of the...
... Kyoto, Japan
{kenji.imamura,eiichiro.sumita}@atrcojp
Yuji Matsumoto
Nara Institute of
Science and Technology
Ikoma-shi, Nara, Japan
matsu@is.aist-nara.acjp
Abstract
When machine translation (MT) ... Automatic Construction of Machine Translation Knowledge
Using Translation Literalness
Kenji Imamura, Eiichiro Sumita
ATR Spoken Language Translation
Research Laboratories
Seika-cho, Soraku-g...
... learning a foreign
language.
A subcategorization frame is a statement of
what types of syntactic arguments a verb (or ad-
jective) takes, such as objects, infinitives, that-
clauses, participial ... manuals.
3. Hand-coded lists are expensive to make, and in-
variably incomplete.
4. A subcategorization dictionary obtained auto-
matically from corpora can be updated quic...
... result.
Classifier and data sets As a classifier, we
chose Naive Bayes with bag -of- words features,
because it is one of the most popular one in this
task. Negation was processed in a similar way as
previous ... works (Pang et al., 2002).
To validate the accuracy of the classifier, three
data sets were created from review pages in which
the review is associated with meta-data. To buil...
... Combinatory Categorial Grammar
Combinatory Categorial Grammar (Ades and Steed-
man, 1982; Steedman, 2000) is an extension to
the classical Categorial Grammar (CG) of Aj-
dukiewicz (1935) and Bar-Hillel ... is added from “araba” to “uyudu˘gum”
to emphasize that the predicate is intransitive and it
may have a locative adjunct. Similarly, a T.OBJECT
link is added from “kitap” to “okudu...
... ("non-mappable") or
are ungrammatical), the remainder of 47 clauses al-
ready has a success-rate of 44.7%. Improvements of
the system components
before
the mapping stage as
well as to ... of
partial parsing ("chunking") with the mapping of
the verb arguments onto subcategorization frames
that can be extracted automatically, in this case,
from WordNet...
... learning approach, which is more
attractive because it is trainable and adaptable, and
subsequently the porting of a machine learning sys-
tem to another domain is much easier than that of a
rule-based ... procedures and NE instances are
finally annotated with the appropriate NE categories.
This automatically tagged corpus may have lower
quality than the manually tagged ones but its si...
... Bisani Paul Vozila Olivier Divay Jeff Adams
Nuance Communications
One Wayside Road
Burlington, MA 01803, U.S .A.
{maximilian.bisani,paul.vozila,olivier.divay,jeff.adams}@nuance.com
Abstract
Written ... significant amount of editing to obtain
a document conforming to the customary standards.
We need to look for what the user wants rather than
what he says.
Natural language processing resea...
... 5¢-ACTCAAATCACTAGTATTCTTCCACCA-3¢
and 5¢-CATTTGAACATAAACATGAACAAATAAGTT-3¢
and the following conditions: annealing temperature 55 °C,
25 cycles, Phusion polymerase used according to the
manufacturer’s ... PhosphorImager. The percentage of release from position 2 was calculated as follows: area fatty acid ⁄ (area fatty acid +
area lysoPtdCho). Pancreatic PLA2 was used as a positive contr...
... and accurate lexicon from s
machine-readable dictionary of variable accuracy and
consistency.
5 Conclusion
Practical natural language applications require vocab-
ularies substantially larger ... have developed a representa-
tional system which is capable of describing compactly
a variety of data relevant to the task of building a lex-
icon with grammatical definitions;...