... 157–164,
Ann Arbor, June 2005.
c
2005 Association for Computational Linguistics
Machine Learning for Coreference Resolution:
From Local Classification to Global Ranking
Vincent Ng
Human Language Technology ... need to make tough or potentially
suboptimal design decisions.
1
For instance, if we
1
We still need to determine the
coreference systems to be
employed in...
... method to
solve this problem is to decompose the whole task
into a set of individual tasks for each token in the in-
put sequence, and solve these small tasks in a fixed
order, usually from left to ... to select samples for training.
In general, this novel learning framework lies be-
tween supervised learning and reinforcement learn-
ing. Guided learning is more difficult th...
... of maltotetraose
(labeled from Glc1¢ to Glc4¢ as shown in Fig. 4B)
bound to EcoMBP exhibits a curved form, which is
similar to the portion (Glc1–Glc4) of the round shape
of c-CD bound to TvuCMBP. ... labeled from 1 to 8. (B) EcoMBP–maltotetraose complex. The
glucose residues of maltotetraose are labeled from 1¢ to 4¢. (C) TliTMBP–trehalose complex. The glucose residues of tr...
... on the top of the stack.
Left-Arc
l
Add to the analysis an arc with label l
from the token at the head of the buffer to the token
on the top of the stack, and push the buffer-token
onto the stack.
Right-Arc
l
Add ... ora-
cle, i.e. to try to always pick the transitions that
55
lead to the correct parse. The information given to
the classifier is the current configuration. There-
f...
... tannine-wijn kamer-toren atoom-molecule paleis-stad
(tannine-wine) (room-tower) (atom-molecule) (palace-city)
structural — kinine-tonic beeld-kerk wervel-ruggengraat paleis-stad
(quinine-tonic) statue-church) ... set to 1.
The top-k most reliable patterns are selected to
find new tuples. The reliability of each tuple i,
r
ι
(i) is computed according to (4), where P is the
set of harveste...
... algorithm
given by (Brown, 1993), which is guaranteed to
converge to a global maximum of the likelihood
for Model 1. However, running the EM algorithm
to optimization for each considered segmentation
model ... Morfessor-bi
Morfessor-bi: (i) to consistently identify the root
word “anahtar” (top portion), and (ii) to match the
English plural word form “games” with the Turk-
ish...
... Moreover, we
also intend to perform a user study on our visualiza-
tion prototype to see if it increases the productivity of
post-editors.
Acknowledgements
We would like to thank Christoph Tillmann and ... Viterbi algorithm to find the best label sequence.
To estimate the confidence of a sentence S we rely on
the information from the forward-backward inference.
One approach is to...
... and semantically well-formed parallel sen-
tences from existing corpus. To achieve this, we first
collect a set of rules as the candidates for the substi-
tution. We also need to know where we should ... idea of paraphrasing is to find al-
ternative ways that convey the same information.
In contrast, we propose to build new parallel sen-
tences that convey different information, yet...
... semantic analysis.
3 Challenges in learning from examples
In the introductory section, we have shown that,
to carry out automatic learning from examples, we
need to define a cross-pair similarity ... T
3
is structurally (and somehow lex-
ically similar) to T
1
and H
3
is more similar to H
1
than to H
2
. Thus, from T
1
⇒ H
1
we may extract
rules to derive that T
3
⇒ H
3
....
... system. We describe experiments on translation from
German to English, showing an improvement from 25.2% Bleu
score for a baseline system to 26.8% Bleu score for the system
with reordering, a statistically ... lead to significantly
different word order from that of English. We now
describe how these characteristics can lead to dif-
ficulties for phrase–based translation syst...