... 19-24, 2011.
c
2011 Association for Computational Linguistics
Joint Training of Dependency Parsing Filters through
Latent Support Vector Machines
Colin Cherry
Institute for Information Technology
National ... The
identity of the responsible classifier is modeled as
a latent variable, which is filled in during training
using a latent SVM (LSVM) formulation. Our use
of...
... Here is an example of a tweet: “my-
craftingworld: #Win Microsoft Of ce 2010 Home
and Student #Contest from @of ce http://bit.ly/ ···
”, where “mycraftingworld” is the name of the user
who published ... Introduction
Tweets, short messages of less than 140 characters
shared through the Twitter service
1
, have become
an important source of fresh information. As a re-
sult, the task...
... paths
of each pair is calculated by employing Dynamic
Time Warping algorithm. The input of the cal-
culation is correlations between dependency re-
lations, which are estimated from a set of train-
ing ... relation
correlations from a set of training path pairs. The
training data collecting will be described in Sec-
tion 6.1.
For each question and its answer sentences in
traini...
... dependency structure.
3.1 Dependency parsing
The aim of dependency parsing is to find the most
probable D of a given W by maximizing the prob-
ability P(D|W). Let D be a set of probabilistic de-
pendencies ... the training data. This provides an up-
dated set of dependencies.
Re-estimate the parameters of parsing model:
We then re-estimated the parsing model par...
... structure building has been previously
explored by a number of authors. Stoness et al.
(2004, 2005) describe a proof -of- concept imple-
514
mentation of a “continuous understanding” mod-
ule that uses ... S!]
TagIU
$TopOfTags
TextualWordIU
nimm
TextualWordIU
den
TextualWordIU
winkel
TextualWordIU
in
TextualWordIU
$TopOfWords
Figure 1: An example network of incremental units, including...
... , w
n
of space-
delimited words from a set W. We assume a lexicon
LEX, distinct from W, containing pairs of segments
drawn from a set T of terminals and PoS categories
drawn from a set N of nonterminals.
LEX ... 0.7273.
4 Experiments
We aim to evaluate state -of- the-art parsing architec-
tures on the morphosyntactic disambiguation of He-
brew texts in three different parsing sc...
... joint inference method, the
feature vector of each e
comes from two parts: vec-
tor of translating e
s
to {f } and vector of translating
{f} to e
, the two vectors are jointly learned at the
same ... Learning of a Dual SMT System for Paraphrase Generation
Hong Sun
∗
School of Computer Science and Technology
Tianjin University
kaspersky@tju.edu.cn
Ming Zhou
Microsoft Research...
... dependency parsing, instead of
constituent-based syntactic parsing. Thus the
SRL performances of their systems are not di-
rectly comparable to ours.
5.2 Results and Discussions
Results of ...
performance of both syntactic and semantic
parsing, in particular the performance of se-
mantic parsing (in this paper, semantic role
labeling). This is done from two levels. Firs...
...
and non-supporters of that new rule and are no
longer supporters or non-supporters of the par-
ent. If the parent still has at least one non-
supporter, the remaining supporters and non-
supporters ... 4 shows how the
number of generated lemmatization rules for Pol-
ish grows as a function of the number of training
pairs.
Figure 4. Number of rules vs. number of trainin...
... joint
probability of the incorrect parses by making
these parses be worse predictors of the words in
the sentence.
4
The combination of training the
correct parses to be good predictors of the words
and training ... the beginning of training but reduced to near
0 by the end of training. Training was stopped when
maximum performance was reached on the validation
set, using...