... An Algorithm for Simultaneously Bracketing Parallel Texts
by Aligning Words
Dekai Wu
HKUST
Department of Computer Science
University ... serve as generative
models for parallel bilingual sentences with
weak order constraints. Focusing on Wans-
duction grammars for bracketing, we formu-
late a normal form, and a stochastic version ...
paper is that the lexical i...
...
memory. The algorithm is given no information
whatsoever
about the phonemic transcription .used,
and even though
cognate identification
is carried
out on the basis of a context-free one -for- one ...
corresponding to z scores of 4 and beyond.
The last improvement in the performance of
the algorithm to date was brought by a redefinition
of the cognatlon index. Once the individ...
... slightly by having indepen-
dent parameters for 1-count, 2-count, and many-
count n-grams, but still assumes that
¯
d(i) is constant
for i greater than two. Second, by using the same
discount for ... for a given n-gram count
is well-approximated by its mean. For similar cor-
pora, this seems to be true, with a histogram of test
counts for trigrams of count 10 that is nearly...
... lexical resources transform these resources into
a network or graph and compute relatedness using
paths in it (see Budanitsky & Hirst (2006) for an ex-
tensive review). For instance, Rada et ... relatedness scores.
The information flow of the API is summarized by
the sequence diagram in Figure 2. The higher in-
put/output layer the user interacts with is provided
by a Java API from...
... following,
cw
is calculated for
each noun. On the other hand,
cw'
is calculated for
each combination of noun and its case information.
Therefore,
cw I
is calculated for each ( noun, case ...
access is realized by linking them in hypertext for-
mat by hypertext authoring.
Automatic hypertext authoring has been focused
on in these years, and much work has been done....
... correspondences
for English "source" phrases. The algorithm is re-
versible, by swapping E with F.
The model for correspondence is that a source
noun phrase in Ei is responsible for producing ... phrases for both languages. Noun phrases
are then mapped to each other using an iterative
re-estimation algorithm that bears similarities to
the Baum-Welch algorithm w...
... dominance graph by checking each split
for eliminability before it is added to the chart.
We compare the performance of this algorithm to
the baseline of computing the complete chart. For
comparison, ... this algorithm by keep-
ing track of how often each subgraph is referenced
413
every
z
D
x,y,z
a
y
a
x
1 2 3
A
x
B
y
C
z
4 5 6
7
Figure 4: A graph for which the algorithm is n...
... alternative LR algorithm for TAGs
Mark-Jan Nederhof
DFKI
Stuhlsatzenhausweg 3
D-66123 Saarbr/icken, Germany
E-marl: nederhof@dfki.de
Abstract
We present a new LR algorithm for tree-
adjoining ... difficulties, the algorithm as it was
published is also incorrect. Brief indications of
the nature of the incorrectness have been given
before by Kinyon (1997). There seems to...
... An Efficient Generation Algorithm for Lexicalist MT
Victor Poznafiski, John L. Beaven &: Pete Whitelock *
SHARP Laboratories of Europe Ltd.
Oxford Science Park, Oxford OX4 4GA
United Kingdom ... Shake-and-Bake generation
algorithm of (Whitelock, 1992) is NP-
complete. We present a polynomial time
algorithm for lexicalist MT generation pro-
vided that sufficient information...
... pre-
cision will be overestimated.
For the BP/EM training, we used 10 BP iter-
ations for each sentences, and 5 global EM iter-
ations. By using a damping scheme for the BP
algorithm, we never observed ... here
with an algorithm trained on the same data and
with no possibilities for fine-tuning; therefore the
comparison should be fair.
The comparison show that performance-wise,
th...