... quantitative improvement in automatic in-
dexing through concept combination. The in-
crease in the volume of indexing is 10.5% for
free indexing and 52.3% for controlled indexing.
The resulting ... Improving Automatic Indexing through Concept Combination
and Term Enrichment
Christian Jacquemin*
LIMSI-CNRS
BP 133, F-91403 ORSAY ...
an automatic indexer is...
... Meeting of the ACL and the 4th IJCNLP of the AFNLP, pages 764–772,
Suntec, Singapore, 2-7 August 2009.
c
2009 ACL and AFNLP
Improving Automatic Speech Recognition for Lectures through
Transformation-based ... the best candidate for ASR
adaptation or training (Riccardi and Hakkani-Tur,
2005; Huo and Li, 2007).
1
Even with all of these,
however, there remains a significant gap betwe...
... Computational Linguistics
Correcting Automatic Translations through Collaborations between MT
and Monolingual Target-Language Users
Joshua S. Albrecht and Rebecca Hwa and G. Elisabeta Marai
Department ... approach for mediating between an
MT system and users who do not under-
stand the source language and thus cannot
easily detect translation mistakes on their
own. Through a v...
... better capture the se-
mantics of words by incorporating both local
and global document context, and 2) accounts
for homonymy and polysemy by learning mul-
tiple embeddings per word. We introduce ... to capture
homonymy and polysemy. Reisinger and Mooney
(2010b) introduced a multi-prototype VSM where
word sense discrimination is first applied by clus-
tering contexts, and then prot...
... system are a
combination of features described in (Xue, 2008;
Ding and Chang, 2008) as well as the word for-
mation and coarse frame features introduced in
(Sun et al., 2009), the c-command thread ... POS, head word of PP phrases, cat-
egory of c
k
’s lift and right siblings, CFG rewrite
rule that expands c
k
and c
k
’s parent (from (Ding
and Chang, 2008)).
3.2 New Word Features
W...
... representations, synonyms and
polysemous terms, that is, terms with multiple
senses or meanings, are not handled well. Meth-
ods for smoothing the term distributions through
the use of latent ... that is constant across terms and
normalize the exponential term to a probability:
Relating the term in the PLSA model to the
distribution of the LSA term over documents,
,
and relati...
... definitions and notations are
needed for understanding and reimplementation of the pre-
sented algorithms, but can be safely skipped on first reading
and consulted when encountering an unfamiliar term.
1059
write ... stand-in for
M(G)
, analogous to the stand-ins for WSAs and
WSTs described in Section 2.
Algorithm 3, PRODUCE, takes as input a
WRTG G
in
= (N
in
, ∆, P
in
, n
0
,
M, G...
... (Bikel et al, 1997), but with multiple hy-
potheses as output and a larger number of states
(12) to handle name prefixes and suffixes, and
transliterated foreign names separately. It operates
on ... tried to combine the advan-
tages of the prior work, and incorporate broader
knowledge into a more general re-ranking model.
3 Task and Terminology
Our experiments were conducted i...
... been studied previously
(Seymore and Rosenfeld, 1996; Stolcke, 1998; Gao
and Lee, 2000; etc). A comparative study of these
techniques is presented in (Goodman and Gao,
2000).
In this paper, ...
simplicity and efficiency, recent researches show
that its correlation with error rate is not as strong as
once thought. Clarkson and Robinson (2001)
analyzed the reason behind it an...
... with ex-
plicit subjects and verbs with zero subjects (zero
pronouns), using rule-based methods (Ferr
´
andez
and Peral, 2000; Rello and Illisei, 2009b). The
Ferr
´
andez and Peral algorithm (2000) ... subjects and impersonal
constructions. Connexor yields 74.9% overall ac-
curacy and 80.2% and 65.6% F-measure for ex-
plicit and elliptic subjects, respectively.
To compare with...