... Association for Computational Linguistics
Biographies, Bollywood, Boom-boxes and Blenders:
Domain Adaptation for Sentiment Classification
John Blitzer Mark Dredze
Department of Computer and Information ... trained without adaptation, while the
gold standard is an in -domain classifier trained on
the same domain as it is tested.
Figure 1 gives accuracies for all pairs o...
... un-
available and therefore information gain cannot
be practically computed. Figure 3 and Figure 4
show results for Lotus and Amazon datasets re-
spectively and are representative of performance
on ... sen-
timent, and (G
0
)
i2
= 1 for negative sentiment. As
with F
0
, one can also use soft sentiment labeling
for documents, though our experiments are con-
ducted with hard...
... alignment and consistency (Picker-
ing and Garrod, 2004; Halliday and Hasan, 1976) on
the one hand, and variation (to improve text quality
and readability) on the other hand (Belz and Reiter,
2006; ... consis-
tency (Halliday and Hasan, 1976), and variation,
which influence people’s assessment of discourse
(Levelt and Kelter, 1982) and generated output (Belz
and Reiter, 2...
... Figure 2. Both for 10
and 20 frames, the results are better for 50 than for
20 clusters, with small differences between 10 and
20 frames. The results vary between -11.850 and
-10.620 (for 5-50 iterations), ... classifications. For example, Siegel
and McKeown (2000) used several machine learn-
ing algorithms to perform an automatic aspectual
classification of English verbs into...
... the ACL 2007 Demo and Poster Sessions, pages 89–92,
Prague, June 2007.
c
2007 Association for Computational Linguistics
Test Collection Selection and Gold Standard Generation
for a Multiply-Annotated ... Introduction
Opinion information processing has been studied
for several years. Researchers extracted opinions
from words, sentences, and documents, and both
rule-based and...
... health and disease: new concepts for
heparanase function in tumor progression and metastasis
Uri Barash
1
, Victoria Cohen-Kaplan
1
, Ilana Dowek
2
, Ralph D. Sanderson
3
, Neta Ilan
1
and
Israel ... enzymatic activity-depen-
dent and -independent functions mediated by defined
protein domains and splice variants, and cross-talk
U. Barash et al. New concepts for heparanase funct...
... pseudo nodes and hyper edges to the forest,
which makes the forest even denser and harder for nav-
igation and search. As trees thrive in the search space,
especially with the pseudo nodes and edges ... Translations
Bing Zhao
†
, and Young-Suk Lee
†
, and Xiaoqiang Luo
†
, and Liu Li
‡
IBM T.J. Watson Research
†
and Carnegie Mellon University
‡
{zhaob, ysuklee, xiaoluo}@us.i...
... rela-
tions for Information Extraction (IE) and QA, as
in (Riloff and Shepherd, 1997; Ravichandran and
Hovy, 2002; Yangarber et al., 2000).
Patterns identify rather specific and informa-
tive ... patterns we used for entailment ac-
quisition based on (Hearst, 1992) and (Pantel et al.,
2004). Capitalized terms indicate variables. pl and
sg stand for plural and singular...
... dictionaries and
selection strategies like select all (Hull and
Grefenstette, 1996; Davis, 1997), randomly select
N (Ballesteros and Croft, 1996; Kwok 1997) and
select best N (Hayashi, Kikui and Susaki, ...
performance and domain shift of corpus are major
problems of these two approaches. Hybrid
approaches (Ballesteros and Croft, 1998; Bian and
Chen, 1998; Davis 1997...
... MK–EN, and Pr(b|e) and
Pr(e|b) for EN–BG, where m, e, and b stand for a
Macedonian, an English, and a Bulgarian word.
Then, following (Callison-Burch et al., 2006; Wu
and Wang, 2007; Utiyama and ... MK–EN and an EN–BG bitext.
First, we induced IBM-model-4 word alignments
for MK–EN and EN–BG, from which we extracted
four conditional lexical translation probabilities:
Pr(m...