... Linguistics
Topic-Focused Multi-document Summarization
Using an Approximate Oracle Score
John M. Conroy, Judith D. Schlesinger
IDA Center for Computing Sciences
Bowie, Maryland, USA
conroy@super.org, ... multi-
document summary given a collection of doc-
uments. Most automatic methods of multi-
document summarization are largely extractive.
This mimics the behavior of humans f...
... Computational Linguistics
Comparative News Summarization Using Linear Programming
Xiaojiang Huang Xiaojun Wan
∗
Jianguo Xiao
Institute of Computer Science and Technology, Peking University, Beijing ... (Peking University), MOE, China
{huangxiaojiang, wanxiaojun, xiaojianguo}@icst.pku.edu.cn
Abstract
Comparative News Summarization aims to
highlight the commonalities and differences
betwe...
... importance
of this task.
One can cast answer-finding as a traditional docu-
ment retrieval problem by considering each candidate
answer as an isolated document and ranking each can-
didate answer ... containing an answer to a question is rather
stricter than mere relevance. Put another way, only a
small number of documents actually contain the an-
swer to a given query, while every docum...
... the analysis of such proteins, we have previously
described a rationale and an efficient algorithm,
improved here, for transforming a standard matrix
into one appropriate for any specified nonstandard
compositional ... only substantial E-value chan-
ges, of greater than a factor of 10, i.e., score changes
greater than 3.3 bits, the case by case advantage of
mode D is vitiated. We therefore pr...
... F-measures of 96.6% and
94.1% respectively. It shows that the
performance is significantly better than
reported by any other machine-learning
system. Moreover, the performance is even
consistently ...
ContainsDigitAndAlpha and
ContainsDigitAndDash, the former will take
precedence. The first eleven features arise from
the need to distinguish and annotate monetary
amounts, percentages...
... summarisation (Jing 2000), subtitle genera-
tion from spoken transcripts (Vandeghinste and
Pan 2004) and information retrieval (Olivers and
Dolan 1999). Sentence compression is a complex
paraphrasing ... standard, D: Decision-tree, LM: IP
language model, Sig: IP language model with sig-
nificance score)
Model CompR Rating
Decision-tree 56.1% 2.22
∗†
LangModel 49.0% 2.23
∗†
LangModel+Significanc...
... this
algorithm, using an average of tiles per
sentence (for an average input sentence length of
30 words) and an average of possible trans-
lations per tile, encodes a candidate set of about
10 possible translations. ... large set
of candidate realizations, and, in a second phase,
statistical knowledge about the target language
(such as stochastic language models) to rank the
candidat...
... errors, and the CRF models are
trained using the alignment results as supervised
data.
2.2 Insertion / Deletion
Since an insertion can be regarded as replacing an
empty word with an actual word, and ... statistical ma-
chine translation (PBSMT), but there are three dif-
ferences; 1) it adopts the conditional random fields,
2) it allows insertion and deletion, and 3) binary and
real fea...
... Polarity Using Random Walks
Ahmed Hassan
University of Michigan Ann Arbor
Ann Arbor, Michigan, USA
hassanam@umich.edu
Dragomir Radev
University of Michigan Ann Arbor
Ann Arbor, Michigan, USA
radev@umich.edu
Abstract
Automatically ... product is very impor-
tant for marketing and customer relation manage-
ment (Morinaga et al., 2002). Manually handling
reviews to identify reputation is a ver...
...
to handle huge amount of documents, which is a
tedious and time-consuming process. Instead of
reading every document, the headline can be used
to decide which of them contains important infor-
mation. ... extractive and abstrac-
tive. In the work of (Douzidia and Lapalme, 2004),
and extractive method was used to produce a 10-
words summary (which can be considered as a
headline) of an...