Báo cáo khoa học: "Combining Multiple Resources to Improve S

Báo cáo khoa học: "Combining Multiple Resources to Improve SMT-based Paraphrasing Model∗" pdf

... generation. In this paper, we exploit multiple resources to improve the SMT-based paraphrase generation. In detail, six kinds of resources are utilized, includ- ing: (1) an automatically constructed thesaurus, ... USA, June 2008. c 2008 Association for Computational Linguistics Combining Multiple Resources to Improve SMT-based Paraphrasing Model ∗ Shiqi Zhao 1 , Cheng...

Ngày tải lên : 17/03/2014, 02:20

9
331
0

Báo cáo khoa học: "Combining Multiple, Large-Scale Resources in a Reusable Lexicon for Natural Language Generation" pptx

... chose well formatted resources (or manually format the re- source) so as to get reliable and usable results; semi-automatic rather than fully automatic approach is adopted to ensure accuracy; ... domain-specific knowledge may need to be added to the lexicon. The problem of how to adapt a general lexicon to a particular application domain and merge domain ontologies with...

Ngày tải lên : 31/03/2014, 04:20

7
316
0

Tài liệu Báo cáo khoa học: "Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classiﬁcation" doc

... automatic method to create a thesaurus that is sensitive to the sentiment of words expressed in different domains. • We describe a method to use the created thesaurus to expand feature vectors ... vector d ∈ R N , where the value of the j-th element d j is set to the total number of occurrences of the unigram or bigram w j in the review d. To ﬁnd the suitable candidates to exp...

Ngày tải lên : 20/02/2014, 04:20

10
555
0

Báo cáo khoa học: "Combining Multiple Knowledge Sources for Dialogue Segmentation in Multimedia Archives" ppt

... dialogue segmentation and topic labels. In the annotation pro- cess, annotators were given the freedom to subdi- vide a segment into subsegments to indicate when the group was discussing a subtopic. Annotators were ... of automatic dialogue segmentation is often considered as similar to the problem of topic segmentation. Therefore, re- search has adopted techniques previously develope...

Ngày tải lên : 23/03/2014, 18:20

8
329
0

Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

... varies sufficiently from language to language to make automatic extraction difficult. Together, these allow phrases like this (taken from the French Wikipedia) to be correctly marked in its entirety ... following forms of the verb to be” to derive a label. For ex- ample, they used the sentence “Franz Fischler is an Austrian politician” to associate the label “politician”...

Ngày tải lên : 20/02/2014, 09:20

9
429
1

Báo cáo khoa học: "Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment" doc

... hard and time-consuming task to hand-align bilingual data, the automation of this task receives a fair amount of attention. In this paper, we present an approach to improve the bilingual dictionary ... dictionaries Dict0.01 for up to one link per word rebuilding algorithm is independent of the actual word alignment method used. Furthermore, we plan to explore ways to improve...

Ngày tải lên : 08/03/2014, 07:20

8
363
0

Báo cáo khoa học: "Using Search-Logs to Improve Query Tagging" potx

... set. The simplest way to match query tokens to snippet tokens is to allow a query token to match any snippet token. This can be problematic when we have queries that have a token repeated with ... surprisingly pow- erful one – is to POS tag some relevant snippets for 238 a given query, and then to transfer the tags from the snippet tokens to matching query tokens. This “di- re...

Ngày tải lên : 16/03/2014, 20:20

5
237
0

Báo cáo khoa học: "Using Anaphora Resolution to Improve Opinion Target Identiﬁcation in Movie Reviews" docx

... in comparison to documents from other domains: Turney (2002) observes that the movie reviews are hardest to classify since the review authors tend to give information about the storyline of the ... are about. This aboutness has been referred to as the opinion target or opinion topic in the literature from the ﬁeld. In this work our goal is to extract opinion target - opinion word...

Ngày tải lên : 17/03/2014, 00:20

6
477
0

Báo cáo khoa học: "Using Deep Morphology to Improve Automatic Error Detection in Arabic Handwriting Recognition" pot

... N-grams+lem. to the improvements gained in Table 3). The largest improvement comes with the addition of the bigram (thus introducing context into the model), but the tri- gram provides only a slight improvement ... handwritten text may be too small, light or closely-spaced to readily distinguish, caus- ing the system to drop them entirely. While Arabic disconnective letters may make it...

Ngày tải lên : 17/03/2014, 00:20

10
521
0

Báo cáo khoa học: Using directed evolution to improve the solubility of the C-terminal domain of Escherichia coli aminopeptidase P Implications for metal binding and protein stability pptx

... and shufﬂed to produce a mutant library, the members of which were then moni- tored for their ability to confer increased TMP resis- tance when fused to DHFR. The genes corresponding to resistant ... ligand to a residue that is unli- kely to bind metal. The G270V residue is located next to a metal-binding residue ) this mutation is likely to cause a conformational change that...

Ngày tải lên : 23/03/2014, 07:20

10
538
0