adjectival subcategorization frames from corpora

Tài liệu Báo cáo khoa học: "AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT" doc

Tài liệu Báo cáo khoa học: "AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT" doc

Ngày tải lên : 20/02/2014, 21:20
... AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT Michael R. Brent MIT AI Lab 545 Technology Square Cambridge, ... open-class dictionary) and gener- ates a partial list of verbs occurring in the text and the subcategorization frames (SFs) in which they occur. Verbs are detected by a novel tech- nique based on ... corpora. 1 INTRODUCTION This paper describes an implemented program that takes an untagged text corpus and generates a partial list of verbs occurring in it and the sub- categorization frames...
  • 6
  • 416
  • 0
Báo cáo khoa học: "Automatic Acquisition of Adjectival Subcategorization from Corpora" docx

Báo cáo khoa học: "Automatic Acquisition of Adjectival Subcategorization from Corpora" docx

Ngày tải lên : 08/03/2014, 04:22
... describes a novel system for acquiring adjectival subcategorization frames (SCFs) and associated frequency information from English corpus data. The system incorporates a decision-tree classifier ... sub- categorization frames from untagged text. In Meet- ing of the Association for Computational Linguistics, pages 209–214. E. J. Briscoe and J. Carroll. 1997. Automatic Extraction of Subcategorization from Corpora. ... first systems capable of automatically learn- ing a small number of verbal subcategorization frames (SCFs) from English corpora emerged over a decade ago (Brent, 1991; Manning, 1993). Subse- quent...
  • 8
  • 390
  • 0
Tài liệu Báo cáo khoa học: "Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations" pdf

Tài liệu Báo cáo khoa học: "Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations" pdf

Ngày tải lên : 20/02/2014, 19:20
... values varied from frame to flame but not from verb to verb and were determined by taking into account for each frame its overall frame frequency which was es- timated from the COMLEX subcategorization ... corpus id- iosyncrasies can affect subcategorization frequen- cies (cf. Roland and Jurafsky (1998) for an exten- sive discussion). This suggests that different corpora may give different results ... shal- low syntactic processing. Alternating verbs were ac- quired from the BNC by using Gsearch as a chunk parser. Erroneous frames were discarded by apply- ing linguistic heuristics, statistical...
  • 8
  • 483
  • 0
Tài liệu Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora" ppt

Tài liệu Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora" ppt

Ngày tải lên : 20/02/2014, 12:20
... (Cucerzan and Yarowsky, 1999) and (Collins and Singer, 1999) present algorithms to obtain NEs from untagged corpora. However, they focus on the classification stage of already segmented entities, and ... feature vector from this example in the following manner: First, we split both words into all possible substrings of up to size two: We build a feature vector by coupling sub- strings from the two ... Computational Linguistics Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora Alexandre Klementiev Dan Roth Dept. of Computer Science University of Illinois Urbana,...
  • 8
  • 391
  • 0
Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Ngày tải lên : 20/02/2014, 16:20
... Japanese-English language pair, especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for the disambiguation of translation ... comparable corpora- based techniques, re- spectively compared to the hybrid two-stages com- parable corpora and linguistics-based pruning. The proposed approach based on bi-directional comparable corpora ... TR2-007. P. Fung. 2000. A Statistical View of Bilingual Lexi- con Extraction: From Parallel Corpora to Non-Parallel Corpora. In Jean Veronis, Ed. Parallel Text Process- ing. G. Grefenstette. 1999....
  • 4
  • 377
  • 0
Tài liệu Báo cáo khoa học: "INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA" ppt

Tài liệu Báo cáo khoa học: "INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA" ppt

Ngày tải lên : 20/02/2014, 21:20
... ( (from SF0) (to San Francisco))))).) GR (Tell ((me (((about the) public) transportation)) ( (from SF0) ((to San) (Francisco .))))) GB ((Tell (me (about (((the public) transportation) ( (from ... corpus, the inside prob- abilities of longer spans of c are computed from INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA Fernando Pereira 2D-447, AT~zT Bell Laboratories PO Box ... inferred from raw text. In addition, the number of iterations needed to reach a good grammar can be reduced; in extreme cases, a good solution is found from parsed text but not from raw text....
  • 8
  • 285
  • 0
Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

Ngày tải lên : 20/02/2014, 22:20
... nouns or proper nouns is converted from their positions in the text into a vector. 3. Match pairs of positional difference vec- tors~ giving scores. All vectors from English and Chinese are matched ... dim(V2) 240 A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Pascale Fung Computer Science Department Columbia University New York, NY ... in the texts. For every word pair from this lexicon, we had ob- tained a DTW score and a DTW path. If we plot the points on the DTW paths of all word pairs from the lexicon, we get a graph...
  • 8
  • 426
  • 0
Tài liệu Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" docx

Tài liệu Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" docx

Ngày tải lên : 22/02/2014, 02:20
... linguistic analysis. The originality of our approach comes from the fact that collocations are not extracted from raw texts, but rather from syntactically parsed texts. The lin- guistic analysis ... textual corpora from the World Trade Organisation (WTO), which consist in parallel documents in three languages: English, French and Spanish. All the examples given in this paper are taken from ... returns chunks of partial analyses. If 132 Creating a Multilingual Collocation Dictionary from Large Text Corpora Luka Nerima, Violeta Seretan, Eric Wehrli Language Technology Laboratory (LATL),...
  • 4
  • 479
  • 0
Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Ngày tải lên : 22/02/2014, 02:20
... translation knowledge acquisition from WWW news sites, this paper studies issues on the effect of cross-language retrieval of relevant texts in bilingual lexicon ac- quisition from comparable corpora. We experimentally ... parallel/comparative corpora. However, the sizes as well as the domain of existing parallel/comparative corpora are lim- ited, while it is very expensive to manually col- lect parallel/comparative corpora. ... translation knowledge acquisition from parallel/comparative corpora, various kinds of translation knowledge are acquired. Within this framework of translation knowledge acquisition from WWW news sites, this...
  • 8
  • 477
  • 0
Báo cáo khoa học: "Prototyping virtual instructors from human-human corpora" pdf

Báo cáo khoa học: "Prototyping virtual instructors from human-human corpora" pdf

Ngày tải lên : 07/03/2014, 22:20
... this paper we presented a novel algorithm for rapidly prototyping virtual instructors from human- human corpora without manual annotation. Using our algorithm and the GIVE corpus we have gener- ated ... sum, this paper presents a novel way of au- tomatically prototyping task-oriented virtual agents from corpora who are able to effectively and natu- rally help a user complete a task in a virtual ... world. References Sudeep Gandhe and David Traum. 2007. Creating spo- ken dialogue characters from corpora without annota- tions. In Proceedings of Interspeech, Belgium. Andrew Gargett, Konstantina...
  • 6
  • 220
  • 0
Báo cáo khoa học: "Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora" pot

Báo cáo khoa học: "Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora" pot

Ngày tải lên : 08/03/2014, 01:20
... engineering is desired. Paraphrases can be extracted from non-parallel corpora using contextual similarity (Lin, 1998). They can also be obtained from parallel corpora if such data is available (Barzilay ... Ibrahim et al., 2003). Recently, there are also a number of studies that extract paraphrases from multilingual corpora (Bannard and Callison- Burch, 2005; Zhao et al., 2008). The approach in (Barzilay ... Singapore, 4 August 2009. c 2009 ACL and AFNLP Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora Xiaoyin Wang 1,2 , David Lo 1 , Jing Jiang 1 , Lu Zhang 2 , Hong Mei 2 1 School...
  • 4
  • 293
  • 0
Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

Ngày tải lên : 08/03/2014, 02:21
... field. Comparable corpora exhibit various degrees of parallelism. Fung and Cheung (2004a) describe corpora ranging from noisy parallel, to compara- ble, and finally to very non-parallel. Corpora from the ... comparable corpora from the Romanian translations of the European Union’s acquis communautaire which we mined from the Web, and has about 10M English words. We downloaded comparable data from three ... lexicon extraction from compara- ble corpora. In ACL 2004, pages 527–534. Philipp Koehn and Kevin Knight. 2000. Estimating word translation probabilities from unrelated mono- lingual corpora using...
  • 8
  • 263
  • 0
Báo cáo khoa học: "Automatic Identification of Word Translations from Unrelated English and German Corpora" pot

Báo cáo khoa học: "Automatic Identification of Word Translations from Unrelated English and German Corpora" pot

Ngày tải lên : 08/03/2014, 06:20
... corpora, but - as empirically shown by Rapp - it also holds for non-parallel corpora. It can be expected that this clue will work best with parallel corpora, second-best with comparable corpora, ... translations from non-parallel corpora. Proceedings of the 5th Annual Workshop on Very Large Cor- pora, Hong Kong, 192-202. Fung, P.; Yee, L. Y. (1998). An IR approach for translating new words from ... word associations based on the co-occurrences of words in large corpora. In: Proceedings of the 1st Work- shop on Very Large Corpora: Columbus, Ohio, 84- 93. 526 German test word Baby...
  • 8
  • 438
  • 0
Corporate Executive Salaries – The Argument from Economic Effi ciency ppt

Corporate Executive Salaries – The Argument from Economic Effi ciency ppt

Ngày tải lên : 08/03/2014, 06:20
... demonstrated that for Australian corporations, the correlation between corporate performance and executive salary was negative, that is, the highest paid executives control- led corporations with the ... distinguish the contribution of the executive from the fortunes of the corporation as a whole. Attempts to compare performance against similar corporations might allow comparative evaluation ... ciency may have been due to corporate leadership, such as through restruc- turing of corporations.  is is plausible but diffi cult to prove. It cannot be isolated from other potential causes...
  • 9
  • 229
  • 0
Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" ppt

Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" ppt

Ngày tải lên : 08/03/2014, 21:20
... (paragraph-level) structure of documents is examined, possibly using mark-up from text encoding. 133 Creating a Multilingual Collocation Dictionary from Large Text Corpora Luka Nerima, Violeta Seretan, Eric Wehrli Language ... linguistic analysis. The originality of our approach comes from the fact that collocations are not extracted from raw texts, but rather from syntactically parsed texts. The lin- guistic analysis ... textual corpora from the World Trade Organisation (WTO), which consist in parallel documents in three languages: English, French and Spanish. All the examples given in this paper are taken from...
  • 4
  • 353
  • 0