named entities from large corpora

Báo cáo khoa học: "Discovering Relations among Named Entities from Large Corpora" pot

Danh mục: Báo cáo khoa học

Báo cáo khoa học: "A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora" doc

Danh mục: Báo cáo khoa học

Tài liệu Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" docx

Danh mục: Báo cáo khoa học

... Cooccurrence Extraction with FipsCollocations are extracted from syntactically ana-lysed corpora. The analysis is performed by Fips, a large- scale parser based on an adaptation ofChomksy's ... returns chunks of partial analyses. If132Creating a Multilingual Collocation Dictionary from Large Text Corpora Luka Nerima, Violeta Seretan, Eric WehrliLanguage Technology Laboratory (LATL), ... linguistic analysis. Theoriginality of our approach comes from the factthat collocations are not extracted from raw texts,but rather from syntactically parsed texts. The lin-guistic analysis...

Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" ppt

Danh mục: Báo cáo khoa học

... (paragraph-level)structure of documents is examined, possibly usingmark-up from text encoding.133Creating a Multilingual Collocation Dictionary from Large Text Corpora Luka Nerima, Violeta Seretan, Eric WehrliLanguage ... linguistic analysis. Theoriginality of our approach comes from the factthat collocations are not extracted from raw texts,but rather from syntactically parsed texts. The lin-guistic analysis ... textual corpora from the World Trade Organisation (WTO), whichconsist in parallel documents in three languages:English, French and Spanish. All the examplesgiven in this paper are taken from...

Báo cáo khoa học: "Learning Translations of Named-Entity Phrases from Parallel Corpora" ppt

Danh mục: Báo cáo khoa học

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Danh mục: Báo cáo khoa học

... Japanese-English language pair,especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for thedisambiguation of translation ... comparable corpora- based techniques, re-spectively compared to the hybrid two-stages com-parable corpora and linguistics-based pruning.The proposed approach based on bi-directionalcomparable corpora ... TR2-007.P. Fung. 2000. A Statistical View of Bilingual Lexi-con Extraction: From Parallel Corpora to Non-Parallel Corpora. In Jean Veronis, Ed. Parallel Text Process-ing.G. Grefenstette. 1999....

Tài liệu Báo cáo khoa học: "Finding Parts in Very Large Corpora" pdf

Danh mục: Báo cáo khoa học

... the machines at our disposal, so still larger corpora would not be out of the question. Finally, as noted above, Hearst [2] tried to find parts in corpora but did not achieve good results. ... Lexicography 3 (1990), 235-245. [2] Marti Hearst, "Automatic acquisition of hy- ponyms from large text corpora, " in Proceed- ings of the Fourteenth International Conference on Computational ... Abstract We present a method for extracting parts of objects from wholes (e.g. "speedometer" from "car"). Given a very large corpus our method finds part words with 55% accuracy...

Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Danh mục: Báo cáo khoa học

... translationknowledge acquisition from WWWnews sites, this paper studies issues onthe effect of cross-language retrieval ofrelevant texts in bilingual lexicon ac-quisition from comparable corpora. Weexperimentally ... parallel/comparative corpora. However, the sizes as well as the domainof existing parallel/comparative corpora are lim-ited, while it is very expensive to manually col-lect parallel/comparative corpora. ... approach of acquiring transla-tion knowledge of domain specific named entities, event expressions, and collocational expressions from the collection of bilingual news articles onWWW news sites...

Báo cáo khoa học: "Learning Condensed Feature Representations from Large Unsupervised Data Sets for Supervised Learning" docx

Danh mục: Báo cáo khoa học

... utilize a large amount of unsuperviseddata to supplement supervised data. Speciﬁcally,an approach that involves incorporating ‘clustering-based word representations (CWR)’ induced from unsupervised ... LimitedMemory BFGS Method for Large Scale Optimization.Math. Programming, Ser. B, 45(3):503–528.Mitchell P. Marcus, Beatrice Santorini, and Mary AnnMarcinkiewicz. 1994. Building a Large AnnotatedCorpus ... 2011.c2011 Association for Computational LinguisticsLearning Condensed Feature Representations from Large UnsupervisedData Sets for Supervised LearningJun Suzuki, Hideki Isozaki, and Masaaki...

Báo cáo khoa học: "Annotating and Recognising Named Entities in Clinical Notes" pot

Danh mục: Báo cáo khoa học

... 15000 clin-ical named entities in 11 entity types. Thispaper reports on the challenges involved increating the annotation schema, and recog-nising and annotating clinical named enti-ties. ... step to the extraction of structured in-formation from these clinical notes is to achieveaccurate identiﬁcation of clinical concepts or named entities. An entity may refer to a concreteobject ... 3 named entities - CT, pituitary macroade-noma and suprasellar cisterns in the sentence:CT revealed pituitary macroadenoma in suprasel-lar cisterns.In recent years, the recognition of named...

Báo cáo khoa học: " Translating Named Entities Using Monolingual and Bilingual Resources" ppt

Danh mục: Báo cáo khoa học

... IdentiFinder named entity identifier (Bikel et al., 1999) to iden-tify all named entities in the topretrieved docu-ments for each sub-phrase. All named entities ofthe type of the named entity ... articles and hence the named entities will most likely be reported in many languages in-cluding the target language. Instead of having tocome up with translations for the named entities of-ten with ... While the identifica-tion of named entities in text has received sig-nificant attention (e.g., Mikheev et al. (1999) andBikel et al. (1999)), translation of named entities has not. This translation...

Báo cáo khoa học: "Detecting Highly Conﬁdent Word Translations from Comparable Corpora without Any Prior Knowledge" doc

Danh mục: Báo cáo khoa học

... of bilingual lexicon extraction from parallel corpora. This assumption shouldalso be reasonable for many types of comparable corpora such as Wikipedia or news corpora, whichare topically aligned ... trans-lation candidates from multilingual comparable corpora. By employing the algorithm we haveimproved precision scores of the methods rely-ing on per-topic word distributions from a cross-language ... efﬁciently bridge the gap betweenlanguages. That seed lexicon is usually crawled from the Web or obtained from parallel corpora. Recently, Li et al. (2011) have proposed an ap-proach that improves...

Báo cáo khoa học: "CS NIPER Annotation-by-query for non-canonical constructions in large corpora" pdf

Danh mục: Báo cáo khoa học

... annotation tasks that require manual analysisover large corpora. The approach is generalizableto any kind of linguistic phenomena that can be lo-cated in corpora on the basis of queries and requiremanual ... suitable software. Their empirical distribu-tion in corpora is thus largely unknown.A major task in recognizing NCCs is distin-guishing them from structurally similar construc-86Figure 3: KWIC ... investiga-tion requires the analysis of large corpora due toa relatively low frequency of instances and whoseidentiﬁcation requires expert knowledge to distin-guish them from other similar constructions....

Báo cáo khoa học: "Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora" pptx

Danh mục: Báo cáo khoa học

... sentence pairs are extracted from the aligned comparable corpora (section 2.2). The workflow for named entity (NE) and terminology extraction and mapping from comparable corpora extracts data in ... and named entities. The toolkit pairs similar bilingual comparable documents and extracts parallel sentences and bilingual terminological and named entity dictionaries from comparable corpora. ... translation;  named entity dictionaries. The demonstration showcases two general use case scenarios defined in the toolkit: “parallel data mining from comparable corpora and named entity/terminology...

Báo cáo khoa học: "Recognizing Named Entities in Tweets" docx

Danh mục: Báo cáo khoa học

... semi-supervised learning.1 Introduction Named Entities Recognition (NER) is generally un-derstood as the task of identifying mentions of rigiddesignators from text belonging to named- entitytypes such as ... Extracting personal names from email: apply-ing named entity recognition to informal text. In HLT,pages 443–450.David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and ... challengesand misconceptions in named entity recognition. InCoNLL, pages 147–155.Sameer Singh, Dustin Hillard, and Chris Leggetter. 2010.Minimally-supervised extraction of entities from textadvertisements....

Báo cáo khoa học: "Building Emotion Lexicon from Weblog Corpora" potx

Danh mục: Báo cáo khoa học

... 133–136,Prague, June 2007.c2007 Association for Computational LinguisticsBuilding Emotion Lexicon from Weblog Corpora Changhua Yang Kevin Hsin-Yih Lin Hsin-Hsi Chen Department of Computer Science and ... mine the relationships between words and emotions using weblog corpora. A collocation model is proposed to learn emotion lexicons from weblog articles. Emotion classification at sentence level ... Blog from January to July, 2006, spanning a period of 212 days. In total, 336,161 bloggers’ articles were col-lected. Each blogger posts 16 articles on average. We used the articles from...

Báo cáo khoa học: "Detecting Semantic Relations between Named Entities in Text Using Contextual Features" pdf

Danh mục: Báo cáo khoa học

... solvesproblems, which result from when a parallelsentence arises from predication ellipsis. How-ever, there are several types of parallel sentencethat differ from the one we explained. (For ... aresorted in order of likelihood of being the antecedent.The sorting algorithm has two steps. First, from thebeginning of the text until the pronoun appears, nounOsakaoasu , NaomiothersnigaKenwaOsakaoasu ... anaphora resolutions here.Applied centering theory to relation detection isas follows. First, from the beginning of the text untilthe following NE appears, noun phrases are stackeddepending...

Báo cáo khoa học: "Mapping Concrete Entities from PAROLE-SIMPLE-CLIPS to ItalWordNet: Methodology and Results" potx

Danh mục: Báo cáo khoa học

... (henceforth TCs) clustered in three categories distinguishing 1st OrderEntities, 2ndOrderEntities and 3rdOrder Entities. Their subclasses, hierarchically ordered by means of a subsumption ... ontology of semantic types. 2 Corpora e Lessici dell'Italiano Parlato e Scritto. 161The IWN Top Ontology (TO) (Roventini et al., 2003), which slightly differs from the EWN TO3, consists ... 161–164,Prague, June 2007.c2007 Association for Computational LinguisticsMapping Concrete Entities from PAROLE-SIMPLE-CLIPS to ItalWordNet: Methodology and Results Adriana Roventini, Nilda...

Báo cáo khoa học: "Constructing Transliteration Lexicons from Web Corpora" docx

Danh mục: Báo cáo khoa học

Báo cáo khoa học: "Constructing Semantic Space Models from Parsed Corpora" potx

Danh mục: Báo cáo khoa học

Xem thêm

Bạn có muốn tìm thêm với từ khóa:

Tìm thêm: hệ việt nam nhật bản và sức hấp dẫn của tiếng nhật tại việt nam xác định các nguyên tắc biên soạn khảo sát các chuẩn giảng dạy tiếng nhật từ góc độ lí thuyết và thực tiễn khảo sát chương trình đào tạo của các đơn vị đào tạo tại nhật bản khảo sát chương trình đào tạo gắn với các giáo trình cụ thể xác định thời lượng học về mặt lí thuyết và thực tế điều tra đối với đối tượng giảng viên và đối tượng quản lí khảo sát thực tế giảng dạy tiếng nhật không chuyên ngữ tại việt nam nội dung cụ thể cho từng kĩ năng ở từng cấp độ xác định mức độ đáp ứng về văn hoá và chuyên môn trong ct phát huy những thành tựu công nghệ mới nhất được áp dụng vào công tác dạy và học ngoại ngữ các đặc tính của động cơ điện không đồng bộ đặc tuyến mômen quay m fi p2 đặc tuyến tốc độ rôto n fi p2 đặc tuyến dòng điện stato i1 fi p2 động cơ điện không đồng bộ một pha sự cần thiết phải đầu tư xây dựng nhà máy phần 3 giới thiệu nguyên liệu chỉ tiêu chất lượng theo chất lượng phẩm chất sản phẩm khô từ gạo của bộ y tế năm 2008 chỉ tiêu chất lượng 9 tr 25