0

scalable mining of named entity transliterations

Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Robust Extraction of Named Entity Including Unfamiliar Word" doc

Báo cáo khoa học

... Extraction of Japanese Named Entity 2.1 Task of the IREX WorkshopThe task of NE extraction of the IREX workshop(Sekine and Eriguchi, 2000) is to recognize eightNE types in Table 1. The organizer of ... experiments of extracting Japanese named entities from IREXcorpus and NHK corpus show the effective-ness of the proposed method.1 IntroductionIt is widely agreed that extraction of named entity (henceforth, ... Chunking of Named EntitiesIt is quite common that the task of extractingJapanese NEs from a sentence is formalized asa chunking problem against a sequence of mor-1The organizer of the IREX...
  • 4
  • 384
  • 1
Báo cáo khoa học:

Báo cáo khoa học: "Joint Inference of Named Entity Recognition and Normalization for Tweets" doc

Báo cáo khoa học

... which named entities occur fre-quently with rich variations. We study theproblem of named entity normalization (NEN)for tweets. Two main challenges are the er-rors propagated from named entity ... nature of tweets, there are rich variations of named enti-ties in them. According to our investigation on thedata set provided by Liu et al. (2011), every named entity in tweets has an average of ... an overview of our method, then detail itsmodel and features.4.1 OverviewGiven a set of tweets as input, our method recog-nizes predefined types of named entities and for each entity outputs...
  • 10
  • 444
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web" pot

Báo cáo khoa học

... Part- Of- Speech Tagging of Korean. Computational Lin-guistics, 28(1):53–70.Manabu Sassano and Takehito Utsuro. 2000. Named Entity Chunking Techniques in Supervised Learningfor Japanese Named Entity ... Automatic Acquisition of Named Entity Tagged Corpus from World WideWebJoohui AnDept. of CSEPOSTECHPohang, Korea 790-784minnie@postech.ac.krSeungwoo LeeDept. of CSEPOSTECHPohang, Korea ... Jour-nal of Korean Information Science Society, 24(8):900–909.GuoDong Zhou and Jian Su. 2002. Named Entity Recognition using an HMM-based Chunk Tagger. InProceedings of the 40th Annual Meeting of...
  • 4
  • 397
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

Báo cáo khoa học

... knowledge for named entity disambigua-tion. In Proceedings of EACL, 9-16. Cucerzan, S. 2007. Large-scale named entity dis-ambiguation based on Wikipedia data. In Pro-ceedings of EMNLP/CoNLL, ... trained on up to 40,000 words of human-annotated newswire. 1 Introduction Named Entity Recognition (NER) has long been a major task of natural language processing. Most of the research in the field ... Computational Linguistics Mining Wiki Resources for Multilingual Named Entity Recognition Alexander E. Richman Patrick Schone Department of Defense Department of Defense Washington, DC...
  • 9
  • 429
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations" ppt

Báo cáo khoa học

... English.408Proceedings of ACL-08: HLT, pages 407–415,Columbus, Ohio, USA, June 2008.c2008 Association for Computational LinguisticsInducing Gazetteers for Named Entity Recognitionby Large-scale Clustering of ... clustering of de-pendency relations between verbs and multi-word nouns (MNs) to construct a gazetteer for named entity recognition (NER). Since depen-dency relations capture the semantics of MNswell, ... for storingonly a part of classes Cl, i.e., 1/|P | of the parame-ter matrix, where P is the number of cluster nodes.This data splitting enables linear scalability of mem-ory sizes. However,...
  • 9
  • 428
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition" pdf

Báo cáo khoa học

... label of a named entity is “O”,which indicates a non -named entity. For 98.0% of the named entities in the training data of the sharedtask in the 2004 JNLPBA, the label of the preced-ing entity ... End-Word” capture the tendency of the length of a named entity. “Count feature” captures the ten-dency for named entities to appear repeatedly inthe same sentence.“Preceding Entity and Prev Word” are ... N is the length of sentence andK is the size of label set. And that of training infirst order semi-CRFs is O(K2LN). The increase of the cost is used to transfer non-adjacent entity information.To...
  • 8
  • 527
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Incorporating speech recognition confidence into discriminative named entity recognition of speech data" ppt

Báo cáo khoa học

... distinguish words of a class from words of other classes. For NER, we used an SVM-basedchunk annotator YamCha20.33 with a quadratickernel (1 +x ·y)2and a soft margin parameter of SVMs C=0.1 ... tokenized usingChaSen. The vocabulary size of the word 3-grammodel was 426,023. The test-set perplexity overthe text corpus was 76.928. The number of out- of- vocabulary words was 1,551 (0.587%). ... Compar-isons of NE surfaces did not include differencesin word segmentation because of the segmentationambiguity in Japanese. Note that NER recall withASR results could not exceed the rate of the...
  • 8
  • 311
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora" ppt

Báo cáo khoa học

... language. Identification of the entity sequivalence class of transliterations is importantfor obtaining its accurate time sequence.In order to keep to our objective of requiring aslittle language ... was initialized with a set of 20 pairs of En-glish NEs and their Russian transliterations. Nega-tive examples here and during the rest of the train-ing were pairs of randomly selected non-NE ... random 727 of the total of 978NEs were matched to correct transliterations by alanguage expert (partly due to the fact that some of the English NEs were not mentioned in the Rus-sian side of the...
  • 8
  • 391
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Syntax-based Semi-Supervised Named Entity Tagging Behrang Mohit" ppt

Báo cáo khoa học

... sub-ject or the object of a sentence has a high probabil-ity of being a particular type of named entity. Thus, we expanded our syntactic analysis of the data into dependency parse of the text and ... section 5 covers the results of the evalua-tion of our system. Figure 1: System's architecture 3 Named Entity Recognition In this level, the system used a group of syntax-based rules to ... trained classifier generalizes well. 1 Introduction Named entity (NE) tagging is the task of recogniz-ing and classifying phrases into one of many se-mantic classes such as persons, organizations...
  • 4
  • 233
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Bootstrapping Approach to Named Entity Classification Using Successive Learners" pdf

Báo cáo khoa học

... 1998. Description of the MENE named Entity System. Proceedings of MUC-7. Collins, M. and Y. Singer. 1999. Unsupervised Models for Named Entity Classification. Proceedings of the 1999 Joint ... performance of the HMM on the PRO tag. Table 4. Performance of PRODUCT NE TYPE PRECISION RECALL F-MEASUREPRODUCT 67.3% 72.5% 69.8% Similar to the case of ORG NEs, the number of concept-based ... Discovery Engine Supported by New Levels of Information Extraction. Proceeding of HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems, Edmonton,...
  • 8
  • 489
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "The Multilingual Named Entity Recognition Framework" docx

Báo cáo khoa học

... entropy approachfor named entity recognition. PhD Thesis, NewYork University.Collins M. and Singer Y. (1999) Unsupervisedmodels for named entity classification. InProceedings of EMNLP/WVLC, 1999, ... languagetechnology is not much developed for most of them. This has a big consequence for named entity recognition: for certain languages likemost of the European languages, we benefitfrom already ... issimply, most of the time, not realistic to taglarge amount of corpus (Appelt and Israel,1999). Moreover, tagging great amounts of datacan be compared to the elaboration of dictionaries2.•Grammar....
  • 4
  • 279
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Japanese Named Entity Recognition based on a Simple Rule Generator and Decision Tree Learning" pdf

Báo cáo khoa học

... Unsuper-vised models for named entity classification. InProceedings of EMNLP/VLC.Jim Cowie. 1995. CRL/NMSU description of theCRL/NMSU system used for MUC-6. In Proceed-ings of the Sixth Message ... memt,January.Manabu Sassano and Takehito Utsuro. 2000. Named entity chunking techniques in supervised learningfor Japanese named entity recognition. In Proceed-ings of the International Conference on Computa-tional ... 705–711.Satoshi Sekine and Yoshio Eriguchi. 2000. Japanese named entity extraction evaluation — analysis of results —. In Proceedings of 18th InternationalConference on Computational Linguistics,...
  • 8
  • 530
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "The Multilingual Named Entity Recognition Framework" ppt

Báo cáo khoa học

... resources and tools for named entity recognition. A team of computational linguist students develops thisThe members of the INaLCO Named Entity Groupare: A. Acoulon, C. Avaux, L. Beroff-Beneat-,A. ... differentapproaches to named entity recognition. Wethen examine previous experiments to comparesystems and techniques. Sekine and Eriguchi(2000) present an interesting classification of named entity recognition ... entropy approachfor named entity recognition. PhD Thesis, NewYork University.Collins M. and Singer Y. (1999) Unsupervisedmodels for named entity classification. InProceedings of EMNLP/WVLC, 1999,...
  • 4
  • 283
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A Framework for Unifying Named Entity Recognition and Disambiguation Extraction Tools" pot

Báo cáo khoa học

... extract the list of Named Entity, their classification and the URIs that dis-ambiguate these entities. The main purpose of thisinterface is to enable a human user to assess thequality of the extraction ... the comparison of the perfor-mance of these services as well as their pos-sible combination. We address this problemby proposing NERD, a framework whichunifies 10 popular named entity extractorsavailable ... 09/10/2011 to 12/10/2011.the number nd of evaluated documents, the num-ber nw of words, the total number ne of enti-ties, the total number nc of categories and nuURIs. Moreover, we compute...
  • 4
  • 466
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation" potx

Báo cáo khoa học

... to find whether a N-gram is a named entity, we match it to the named entity list extracted using the open-Calais API3, which contains more than 30 types of named entities, such as Person, ... Totally, the traditional named entity dis-ambiguation methods can be classified into two categories: the shallow methods and the know-ledge-based methods. Most of previous named entity disambiguation ... knowledge captured in the structural semantic relatedness measure for named entity disambiguation. Because the key problem of named entity disambiguation is to measure the similarity between name...
  • 10
  • 284
  • 0

Xem thêm