0

automatic retrieval and clustering of similar words

Báo cáo khoa học:

Báo cáo khoa học: "Automatic Retrieval and Clustering of Similar Words" potx

Báo cáo khoa học

... Related Work and Conclusion There have been many approaches to automatic de- tection of similar words from text corpora. Ours is 772 Automatic Retrieval and Clustering of Similar Words Dekang ... Ilcell, nmod -of, architecturell=l [[cell, obj -of, attackl[=6 [[cell, obj -of, bludgeon[[=l [Icell, obj -of, callll=l 1 Hcell, obj -of, come froml[=3 Ilcell, obj -of, containll 4 Ilcell, obj -of, decoratell=2 ... shows the similarity tree for the top-40 most similar words to duty. The first number behind a word is the similarity of the word to its parent. The second number is the similarity of the word...
  • 7
  • 322
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "SenseClusters: Unsupervised Clustering and Labeling of Similar Contexts" potx

Báo cáo khoa học

... LinguisticsSenseClusters: Unsupervised Clustering and Labeling of Similar ContextsAnagha Kulkarni and Ted PedersenDepartment of Computer ScienceUniversity of MinnesotaDuluth, MN 55812{kulka020,tpederse}@d.umn.eduhttp://senseclusters.sourceforge.netAbstractSenseClusters ... 220–231, Mexico City, February.A. Purandare and T. Pedersen. 2004. Word sensediscrimination by clustering contexts in vector and similarity spaces. In Proceedings of the Conferenceon Computational ... correspond to the different senses of theword. This follows the hypothesis of (Miller and Charles, 1991) that words that occur in similar con-texts will have similar meanings.We have shown that...
  • 4
  • 322
  • 0
Tài liệu AUTOMATIC MONITORING AND CONTROL OF COOLING WATER TREAMENT PRODUCTS docx

Tài liệu AUTOMATIC MONITORING AND CONTROL OF COOLING WATER TREAMENT PRODUCTS docx

Thời trang - Làm đẹp

... Controls and ProChemTech to devise and commercialize an on-line spectrophotometer based monitor and controller to control feedtraced inhibitors. After review of the technology in the hand held ... function. 3 Control of cooling water inhibitor dosage is one of the critical issues in achieving good results as to control of scale, corrosion and deposition; minimization of water management ... and control method allows automatic monitoring and control of inhibitor dosage and is currently marketed as their “TRASAR” technology. Unfortunately for the rest of the water management industry,...
  • 8
  • 423
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic Detection and Correction of Errors in Dependency Treebanks" potx

Báo cáo khoa học

... errors.5 Automatic Correction of ErrorsIn this section we propose our algorithm for auto-matic correction of errors, which consists out of the following steps:1. Automatic detection of error candidates, ... different to gold-standard.2. Substitution of the annotation of the error candidates by the annotation proposed by one of the parsers (in our case MSTParser).3. Parse of the modified corpus ... corpora like the Penn Treebank [1] have thousands of citations, since most of the algorithms profit from annotated data during the development and testing and thus are widely used in the field. Treebanks...
  • 5
  • 376
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews" doc

Báo cáo khoa học

... (e.g, Yu and Hatzivassiloglou (2003)) and classifying thepolarity based solely on the subjective portions of the document (e.g., Pang and Lee (2004)). Moti-vated by the work of Koppel and Schler ... Computational LinguisticsExamining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of ReviewsVincent Ng and Sajib Dasgupta and S. M. Niaz ArifinHuman Language ... dataset. Briefly,a positive review has a rating of ≥ 3.5 (out of 5) or ≥ 3 (out of 4), whereas a negative review has a rating of ≤ 2 (out of 5)or ≤ 1.5 (out of 4).614 Finally, previous work has also...
  • 8
  • 489
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "DISTRIBUTIONAL CLUSTERING OF ENGLISH WORDS" pptx

Báo cáo khoa học

... simple tabulation of fre- quencies of certain words participating in certain configurations, for example of frequencies of pairs of a transitive main verb and the head noun of its direct object, ... likelihood of a particular direct object for a verb from the likelihoods of that direct object for similar verbs. This requires a reasonable defini- tion of verb similarity and a similarity ... involving the 1000 most frequent nouns in the corpus for clustering, and randomly divided it into a training set of 756721 pairs and a test set of 81240 pairs. Relative Entropy Figure 3 plots the...
  • 8
  • 310
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Collecting a Why-question corpus for development and evaluation of an automatic QA-system" pdf

Báo cáo khoa học

... time-consuming process and not ap-plicable within efficient development of sys-tems. Automatic evaluation requires a cor-pus of questions and answers, a definition of what is a correct answer, and a way ... selection of natural questions. Thearticles varied in topic, degree of formality and theamount of details; from ”Horror film” and ”Christ-mas worldwide” to ”G-Man (Half-Life)” and ”His-tory of London”. ... post-process the data and in further projects of similar nature. For example, the ROUGE similarity couldbe used in the data collection phase as a tool of au-tomatic approval and rejection of workers’...
  • 9
  • 610
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity" pdf

Báo cáo khoa học

... distribu-tionally similar words. In IJCAI, pages 1492–1493.Dekang Lin. 1998. Automatic retrieval and clustering of similar words. In COLING-ACL, pages 768–774.Robert Malouf and Gertjan van Noord. ... behind this isthat similar words share similar contexts. Systemsbased on distributional similarity provide rankedlists of semantically related words according tothe similarity of their contexts. ... nature of the context applied and the results of the synonymextraction task.5.1.1 Data and ResourcesAs our data we used the Dutch CLEF QA cor-pus, which consists of 78 million words of Dutch2Note...
  • 8
  • 516
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Automatic clustering of collocation for detecting practical sense boundary" ppt

Báo cáo khoa học

... 2003) Used clustering methods cover both the popularity and the variety of the algorithms – soft and hard clustering and graph clustering etc. In all clustering methods, used similarity measure ... show the similar pattern of context. If collocation patterns between contextual words are similar, it means that the contextual words are used in a similar context - where used and interrelated ... similar context – the contextual words having similar pattern of surrounding words - into same cluster. Extracted clusters throughout the clustering symbolize the senses for the central words...
  • 4
  • 425
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity" pot

Báo cáo khoa học

... Systems. Kluwer Academic Publishers,Dordrecht: The Netherlands.Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proc. of the 17th international con-ference on Computational ... We can evaluate the effective-ness of ? and @, and indirectly that of ^ too, bycomparing the use of ? and @ as category buildersto a hand-crafted gold standard like WordNet.Other researchers ... of the form “the P * of X ”(where P ∈ @X), then @X augmented with thisadditional set of attributes (like hands for surgeon)284 Veale, T. and Butnariu, C. (2010). Harvesting and Un-derstanding...
  • 10
  • 384
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic Determination of Parts of Speech of English Words" docx

Báo cáo khoa học

... nine hundred words to assign parts of speech to special or exceptional words. Other words are split into affix and kernel parts and assigned a part of speech on the basis of the part -of- speech ... implications of the affixes and the length of the remaining kernel. An accuracy of 95 per cent is achieved from the point of view of inclusive part of speech, where inclusive part of speech is ... part- of- speech string for words which match. For all other words, the word is separated into kernel and affix parts, and the part -of- speech implication of the affixes is looked up and applied...
  • 15
  • 383
  • 0
A study on polysemy of antonymous words in English Some related problems facing learners of English and suggested solutions

A study on polysemy of antonymous words in English Some related problems facing learners of English and suggested solutions

Khoa học xã hội

... football (Hanh, 2006:90) Lexemes like on and off, good and bad, love and hate are pairs of antonyms. They indicate the words of the same part of speech, which have contrasting or opposite ... Antonyms of dull 28 3.2. Antonyms of dry 30 3.3. Antonyms of hard 31 3.4. Antonyms of heavy 33 3.5. Antonyms of severe 35 3.6. Antonyms of short 36 3.7. Antonyms of strong 37 4. Antonyms of ... them antonymous verbs. Similarly, these pairs of antonyms are antonymous verbs (bring and take, live and die, open and close, weep and laugh are antonyms on the basis of relation ) To bring...
  • 65
  • 726
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Exploiting Aggregate Properties of Bilingual Dictionaries For Distinguishing Senses of English Words and Inducing English Sense Clusters" ppt

Báo cáo khoa học

... Properties of Bilingual Dictionaries For DistinguishingSenses of English Words and Inducing English Sense ClustersCharles SCHAFER and David YAROWSKYDepartment of Computer Science and Center ... con-sisted of a list of pairs of the form (foreign word, Englishword). Because bilingual dictionary structure varieswidely, and even the availability and compatibility of part -of- speech tags ... senses of blond and justFigure 1: Detecting asynonymy via unbalanced synonymy relation-ships among 3 words. The derived synonymy relation S holds betweenfair and blond, and between fair and just....
  • 4
  • 369
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Blog Categorization Exploiting Domain Dictionary and Dynamically Estimated Domains of Unknown Words" potx

Báo cáo khoa học

... and the dy-namic domain estimation of unknown words. In theBlog categorization, the method achieved the accu-racy of 94%, and the domain estimation of unknown words achieved the accuracy of ... §4.4 Domain Estimation of Unknown Words The domain (and IDF) of unknown word is dynam-ically estimated exploiting the Web. More specifi-cally, we use Wikipedia and Snippets of Web search,in addition ... buildclassifier. Ko and Seo (2004) automatically collecttraining data using a large amount of unlabeled data and a small amount of seed information. However,the novelty of this study is the...
  • 4
  • 278
  • 0

Xem thêm