clustering of similar words

Báo cáo khoa học: "SenseClusters: Unsupervised Clustering and Labeling of Similar Contexts" potx

Báo cáo khoa học: "SenseClusters: Unsupervised Clustering and Labeling of Similar Contexts" potx

Ngày tải lên : 08/03/2014, 04:22
... correspond to the different senses of the word. This follows the hypothesis of (Miller and Charles, 1991) that words that occur in similar con- texts will have similar meanings. We have shown that ... the percentage of the majority class (MAJ.) and count (N) of the total number of contexts for the names or newsgroups. The majority percentage provides a simple baseline for level of performance, ... optimal number of clusters, to avoid setting this value man- ually. In general all of our results significantly improve upon the majority classifier, which suggests that the clustering of contexts...
  • 4
  • 322
  • 0
Perfect Bound Press Word Fugitives In Pursuit Of Wanted Words

Perfect Bound Press Word Fugitives In Pursuit Of Wanted Words

Ngày tải lên : 05/10/2012, 09:56
... of the spoken language into print, they haven’t made it into our dictionaries. Thus family words make up a half-hidden level of language. The conceptual matter of family words, like that of ... dictionary, Burgess Unabridged: A Dictionary of Words You Have Always Needed. Among the words in it is blurb— another of Burgess’s claims to fame, for this creation of his re - mains in use, still with ... see the bibliography for an expanded list of sources—but I will touch on some highlights. An Exaltation of Larks, the collection of venerable terms of venery, originally appeared in 1968 and...
  • 207
  • 576
  • 4
Words That Appear to Be Misspellings of Everyday Words II

Words That Appear to Be Misspellings of Everyday Words II

Ngày tải lên : 25/10/2013, 16:20
... chain train of grocery stores.” —San Diego Business Journal WORDS THAT APPEAR TO BE MISSPELLINGS OF EVERYDAY WORDS II 43 To be well informed, one must read quickly a great number of merely instructive ... wretch. From erroneous reading of Middle English nithing,from Old English nithing. This form of the word originated in the 1596 text of historian William of Malmesbury. 48 CHAPTER 12 Words Formed Erroneously cmp02.qxd ... gift, quality, trait, or power. 2. To put on (an item of clothing). WORDS THAT APPEAR TO BE MISSPELLINGS OF EVERYDAY WORDS II 41 The lights of stars that were extinguished ages ago still reach...
  • 12
  • 418
  • 0
Perfect Bound Press Word Fugitives In Pursuit Of Wanted Words - TRIBULATIONS

Perfect Bound Press Word Fugitives In Pursuit Of Wanted Words - TRIBULATIONS

Ngày tải lên : 25/10/2013, 19:20
... outlet. When I 110 A LITTLE CROP OF HORRORS This lexicon of tribulations consists of four dictionary words (mostly archaic, rare, or dialectal), and twelve words of the kind this book is mainly ... by Russ Harvey, of Cody’s Books, in Berkeley, Calif. Nantucket designates a pocket in The Deeper Meaning of Liff , but of course, in reality it is the name of an island off Massachusetts. ... Lennon, of Ithaca, N.Y.), fluster cluster (Charles Memminger, of Honolulu), awry spell (Connie West, of Cincinnati), and bad err day (Gina Loebell, of East Windsor, N.J.). Ilan Kinsley, of Sioux...
  • 26
  • 398
  • 0
Perfect Bound Press Word Fugitives In Pursuit Of Wanted Words - Them

Perfect Bound Press Word Fugitives In Pursuit Of Wanted Words - Them

Ngày tải lên : 25/10/2013, 19:20
... spoke of the peasants as leading “a way of life completely different from ours, from that of civilized people.” And Dany Levy, the founder and editor of DailyCandy.com, who compiles lexicons of ... the bla- tant attack on those of us who send follow-up e-mails,” wrote Andrew Goldberg, of New York City. Cheryl Scott Ryan, of Austin, Texas, wrote, “Our recent of ce move has not been kind ... (each proposed by a number of people), ribaldefiler (Romy Benton, of Portland, Ore.), opporntunist (Steve Groulx, of Cornwall, On - tario), verse-vicer (Nancy Schimmel, of Berkeley, Calif.), and...
  • 20
  • 341
  • 0
Tài liệu Báo cáo khoa học: "Identifying the Semantic Orientation of Foreign Words" pdf

Tài liệu Báo cáo khoa học: "Identifying the Semantic Orientation of Foreign Words" pdf

Ngày tải lên : 20/02/2014, 05:20
... orientation of for- eign words. Identifying the semantic orienta- tion of words has numerous applications in the areas of text classification, analysis of prod- uct review, analysis of responses ... 1966) as a source of seed labeled words. The lexicon con- tains 4206 words, 1915 of which are positive and 2291 are negative. For Arabic and Hindi we con- structed a labeled set of 300 words for each ... labeled set of positive and neg- ative words and has shown very promising re- sults. 1 Introduction A great body of research work has focused on iden- tifying the semantic orientation of words. Word...
  • 6
  • 399
  • 0
Tài liệu Báo cáo khoa học: "Topological Ordering of Function Words in Hierarchical Phrase-based Translation" pdf

Tài liệu Báo cáo khoa học: "Topological Ordering of Function Words in Hierarchical Phrase-based Translation" pdf

Ngày tải lên : 20/02/2014, 07:20
...    d(Y  , Y  ) 326 N N = 2048 N = 128 N = 64 N = 2048 331 Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, pages 324–332, Suntec, Singapore, 2-7 August 2009. c 2009...
  • 9
  • 471
  • 1
Tài liệu Báo cáo khoa học: "Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations" ppt

Tài liệu Báo cáo khoa học: "Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations" ppt

Ngày tải lên : 20/02/2014, 09:20
... semantics of MNs well, the MN clusters constructed by using dependency relations should serve as a good gazetteer. However, the high level of computa- tional cost has prevented the use of clustering for ... for storing only a part of classes C l , i.e., 1/|P | of the parame- ter matrix, where P is the number of cluster nodes. This data splitting enables linear scalability of mem- ory sizes. However, ... and, in terms of execution speed, may 4 Acknowledgements: This corpus was provided by Dr. Daisuke Kawahara of NICT. 5 To be precise, we need two copies of these. 6 Each node has a copy of the training...
  • 9
  • 428
  • 0
Tài liệu Báo cáo khoa học: "Automatic clustering of collocation for detecting practical sense boundary" ppt

Tài liệu Báo cáo khoa học: "Automatic clustering of collocation for detecting practical sense boundary" ppt

Ngày tải lên : 20/02/2014, 16:20
... similar context – the contextual words having similar pattern of surrounding words - into same cluster. Extracted clusters throughout the clustering symbolize the senses for the central words ... popularity and the variety of the algorithms – soft and hard clustering and graph clustering etc. In all clustering methods, used similarity measure is the cosine similarity between two sense ... the central word show the similar pattern of context. If collocation patterns between contextual words are similar, it means that the contextual words are used in a similar context - where...
  • 4
  • 425
  • 0
Báo cáo khoa học: "Improving the Use of Pseudo-Words for Evaluating Selectional Preferences" docx

Báo cáo khoa học: "Improving the Use of Pseudo-Words for Evaluating Selectional Preferences" docx

Ngày tải lên : 07/03/2014, 22:20
... noun-noun similarity score, Seen(v d ) is the set of seen head words filling the slot v d during training, and C(v d , n) is the num- ber of times the noun n was seen filling the slot v d The similarity ... 2002 of the NYT portion of the Gigaword Corpus, containing approximately 225 million tokens. • Train x10: The entire NYT portion of Giga- word (approximately 1.2 billion tokens). It is an order of ... less of the training examples overall. In order to analyze why pairs are unseen, we an- alyzed the distribution of rare words across unseen and seen examples. To define rare nouns, we order head words...
  • 9
  • 405
  • 0
Báo cáo khoa học: "An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words" potx

Báo cáo khoa học: "An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words" potx

Ngày tải lên : 08/03/2014, 05:20
... Similar Words The contextually similar words of a word w are words similar to the intended meaning of w in its context. Below, we describe an algorithm for constructing contextually similar words ... consisting of 11839 nouns, 3639 verbs and 5658 adjectives/adverbs. Given a word w, the thesaurus returns a set of similar words of w along with their similarity to w. For example, the 20 most similar words ... the contextually similar words of w. We retrieve from the collocation database the words that occurred in the same dependency relationship as w. We refer to this set of words as the cohort of w for the...
  • 8
  • 376
  • 0
Báo cáo khoa học: "A COMFUTATIONAL THEORY OF THE FUNCTION OF CLUE WORDS IN ARGUMENT UNDERSTANDING" potx

Báo cáo khoa học: "A COMFUTATIONAL THEORY OF THE FUNCTION OF CLUE WORDS IN ARGUMENT UNDERSTANDING" potx

Ngày tải lên : 08/03/2014, 18:20
... terms of algorithms, with measurable complexity, to allow convenient study of the effect of clue words on processing. Two important observations are made: (I) clue words cut processing of the ... categories. 258 A COMFUTATIONAL THEORY OF THE FUNCTION OF CLUE WORDS IN ARGUMENT UNDERSTANDING Robin Cohen Department of Computer Science University of Toronto 'lDronto, CANADA MSS ... use of clue words in argument dialogues. These are special words and phrases directly indicating the structure of the argument to the hearer. Two main conclusions are drawn: I) clue words...
  • 8
  • 384
  • 0
Báo cáo khoa học: "Analysis of Unknown Words through Morphological Decomposition" potx

Báo cáo khoa học: "Analysis of Unknown Words through Morphological Decomposition" potx

Ngày tải lên : 09/03/2014, 01:20
... analysis of words into morphemes based on user-defined rules. The basic system does not offer analysis of words containing unknown morphemes, nor does it provide a rank ordering of the output ... These points can multiply together and of- ten produce a large number of possible analyses. Out of the test set of 200 words, based on a lex- icon consisting of around 3500 morphemes (in- cluding ... analysing words. Another problem is that unknown words are often place-names, proper names, Ioanwords etc. The technique described here would prob- ably not deal adequately with such words. ...
  • 6
  • 413
  • 0

Xem thêm