0

bagging and distributional similarity

Báo cáo khoa học:

Báo cáo khoa học: "Reducing semantic drift with bagging and distributional similarity" pdf

Báo cáo khoa học

... ACL and the 4th IJCNLP of the AFNLP, pages 396–404,Suntec, Singapore, 2-7 August 2009.c2009 ACL and AFNLPReducing semantic drift with bagging and distributional similarity Tara McIntosh and ... and WMEB using just thehand-picked seeds (Shand) and 50 sample super-vised bagging (SgoldBAG). Bagging with samples from Sgoldsuccessfullyincreased the performance of both BASILISK and WMEB ... Lhandtosample from and then another round with the 50sets of randomly unsupervised seeds, Srand.The next decision is how to sample SrandfromLhand. One approach is to use uniform randomsampling...
  • 9
  • 339
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity" pdf

Báo cáo khoa học

... correct alignments and thiscauses many mistakes in the distributional simi-larity algorithm. We have given some examples inrows 4 and 5 of table 5.We have used the distributional similarity scoreonly ... Alignment and Measures of Distributional Similarity Lonneke van der Plas & J¨org TiedemannAlfa-InformaticaUniversity of GroningenP.O. Box 7169700 AS GroningenThe Netherlands{vdplas,tiedeman}@let.rug.nlAbstractThere ... usingmeasures of distributional similarity, butthese typically are not able to distin-guish between synonyms and other typesof semantically related words such asantonyms, (co)hyponyms and hypernyms.We...
  • 8
  • 516
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Word classification based on combined measures of distributional and semantic similarity" docx

Báo cáo khoa học

... combined and the distributional weighting schemas.The combined weighting schema thus showedrelative improvement on the distributional one:1.5% (BNC) and 2.3% (AP) in terms of precision and 9.2% ... each weightedby the distributional similarity of the neighbor tothe test word. Figure 3 compares the precision and learning accuracy of the combined weightingschema to the distributional weighting. ... semantic similarity to other classes.Besides distributional data, our method integratesthis semantic information: the classification decisionis a function of both (1) the distributional similarity of...
  • 4
  • 345
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Finding Word Substitutions Using a Distributional Similarity Baseline and Immediate Context Overlap" potx

Báo cáo khoa học

... (see Geffet and Dagan, 2004 and Gef-fet and Dagan, 2005, who improve the output of a distributional similarity system for an entailmenttask using a web-based feature inclusion check, and comment ... ‘murder’ and ‘abduct’kill murder abducttwo birds with babies that life her and makecancer cells and his wife and an innocent mana mocking bird thousands of innocent unsuspecting people and or ... Geffet and Ido Dagan. 2004. Feature VectorQuality and Distributional Similarity. ProceedingsOf the 20th International Conference on Computa-tional Linguistics, 2004.Maayan Geffet and Ido...
  • 9
  • 248
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Measures of Distributional Similarity" ppt

Báo cáo khoa học

... Speech and Language, 9:123-152. Ido Dagan, Lillian Lee, and Fernando Pereira. 1999. Similarity- based models of cooccur- rence probabilities. Machine Learning, 34(1- 3) :43-69. Ute Essen and ... Thomas M. Cover and Joy A. Thomas. 1991. Elements of Information Theory. John Wiley. Ido Dagan, Shanl Marcus, and Shanl Marko- vitch. 1995. Contextual word similarity and estimation from ... nando Pereira, and Stuart Shieber for helpful discussions, the anonymous reviewers for their insightful comments, Fernando Pereira for ac- cess to computational resources at AT&T, and...
  • 8
  • 338
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Distributional Similarity Models: Clustering Neighbors" doc

Báo cáo khoa học

... 1995. Contextual word similarity and estimation from sparse data. Computer Speech and Lan- guage, 9:123-152. Ido Dagan, Lillian Lee, and Fernando Pereira. 1999. Similarity- based models ... parison of distributional clustering and nearest- neighbors averaging on several large datasets, exploring the tradeoff in similarity- based mod- eling between memory usage on the one hand and estimation ... Douglas Baker and Andrew Kachites McCallum. 1998. Distributional clustering of words for text classification. In Plst Annual International A CM SIGIR Conference on Research and Development...
  • 8
  • 268
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Syntactic Features and Word Similarity for Supervised Metonymy Resolution" pot

Báo cáo khoa học

... Scotlandsubj-of subj-ofwin losecontext reductionPakistanScotland-subj-of-losePakistan-subj-of-win similarity semantic classhead similarity role similarity Pakistanhad ... in the semi-finalScotlandFigure 1: Context reduction and similarity levelsdraw this inference, two levels of similarity need tobe taken into account. One concerns the similarity ofthe words ... both the similarity of the heads in the gram-matical relation (e.g., “win” and “lose”) and that ofthe grammatical role (e.g. subject). Figure 1 illus-trates context reduction and similarity...
  • 8
  • 603
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Using lexical and relational similarity to classify semantic relations" pptx

Báo cáo khoa học

... distinct typesof word pair similarity: lexical similarity and relational similarity. We present anefficient and flexible technique for imple-menting relational similarity and show theeffectiveness ... relational similarity but not both in com-bination. Previously proposed lexical models in-clude the WordNet-based methods of Kim and Baldwin (2005) and Girju et al. (2005), and the distributional ... co-occurrence probabilityvectors for w1 and w2. Taking kjsdas a measure ofword similarity and introducing parameters α and β to scale the contributions of w1 and w2respec-tively, we retrieve...
  • 9
  • 416
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks" ppt

Báo cáo khoa học

... and intriguing question, whereby we construct the syn-tactic and semantic distributional similarity net-work (DSN) and analyze their spectrum to un-derstand their global topology. We observe thatthere ... commonalities and differences be-tween the syntactic and semantic distributional patterns of the words of a language? This study isan initial attempt to answer this fundamental and intriguing ... popular, visualization of distributional similarity is through graphs or networks, whereeach word is represented as nodes and weightededges indicate the extent of distributional similar-ity...
  • 4
  • 250
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Exploring Distributional Similarity Based Models for Query Spelling Correction" docx

Báo cáo khoa học

... and describe two methods that can make use of distributional similarity information in Section 3. Experiments and results are presented in Section 4. The last section contains summaries and ... of valid words in certain contexts (Golding and Roth, 1996; Mangu and Brill, 1997). Distributional similarity between words has been investigated and successfully applied in many natural language ... 1998) and language model smoothing (Essen and Steinbiss, 1992; Dagan et al., 1997). An investi-gation on distributional similarity functions can be found in (Lillian Lee, 1999). 3 Distributional...
  • 8
  • 309
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Word Vectors and Two Kinds of Similarity" pptx

Báo cáo khoa học

... similarity into twocategories: taxonomic similarity and associative similarity. Taxonomic similarity, or categorical similarity, is a kind of semantic similarity betweenwords in the same level ... LSA-based, cooccurrence-based and dictionary-based methods, were com-pared in terms of the ability to representtwo kinds of similarity, i.e., taxonomic similarity and associative similarity. Theresult ... addresses threemethods, LSA-based, cooccurrence-based, and dictionary-based methods, and two kinds of sim-ilarity, taxonomic similarity and associative sim-ilarity. Word vectors constructed...
  • 8
  • 473
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "The Distributional Inclusion Hypotheses and Lexical Entailment" pdf

Báo cáo khoa học

... 1994; Lee, 1997; Lin, 1998; Pantel and Lin, 2002; Weeds and Weir, 2003). As it turns out, distributional similarity captures a somewhat loose notion of semantic similarity (see Table 1). It does ... Southampton, U.K. Geffet, Maayan and Ido Dagan, 2004. Feature Vector Quality and Distributional Similarity. In Proc. of Col-ing-04. Geneva. Switzerland. Grefenstette, Gregory. 1994. ... using the filter, with 20 and 40 feature sampling, com-pared to RFF top-40 and RFF top-26 simi-larities. ITA-20 and ITA-40 denote the web-sampling method with 20 and random 40 features, respectively....
  • 8
  • 432
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

Báo cáo khoa học

... quality photocopies, and faxes are still difficult to process and cause many errors. The accu- racy of handwritten OCR is still about 90% (Hilde- brandt and Liu, 1993), and it worsens dramatically ... 91% for magazines and introductory textbooks of science and technology. (Ito and Maruyama, 1992) used part of speech bigram model and beam search in order to get multiple candidates in their ... al., 1991; Golding and Schabes, 1996). Similar techniques are used for correcting the output of English OCRs (Tong and Evans, 1996) and English speech recognizers (Ring- ger and Allen, 1996)....
  • 7
  • 472
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments" ppt

Báo cáo khoa học

... grammaticality, and coherence of the essay (Higgins et al., 2004), and the assessment of short student answers (Lea-cock and Chodorow, 2003; Pulman and Sukkarieh,2005; Mohler and Mihalcea, 2009), ... & St. Onge (1998) [HSO], and two corpus-based measures: Latent Semantic Analysis [LSA](Landauer and Dumais, 1997) and Explicit Seman-tic Analysis [ESA] (Gabrilovich and Markovitch,2007).Briefly, ... MA.G. Hirst and D. St-Onge, 1998. Lexical chains as repre-sentations of contexts for the detection and correctionof malaproprisms. The MIT Press.J. Jiang and D. Conrath. 1997. Semantic similarity...
  • 11
  • 478
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "CONTEXTUAL WORD SIMILARITY AND ESTIMATION FROM SPARSE DATA" ppt

Báo cáo khoa học

... SIGIR. Fernando Pereira, Naftali Tishby, and Lillian Lee. 1993. Distributional clustering of English words. In Proc. of the Annual Meeting of the ACL. Philip Resnik. 1992. Wordnet and distributional ... using frequency information (Good, 1953; Katz, 1987; Jelinek and Mercer, 1985; Church and Gale, 1991). Church and Gale (Church and Gale, 1991) show, that for unobserved bigrams, the estimates ... 150 pairs, were constructed randomly and were restricted to words with indi- vidual frequencies between 500 and 2500. We term these two sets as the occurring and non-occurring sets. The...
  • 8
  • 334
  • 0

Xem thêm

Tìm thêm: hệ việt nam nhật bản và sức hấp dẫn của tiếng nhật tại việt nam xác định các nguyên tắc biên soạn khảo sát các chuẩn giảng dạy tiếng nhật từ góc độ lí thuyết và thực tiễn khảo sát chương trình đào tạo của các đơn vị đào tạo tại nhật bản khảo sát chương trình đào tạo gắn với các giáo trình cụ thể xác định thời lượng học về mặt lí thuyết và thực tế điều tra đối với đối tượng giảng viên và đối tượng quản lí khảo sát thực tế giảng dạy tiếng nhật không chuyên ngữ tại việt nam khảo sát các chương trình đào tạo theo những bộ giáo trình tiêu biểu xác định mức độ đáp ứng về văn hoá và chuyên môn trong ct phát huy những thành tựu công nghệ mới nhất được áp dụng vào công tác dạy và học ngoại ngữ mở máy động cơ rôto dây quấn đặc tuyến hiệu suất h fi p2 động cơ điện không đồng bộ một pha sự cần thiết phải đầu tư xây dựng nhà máy thông tin liên lạc và các dịch vụ phần 3 giới thiệu nguyên liệu từ bảng 3 1 ta thấy ngoài hai thành phần chủ yếu và chiếm tỷ lệ cao nhất là tinh bột và cacbonhydrat trong hạt gạo tẻ còn chứa đường cellulose hemicellulose chỉ tiêu chất lượng theo chất lượng phẩm chất sản phẩm khô từ gạo của bộ y tế năm 2008 chỉ tiêu chất lượng 9 tr 25