Báo cáo khoa học: "Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities" ppt

... 4), and in a more re-alistic one in which parsing and segmentation arehandled jointly by the parser (Goldberg and Tsar-faty, 2008) (Sec. 5). External lexical informa-tion enhances unlexicalized ... training data. count(·) is a counting function over the training data, rarestands for any rare event, and wrareis a speciﬁcrare event. KCA(·) is the KC Analyzer function,mapping a lexical ... as proposed in (Tsar-faty and Sima’an, 2008).Apart from Hebrew, our method is applicablein any setting in which there exist a small tree-bank and a wide- coverage lexical resource. Forexample...

Báo cáo khoa học: "Building Deep Dependency Structures with a Wide-Coverage CCG Parser" ppt

... standard, and computed precision and recall ﬁgures over the dependencies. Recall that a dependency is deﬁned as a 4-tuple: a head of a functor, a functor category, an argument slot, and a head ... the data. If a word ap-pears at least K times in the data, the supertaggeronly considers categories that appear in the word’scategory set, rather than all lexical categories.The second parsing ... for a c and b d is the numberof times that word-category pairs a b and c d are inthe same word-category sequence in the training data.C R a b c d is the number of times that a b and c d are...

Báo cáo khoa học: "Accurate Unlexicalized Parsing" pot

... between a compact grammar and useful markov histories.3 External vs. Internal AnnotationThe two major previous annotation strategies, par-ent annotation and head lexicalization, can be seenas ... motivates various class- or similarity-based approaches to combating sparseness, and this remains a promising avenue of work, but success in this area has provensomewhat elusive, and, at any rate, ... F1up substan-tially, to 80.62%.In addition to the adverb case, the Penn tag setconﬂates various grammatical distinctions that arecommonly made in traditional and generative gram-mar, and from...

Báo cáo khoa học: "Determining Word Sense Dominance Using a Thesaurus" potx

... and citations therein); (ii) compu-tational ease—with just around a thousand cate-gories, the word–category matrix has a manage-able size; (iii) widespread availability—thesauriare available ... thesaurus. Since human an-notation is both expensive and time intensive, wepresent an alternative approach of artiﬁcially gen-erating thesaurus-sense-tagged data following theideas of Leacock ... to 0 and 1. Thelower accuracies for α near 0.5 are understandableas the amount of evidence towards both senses ofthe target word are nearly equal.Odds, pmi, and Yule perform almost equallywell...

Báo cáo khoa học: "Better Automatic Treebank Conversion Using A Feature-Based Approach" doc

... similarly as the standard shift-reduce parsing algorithm. In the training phase, each target-styleparse tree in the training data is transformed into a binary tree (Charniak et al., 1998) and then ... choose actions for state transition. Moreover,beam search strategies can be used to expand thesearch space of a shift-reduce-based heterogeneousparser (Sagae and Lavie, 200 6a) . To incorporate ... (named source parser) on a source treebank, and use it to parse sentencesin the training data of a target treebank.Step 2: Build a parser on pairs of golden target-style and auto-assigned (in...

Báo cáo khoa học: "Part of Speech Tagging Using a Network of Linear Separators" pdf

... data comes from a different source than the training data, and will allow the algorithm to adapt to the new context. For example, a language acquisition system with a tagger trained on a ... and evaluate its performance under various conditions. In the second set SNOW is compared with a naive Bayes algorithm and with Brill's TBL, all trained and tested on the same data. ... tagger, based on Brill's transforma- tion based approach; we show that SNOW- based taggers already achieve results that are comparable to it, and outperform it, when we allow online update....

Báo cáo khoa học: "PART-OF-SPEECH TAGGING USING A VARIABLE MEMORY MARKOV MODEL" doc

... assumptions for the static tag probabilities, are encouraging. VARIABLE MEMORY MARKOV MODELS Markov models are a natural candidate for language modeling and temporal pattern recognition, ... fixed-length histories, variable memory Markov models dynamically adapt their history length based on the training data, and hence may use fewer parameters. In a test of a VMM based tagger on the Brown ... future observations. This approach is easy to implement, the learning algorithm and classifica- tion of new tags are computationally efficient, and the results achieved, using simplified assumptions...

Báo cáo khoa học: The leech product saratin is a potent inhibitor of platelet integrin a2b1 and von Willebrand factor binding to collagen pdf

... inhibitors of ADP and thromboxane A 2, both saratin and 6F1, a blocking a 2b1mAb, abro-gated platelet adhesion to fibrillar and soluble collagen. Additionally, sara-tin eliminated a 2b1-dependent ... kDa leech antiplatelet protein isola-ted from Haementeria officinalis) and calin and sara-tin (approximately 65 kDa and 12 kDa proteins,respectively, both isolated from H. medicinalis), havebeen ... Willebrand factor-dependent platelet adhesion,decreases platelet aggregation and intimal hyperplasia in a rat carotid endarterectomy model. J Vasc Surg 34,724–729.20 Smith TP, Alshafie TA, Cruz...

Báo cáo khoa học: "HPSG-Style Underspecified Japanese Grammar with Wide Coverage" docx

... U.K. Abstract This paper describes a wide- coverage Japanese grammar based on HPSG. The aim of this work is to see the coverage and accuracy attain- able using an underspecified grammar. Under- ... when a relative clause modifies a phrase. Head-marker schema Applied when a marker like a postposition marks a phrase. Head-adjacent schema Applied when a suffix attaches to a word or a compound ... grammar returns at least one parse tree, and "accuracy" refers to the percentage of bunsetsus which are attached correctly. To realize wide coverage and reasonable accuracy,...

Báo cáo khoa học: "Enhancing Performance of Lexicalised Grammars" pdf

... thresholds, and results are shown in Table 1. Since a gold stan-dard treebank for our data set was available, it waspossible to evaluate the accuracy of the parser. Eval-uation of deep parsing ... likely to have lexicalisedgrammars) as a POS tagger can massively increasethe parser coverage on unseen text. While annotat-ing with named entity data or a lexical type supertag-ger were also found ... supertags we man-age to halve parsing time with minimal loss of coverage or precision.1 IntroductionHeavily lexicalised grammars have been used in ap-plications such as machine translation and...

Xem thêm