a latticebased framework for joint chinese word segmentation pos tagging and parsing

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

Ngày tải lên : 08/03/2014, 01:20
... several subse- quences and label each of them a POS tag. It is a better idea to perform segmentation and POS tagging jointly in a uniform framework. Ac- cording to Ng and Low (2004), the segmentation task ... Philadelphia, PA 19104, USA jiangwenbin@ict.ac.cn lhuang3@cis.upenn.edu Abstract We propose a cascaded linear model for joint Chinese word segmentation and part- of-speech tagging. With a character-based perceptron ... multi- character word respectively. In order to perform POS tagging at the same time, we expand boundary tags to include POS information by attaching a POS to the tail of a boundary tag as a postfix...
  • 8
  • 445
  • 0
Báo cáo khoa học: "Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese" potx

Báo cáo khoa học: "Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese" potx

Ngày tải lên : 07/03/2014, 18:20
... Kruengkrai, Kiyotaka Uchimoto, Jun’ichi Kazama, Yiou Wang, Kentaro Torisawa, and Hitoshi Isahara. 2009. An error-driven word- character hybrid model for joint Chinese word segmentation and POS tagging. ... between segmentation and POS tagging. 3 Model 3.1 Incremental Joint Segmentation, POS Tagging, and Dependency Parsing Based on the joint POS tagging and dependency parsing model by Hatori et al. ... model is fundamentally a com- bination of the features used in the state-of-the-art joint segmentation and POS tagging model (Zhang and Clark, 2010) and dependency parser (Huang and Sagae, 2010),...
  • 9
  • 523
  • 0
Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

Ngày tải lên : 07/03/2014, 18:20
... times faster than searching in a raw space pruned with beam-width 5. Tagging accuracy is moderately improved as well. For Chinese word segmentation (CWS), which can be formulated as character tagging, ... popular as used in (Zhang and Clark, 2007) and (Jiang et al., 200 8a) . We propose an Integer Linear Programming (ILP) formulation of word segmentation, which is nat- urally viewed as a word- based ... are made available during Viterbi decoding. 3 Chinese Word Segmentation (CWS) 3.1 Word segmentation as character tagging Considering the ambiguity problem that a Chinese character may appear in any...
  • 9
  • 425
  • 0
Báo cáo khoa học: "SenseRelate::TargetWord – A Generalized Framework for Word Sense Disambiguation" doc

Báo cáo khoa học: "SenseRelate::TargetWord – A Generalized Framework for Word Sense Disambiguation" doc

Ngày tải lên : 08/03/2014, 04:22
... the generalized framework for Word Sense Disambiguation, the ar- chitecture and usage of SenseRelate::TargetWord, and a description of the user interfaces (command line and GUI). 2 The Framework The ... Interactive Poster and Demonstration Sessions, pages 73–76, Ann Arbor, June 2005. c 2005 Association for Computational Linguistics SenseRelate::TargetWord – A Generalized Framework for Word Sense Disambiguation Siddharth ... lexical sample format, which is an XML–based format that has been used for both the S ENSEVAL-2 and SENSEVAL-3 exercises. A file in this format includes a number of instances, each one made up...
  • 4
  • 349
  • 0
Some studies on a probabilistic framework for finding object-oriented information in unstructured data

Some studies on a probabilistic framework for finding object-oriented information in unstructured data

Ngày tải lên : 23/11/2012, 15:04
... S. Jayram, Rajasekar Krishna-murthy, Sriram Raghavan, Shivakumar Vaithyanathan, and Huaiyu Zhu. Avatar information extraction system. IEEE Data Eng. Bull. [23] Sándor Dominich. The Modern Algebra ... learning framework, which overcomes the challenges about scalability and adaptability of the previous approaches. We have then adapted the probabilistic framework to a Vietnamese domain - real ... based. We also adapt the probabilistic framework to Vietnamese Real Estate domain and have a satisfactory result. 1.4 Chapter summary This chapter brought an overview of web-page problem and...
  • 51
  • 393
  • 0
A general framework for studying class consciousness and class formation

A general framework for studying class consciousness and class formation

Ngày tải lên : 01/11/2013, 07:20
... and class formation, but rather as a framework for deđning an agenda of problems for empirical research within class analysis. In the multivariate empirical studies of class conscious- ness and ... ``hege- monic,'' ``reformist,'' ``oppositional'' and ``revolutionary'' working-class consciousness in terms of particular combinations of perceptions, the- ories and preferences. ... that it is never satisfactory to restrict the analysis to the ``union'' as a collective entity making choices and engaging in practices directed at ``capitalists'' or ``management.''...
  • 31
  • 500
  • 0
Tài liệu Carrots, Sticks, and Promises: A Conceptual Framework for the Management of Public Health and Social Issue Behaviors docx

Tài liệu Carrots, Sticks, and Promises: A Conceptual Framework for the Management of Public Health and Social Issue Behaviors docx

Ngày tải lên : 18/02/2014, 02:20
... disadvantage noncompliance. Law is also similar to what Wiener and Doescher (1991)'term a structural solution, that is, a political act that mandates individual behavior. For Taylor and ... have changed dramati- cally in the past years, and as a result, policy with respect to managing tobacco usage behavior also has changed. The re- lationship of behavior management and externalities ... 1 Applications of Education, Marketing, and Law Social Dilemmas and Social Traps Social dilemmas (Dawes 1980; Wiener and Doescher 1991) are characterized as situations in which each individual...
  • 14
  • 780
  • 0
Tài liệu Báo cáo khoa học: "A Pipeline Framework for Dependency Parsing" ppt

Tài liệu Báo cáo khoa học: "A Pipeline Framework for Dependency Parsing" ppt

Ngày tải lên : 20/02/2014, 12:20
... accuracy (RA) and leaf accuracy (LA), as in (Yamada and Matsumoto, 2003). When evaluating the result, we exclude the punctuation marks, as done in (Mc- Donald et al., 2005) and (Yamada and Matsumoto, 2003). 4.3 ... non-root words that are assigned the correct head. Complete accuracy (CA) indicates the fraction of sentences that have a complete cor- rect analysis. We also measure that root accuracy (RA) and leaf ... 4 words after w 2 (as in (Yamada and Matsumoto, 2003)). The key additional feature we use, relative to (Ya- mada and Matsumoto, 2003), is that we include the previous predicted action as a feature....
  • 8
  • 581
  • 0
Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf

Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf

Ngày tải lên : 20/02/2014, 12:20
... A. Wu and Mu Li and C N.Huang and H. Li and X. Xia and H. Qin. 2004. Adaptive Chinese Word Segmentation. In Proceedings of ACL-2004. Meng, H. and C. W. Ip. 1999. An Analytical Study of Transformational ... not pre-suppose any lexical information and it treats character strings as context which provides infor- mation on the possible classification of character- breaks as word- breaks. We are confident that ... change our notation to allow for more precise explanation. As noted be- fore, Chinese text can be formalized as a sequence of characters and intervals as illustrated in we call this representation...
  • 4
  • 301
  • 0
Tài liệu Báo cáo khoa học: "A Unified Framework for Automatic Evaluation using N-gram Co-Occurrence Statistics" pptx

Tài liệu Báo cáo khoa học: "A Unified Framework for Automatic Evaluation using N-gram Co-Occurrence Statistics" pptx

Ngày tải lên : 20/02/2014, 16:20
... various automatic evaluation metrics are able to closely approximate human evaluations for various applications. Given an application app and an evaluation guideline package eval, the faithfulness/compactness ... separately evaluated. Each version was evaluated by a human evaluator, with no reference answer available. For this evaluation 115 test questions were used, and the human evaluator was asked ... same family of metrics explain best the variations obtained with human evaluations, according to the application being evaluated (Machine Translation, Automatic Summarization, and Automatic...
  • 8
  • 462
  • 0
Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf

Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf

Ngày tải lên : 20/02/2014, 18:20
... status ('bound' or 'separated') would be likely to be consistent with that of the local maximum. So does the second local minimum. Finally, for locations marked '?' ... Given a Chinese character string 'xy', the mutual information between characters x and 3,(or equally, the mutual information of the location between x and y) is defined as: mi(x:y) = ... that every location between x and y in the sentence be treated as 'combined' or 'separated' accordingly if its mY value is greater than or below a threshold(suppose the threshold...
  • 7
  • 396
  • 0