word lattice reranking for chinese word segmentation and partofspeech tagging

Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging" docx

Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging" docx

Ngày tải lên : 17/03/2014, 01:20
... model for joint chinese word segmentation and part-of-speech tagging. In Proceedings of ACL. Wenbin Jiang, Haitao Mi, and Qun Liu. 2008b. Word lattice reranking for chinese word segmentation and part-of-speech ... ACL and AFNLP An Error-Driven Word- Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging Canasai Kruengkrai †‡ and Kiyotaka Uchimoto ‡ and Jun’ichi Kazama ‡ Yiou Wang ‡ and ... discriminative word- character hybrid model for joint Chi- nese word segmentation and POS tagging. Our word- character hybrid model offers high performance since it can handle both known and unknown words....
  • 9
  • 338
  • 0
Báo cáo khoa học: "Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation" doc

Báo cáo khoa học: "Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation" doc

Ngày tải lên : 17/03/2014, 00:20
... model for integrated morphological and syntactic parsing. First and foremost, we cur- rently know of no other same effort in parsing the structures of Chinese words, and we have to anno- tate word ... many efforts to in- tegrate Chinese word segmentation, part-of-speech tagging and parsing (Wu and Zixin, 1998; Zhou and Su, 2003; Luo, 2003; Fung et al., 2004). However, in these research all words ... 2003. Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing, 8(1):29–48. Yue Zhang and Stephen Clark. 2007. Chinese segmenta- tion with a word- based...
  • 10
  • 476
  • 0
Báo cáo khoa học: "Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study" potx

Báo cáo khoa học: "Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study" potx

Ngày tải lên : 17/03/2014, 01:20
... in the context of Chinese word segmentation and part-of-speech tagging, where no segmentation and POS tagging standards are widely accepted due to the lack of morphology in Chinese. Experi- ments ... that when word segmenta- tion and POS tagging are conducted jointly, the performance for segmentation improves since the POS tags provide additional information to word segmentation (Ng and Low, ... pars- ing (and translation). Experiments adapting from PD to CTB are con- ducted for two tasks: word segmentation alone, and joint segmentation and POS tagging (Joint S&T). The performance...
  • 9
  • 404
  • 0
Báo cáo khoa học: "Discriminative Pruning of Language Models for Chinese Word Segmentation" ppt

Báo cáo khoa học: "Discriminative Pruning of Language Models for Chinese Word Segmentation" ppt

Ngày tải lên : 17/03/2014, 04:20
... the same Chinese word segmentation F-measure, the number of bigrams in the model can be reduced by up to 90%. Correlation be- tween language model perplexity and word segmentation performance ... model for Chinese word segmentation. It differentiates from the previous pruning approaches in two respects. First, the pruning criterion is based on performance variation of word segmentation. ... Gao, Mu Li, Andi Wu, and Chang-Ning Huang. 2005. Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach. Computational Linguistics, 31(4): 531-574. Jianfeng Gao and Min...
  • 8
  • 294
  • 0
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

Ngày tải lên : 20/02/2014, 09:20
... UK {yue.zhang,stephen.clark}@comlab.ox.ac.uk Abstract For Chinese POS tagging, word segmentation is a preliminary step. To avoid error propa- gation and improve segmentation by utilizing POS information, segmentation and tagging can be performed ... segmentation and POS tagging using an HMM-based approach. Word information is used to process known-words, and character infor- mation is used for unknown words in a similar way to Ng and Low (2004). ... (2002) and are specific to Chinese, are shown in Table 2. The word segmentation features are extracted from word bigrams, capturing word, word length and character information in the context. The word length...
  • 9
  • 576
  • 0
Báo cáo khoa học: "Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling" pdf

Báo cáo khoa học: "Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling" pdf

Ngày tải lên : 17/03/2014, 04:20
... training set consisting of 5448 words, and considered alternative unlabeled train- ing sets, (5210 words), (10,208 words), and (25,145 words), consisting of the same, 2 times and 5 times as many sentences ... programming for computing the gradient, and thereby allows us to perform efficient iterative ascent for training. We apply our new training technique to the problem of sequence labeling and segmentation, ... observation sequence , define the matrix random variable by where Here is the edge with labels and is the vertex with label . For each index define the for- ward vectors with base case and recurrence Similarly,...
  • 8
  • 382
  • 0
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

Ngày tải lên : 08/03/2014, 01:20
... and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank 5.0, we obtain an error reduction of 18.5% on segmentation and 12% on joint seg- mentation and part-of-speech tagging ... that segmentation and POS tagging task is to divide a character sequence into several subse- quences and label each of them a POS tag. It is a better idea to perform segmentation and POS tagging ... p (position i −l), and select for position i a N-best list of candidate results from all these candidates. When we derive a candidate result from a word- POS pair p and a candidate q at prior...
  • 8
  • 445
  • 0
Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

Ngày tải lên : 17/03/2014, 00:20
... inter- mediate sub -word structure for joint segmentation and tagging. Since the sub-words are large enough in practice, the decoding for POS tagging over sub- words is efficient. Finally, the Chinese language ... 1385–1394, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics A Stacked Sub -Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging Weiwei ... MA, October. Association for Computational Linguistics. Ruiqiang Zhang, Genichiro Kikui, and Eiichiro Sumita. 2006. Subword-based tagging by conditional random fields for Chinese word segmentation. In...
  • 10
  • 412
  • 0
Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf

Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf

Ngày tải lên : 20/02/2014, 18:20
... Chinese word segmentation is therefore the first step for any Chinese information processing system[ 1]. Almost all methods for Chinese word segmentation developed so far, both statistical and ... Automatic Word Segmentation System for Written Chinese Texts", Journal of Chinese Information Processing, Vol. 1, No.2, 1987 (in Chinese) [2] Fan C.K.,Tsai WH., "Automatic Word Identification ... a Chinese 1268 Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data Sun Maosong, Shen Dayang*, Benjamin K Tsou** State Key Laboratory of Intelligent Technology and...
  • 7
  • 396
  • 0
Báo cáo khoa học: "Subword-based Tagging for Confidence-dependent Chinese Word Segmentation" pdf

Báo cáo khoa học: "Subword-based Tagging for Confidence-dependent Chinese Word Segmentation" pdf

Ngày tải lên : 17/03/2014, 04:20
... proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entropy (MaxEnt) and the ... methods with Chinese word segmentation, with which our re- sults were compared. Section 5 provides the con- cluding remarks and outlines future goals. 2 Chinese word segmentation framework Our word segmentation ... dictionary- based N-gram word segmentation for segmenting IV words, a maximum entropy subword-based tagger for recognizing OOVs, and a confidence-dependent word disambiguation used for merging the results of...
  • 8
  • 348
  • 0
Study guide MOS 2010 for microsoft: word, excel,powerpoint and outlook

Study guide MOS 2010 for microsoft: word, excel,powerpoint and outlook

Ngày tải lên : 23/08/2013, 14:34
... book for more information. The MOS certication exams for the Ofce 2010 programs and SharePoint are performance based and require you to complete business-related tasks in the program for which ... highlight and comment on document content, and search for specic text by using the commands on the Tools menu. xli Getting Support and Giving Feedback Errata We’ve made every effort to ensure ... Exam Candidates for MOS-level certication are expected to successfully complete a wide range of standard business tasks, such as formatting a document or worksheet and its content; creating and...
  • 696
  • 1.5K
  • 8
Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc

Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc

Ngày tải lên : 20/02/2014, 04:20
... RANLP workshop, pages 19–25. Wenbin Jiang, Haitao Mi, and Qun Liu. 2008. Word lattice reranking for chineseword segmentation and part-of-speech tagging. In Proceedings of the 22nd International Conference ... is the candidate sentence, l DF , l VF and l GF are three lattices one for each definition field, cov- erage is the fraction of words of the input sentence covered by the three lattices, and support ... than for any pair of sentences in T belonging to two different clusters. 3.2.3 Word- Class Lattice Construction Finally, the third step consists of the construction of a Word- Class Lattice for...
  • 10
  • 567
  • 0
Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf

Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf

Ngày tải lên : 20/02/2014, 12:20
... Christoper C. Yang and K. W. Li. 2005. A Heuristic Method Based on a Statistical Approach for Chinese Text Segmentation. Journal of the American Society for Information Science and Technology, ... corrected candidate words are suggested by the system from the word dictionary, according to some metric to measure the similarity between the target word and its candidate word, such as edit-distance ... Gao, Mu Li and Chang-Ning Huang. 2003. Improved Source-Channel Models for Chinese Word Segmentation. Proceedings of the 41 st Annual Meet- ing of the ACL, pp. 272-279 Seung-Shik Kang and Chong-Woo...
  • 4
  • 523
  • 0
Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf

Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf

Ngày tải lên : 20/02/2014, 12:20
... Processing, pp. 147-173. Gao, J. and A. Wu and Mu Li and C N.Huang and H. Li and X. Xia and H. Qin. 2004. Adaptive Chinese Word Segmentation. In Proceedings of ACL-2004. Meng, H. and C. W. Ip. 1999. An ... N. 2003. Chinese Word Segmentation as Charac- ter Tagging. Computational Linguistics and Chinese Language Processing. 8(1): 29-48 Redington, M. and N. Chater and C. Huang and L. Chang and K. Chen. ... that Chinese word segmentation is the classifi- cation of a string of character-boundaries (CB’s) into either word- boundaries (WB’s) and non -word- boundaries. In Chinese, CB’s are delimited and...
  • 4
  • 301
  • 0
Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

Ngày tải lên : 07/03/2014, 18:20
... likelihoods. Deterministic constraints for character tagging For the character tagging formulation of Chinese word segmentation, we discussed two tagsets IB and BMES in Section 3.1. With respect ... manner. 5.4 Chinese word segmentation Like other tagging problems, Viterbi-style decoding is widely used for character tagging for CWS. We transform tagged character sequences to word seg- mentations ... of a word and I all other positions; and 2) BMES: where B, M and E represent the beginning, middle and end of a multi- character word respectively, and S tags a single- character word. For example,...
  • 9
  • 425
  • 0

Xem thêm