a generative probabilistic ocr model for nlp applications

Tài liệu Báo cáo khoa học: "GPSM: A GENERALIZED PROBABILISTIC SEMANTIC MODEL FOR AMBIGUITY RESOLUTION" pptx

Tài liệu Báo cáo khoa học: "GPSM: A GENERALIZED PROBABILISTIC SEMANTIC MODEL FOR AMBIGUITY RESOLUTION" pptx

Ngày tải lên : 20/02/2014, 21:20
... measure shows substantial im- provement in structural disambiguation over a syntax-based approach. 1. Introduction In a large natural language processing system, such as a machine translation ... R&D Road II, Science-Based Industrial Park Hsinchu, TAIWAN 30077, R.O.C. ABSTRACT In natural language processing, ambiguity res- olution is a central issue, and can be regarded as a ... information. Hence, we will show how to annotate a syntax tree so that various interpretations can be characterized differently. Semantic Tagging A popular linguistic approach to annotate a...
  • 8
  • 412
  • 0
Tài liệu Báo cáo khoa học: "A Syntax-Driven Bracketing Model for Phrase-Based Translation" pptx

Tài liệu Báo cáo khoa học: "A Syntax-Driven Bracketing Model for Phrase-Based Translation" pptx

Ngày tải lên : 20/02/2014, 07:20
... various language-pairs, one issue is that matching syn- tactic analysis can not always guarantee a good translation, and violating syntactic structure does not always induce a bad translation. Marton and Resnik ... Singapore, 2-7 August 2009. c 2009 ACL and AFNLP A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, Min Zhang, Aiti Aw and Haizhou Li Human Language Technology Institute for ... Reordering Model for Statistical Machine Translation. In Proceedings of ACL-COLING 2006. Deyi Xiong, Min Zhang, Aiti Aw, and Haizhou Li. 2008. Linguistically Annotated BTG for Statistical Machine Translation....
  • 9
  • 438
  • 0
Tài liệu Báo cáo khoa học: "A Unified Syntactic Model for Parsing Fluent and Disfluent Speech∗" ppt

Tài liệu Báo cáo khoa học: "A Unified Syntactic Model for Parsing Fluent and Disfluent Speech∗" ppt

Ngày tải lên : 20/02/2014, 09:20
... Communication Re- search Centre, University of Edinburgh. John Hale, Izhak Shafran, Lisa Yung, Bonnie Dorr, Mary Harper, Anna Krasnyanskaya, Matthew Lease, Yang Liu, Brian Roark, Matthew Snover, and ... 1 and 2, in which the same repair fragment is shown in a standard state such as might be used to train a probabilistic context free grammar, and after the right-corner transform. Fig- ure 1 also ... modified for use in a special repair grammar, which not only reduces the amount of available training data, but violates our intuition that most reparanda are fluent up until the actual edit occurs. The...
  • 4
  • 581
  • 0
Tài liệu Báo cáo khoa học: "A Phrase-based Statistical Model for SMS Text Normalization" ppt

Tài liệu Báo cáo khoa học: "A Phrase-based Statistical Model for SMS Text Normalization" ppt

Ngày tải lên : 20/02/2014, 12:20
... groups and domains can be modeled separately without accessing and adapting the language model of the MT system for each SMS application. Another advantage is that the normalization module can ... normalization as a translation problem from the SMS language to the English language 1 and we propose to adapt a phrase-based statistical MT model for the task. Evaluation by 5-fold cross validation ... a consensus translation technique to bootstrap parallel data using off-the-shelf translation sys- tems for training a hierarchical statistical transla- tion model for general domain instant...
  • 8
  • 399
  • 0
Tài liệu Báo cáo khoa học: Trophoblast-like human choriocarcinoma cells serve as a suitable in vitro model for selective cholesteryl ester uptake from high density lipoproteins pdf

Tài liệu Báo cáo khoa học: Trophoblast-like human choriocarcinoma cells serve as a suitable in vitro model for selective cholesteryl ester uptake from high density lipoproteins pdf

Ngày tải lên : 20/02/2014, 23:20
... & Takahara, J. (1999) Evidence for a potential role for HDL as an important source of cholesterol in human adrenocortical tumors via the CLA-1 path- way. Endocr. J. 46, 27–34. 63. Cherradi,N.,Bideau,M.,Arnaudeau,S.,Demaurex,N.,James, R.W., ... proliferation and invasion. Choriocarcinoma is a malignant neoplasm that represents the early trophoblast of the attachment phase or as later invasive stage [46–48]. Thus, in most cases, choriocarcinoma ... choriocarcinoma cells were incubated as described above. Total RNA was isolated, and Northern blot analysis was performed using radiolabeled SR-BI cDNA probe for each cell line (top panel) in the absence...
  • 12
  • 470
  • 0
Tài liệu Báo cáo khoa học: "Deriving Verbal and Compositional Lexical Aspect for NLP Applications" pptx

Tài liệu Báo cáo khoa học: "Deriving Verbal and Compositional Lexical Aspect for NLP Applications" pptx

Ngày tải lên : 22/02/2014, 03:20
... bridge and New York. Weinberg, Amy, Joseph Garman. Jeffery Martin. and Paola Merlo. 1995. Principle-Based Parser for Foreign Language Training in German and Arabic. In Melissa Holland, Jonathan ... representations, as in the examples provided from machine transla- tion and foreign language tutoring applications. We are aware of no attempt in the literature to represent and access aspect on a ... of Lezical and Grammat- ical Aspect. Garland, New York. Passoneau, Rebecca. 1988. A Computational Model of the Semantics of Tense and Aspect. Compu- tational Linguistics: Special Issue...
  • 8
  • 401
  • 0
Báo cáo khoa học: "A Scalable Probabilistic Classifier for Language Modeling" pdf

Báo cáo khoa học: "A Scalable Probabilistic Classifier for Language Modeling" pdf

Ngày tải lên : 07/03/2014, 22:20
... Ducharme, P. Vincent, and C. Jauvin. 2003. A Neural Probabilistic Language Model. Journal of Machine Learning Research, 3:1137–1155. A. Berger, V. Della Pietra, and S. Della Pietra. 1996. A Maximum ... Categorization Research. Journal of Machine Learning Research, 5:361–397. A. Mnih and G. Hinton. 2008. A Scalable Hierarchical Distributed Language Model. In Advances in Neural Information Processing ... model whose relative performance compared to N-Gram models gets better as one incorporates richer fea- ture sets. It scales almost as well to large datasets as standard N-Gram models: training requires...
  • 6
  • 350
  • 0
Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

Ngày tải lên : 08/03/2014, 02:21
... is substantial re- search focusing on syntactic tree alignment model for machine translation. For example, (Wu 1997; Alshawi, Bangalore, and Douglas, 2000; Yamada and Knight, 2001) have studied ... for Machine Translation in the Americas. Munteanu D. S, A. Fraser, and D. Marcu. D., 2002. Improved Machine Translation Performance via Parallel Sentence Extraction from Comparable Corpora. ... three features, the maximum en- tropy model is trained on 1,000 pairs of web pages manually labeled as parallel or non- parallel. The Iterative Scaling algorithm (Pietra, Pietra and Lafferty...
  • 8
  • 435
  • 0
Báo cáo khoa học: "A Generalized Vector Space Model for Text Retrieval Based on Semantic Relatedness" pot

Báo cáo khoa học: "A Generalized Vector Space Model for Text Retrieval Based on Semantic Relatedness" pot

Ngày tải lên : 08/03/2014, 21:20
... measure of relatedness does (low y values for small x values and high y values for high x). The same pattern applies in the M&C and 353-C data sets. 4.2 Evaluation of the GVSM For the evaluation ... query, are computed similarly. A GVSM model aims at being able to retrieve documents that not necessarily contain exact matches of the query terms, and this is its great advantage. This new space ... Linguistics A Generalized Vector Space Model for Text Retrieval Based on Semantic Relatedness George Tsatsaronis and Vicky Panagiotopoulou Department of Informatics Athens University of Economics and Business, 76,...
  • 9
  • 394
  • 0
Chord: A Scalable Peertopeer Lookup Service for Internet Applications pot

Chord: A Scalable Peertopeer Lookup Service for Internet Applications pot

Ngày tải lên : 15/03/2014, 22:20
... ring. Assuming that the data Chord is being used to locate is cryptographically authenticated, this is a threat to availability of data rather than to authenticity. The same approach used above ... de- sired authentication, caching, replication, and user-friendly naming of data. Chord’s flat key space eases the implementation of these features. For example, an application could authenticate data ... mechanism also helps higher layer software replicate data. A typical application using Chord might store repli- cas of the data associated with a key at the nodes succeeding the key. The fact that...
  • 12
  • 441
  • 0
Báo cáo khoa học: "A Class-Based Agreement Model for Generating Accurately Inflected Translations" pptx

Báo cáo khoa học: "A Class-Based Agreement Model for Generating Accurately Inflected Translations" pptx

Ngày tải lên : 16/03/2014, 19:20
... exponential translation model for target language morphology. In ACL-HLT. C. Tillmann. 2004. A unigram orientation model for statistical machine translation. In NAACL. K. Toutanova, H. Suzuki, and A. ... similar agreement phenom- ena as probabilistic sequences. Factored Translation Models Factored transla- tion models (Koehn and Hoang, 2007) facilitate a more data-oriented approach to agreement modeling. Words ... phrase ta- ble annotations and can be easily implemented as a feature in many phrase-based decoders. 1 Introduction Languages vary in the degree to which surface forms reflect grammatical relations....
  • 10
  • 414
  • 0
Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

Ngày tải lên : 17/03/2014, 00:20
... information for each character. Each character can be assigned one of two possi- ble boundary tags: “B” for a character that begins a word and “I” for a character that occurs in the mid- dle of a word. ... representa- tion (Ramshaw and Marcus, 1995) and the Start/End representation (Kudo and Matsumoto, 2001) are popular. For example, the label B-NN indicates that a character is located at the begging of a noun. ... POS information is allowed to inter- act with segmentation. Note that word segmentation can also be formulated as a sequential classification problem to predict whether a character is located at the...
  • 10
  • 412
  • 0
Báo cáo khoa học: "Fast, Space-Efficient, non-Heuristic, Polynomial Kernel Computation for NLP Applications" docx

Báo cáo khoa học: "Fast, Space-Efficient, non-Heuristic, Polynomial Kernel Computation for NLP Applications" docx

Ngày tải lên : 17/03/2014, 02:20
... implementation is available as the open-source splitSVM Java library. 1 Introduction Over the last decade, many natural language pro- cessing tasks are being cast as classification prob- lems. These are ... achieved by taking into account the Zipfian nature of natural language data, and structuring the compu- tation accordingly. On a sample application (replac- ing the libsvm classifier used by MaltParser ... Changing existing Java code to accommodate our fast SVM classifier is done by loading a different model, and changing a single function call. 4.1 Evaluation: Speeding up MaltParser We evaluate...
  • 4
  • 285
  • 0
Báo cáo khoa học: "A Language-Independent Unsupervised Model for Morphological Segmentation" pot

Báo cáo khoa học: "A Language-Independent Unsupervised Model for Morphological Segmentation" pot

Ngày tải lên : 17/03/2014, 04:20
... thank Emily Pitler and Samarth Ke- shava for making available the code of the RePortS algorithm, and Stefan Bordag and Delphine Bern- hard for running their algorithms on the German data. Many ... presented here have been shown to improve accuracy (Kurimo et al., 2006). Another motivation for evaluating the system on a task rather than on manually annotated data is that linguistically motivated morphological ... semantic and syntactic informa- tion is very attractive because it adds an additional dimension, but these approaches have to cope with more severe data sparseness issues than approaches that...
  • 8
  • 288
  • 0
Báo cáo khoa học: "A Hierarchical Phrase-Based Model for Statistical Machine Translation" pptx

Báo cáo khoa học: "A Hierarchical Phrase-Based Model for Statistical Machine Translation" pptx

Ngày tải lên : 17/03/2014, 05:20
... Linguistics A Hierarchical Phrase-Based Model for Statistical Machine Translation David Chiang Institute for Advanced Computer Studies (UMIACS) University of Maryland, College Park, MD 20742, USA dchiang@umiacs.umd.edu Abstract We ... USA dchiang@umiacs.umd.edu Abstract We present a statistical phrase-based transla- tion model that uses hierarchical phrases— phrases that contain subphrases. The model is formally a synchronous ... in Natural Language Processing (EMNLP), pages 388–395. Shankar Kumar, Yonggang Deng, and William Byrne. 2005. A weighted finite state transducer transla- tion template model for statistical machine...
  • 8
  • 331
  • 0

Xem thêm