Báo cáo khoa học: "Improving On-line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation" docx

... LinguisticsImproving On-line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation Vicent Alabau, Alberto Sanchis, Francisco CasacubertaInstitut Tecnol`ogic d’Inform`aticaUniversitat ... Vera, s/n, Valencia, Spain{valabau,asanchis,fcn}@iti.upv.esAbstract In interactive machine translation (IMT), a hu-man expert is integrated into the core of a ma-chine translation (MT) system. ... word-based translation models havebeen considered: direct IBM1 and IBM2 models, and inverse IBM1-inv and IBM2-inv models withthe inverse dictionary from Eq. 9.However, a more interesting set...

Tài liệu Báo cáo khoa học: "Improving Automatic Speech Recognition for Lectures through Transformation-based Rules Learned from Minimal Data" ppt

... by partitioning each 50 minute lecture intoa training and a test set, where the training set issmaller than the test set. As mentioned in the intro-duction, it is feasible to obtain manual transcriptsfor ... ρ(rbest, TASR)) and the re-maining rules are scored on the transformed train-ing text. This ensures that the scoring and rankingof remaining rules takes into account the changesbrought ... applying rule ron text TASR.As outlined in Figure 1, rules that occur in thetraining sample more often than an establishedthreshold are ranked according to the scoring func-tion. The ranking...

Báo cáo khoa học: "Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia" potx

... performance using four sets of fea-tures: (i) Monolingual Wiki-tagger based, using only the features in Group 1 (MONO); (ii) Bilinguallabel match and Wiki-tagger based, using features in Groups ... phrases in Wikipedia, using Wikipedia metadata. The following sources of in- formation were used from Wikipedia: category an-notations on English documents, article links whichlink from phrases in ... whichcan combine the two types of information. Simi-larly to the joint model of Burkett et al. (2010a), ourmodel can incorporate both monolingual and bilin-gual features in a log-linear framework....

Báo cáo khoa học: "Arabic Named Entity Recognition: Using Features Extracted from Noisy Data" doc

... class All . The baseline results, FreqBaseline,assigns a test token the most frequent tag observedfor it in the gold training data, if a test token isnot observed in the training data, it is assigned ... ob-tained model outperformed the baseline. More re-cently, in (Chen and Ji, 2009), the authors reporttheir comparative study between monolingual andcross-lingual bootstrapping. Finally, in MentionDetection ... fea-ture, yielding an improvement of up to 4 pointsover the baseline. We experiment with differentsizes for the SE, i.e. taking the ﬁrst parent versusadding neighboring non-terminal parents....

Tài liệu Báo cáo khoa học: "Learning Syntactic Verb Frames Using Graphical Models" doc

... toVALEX using pGRs with a narrow window width.Since POS tagging is more reliable and robust acrossdomains than parsing, retraining on new domainswill not suffer the effects of a mismatched parsingmodel ... lex-icon for biomedical information extraction. In Com-putational Linguistics and Intelligent Text Processing.Springer Berlin / Heidelberg.429Proceedings of the 50th Annual Meeting of the Association ... Exploring subdomain variation in biomedical language. BMC Bioinformatics.Diana McCarthy. 2000. Using semantic preferences toidentify verbal participation in role switching alterna-tions. In NAACL...

Tài liệu Báo cáo khoa học: "An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation" docx

... In this way, our program can introduce some randomness into weight training. Hence users do not need to repeat MERT for obtaining stable and optimized weights using different starting points. ... target-language corpus. Finally, the resulting models are incorporated into the decoder which can automatically tune feature weights on the development set using minimum error rate training (Och, 2003) ... reordering model, our toolkit supports two different reordering models which are trained independently but jointly used during decoding. z The first of these is a discriminative reordering model....

Tài liệu Báo cáo khoa học: "Towards History-based Grammars: Using Richer Models for Probabilistic Parsing*" docx

... tailoring via the usual linguistic introspection in the hope of generating the correct parse. In head-to-head tests against one of the best existing robust probabilistic parsing models, ... definition of a history in the HBG model) and the corresponding rule used in expanding a node. Using the resulting data set we built a decision tree by classifying his- tories to locally minimize ... for the words in the vocabulary (the lexical heads) using automatic clustering algorithms using the bigram mutual in- formation clustering algorithm (see (5)). Given the bitsting of a history,...

Báo cáo khoa học: "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation" pptx

... analysis using binary branching structures under word alignment and parse tree constraints. Bod (2007) also finds that discontinues phrasal rules make significant improvement in lin-guistically ... non-contiguous phrase modeling in both syntax-based and phrase-based systems. We also find that in Chinese-English translation task, gaps are more effective in Chinese side than in the English side. ... Model for Statistical Machine Translation Jun Sun1,2 Min Zhang1 Chew Lim Tan2 1 Institute for Infocomm Research 2School of Computing, National University of Singapore sunjun@comp.nus.edu.sg...

Báo cáo khoa học: "A Comparative Study of Hypothesis Alignment and its Improvement for Machine Translation System Combination" pot

... decoding shows the best performance in combining outputs from multiple machine translation (MT) sys-tems. However, overcoming different word orders presented in multiple MT systems dur-ing ... Networks for Combining Machine Translation Systems. In Pro-ceedings of COLING 2008, pp. 33–40. Manchester, Aug. S. Bangalore, G. Bordel, and G. Riccardi. 2001. Computing consensus translation ... its Improvement for Machine Translation System Combination Boxing Chen*, Min Zhang, Haizhou Li and Aiti Aw Institute for Infocomm Research 1 Fusionopolis Way, 138632 Singapore {bxchen,...

Báo cáo khoa học: "Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation" potx

... description of Joshua’s main fea-tures, described in more detail in Li et al. (2009a):• Training Corpus Sub-sampling: We sup-port inducing a grammar from a subsetof the training data, that consists ... proposed by Kishore Papineni (per-sonal communication), outlined in further de-tail in (Li et al., 2009a). The method achievesa 90% reduction in training corpus size whilemaintaining state-of-the-art ... Monz,and Josh Schroeder. 2009. Findings of the 2009Workshop on Statistical Machine Translation. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 1–28, Athens,...

Xem thêm

Từ khóa: Báo cáo thực tập tại nhà thuốc tại Thành phố Hồ Chí Minh năm 2018 Nghiên cứu sự biến đổi một số cytokin ở bệnh nhân xơ cứng bì hệ thống chuyên đề điện xoay chiều theo dạng Nghiên cứu tổ chức pha chế, đánh giá chất lượng thuốc tiêm truyền trong điều kiện dã ngoại Nghiên cứu tổ hợp chất chỉ điểm sinh học vWF, VCAM 1, MCP 1, d dimer trong chẩn đoán và tiên lượng nhồi máu não cấp Nghiên cứu vật liệu biến hóa (metamaterials) hấp thụ sóng điện tử ở vùng tần số THz Giáo án Sinh học 11 bài 13: Thực hành phát hiện diệp lục và carôtenôit Giáo án Sinh học 11 bài 13: Thực hành phát hiện diệp lục và carôtenôit ĐỒ ÁN NGHIÊN CỨU CÔNG NGHỆ KẾT NỐI VÔ TUYẾN CỰ LY XA, CÔNG SUẤT THẤP LPWAN NGHIÊN CỨU CÔNG NGHỆ KẾT NỐI VÔ TUYẾN CỰ LY XA, CÔNG SUẤT THẤP LPWAN SLIDE Phối hợp giữa phòng văn hóa và thông tin với phòng giáo dục và đào tạo trong việc tuyên truyền, giáo dục, vận động xây dựng nông thôn mới huyện thanh thủy, tỉnh phú thọ Trả hồ sơ điều tra bổ sung đối với các tội xâm phạm sở hữu có tính chất chiếm đoạt theo pháp luật Tố tụng hình sự Việt Nam từ thực tiễn thành phố Hồ Chí Minh (Luận văn thạc sĩ)Nghiên cứu, xây dựng phần mềm smartscan và ứng dụng trong bảo vệ mạng máy tính chuyên dùng Nghiên cứu về mô hình thống kê học sâu và ứng dụng trong nhận dạng chữ viết tay hạn chế Nghiên cứu tổng hợp các oxit hỗn hợp kích thƣớc nanomet ce 0 75 zr0 25o2 , ce 0 5 zr0 5o2 và khảo sát hoạt tính quang xúc tác của chúng Tìm hiểu công cụ đánh giá hệ thống đảm bảo an toàn hệ thống thông tin BT Tieng anh 6 UNIT 2 Giáo án Sinh học 11 bài 15: Tiêu hóa ở động vật Chiến lược marketing tại ngân hàng Agribank chi nhánh Sài Gòn từ 2013-2015 HIỆU QUẢ CỦA MÔ HÌNH XỬ LÝ BÙN HOẠT TÍNH BẰNG KIỀM