0

a multilingual parallel corpus

Báo cáo khoa học:

Báo cáo khoa học: "Unsupervised Learning of Arabic Stemming using a Parallel Corpus" pot

Báo cáo khoa học

... 1999).Usually, entire documents are translated by humans,and the sentence pairs are subsequently aligned byautomatic means. A small parallel corpus can beavailable when native speakers and translators ... is a language-independent algo-rithm.English Phrase: the advisory committeeArabic Phrase: Alljnp AlAst$ArypTask: stem AlAst$ArypChoices ScoreAlAst$Aryp 0.2AlAst$Aryp 0.7AlAst$Aryp ... 2 ApproachFigure 1: Approach OverviewOur approach is based on the availability of thefollowing three resources:• a small parallel corpus • an English stemmer• an optional unannotated Arabic...
  • 8
  • 424
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates" pdf

Báo cáo khoa học

... sentiments for a variety of topics and corresponding targets are potentially involved (Riloff and Wiebe., 2003; Sarmento et al., 2009). Alternative approaches to automatic and manual construction ... Natu-ral Language Processing and Computational Natural Language Learning, Prague. Krippendorff, Klaus. 2004. Content Analysis: An Intro-duction to Its Methodology, 2nd Edition. Sage Publi-cations, ... 564–568,Portland, Oregon, June 19-24, 2011.c2011 Association for Computational LinguisticsLiars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates Paula Carvalho...
  • 5
  • 499
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Collecting a Why-question corpus for development and evaluation of an automatic QA-system" pdf

Báo cáo khoa học

... each paid reward.• Qualifications To improve the data quality, a HIT can also be attached to certain tests,“qualifications” that are either system-providedor created by the requester. An example ... the assign-ments have been completed.• Rewards At upload time, each HIT has to beassigned a fixed reward, that cannot be changedlater. Minimum reward is $0.01. Amazon.comcollects a 10% (or a ... excess of information. FAQ-pages tend to alsoanswer questions which are not asked, and also con-tain practical examples. Human-powered answersoften contain unrelated information and discourse-like...
  • 9
  • 610
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" docx

Báo cáo khoa học

... trans-lations for creating a tri-lingual collocation dic-tionary, with samples of actual use in language.Using past translations as reference for the transla-tor's further work was an ... unableto create a complete analysis of a sentence, theFips parser returns chunks of partial analyses. If132Creating a Multilingual Collocation Dictionary from Large Text CorporaLuka Nerima, ... V-Prep-N.Another argument in favour of a full syntacticalanalysis is that it solves the problem of all cases ofextraposed elements, such as passives, topicalisa-tion, and dislocation. To illustrate...
  • 4
  • 479
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "WebCAGe – A Web-Harvested Corpus Annotated with GermaNet Senses" docx

Báo cáo khoa học

... hand-crafted sense-annotatedcorpora have been available (Agirre et al., 2007;Erk and Strapparava, 2012; Mihalcea et al., 2004),while WSD research for languages that lack thesecorpora has lagged behind ... the 3rd In-ternational Language Resources and Evaluation(LREC’02), Las Palmas, Canary Islands, pp. 609–612Santamar´ a, C., Gonzalo, J., Verdejo, F. 2003. Au-tomatic Association of Web Directories ... representative examples in Yarowsky’s ap-proach is performed completely manually and istherefore limited to the amount of data that canreasonably be annotated by hand.Leacock et al. (1998), Agirre...
  • 10
  • 419
  • 0
Báo cáo khoa học:

Báo cáo khoa học: " a Movie Dialogue Corpus for Research and Development" potx

Báo cáo khoa học

... Seve-ral factors, such as the availability of more power-ful computers, an almost unlimited storage ca-pacity, the availability of large volumes of data in digital format, as well as the ... dialogue management and natural language generation. Springer. Stallard D (2000) Talk’n’travel: a conversational system for air travel planning. In Proceedings of the 6th Conference on Applied ... hand, contain all additional information/texts appearing in the scripts, which are typically of narrative nature and explain what is happening in the scene. Figure 1 depicts a browser snapshot...
  • 5
  • 424
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Personalized Normalization for a Multilingual Chat System" doc

Báo cáo khoa học

... short-forms are very irregular and hard to predict their standard forms using morphological and phonetic similarity. It is also hard to train a statistical model if training data is not available. ... Koehn &al. Moses: Open Source Toolkit for Statistical Machine Translation, ACL 2007, demonstration session. Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. ... flexibility and interactivity to include and manage their own vocabularies during chat. 2 ASIASPIK System Overview AsiaSpik is a web-based multilingual instant messaging system that enables online...
  • 6
  • 376
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus" pptx

Báo cáo khoa học

... metadataand annotations. The annotation files areconverted to a tabular format using an eas-ily adaptable XSLT-based mechanism, andtheir consistency is verified in the process.Metadata files are ... order to generate tabular files(TSV) and a table-creation script.4. Create and populate metadata tables withindatabase.5. Adapt the XSLT stylesheet as needed for vari-ous table formats.5 Results: ... names or analyse folders. Moreover, the ad-vantage of creating IMDI files is that the metadatais compliant with a widely used standard accompa-nied by freely available tools such as the metadatabrowser....
  • 4
  • 373
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "TRANSFER IN A MULTILINGUAL MT SYSTEM" pdf

Báo cáo khoa học

... project in the world that applies (iii) not only as a matter of principle but as actual practice. We will regard a natural language as a set of texts. A translation pair is a pair of texts (T~, ... simple transfer can now be formulated as follows: If A translates-as A& apos;, then we will call A& apos; a TN of A. We now call an element s,t of the set defined by translates-as a simple ... We can then introduce a new relation, called translates-as. This is a binary relation, probably many-to-many; its left-hand term is a subtree of R , and its righthand term is a tree. Clearl~,...
  • 4
  • 285
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" ppt

Báo cáo khoa học

... syntactical relation).When parallel corpora are available, also thetranslation equivalents of the collocation contextare displayed, thus allowing the user to see how a given collocation was translated ... is length-based and integrates a shal-low content analysis. It begins by individuating a paragraph in the target text which is a first candi-date as target paragraph, and which we call"pivot". ... trans-lations for creating a tri-lingual collocation dic-tionary, with samples of actual use in language.Using past translations as reference for the transla-tor's further work was an...
  • 4
  • 353
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Accurate Collocation Extraction Using a Multilingual Parser" docx

Báo cáo khoa học

... sections 4 and 5 a com-parative evaluation experiment proving that a hy-brid approach leads to more accurate results than a classical approach in which syntactic informationis not taken into account.2 ... help-ing developing nations ?1.c) make mistake: We could look backand probably see a lot of mistakesthat all parties including Canadaperhaps may have made.3 Multilingual Extraction ResultsIn ... 2006.c2006 Association for Computational LinguisticsAccurate Collocation Extraction Using a Multilingual ParserVioleta SeretanLanguage Technology LaboratoryUniversity of Geneva2, rue de Candolle,...
  • 8
  • 261
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Analysis of Selective Strategies to Build a Dependency-Analyzed Corpus" pptx

Báo cáo khoa học

... Makoto Nagao. 199 4a. KNParser: Japanese dependency/case structure ana-lyzer. In Proceedings of Workshop on Sharable Nat-ural Language Resources, pages 48–55.Sadao Kurohashi and Makoto Nagao. ... humanto annotate. Under this framework, the system hasaccess to a large pool of unlabeled data, and it hasto predict how much it can learn from each candi-date in the pool if that candidate is labeled.Most ... E-mail Magazine isbilingual. The articles of this magazine were an-alyzed by the dependency analyzer CaboCha, andwe manually corrected the errors.K-mag includes a wide variety articles, and...
  • 8
  • 488
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Evaluating Centering-based metrics of coherence for text structuring using a reliably annotated corpus" doc

Báo cáo khoa học

... International Work-shop on NLG, pages 98–107, Niagara-on-the-Lake, Ontario, Canada.Eleni Miltsakaki. 2002. Towards an aposyn-thesis of topic continuity and intrasenten-tial anaphora. Computational ... in Italian.In Walker et al. (Walker et al., 1998b), pages115–137.Aggeliki Dimitromanolaki and Ion Androut-sopoulos. 2003. Learning to order fac ts fordiscourse planning in natural language ... into a format appropriate for seec.The first author was able to engage in this researchthanks to a scholarship from the Greek State Schol-arships Foundation (IKY).ReferencesRegina Barzilay,...
  • 8
  • 608
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Towards a Resource for Lexical Semantics: A Large German Corpus with Extensive Semantic Annotation" pot

Báo cáo khoa học

... Semantic AnnotationKatrin Erk and Andrea Kowalski and Sebastian Pad´o and Manfred PinkalDepartment of Computational LinguisticsSaarland UniversitySaarbr¨ucken, Germany{erk, kowalski, pado, ... desirable.FrameNet as a resource for semantic role an-notation. Above, we have asked about the suitabil-ity of FrameNet for semantic role annotation, andour data allow a first, though tentative, ... suit-able resource as annotation basis. FrameNet roles,which are local to particular frames (abstract sit-uations), may be better suited for the annotationtask than the “classical” thematic...
  • 8
  • 407
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "JaBot: a multilingual Java-based intelligent agent for Web sites" pdf

Báo cáo khoa học

... XXI. Revista de la UNED. Read T., Bhrcena E. and Faber P. (1997) Java and its role in Natural Language Processing and Machine Translation. In Proceedings of the Machine Translation Summit ... their careers. As can be seen in the diagram below, JaBot has three modules: a natural language interface, a search engine and an interactive list of references to the Web pages on the site at ... its architecture and associated data sources. Subsequently, an illustrative example of its functionality has been presented, which demonstrated that JaBot is more flexible than a traditional...
  • 5
  • 229
  • 0

Xem thêm