languages using split words

Tài liệu Báo cáo khoa học: "Modeling Morphologically Rich Languages Using Split Words and Unstructured Dependencies" docx

Tài liệu Báo cáo khoa học: "Modeling Morphologically Rich Languages Using Split Words and Unstructured Dependencies" docx

Ngày tải lên : 20/02/2014, 09:20
... 9.71M 0.50M 9.45M 1.19M 4.1 Using a morphological tagger and disambiguator The split version of the corpus contains words that are split into their stem and suffix forms by using a previously developed ... 2 gives the total log-probability (using log 2 ) for the split and unsplit datasets using n-gram models of different order. We compute the perplexity of the two datasets using a common denomina- tor: ... both the split and split+ 0 datasets; therefore we ignore the cost of the OOV tokens as is the default SRILM behavior. Table 3: Total log probability for the 6-gram word models on split and split+ 0...
  • 4
  • 324
  • 0
Báo cáo khoa học: "Designing spelling correctors for inflected languages using lexical transducers" pdf

Báo cáo khoa học: "Designing spelling correctors for inflected languages using lexical transducers" pdf

Ngày tải lên : 17/03/2014, 23:20
... is a whole dictionary of words or that the sys- tem works without lexical information. Oflazer and Guzey (1994) face the problem of correcting words in agglutinative languages. 3.1 Correcting ... authors (van Berkel &: de Smedt, 88). When we faced the problem of cor- recting misspelled words the main problem found was that because of the recent standardisation and the widespread ... was applied to build the transducer: 1. Additional morphemes are linked to the stan- dard ones using the possibility of expressing two levels in the lexicon. 2. Definition of additional rules...
  • 2
  • 263
  • 0
Báo cáo y học: "Identification of Cellular Membrane Proteins Interacting with Hepatitis B Surface Antigen using Yeast Split-Ubiquitin System"

Báo cáo y học: "Identification of Cellular Membrane Proteins Interacting with Hepatitis B Surface Antigen using Yeast Split-Ubiquitin System"

Ngày tải lên : 02/11/2012, 11:08
... generated using random hexamer primer from BD Matchmaker™ Library Construction & Screening Kits User Manual (BD Biosciences, Clontech, USA). The second strand cDNA was synthesized using Long-Distance ... Identification of Cellular Membrane Proteins Interacting with Hepatitis B Surface Antigen using Yeast Split- Ubiquitin System Qi Chun Toh, Tuan Lin Tan, Wei Qiang Teo, Chin Yee Ho, Subhajeet Parida ... cellular proteins that interact with HBsAg and thereby contributing to HBV morphogenesis. Using the yeast split- ubiquitin system, a number of cellular membrane proteins have been isolated in this...
  • 4
  • 493
  • 0
Tài liệu Báo cáo khoa học: "Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs" pptx

Tài liệu Báo cáo khoa học: "Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs" pptx

Ngày tải lên : 20/02/2014, 12:20
... dis- tortion probability: one for head words and the other for non-head words. Distortion Probability for Head Words The distortion probability for head words represents the relative position ... presented a word alignment approach for languages with scarce resources using bilin- gual corpora of other language pairs. To perform word alignment between languages L1 and L2, we introduce a ... the feature vector constructed using the context words in the English sentence to represent the context. So we can calculate the cross-language word similarity using the feature vectors. The...
  • 8
  • 359
  • 0
Tài liệu Báo cáo khoa học: "Extracting Semantic Orientations of Words using Spin Model" pdf

Tài liệu Báo cáo khoa học: "Extracting Semantic Orientations of Words using Spin Model" pdf

Ngày tải lên : 20/02/2014, 15:20
... (Schmid, 1994). 35 stopwords (quite fre- quent words such as “be” and “have”) are removed from the lexical network. Negation words include 33 words. In addition to usual negation words such as “not” ... extracted the words tagged with “Positiv” or “Negativ”, and reduced multiple-entry words to single entries. As a result, we obtained 3596 words (1616 positive words and 1980 negative words) 1 . ... computation converged. The words with high final average values are clas- sified as positive words. The words with low final average values are classified as negative words. 4.3 Hyper-parameter Prediction The...
  • 8
  • 435
  • 0
Tài liệu Báo cáo khoa học: "An Evaluation Method of Words Tendency using Decision " docx

Tài liệu Báo cáo khoa học: "An Evaluation Method of Words Tendency using Decision " docx

Ngày tải lên : 20/02/2014, 16:20
... classes. The words belong to each class is called: increasing -words, relatively constant -words, and decreasing -words respectively. Table 1 shows a sample of some classified words according ... the words in each group. Table 1 Sample of Classified Words Stability Class Example of words in each class Increasing Words Sammy-Sosa, McGwire, Carlos-Delgado Relatively constant words ... of words frequency with time- series variation included in both periods. The data of extracted words is shown in Table 2. In order to get the accuracy of the correct words that are words...
  • 4
  • 502
  • 0
Báo cáo khoa học: "Using WordNet to Automatically Deduce Relations between Words in Noun-Noun Compounds" docx

Báo cáo khoa học: "Using WordNet to Automatically Deduce Relations between Words in Noun-Noun Compounds" docx

Ngày tải lên : 08/03/2014, 02:21
... the two words in that compound. Sets of compounds from other sources would not have such associated definitions. Second, by using compounds from WordNet, we could guarantee that all constituent words ... that the correct re- lation between two words in a compound can be deduced by finding other compounds containing words from the same semantic categories as the words in the compound to be disambiguated: ... obtained for that relation from any other sense-pair, using the first term of the score tuple as the main key for comparison (lines 14 and 15), and using the second term as a tie-breaker (lines 16...
  • 8
  • 318
  • 0
Báo cáo khoa học: "An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words" potx

Báo cáo khoa học: "An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words" potx

Ngày tải lên : 08/03/2014, 05:20
... Similar Words The contextually similar words of a word w are words similar to the intended meaning of w in its context. Below, we describe an algorithm for constructing contextually similar words ... parsed corpus. Attachment decisions are made using a linear combination of features and low frequency events are approximated using contextually similar words. Introduction Prepositional phrase attachment ... the contextually similar words of w. We retrieve from the collocation database the words that occurred in the same dependency relationship as w. We refer to this set of words as the cohort of w...
  • 8
  • 376
  • 0
Báo cáo khoa học: "USING AN ONLINE DICTIONARY TO FIND RHYMING WORDS AND PRONUNCIATIONS FOR UNKNOWN WORDS " doc

Báo cáo khoa học: "USING AN ONLINE DICTIONARY TO FIND RHYMING WORDS AND PRONUNCIATIONS FOR UNKNOWN WORDS " doc

Ngày tải lên : 08/03/2014, 18:20
... rhyming words, in WordSmith's rhyming dimension, for an unknown word. Z. Rhyme The WordSmith rhyme dimension is based on two files. The first is a main file keyed on the spelling of words ... of words. They also show how answer~ to these psycholinguistic questions can, in turn, contribute to 282 USING AN ON=LINE DICTIONARY TO FIND RHYMING WORDS AND PRONUNCIATIONS FOR UNKNOWN WORDS ... in the pronunciation of known words (Rosson, 1985). Until recently, it was generally assumed that novel words or pseudowords (letter strings which are not real words of English but which conform...
  • 7
  • 381
  • 1
Báo cáo khoa học: "Using Mazurkiewicz Trace Languages for Partition-Based Morphology" doc

Báo cáo khoa học: "Using Mazurkiewicz Trace Languages for Partition-Based Morphology" doc

Ngày tải lên : 17/03/2014, 04:20
... Trace Lan- guages. Recognizable languages may be imple- mented by finite-state automata in lexicographic normal form, using the morphism ϕ −1 . Operations on trace languages are implemented by operations on ... Recogniz- able trace languages are not closed under projection. The reason is that the projection may delete symbols which makes the languages of loops connected. 3 Partitioned relations and trace languages It ... describe the mor- phology of languages using contextual rewrite rules which are easily applied in cascade. Rules are com- piled into finite-state transducers and merged using transducer composition...
  • 8
  • 245
  • 0
Báo cáo khoa học: "Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words" ppt

Báo cáo khoa học: "Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words" ppt

Ngày tải lên : 23/03/2014, 18:20
... number of words present in both C and WN divided by N; (2) Precision*: the number of correct words divided by N. Correct words are ei- ther words that appear in the WN subtree, or words whose ... manner, using meta-patterns comprised of high frequency words and content words. 2. Identification of pattern candidates that give rise to symmetric lexical relationships. This is done using simple ... of words present in both C and WN divided by the number of (single) words in WN; (4) The num- ber of correctly discovered words (New) that are not in WN. The Table also shows the number of WN words...
  • 8
  • 478
  • 0
Báo cáo khoa học: "Guessing Parts-of-Speech of Unknown Words Using Global Information" ppt

Báo cáo khoa học: "Guessing Parts-of-Speech of Unknown Words Using Global Information" ppt

Ngày tải lên : 23/03/2014, 18:20
... POS tags of unknown words. We propose a probabilistic model for POS guessing of unknown words using global information as well as local information, and estimate its parameters using Gibbs sampling. ... words can have, de- scribed later), the sizes of training, test and un- labeled data, and the splitting method of them. For the test data and the unlabeled data, unknown words are defined as words ... estimated using all the training data (Figure 2, *2). Local 3 A major method for generating such pseudo unknown words is to collect the words that appear only once in a cor- pus (Nagata, 1999). These words...
  • 8
  • 295
  • 0
Báo cáo khoa học: "Using bilingual dependencies to align words in Enlish/French parallel corpora" ppt

Báo cáo khoa học: "Using bilingual dependencies to align words in Enlish/French parallel corpora" ppt

Ngày tải lên : 23/03/2014, 19:20
... align words using various syntactic relations in both languages, even though the category of the words under consideration is different. 5.4 Comparative evaluation The results achieved using ... Kluwer Academic Publishers, pp. 371-388 Wu D. 2000. Bracketing and aligning words and con- stituents in parallel text using Stochastic Inversion Transduction Grammars. In Véronis, J. (Ed.), Paral- lel ... regardless of whether the syntactic relations are identical in both languages, and regardless of whether the POS of the words to be aligned are the same. To sum up, adjectives and nouns are...
  • 6
  • 354
  • 0

Xem thêm