... Valletta, Malt.D. Tufis¸, R. Ion, and N. Ide. 2004. Fine-Grained Word Sense Disambiguation Based on Parallel Cor-pora, Word Alignment, Word Clustering and AlignedWordnets. In Proceedings of the ... to the focus word itself beingthe word form of the focus word, the lemma,Part-of-Speech and chunk information• local context features related to a window ofthree words preceding and following ... incorporates the automatically generated word alignments as labels. We applied an automaticpost-processing step on these word alignments in or-der to remove leading and trailing determiners and prepositions....
... reflects a certain cultural con- text and cannot be simply replaced by a word to word translation. • collocations: Some word pairs such as projects and ~(houses) are not direct translations. ... unique words in the Chinese text, N is the occurrence count of one English wordand M the occurrence count of one Chinese word. We previously used some frequency difference con- straints and ... influenced the results. Apart from single Word to single word transla- tion such as Governor/~ and prosperity/~i~fl¢~, we also found many single word translations which show potential towards...
... that our technique wouhl work on other French and English corporaand even on other pairs of languages. The work of Gale and Church , [Gale and Church, 1991], who use a very similar method ... from one or the other of the corpora. If a person is given two parallel texts and asked to match up the sentences in them, it is na.tural for him to look at the words in the sen- tences. Elaborating ... re[erred to as tta.nsards. 169 And love and kisses to you, too. mugwumps who sit on the fence with their mugs on one side and their wumps on the other side and do not know which side to...
... scientific and technical objects, with special but not exclusive reference to Canada, and by demonstrating the products and processes of science and technology and their economic, social and cultural ... owned by the Corporation are recorded at cost and amortized over their estimated useful life. Land and buildings owned by the Government of Canada and under the control of the Corporation ... medicine, meteorology, surveying and mapping, and information technology; and Transportation: motorized and non-motorized wheel, track and trackless vehicles; motorized and non-motorized marine transportation,...
... between common English words and medical terms. We measured word frequency by "disease occur- rence", (the number of disease definitions in which a given word occurs one or more ... com~non English word like 'of', would be used in the descriptions of all kinds of dis- ease, and would accordingly have a high 'entropy'. Tables 2 and 3 show the top and bottom ... could co-occur in any location in the definition and in either order), and the co-occurrences expected from chance alone. Tables 4 and 5 show the top and bottom of a list of all pairs formed from...
... training, and suggest that their re-sults with a 340k -word Spanish corpus are compa-rable to 20k-40k words of gold-standard trainingdata when using MUC-style evaluation metrics.2.1 Gold-standard corpora We ... (CVN-68) has wordtype AA Aaa (AA-00).Wordtype with functions: We also map contentwords to wordtypes only—function wordsare retained, e.g. Bank of New England Corp.maps to Aaa of Aaa Aaa Aaa ... Association for Computational LinguisticsAnalysing Wikipedia and Gold-Standard Corpora for NER TrainingJoel Nothman and Tara Murphy and James R. CurranSchool of Information TechnologiesUniversity...
... mechanisms of corporate governance, compliance, and ethics, and their collective role in preventing and mitigating excesses and scandals in the corporate sector. Earlier rounds of corporate scandal ... compliance and ethics activities. Notwithstanding these and other changes in law and regulation affecting corporate directors, in the years since SOX there has been a series of further scandals and ... integrity and corporate ethics starts with a senior-level chief ethics and compliance officer (CECO) who understands the compliance and ethics field, is empowered and experienced, and who has...
... comparable corpora has focused on extracting word translations (Fung and Yee, 1998; Rapp, 1999; Diab and Finch, 2000;Koehn and Knight, 2000; Gaussier et al., 2004;Shao and Ng, 2004; Shinyama and Sekine, ... 19(1):61–74.Pascale Fung and Percy Cheung. 2004a. Mining verynon -parallel corpora: Parallel sentence and lexiconextraction vie bootstrapping and EM. In EMNLP2004, pages 57–63.Pascale Fung and Percy Cheung. ... automatic creation of parallel corpora. Sev-eral researchers (Zhao and Vogel, 2002; Vogel,2003; Resnik and Smith, 2003; Fung and Cheung,2004a; Wu and Fung, 2005; Munteanu and Marcu,2005) have...
... em-ployed in word sense disambiguation work that uses parallel corpora (Diab and Resnik, 2002). The as-sumption made in the word sense disambiguationwork is that if a source language word aligns ... Callison-Burch, David Talbot, and Miles Osborne.2004. Statistical machine translation with word- and sentence-aligned parallel corpora. In Proceedings ofACL.Mona Diab and Philip Resnik. 2002. An ... unsupervisedmethod for word sense tagging using parallel corpora. In Proceedings of ACL.Ali Ibrahim, Boris Katz, and Jimmy Lin. 2003. Extract-ing structural paraphrases from aligned monolingual corpora. ...
... unknown words, we can use the fact that most unknown words are nouns or proper nouns and merge this category with nouns. We can also merge acs that are represented with only a few distinct words ... with content words) reduces the number of parameters 335 AUTOMATIC ALIGNMENT IN PARALLELCORPORA Harris Papageorgiou, Lambros Cranias, Stelios Piperidis I Institute for Language and Speech ... semantic load of a sentence as the patterns of tags of its content words. Content words are taken to be verbs, nouns, adjectives and adverbs. The complexity of transfer in translation imposes...
... list. In otherwords, if the first translation candidate for the source word isola is the target word island, and, vice versa, the firsttranslation candidate for the target word island is isola, ... pair(isola, island).2. Remove the words isola and island fromtheir respective vocabularies.3. Since island is not in the vocabulary, theindirect association between arcipelago and island is not ... In other words, if the most prob-able translation candidate for a source word wS1isa target word wT2 and, vice versa, the most prob-able translation candidate of the target word wT2451Proceedings...
... such as Japan, America and Europe, companies need to strengthen research capacity to apply international standards on environment, such as ISO 14000 and environmental standards of the market ... have managers’ awareness and commitment to carry out and improve the initiative of CSR. If the managers are at least partly the proponent of the corporate culture, and integrating CSR requires ... and 20% of respondents have no opinion on this issue and no managers indicate a strongly disagree to the statement. Furthermore, tables 79% of respondents understand all about what is CSR and...
... multinational corporations but their acceptance and performance about CSR does not pay attention and concern; and it is the caused of many damages and disasters for environment and social. In ... dilute the tide, and when the tide go down, water and waste was mixed colors, and flows into the Dong Nai river. People living around said that the sewage canals have a black color and a very unpleasant ... the environmental, the cultural and the financial) and sustainability of behavior which contributes to a future for the people and the planet” (Pearce 2011); and as “voluntary disclosures of...
... figures target whilst managing and redirecting their behaviors on how to fit the ethical standards and improving the life quality of employees and their families, and the local community. 2.2. ... rearranged, and summarized in the “Review reports” and “Annual Performance” section. 46 Demacarty, P., 2009. Financial returns of corporate social responsibility, and the moral freedom and responsibility ... the directors and staffs, customers and CSR benefit receivers in Hanoi city. 3.6.3. Sample size 100 management and staffs questionnaire forms, 100 customers questionnaire forms and 300 CSR...
... (positive or negative); and are respectively the numbers of labeled instances in and ; and are parallel instances in and , respectively (i.e. ... on the right-hand side is the likelihood of labeled data for both and ; and the second term is the likelihood of the unlabeled parallel data . If we assume that parallel sentences ... for the Labeled Data Unlabeled Parallel Text and its Preprocessing. For the unlabeled parallel text, we use the ISI Chinese-English parallel corpus (Munteanu and Marcu, 2005), which was extracted...