... the first candidate. Let c/ be the i-th character in the input, xlj be the j-th can- didate for ci in the output, and p be the probability that the first candidate is correct. The confusion ... Training Data for the Language Model We used the EDR Japanese Corpus Version 1.0 (EDR, 1991) to train the language model. It is a corpus of approximately 5.1 million words (208 thou- sand sentences). ... pronunciation, and part of speech. In this experiment, we randomly selected 90% of the sentences in the EDR Corpus for training. The first column of Table 1 shows the number of sen- tences, words, and...