... to the size of the manual corpus. When we trained with that size of the automatic corpus, the performance was verylow compared to the performance of the manual cor-pus. The reason is that the ... the seeds and comparable tothat with the manual corpus.Moreover, the domain of the manual training corpus is same with that of the test corpus, i.e., news and novels, while the do-main of the ... step, we split the texts of the collected documents into sentences by(Shim et al., 2002) and remove sentences withouttarget NE instances.2.3 Refining the Web Texts The collected web documents may...