... one of 110M words of NYT from 2000and 2001 (NYT00+01), and one of 110M words of AFP from 2002, 2005, and 2006 (AFP02+05+06).In both cases, we compute¯d(i) and tune parameterson 110M words of ... ktrain(w) denote the number of occurrences of w in the training corpus, and ktest(w)denote the number of occurrences of w in the testcorpus. We define the empirical discount of w to bed(w) = ktrain(w) ... maximallydomain-similar, we randomly assign half of thesedocuments to train and half of them to test, yieldingtrain and test corpora of approximately 50M words each, which we denote by NYT95 and NYT95....