... However,rather than splitting the data into separate trainingand test sets, they evaluate the models in an extra-polation setting, where the parameters of the modelare estimated on a subset of the ... and all wordswere converted to lowercase). Each of these sam-ples was then split into a training set of 1 million to- kens (the training size N0) and a test set of 3 million4See www.natcorp.ox.ac.uk, ... fortraining precede, chronologically, the data one wants to generalize to. We estimate parameters of the ZM, fZM andGIGP models on each training set, using the zipfRtoolkit.5 The models are then...