... chance for error is higher. As seen from theresults, the learned metrics typically perform betterwhen the training examples include sentences from higher- quality systems. Consider, for example, ... (Liu and Gildea, 2006)in that word class information is used. Finally, re-searchers have begun to look for similarities at adeeper structural level. For example, Liu and Gildea(2005) developed ... the translationqualities of new systems?In this paper, we argue for the viability of aregression-based framework for sentence -level MT-evaluation. Through empirical studies, we firstshow that...