... Tetreault, Elena Filatova, and Martin Chodorow.2010b. Rethinking Grammatical Error Annotation and Evaluation with the Amazon Mechanical Turk. In Pro-ceedings of the NAACL Workshop on Innovative ... Creating Speechand Language Data with Amazon’s Mechanical Turk,pages 195–203.Chris Callison-Burch. 2009. Fast, Cheap, and Creative:Evaluating Translation Quality Using Amazon’s Me-chanical ... of appropriate evaluation metrics and to make system comparison easier. Oursolution is general enough for, in the simplest case, intrinsically evaluating a single system on a singledataset and,...