... random sampling and active learning. Differences for training setsizes of 20 and 40 are all significant (p < .05).6 Related workWhile most previous work on trainable NLG re-lies on a handcrafted ... 2002.C. Thompson, M. E. Califf, and R. J. Mooney. Active learn-ing for natural language parsing and information extrac-tion. In Proceedings of ICML, 1999.B. Thomson and S. Young. Bayesian update ... a ‘gold standard’ humanutterance from our dataset, which they must com-pare with utterances generated by models trainedwith and without active learning on a set of 20, 40,100, and 362 utterances...