... (N):N = O 1, 1+ O 1, 2+ O2 ,1 + O2,2Then the expected count for seeing the phrasein the document is:E 1, 1=O 1, 1+ O 1, 2N×O 1, 1+ O2 ,1 N× NTo measure the phraseness of a candidate phrasewe ... 11 ,563, 010 1: 173.27 10 % undersampling 66,349 1, 148,532 1, 214 ,8 81 1 :17 . 31 Test dataAll (original) 7,764 6 ,15 7,034 6 ,16 4,798 1: 793.02Commonword/comma filters 7,225 1, 472,820 1, 480,045 1: 203.85Table 1: ... per-mutation in the book, and then chooses the one with the highest frequency as the correct reconstruction of the entry. In this way, we identify the form of the index entries as appearing in the book,...