... one extra copysorted in position 1, for all the trigrams, we needtwo extra copies; one sorted in position 1, and an-other sorted in position 2, and so forth. Hence, inorder to handle the contiguous ... System Demonstrations, pages 10 3–1 08,Portland, Oregon, USA, 21 June 2011.c2011 Association for Computational LinguisticsAn Efficient Indexer for Large N-Gram CorporaHakan CeylanDepartment of ... com-pressed form). The unigrams form the vocabularyof the corpus and are stored in a single file whichincludes around 13 million tokens and their associ-ated counts. The remaining N-grams are stored...