... and 52extended (68% and 17%, respectively). In total, thesecover about 27 900 proteins, and surprisingly, giventhat the majority of the families are classical, 36% of the proteins are of the ... each of the candidate sequences was compared with all of the othercandidates using fasta [23]. We tested clustering at variouslevels, and found that an initial clustering at the 40% level and ... toachieve the desired specificity. If two clusters were equal,one of the HMMs was selected, and if one cluster was asubset of another cluster, the HMM of the larger clusterwas selected. If there...