Báo cáo khoa học: The carbohydrate-binding module family 20 – diversity, structure, and function doc

Thông tin tài liệu

REVIEW ARTICLE The carbohydrate-binding module family 20 – diversity, structure, and function Camilla Christiansen 1,2,3 , Maher Abou Hachem 2 ,S ˇ tefan Janec ˇ ek 4,5 , Anders Viksø-Nielsen 3 , Andreas Blennow 1 and Birte Svensson 2 1 VKR Research Centre Pro-Active Plants, Department of Plant Biology and Biotechnology, Faculty of Life Sciences, University of Copenhagen, Frederiksberg, Denmark 2 Enzyme and Protein Chemistry, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark 3 Novozymes A ⁄ S, Bagsvaerd, Denmark 4 Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia 5 Department of Biotechnology, Faculty of Natural Sciences, University of SS Cyril and Methodius, Trnava, Slovakia Keywords a-glucan; amylolytic enzymes; glucan; molecular recognition; starch-binding domain; starch metabolism; water dikinase Correspondence M. Abou Hachem, Enzyme and Protein Chemistry, Department of Systems Biology, Technical University of Denmark, Søltofts Plads, Building 224, DK-2800 Kgs. Lyngby, Denmark Fax: +45 4588 6307 Tel: +45 4525 2732 E-mail: maha@bio.dtu.dk (Received 22 May 2009, accepted 17 July 2009) doi:10.1111/j.1742-4658.2009.07221.x Starch-active enzymes often possess starch-binding domains (SBDs) medi- ating attachment to starch granules and other high molecular weight substrates. SBDs are divided into nine carbohydrate-binding module (CBM) families, and CBM20 is the earliest-assigned and best characterized family. High diversity characterizes CBM20s, which occur in starch-active glycoside hydrolase families 13, 14, 15, and 77, and enzymes involved in starch or glycogen metabolism, exemplified by the starch-phosphorylating enzyme glucan, water dikinase 3 from Arabidopsis thaliana and the mammalian glycogen phosphatases, laforins. The clear evolutionary relatedness of CBM20s to CBM21s, CBM48s and CBM53s suggests a common clan host- ing most of the known SBDs. This review surveys the diversity within the CBM20 family, and makes an evolutionary comparison with CBM21s, CBM48s and CBM53s, discussing intrafamily and interfamily relationships. Data on binding to and enzymatic activity towards soluble ligands and starch granules are summarized for wild-type, mutant and chimeric fusion proteins involving CBM20s. Noticeably, whereas CBM20s in amylolytic enzymes confer moderate binding affinities, with dissociation constants in the low micromolar range for the starch mimic b-cyclodextrin, recent find- ings indicate that CBM20s in regulatory enzymes have weaker, low milli- molar affinities, presumably facilitating dynamic regulation. Structures of CBM20s, including the first example of a full-length glucoamylase featuring both the catalytic domain and the SBD, are summarized, and distinct architectural and functional features of the two SBDs and roles of pivotal amino acids in binding are described. Finally, some applications of SBDs as affinity or immobilization tags and, recently, in biofuel and in planta bioengineering are presented. Abbreviations AMPK, AMP-activated protein kinase; CBM, carbohydrate-binding module; CGTase, cyclodextrin glucanotransferase; DP, degree of polymerization; GA, glucoamylase; GH, glycoside hydrolase; GWD3, glucan, water dikinase 3; ITC, isothermal titration calorimetry; SBD, starch-binding domain; SEX4, starch excess 4 protein. 5006 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS Introduction Starch and glycogen – structure and enzymatic degradability Starch is the major energy reserve in plants and the most important energy source in the human diet. The starch granule is a complex structure composed of two distinct glucose polymers: amylose, comprising essen- tially unbranched a-(1 fi 4)-linked glucose residues, and the larger and branched amylopectin, produced by the formation of a-(1 fi 6) linkages between adjoining straight glucan chains on an a-(1 fi 4) backbone. In starch granules, amylopectin and amylose molecules are organized as alternating semicrystalline and amorphous layers forming radial growth rings [1,2]. Whereas little is known about the structure of the amorphous layers, the semicrystalline layers are made of short linear amylopectin chains packed together as parallel left-handed double helices. These helical segments extend from glu- cosidic branch points, and are further packed into con- centric arrays known as crystalline lamellae [3]. Amorphous regions of the semicrystalline layers and the amorphous layers are composed of amylose and nonor- dered amylopectin branch chains. The enzymatic degradation of starch and other insoluble polysaccharides poses a considerable chal- lenge to the attacking enzymes, as the polysaccharide chains are often poorly accessible to the active sites. The degradation, moreover, involves mass transfer in a two-phase system comprising the bulk of the medium and the surface of starch granules. Finally, despite the common structural features, starch granules show remarkable morphological variation, depending on botanical origin and tissue or compartmentalization [4], and such differences are also important for the degradability of the starch granule [3]. From a chemical viewpoint, starch and glycogen are very similar, but they differ considerably in their molecular fine structure, physical properties, and sus- ceptibility to enzymatic degradation. Glycogen does not crystallize in water and has a higher degree of branching than starch, making fast enzymatic degradation possible and providing short-term energy for rapid metabolic needs [5]. However, the chemical similarity between starch and glycogen results in overlapping molecular recognition by enzymes or binding proteins targeting the two polymers. Starch recognition and starch-binding domains (SBDs) Microbial extracellular hydrolytic enzymes that cata- lyse the degradation of starch granules or plant cell walls typically possess a modular architecture, and very often contain noncatalytic ancillary domains referred to as carbohydrate-binding modules (CBMs), which target cognate catalytic modules to specific polysaccharide structures. Binding of enzymes to insoluble polysaccharide surfaces such as starch granules or crystalline cellulose is enhanced by CBMs, which have also been suggested to distort the conformation and pack- ing of the polymers, thereby facilitating their degradation. CBMs with affinity for starch are commonly referred to as SBDs. The first discovered SBDs were placed in the CBM20 family, which remains the best characterized SBD family to date. Thus, starch binding was described for CBM20s of, for example, glucoamylase (GA, EC 3.2.1.3) from Aspergillus niger [6–8], cyclodextrin glucanotransferase (CGTase, EC 2.4.1.19) from Bacillus circulans strain 251 [9,10], maltogenic a-amylase (EC 3.2.1.133) from Geobacillus stearothermophilus [11], and b-amylase (EC 3.2.1.2) from Bacillus cereus var. mycoides [12,13]. Whereas the early CBM20s were found in extracellular starch hydrolases secreted by fungi or bacteria [14– 16], an increasing number of newly reported CBM20s show diversity with respect to phylogenetic origin and function of the appended catalytic modules. CBM20s thus occur in a wide spectrum of secreted or intracellu- lar, amylolytic and nonamylolytic enzymes from plants, mammals, archaeans, bacteria, and fungi. A common feature of CBM20s is that they are joined to catalytic modules associated with starch or glycogen metabolism. Recent evidence indicates that a few of the enzymes possessing CBM20s have regulatory functions [17,18]. Remarkably, the affinity of these regulatory enzymes for starch seems to be relatively low, suggesting a much more dynamic role of their SBDs than for those found in extracellular starch-degrading enzymes [19]. This review describes the current knowledge on SBDs, exemplified by well-characterized CBM20s, and follows up on recent revelations regarding SBDs with plausibly novel physiological functions that are different from those reported earlier for CBM20s and the related CBM21s, CBM48s, and CBM53s. Classification By analogy with glycoside hydrolases (GHs), CBMs are categorized into families based on amino acid sequence similarities, and members of each family share a common structural fold [20], but not necessar- ily specificity. The generic term CBM refers to noncatalytic carbohydrate-binding domains, which are C. Christiansen et al. Carbohydrate-binding module family 20 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS 5007 currently grouped in 54 different families (see http:// www.cazy.org/fam/acc_CBM.html). The first classified noncatalytic binding domains were originally defined as cellulose-binding domains, because their specificity was towards crystalline cellulose [21,22]. Similar obser- vations were made for other polysaccharide-degrading enzymes, such as plant chitinases (EC 3.2.1.14) [23,24], and a new term, carbohydrate-binding module, was introduced to reflect the diverse polysaccharide specificity [20,25,26]. A CBM is defined as a contiguous amino acid sequence from a carbohydrate-active enzyme that folds as a separate domain and shows carbohydrate-binding ability. Currently, nine CBM families, 20, 21, 25, 26, 34, 41, 45, 48, and 53, have been reported to contain SBDs. Their three-dimensional structures are available, except for CBM45 and CBM53. Despite low sequence similarity, the remaining seven families share a very similar fold. The structural features of CBM20s will be discussed in further detail below. The CBM20 family currently has about 300 entries in the CAZy database, representing high phylogenetic diversity, as CBM20s are encountered in archaeans (5), bacteria (152), and eukaryotes (123). Twenty unclassified entries, all descending from industrial patents, are very likely of bacterial origin. Bacterial and fungal CBM20s are most frequently connected with catalytic domains in GH13s, or the a-amylase family; GH14s and GH15s contain b-amylases and GAs, respectively [27,28]. Sev- eral specificities are found among the GH13s [29]: a-amylase (EC 3.2.1.1) [30,31]; CGTase (EC 2.4.1.19) [10]; maltotetraose-forming exo-amylase (EC 3.2.1.60) [32]; and maltogenic a-amylase (EC 3.2.1.133) [11]. A few CBM20s are from plant enzymes, including a-glucan, water dikinase 3 (GWD3) (EC 2.7.9.5) [18] and the GH77 4-a-glucanotransferase (EC 2.4.1.25) [33]. Predicted glucan-binding residues CBM20s are 90-130 amino acids long, and detailed sequence analysis has revealed that there are no invariant residues in the family [34–36]. Nevertheless, the ability of CBM20s to bind to starch seems to be associated with certain consensus residues. Originally, 11 conserved positions were indicated, on the basis of eight aligned sequences from fungi and bacteria [37]. Analysis of a much larger set of CBM20s, however, suggested that some of these residues are more important than others, owing to their higher degree of conservation across different taxa and functionalities [28,34]. Four of the highly conserved residues are aromatic amino acids, which are typically directly involved in glucan interactions. Three-dimensional structures provide support for two separate glucan-binding sites in CBM20s. Binding site 1 contains two conserved tryptophans, Trp543 and Trp590, in the canonical A. niger GA SBD [7]. Another tryptophan, conserved in 96% of the 103 analysed sequences, corresponds to Trp563 in the GA SBD [34] and is assigned to binding site 2. Tyrosines, assigned to binding site 2, aligning with Tyr527 and Tyr556 in GA SBD, are conserved in 24 and 45 of the 103 analysed sequences, respectively. In addition to the original 11 conserved residues, Phe519 in A. niger GA SBD is found in 87% of the sequences and is replaced by isoleucine, leucine or valine in several bacterial b-amylases. Notably, Arabidopsis thaliana GWD3 has an arginine at this position (Fig. 1) [34]. Finally, the W615K mutant of the highly conserved Trp615 in GA SBD was difficult to produce, suggesting that this residue plays a structural role [7]. CBM20s are evolutionarily related to CBM21s, CBM48s, and CBM53s CBM20s were originally found at the C-termini of various starch hydrolases and CGTases, whereas CBM21s were positioned N-terminally, as in GA from Rhizo- pus oryzae [28,37]. Bioinformatics analysis suggested that CBM20s and CBM21s constitute a CBM clan, despite their low sequence identity [34]. Thus, CBM20s and CBM21s were predicted to have similar secondary and tertiary structures, and this was confirmed by the solved structure of R. oryzae GA CBM21 [38], which has a conventional b-sandwich fold and immunoglobulin-like architecture. Several plant 4-a-glucanotransferases and GWD3, as well as the majority of CBM20-containing unknown eukaryotic proteins [34], possess N-terminal CBM20s. CBM20s are thus predominantly, but not exclusively C-terminally positioned. In addition to CBM21s, CBM48s and CBM53s are related to CBM20s [35,36], and will be included in the evolutionary analysis. SBDs occur with a variety of enzymatic activities Alignment of 60 selected SBD sequences (Fig. 1) illustrates the close evolutionary relationship of the four CBM families 20, 21, 48, and 53, although occasional subtle structural differences make unambiguous classification challenging. The corresponding evolutionary tree (Fig. 2), containing 33 sequences from the largest family, CBM20, eight from CBM21, 16 from CBM48, and three from the new family, CBM53, highlights the relationship. This set of sequences was created in an effort to cover a wide spectrum of known CBM20s, i.e. Carbohydrate-binding module family 20 C. Christiansen et al. 5008 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS Fig. 1. Amino acid sequence alignment of CBM families 20 (green), 21 (red), 48 (blue), and 53 (magenta). The abbreviations of the source proteins are given in Table 1. Eleven CBM20 conserved residues [37] are highlighted as follows: two tryptophans of binding site 1 in blue and substitutions by phenylalanine or tyrosine in turquoise; one tryptophan of binding site 2 in red, substitutions by phenylalanine or tyrosine in magenta; the remaining eight residues are in yellow. The tyrosines or phenylalanines at CBM21 binding site 2 are in green. The additional well-conserved phenylalanine in CBM20s and CBM48s is in black, and the invariant CBM21 lysine is in grey. The alignment was performed by CLUSTALW at the European Bioinformatic Insti- tute’s server (http://www.ebi.ac.uk/) and then manually adjusted. C. Christiansen et al. Carbohydrate-binding module family 20 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS 5009 from microbial amylolytic enzymes of GH13, GH14, and GH15, the plant GWD3 [18], and the mammalian proteins laforin (EC 3.1.3.16/48) [39] and genethonin-1 [40]. CBM21s from GH13 and GH15 and regulatory su- bunits of protein phosphatases [41] proposed to form the common CBM20–CBM21 clan [34] were later extended to include CBM48s from the GH13 pullulanase (EC 3.2.1.41) subfamily [42], regulatory domains of mammalian AMP-activated protein kinase (AMPK) [43] and the plant starch excess 4 (SEX4, EC 3.1.3.48) protein [36,44]. Moreover, three tandem SBD repeats from Ar. thaliana starch synthase III (EC 2.4.1.21) [45,46], recently placed into the new CBM53 family, were included to broaden the evolutionary comparison of SBDs belonging to these four CBM families (Table 1). Individual SBD families display subtle binding-site differences Alignment of amino acid sequences of SBDs and SBD- like motifs from CBM20s, CBM21s, CBM48s and CBM53s (Fig. 1) reveals evolutionary relationships, especially concerning positions occupied by aromatic residues identified in binding sites 1 and 2 in CBM20s [8]. It is evident that the 11 conserved residues in CBM20s [37] are not strongly conserved in CBM21s or CBM48s, and even some CBM20s lack certain consensus residues (Fig. 1). One prominent example is CBM20 of acarviose transferase from Actinoplanes sp., which has been shown to bind to a starch resin [47], and has glycine replacing the conserved tryptophan in binding site 2 (Fig. 1). This tryptophan, however, is found in CBM21s from GH13 and GH15 amylases, whereas it may be replaced by tyrosine or phenylalanine in CBM20s and CBM21s of nonamylolytic enzymes (Fig. 1). Most CBM21s contain the two tryptophans at CBM20 binding site 1, even though this site in CBM21 from R. oryzae GH15 GA is located at a different position in the sequence, owing to a slightly different topology of this CBM family, where a strand must be shifted for overlap with CBM20 topology [38]. With respect to CBM48s, one of the two tryptophans in CBM20 binding site 1 (W543 in A. niger GA SBD) is lacking in malto-oligosyltrehalose hydrolases (EC 3.2.1.141), pullulanases and isoamylases of GH13. By contrast, CBM48s in GH13 branching enzymes and in Fig. 2. Evolutionary tree of CBM families 20 (green), 21 (red), 48 (blue), and 53 (magenta). The abbreviations of the source proteins are given in Table 1. The tree is based on the alignment of complete CBM sequences (shown in Fig. 1), including the gaps. It was calculated as a PHYLIP tree type using the neighbour-joining method (http://www.ebi.ac.uk/) and displayed by the program TREEVIEW [146]. Carbohydrate-binding module family 20 C. Christiansen et al. 5010 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS Table 1. SBD and ⁄ or SBD-like sequences from CBM families 20, 21, 48, and 53. Family Abbreviation Specificity EC Source GenPept Length 20 AMY_Aspka a-Amylase 3.2.1.1 Aspergillus kawachii BAA22993 640 20 AMY_Bacsp a-Amylase 3.2.1.1 Bacillus sp. TS-23 AAA63900 613 20 AMY_Crcsp a-Amylase 3.2.1.1 Cryptococcus sp. S-2 BAA12010 631 20 AMY_Strgr a-Amylase 3.2.1.1 Streptomyces griseus CAA40798 566 20 MGA_Bacst Maltogenic a-amylase 3.2.1.133 Bacillu stearothermophilus AAA22233 719 20 M3H_Brasp Maltotriohydrolase 3.2.1.116 Brachybacterium sp. LB25 BAE94180 615 20 M4H_Psest Maltotetraohydrolase 3.2.1.60 Pseudomonas stutzeri AAA25707 548 20 M5H_Psesp Maltopentaohydrolase 3.2.1.– Pseudomonas sp. KO-8940 BAA01600 614 20 CGT_Bacci Cyclodextrin glucanotransferase 2.4.1.19 Bacillus circulans strain 251 CAA55023 713 20 CGT_Klepn Cyclodextrin glucanotransferase 2.4.1.19 Klebsiella pneumoniae AAA25059 655 20 CGT_Thbth Cyclodextrin glucanotransferase 2.4.1.19 Thermoanaerobacterium thermosulfurogenes AAB00845 710 20 CGT_Thcsp Cyclodextrin glucanotransferase 2.4.1.19 Thermococcus sp. B1001 BAA88217 739 20 CGT_Hafme Cyclodextrin glucanotransferase 2.4.1.19 Haloferax mediterranei CAI46245 713 20 ACT_Actsp Acarviose transferase 2.4.1.19 Actinoplanes sp. 50 ⁄ 110 AAE37556 724 20 BMY_Bacce b-Amylase 3.2.1.2 Bacillus cereus BAA34650 546 20 BMY_Bacme b-Amylase 3.2.1.2 Bacillus megaterium CAB61483 545 20 BMY_Thbth b-Amylase 3.2.1.2 Thermoanaerobacterium thermosulfurogenes AAA23204 515 20 GMY_Aspka Glucoamylase 3.2.1.3 Aspergillus kawachii BAA00331 639 20 GMY_Aspni Glucoamylase 3.2.1.3 Aspergillus niger AAB59296 640 20 GMY_Hypje Glucoamylase 3.2.1.3 Hypocrea jecorina 2VN7_A 599 20 GMY_Lened Glucoamylase 3.2.1.3 Lentinula edodes AAF75523 571 20 GMY_Neucr Glucoamylase 3.2.1.3 Neurospora crassa AAE15056 626 20 6AGT_Artgl 6-a-Glucosyltransferase 2.4.1.– Arthrobacter globiformis BAD34980 965 20 4AGT_Soltu 4-a-Glucanotransferase 2.4.1.25 Solanum tuberosum AAR99599 948 20 GWD3_Arath a-Glucan, water dikinase 2.7.9.45 Arabidopsis thaliana AY747068 1196 20 GEN_Homsa Genethonin-1 – Homo sapiens AAC78827 358 20 LAF_Homsa Laforin 3.1.3.48 ⁄ 16 Homo sapiens AAG18377 331 20 GPD_Homsa Glycerophosphodiester phosphodiesterase 3.1.–.– Homo sapiens AAH27588 672 20 APU_Bacst Amylopullulanase 3.2.1.1 ⁄ 41 Bacillus stearothermophilus AAG44799 2018 20 APU_Thbth Amylopullulanase 3.2.1.1 ⁄ 41 Thermoanaerobacterium thermosulfurogenes AAB00841 1861 20 IGT_Bacci Isocyclomaltooligosaccharide glucanotransferase 2.4.1.– Bacillus circulans BAF37283 995 20 CE1_Pyrfu Carbohydrate esterase 1 – Pyrococcus furiosus AAL81232 404 20 CE1_Thcko Carbohydrate esterase 1 – Thermococcus kodakarensis BAD84711 449 21 AMY_Lipko a-Amylase 3.2.1.1 Lipomyces kononenkoae AAC49622 624 21 AMY_Lipst a-Amylase 3.2.1.1 Lipomyces starkeyi AAN75021 647 21 GMY_Arxad Glucoamylase 3.2.1.3 Arxula adeninivorans CAA86997 624 21 GMY_Mucci Glucoamylase 3.2.1.3 Mucor circinelloides AAN85206 609 21 GMY_Rhior Glucoamylase 3.2.1.3 Rhizopus oryzae AAQ18643 604 21 PPRS_Cloac Protein phosphatase 1 regulatory subunit – Clostridium acetobutylicum AAK76874 247 21 PPRS_Homsa Protein phosphatase 1 regulatory subunit – Homo sapiens (human brain) AAH47502 299 21 PPRS_Sacce Protein phosphatase 1 regulatory subunit – Saccharomyces cerevisiae CAA86906 538 48 AMPK1_Ratno AMPK b1 subunit – Rattus norvegicus AAH62008 270 48 AKIN1_Zeama AKIN-b-c-1 protein – Zea mays AF276085 497 48 SEX4_Arath Starch excess 4 protein 3.1.3.48 Arabidopsis thaliana AAN28817 379 48 SNF1_Orysa SNF1-related regulatory subunit b1– Oryza sativa ABF95644 295 48 GSs_Grija Glycogen synthase subunit 2.4.1.21 Griffithsia japonica AAM93999 201 48 GBE_Escco Glycogen branching enzyme 2.4.1.18 Escherichia coli AAA23872 728 48 SBE_Horvu Starch branching enzyme 2.4.1.18 Hordeum vulgare AAP72268 775 48 GBE_Sacce Glycogen branching enzyme 2.4.1.18 Saccharomyces cerevisiae AAA34632 704 48 GBE_Homsa Glycogen branching enzyme 2.4.1.18 Homo sapiens AAA58642 702 48 MOTH_Brehe Malto-oligosyl trehalohydrolase 3.2.1.141 Brevibacterium helvolum AAB95369 589 C. Christiansen et al. Carbohydrate-binding module family 20 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS 5011 the regulatory proteins AMPK [43], AKIN [48], and SEX4 [44], as well as CBM53 repeats from starch synthase III, are likely to have a functional binding site 1 in which a tyrosine corresponds to Trp590 in A. niger GA SBD (Fig. 1). CBMs only partly cluster according to the appended catalytic domains Sequence features apparent in the CBM alignment are reflected in the corresponding evolutionary tree (Fig. 2). Remarkably, CBM20s from amylases cluster with the starch-binding ⁄ glycogen-binding CBM20 from laforin. In the tree, the CBM20s described as possible ‘interme- diates’ [34], i.e., from GH13 amylopullulanases (EC 3.2.1.1/41) and carbohydrate esterases (EC 3.1.1 ), are between those CBM48s that are most intimately related to the CBM20s and CBM21s. The starch synthase III CBM53 repeats are most closely related to CBM21s and are positioned on its border with the other, CBM48, group, which is more distant from CBM20s [36]. Thus sequences classified in CBM48 do not appear in a common cluster (Fig. 2), the CBM48s from the GH13 pullulanase subfamily [42] clustering together on a branch adjacent to CBM21s, which are mostly encountered in GAs, whereas the remaining CBM48s from AMPK, AKIN1, SEX4 and related regulatory proteins group next to CBM20s from genethonin-1, GWD3, 4-a-glucanotransferase, and phosphodiesterase 5 (Fig. 2). This relationship – first shown before family CBM48 was defined [36] – justifies CBM48s being placed in a clan with CBM20s and CBM21s. CBM20 molecular structure Conserved structural features of CBM20s Three-dimensional structures have been reported for seven of the nine SBD families defined so far: CBM20 [6,10]; CBM21 [38]; CBM25 and CBM26 [49]; CBM34 [50]; CBM41 [51]; and CBM48 [43]. No structures are available for CBM45 and CBM53. All solved SBDs show a b-sandwich fold with an immunoglobulin-like topology. Ten CBM20 structures, including those of both isolated CBMs and intact proteins possessing CBM20s, have been determined using NMR or X-ray crystallography (Table 2). The best characterized is CBM20 from A. niger GA (GA SBD), which is used here as the main representative of CBM20. Its structure was determined by NMR in both a free and a b-cyclodextrin-complexed state, and shows a well- defined b-sandwich fold with eight b-strands distrib- uted in two b-sheets (Fig. 3) [6,8]. One has five antiparallel b-strands, whereas the other is made from one parallel b-strand and an antiparallel b-strand pair [52]. This fold makes an open-sided distorted b-barrel with six loops of significant length, four of which are well defined. b-Strand 3 is absent in CGTases [53–55] and in maltogenic a-amylase [11]. The approximate dimensions of GA SBD are 42 · 38 · 31 A ˚ . One of the GA SBD structures has b-cyclodextrin bound as a starch mimic at both binding site 1 and binding site 2 [6], demonstrating the bivalent nature of this CBM20. The N-terminus and C-terminus are located at opposite ends of the longest axis of GA SBD (Fig. 3). A disulfide bond (Cys509–Cys604) between the N-terminal cysteine and the loop connecting b-strands 7 and 8 contributes to structural stability, and mutations of both cysteines to glycine or serine resulted in destabili- zation, measured as a 10 °C reduction in unfolding temperature (T m ) and loss of 10 kJÆmol )1 of free energy, largely owing to an unfavourable change in entropy [56]. The architecture and dynamics of the binding sites Binding site 1 consists of Trp543, Lys578, Trp590, Glu591, and Asn595, and the indole rings of Trp543 and Trp590 form the central part of a carbohydrate- Table 1. Continued. Family Abbreviation Specificity EC Source GenPept Length 48 MOTH_Sulso Malto-oligosyl trehalohydrolase 3.2.1.141 Sulfolobus solfataricus BAA11010 558 48 PUL_Klepn Pullulanase 3.2.1.41 Klebsiella pneumoniae AAA25124 1102 48 PUL_Horvu Pullulanase (limit dextrinase) 3.2.1.41 Hordeum vulgare AAD04189 904 48 ISO_Pseam Isoamylase 3.2.1.68 Pseudomonas amyloderamosa AAA25854 771 48 ISO_Sulso Isoamylase 3.2.1.68 Sulfolobus solfataricus AAK42273 718 48 ISO_Orysa Isoamylase 3.2.1.68 Oryza sativa BAA29041 733 53 SS3a_Arath Starch-synthase III – copy 1 2.4.1.21 Arabidopsis thaliana NP_172637 1025 53 SS3b_Arath Starch-synthase III – copy 2 2.4.1.21 Arabidopsis thaliana NP_172637 1025 53 SS3c_Arath Starch-synthase III – copy 3 2.4.1.21 Arabidopsis thaliana NP_172637 1025 Carbohydrate-binding module family 20 C. Christiansen et al. 5012 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS binding platform. This shallow and solvent-exposed binding site is characterized by a small ligand contact area, and undergoes very little structural change upon binding [8]. By contrast, binding site 2, defined by Thr526, Tyr527, Gly528, Glu529, Asn530, Asp554, Tyr556, and Trp563, is more extended and undergoes a large conformational rearrangement upon binding of b-cyclodextrin, indicating that it has higher structural plasticity than binding site 1 [8]. The main change is observed in loop regions close to the C-terminal end (Fig. 3). This part of binding site 2 is composed of a flexible loop (Fig. 3), and in the GA SBD b-cyclodextrin complex, Tyr556 approaches Asp554 and Lys555, inducing a substantial change in the position of Asp560 and resulting in a more than 13 A ˚ movement of the C a atom of this residue [8]. At binding site 2, carbohydrate–protein contacts are dominated by van der Waals stacking interactions, primarily provided by Tyr527, Tyr556 and, to a lesser extent, Trp563. The involvement of aromatic side chains in ligand binding was confirmed by site-directed mutagenesis and UV difference spectroscopy [7]. Structural plasticity at binding site 2 seems to be a general property of CBM20s, and significant conformational changes were also observed in the loop of residues 460–465 in b-amylase from B. cereus (Protein Data Bank code: 1B9Z), and the loop of residues 627–636 in CGTase from B. circulans strain 251 (Protein Data Bank code: 1CDG), when the proteins were crystallized in complex with maltose [9,12]. Two maltose molecules were bound at the surface of CBM20 of B. circulans strain 251 CGTase, and a third was bound on the catalytic domain [9]. Binding site 1 includes Trp616 and Trp662, which stack onto both glucose rings of the maltose. Direct hydrogen bonds with Lys651 and Asn667 and water-mediated hydrogen bonds with main chain carbonyl oxygens of Trp616 and Glu663 further strengthen the maltose binding. In binding site 2, Tyr633 stacks on one of the glucose rings of maltose, whereas the side chains of Thr598 and Asn627, the main chain carbonyl oxygen of Ala599 and the main chain amide nitrogen of Gly601 form direct hydrogen bonds with maltose. A water-mediated hydrogen bond is observed between maltose and the Asn603 side chain. Binding site 2 is situated at the entry of the groove lead- ing to the active site (Fig. 4), indicating that its function may be a combination of starch binding and sequester- ing single glucan chains into the active site. Indeed, the side chain of Leu600 in the B. circulans strain 251 CGTase, which is a part of binding site 2, points into the solvent and sterically confines the accessibility of this site to single carbohydrate chains (Fig. 4B). In other Table 2. A summary of CBM20 three-dimensional structures. WT, wild type. Specificity Source Protein Data Bank code Ligand Form References Cyclodextrin glucanotransferase GH13 Bacillus circulans 8 a 1CGT Free WT [53] 1CGU b-Cyclodextrin WT b [139] 5CGT Maltotriose WT b [140] Bacillus circulans strain 251 a 1CXI Free WT [54] 1CDG Maltose WT [9] 1CXH Maltoheptaose WT [54] 1CXE a-Cyclodextrin WT [54] 1CXK c-Cyclodextrin WT b [141] 1TCM Free W616A [10] Bacillus sp. 1011 b 1PAM Free WT [55] Bacillus stearothermophilus no. 2 1CYG Free WT Not available Thermoanaerobacterium thermosulfurigenes strain EM1 a 1CIU Free WT [142] 1A47 Maltohexaose inhibitor WT [143] Maltogenic a-amylase GH13 Geobacillus stearothermophilus strain C599 1QHP Maltose WT [11] b-Amylase GH14 Bacillus cereus a 1B90 Free WT [12] 1B9Z Maltose WT [12] 1CQY Free SBD (418–516) [114] Glucoamylase GH15 Aspergillus niger 1KUL Free SBD (509–616) [6] 1ACO b-Cyclodextrin SBD (509–616) [8] Hypocrea jecorina 2VN4 Free WT [62] FLJ11085 c Homo sapiens 2Z0B Free WT Not available a For these enzymes representative structure entries are listed from the many available. b Mutation was in the catalytic domain. c Putative glycerolphosphodiester phosphodiesterase. C. Christiansen et al. Carbohydrate-binding module family 20 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS 5013 CBM20s, phenylalanine, tyrosine or other bulky residues at this position may serve a similar purpose. This architecture, where an aromatic or a bulky residue defines a binding site accessible to single carbohydrate chains, is also observed in the recently determined structure of GA CBM21 from R. oryzae [57], and is remi- niscent of the surface binding site observed on the C-terminal domain of barley a-amylase 1 [58,59]. Inter- estingly, barley a-amylase has a second surface binding site formed by two consecutive tryptophan residues (Trp278 and Trp279 in the low-pI barley a-amylase; Protein Data Bank code: 1HT6), which bears a close resemblance to binding site 1 in CBM20. It remains to be explored whether these architectural similarities confer similar functionalities in starch binding in these proteins. A few more CBM20s form complex structures with bound ligands (Table 2): B. circulans strain 8 CGTase with b-cyclodextrin (Protein Data Bank code: 1CGU) or maltotriose (Protein Data Bank code: 5CGT) and G. stearothermophilus strain C599 maltogenic a-amylase with maltose (Protein Data Bank code: 1QHP) highlight the importance of van der Waals contacts with the indole groups of the conserved Trp543 and Trp590 in binding site 1 (A. niger GA numbering). Moreover, other polar residues, such as Asn595 and Lys598 (A. niger GA numbering) in binding site 1, are likely to form direct or solvent-mediated hydrogen bonds with larger ligands such as starch, and the conserved Lys598 packs on Trp543, thus contributing to a more rigid conformation of the aromatic platform. Binding site 2 is structurally more diverse, and shows differences in sequence as well as with respect to the residues involved in hydrogen bonding to ligands. Notably, no bound carbohydrate was observed at binding site 2 in the G. stearothermophilus strain C599 maltogenic a-amylase and in B. cereus b-amylase maltose complexes [12,60]. A new conserved carbohydrate- binding site was identified on the catalytic domain only in bacterial b-amylases [61]. Linker regions and interaction with the catalytic domain SBDs are connected differently with various catalytic domains. Most GAs have linker regions separating the SBD from the catalytic domain, whereas, for example, CGTases lack such linkers. The A. niger GA1 form, including both the catalytic domain and the polypeptide linker-connected SBD, was not crystallized. How- ever, the recently solved structure of Hypocrea jecorina GA [62] provides information on the spatial arrangement of the catalytic module relative to the CBM20. The SBD of H. jecorina GA is quite similar to that of A. niger GA SBD determined by NMR [6], and the structures show an rmsd of 1.7 A ˚ for 99 aligned C a atoms. In H. jecorina GA, binding site 1 is located on the SBD on the side opposite to the variable loop region and the catalytic domain, whereas binding site 2 is near the catalytic domain. This juxtaposition of the SBD relative to the catalytic domain is similar to the architecture of CGTase from B. circulans strain 251 [9], and seems also to be valid for several full-length enzymes possessing CBM20s, e.g. maltogenic a-amylase from G. stearothermophilus strain C599, and CGTases from Bacillus sp. 1011 and Thermoanaerobacterium thermosulfurigenes strain EM1 (Table 2), suggesting a A B Fig. 3. A cartoon representation of the A. niger GA CBM20 (Pro- tein Data Bank code: 1KUL) showing binding sites 1 (A) and 2 (B). The cartoon is coated by a transparent molecular surface representation to give a topological perspective of the binding sites. The N-terminus and the C-terminus are in yellow and blue, respectively. The flexible loop showing the largest conformational change upon ligand binding is in red. Ligand-binding aromatic residues and other selected residues implicated in ligand interactions are shown as sticks. The view in (B) is rotated about 180° along the long axis of the molecule. Carbohydrate-binding module family 20 C. Christiansen et al. 5014 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS conserved architecture for a phylogenetically diverse group of enzymes. The extent of interaction between the catalytic modules in GAs with CBM20s, joined to them by polypeptide linkers, remains an open question. The H. jecorina GA structure reveals an intimate interaction between the SBD and the catalytic domain, and the linker region has a well-defined electron density and extends as a random coil interacting with the catalytic module through side chain contacts [62]. The rather compact conformation and the orientation of the SBD relative to the catalytic domain suggests that the SBD is important in directing the enzyme to regions where the starch granular structure is disrupted. By contrast, the low- resolution structure of the intact GA1 from A. niger in solution, recently determined with the aid of small- angle X-ray scattering, reveals an extended conformation where the highly O-glycosylated polypeptide linker separates the two domains of the enzyme [63]. Interest- ingly, the linker of the A. niger GA is 22 amino acids longer than that of H. jecorina. These two examples indicate that the modules of fungal GAs vary in structural organization and in flexibility of the domains relative to each other, promoting a fine-tuned mode of action towards the natural substrates. Ligand binding and role of CBM20s in enzyme catalysis Binding site topology classification On the basis of the topology of ligand-binding sites, CBMs are classified into three distinct types, the A-type, B-type, and C-type. The A-type has a planar hydrophobic binding surface that recognizes highly crystalline polysaccharides such as cellulose and chitin [64]. The B-type, in contrast, has a binding cleft or a groove with at least two subsites accommodating a single polysaccharide chain [65–67]. CBM20s belong to this type, together with the majority of identified CBMs. Typically, the binding affinity of B-type CBMs strongly depends on ligand size. Thus, increased affinity up to maltononaose was demonstrated for GA SBD, whereas its interaction was negligible for oligosaccharides of degree of polymerization (DP) < 4 [64,68]. B-type CBMs do not bind to planar surfaces, similar to those found in highly crystalline polysaccharides such as cellulose. Moreover, in B-type as opposed to A-type binding, direct hydrogen bonds play a key role in defining affinity and ligand specificity [67,69]. C-type CBMs recognize termini of polysaccharides in a solvent-exposed binding pocket or a blind canyon, and have a preference for small sugars, optimally binding monosaccharides, disaccharides, and trisaccharides. The B-type binding thermodynamics in CBM20 According to the binding site topology classification of CBMs [64], ligand binding to B-type modules is accompanied by an unfavourable change of entropy, which is compensated for by a large, favourable enthalpic contribution that dominates the binding free energy change. This pattern is indeed corroborated by the energetics of b-cyclodextrin binding to A. niger GA SBD, giving DG and DH of )26.7 kJÆmol )1 and )58 kJÆmol )1 , respectively, at pH 7.0 and 25 °C [56], in AB Fig. 4. (A) Surface represention of CGTase from B. circulans strain 251 in complex with maltose (Protein Data Bank code: 1CDG). This structure illustrates the close proximity of CBM20 binding site 2, represented by Tyr633 (yellow) and Leu600 (red), to the active site cleft and the catalytic nucleophile Glu257 (green). Three bound maltose molecules are shown as sticks at binding sites 1 and 2, and at a third site on the catalytic domain of CGTases (upper part of the molecule). (B) Close-up revealing architectural features of CBM20 binding site 2, with Leu600 protruding into the solvent and restricting the access to this site to single a-glucan chains. The structure was rendered using PYMOL v0.99 software (DeLano Scientific LLC, Palo Alto, CA). C. Christiansen et al. Carbohydrate-binding module family 20 FEBS Journal 276 (2009) 5006–5029 ª 2009 The Authors Journal compilation ª 2009 FEBS 5015 [...]... that the SBD is an integral part of the CGTase structure, and that the intimate interactions and native spatial alignment relative to the catalytic module are crucial for the stability and catalytic performance, as opposed to GA SBD [107] and a-amylase SBD [102] This is further corroborated by the structure of G stearothermophilus CGTase, where SBD binding site 2 is close to the active site on the catalytic... rendering their function and structural stability much more dependent on the integral architecture This is evident from the loss of activity upon the swapping of CBM20 of B macerans CGTase with that of A awamori GA Another intriguing point is the role of the two binding sites present in most CBM20s The different afnities of these binding sites and their spatial arrangement with respect to the catalytic... compared with the wild type The role of the CGTase SBD was further investigated by swapping it with the homologous (60% similarity) A awamori GA SBD [106] FEBS Journal 276 (200 9) 50065029 ê 200 9 The Authors Journal compilation ê 200 9 FEBS 5019 Carbohydrate-binding module family 20 C Christiansen et al Although both domains were shown to retain their binding activities as independent domains, the replacement... instrumental for a better understanding of their physiological roles CBM20s and bioengineering CBM20-containing fusion proteins Exploiting the CBM20 binding functionality in protein engineering As CBM20s generally maintain their structural fold and afnity for starch, they have been explored as afnity purication tags in protein fusions CBM20s of Bacillus macerans CGTase and A awamori GA thus retain strong... other CBM20s discussed above Hence, cooperativity between SBD binding sites and the catalytic domain is conceivable, and has been veried by mutational analysis of the two sites in CGTase from B circulans strain 251, resulting in the Hill coefcient being reduced from 1.78 for the wild type to 1.3 and 1.05 for the enzyme with W616A and Y633A mutations at binding sites 1 and 2, respectively [10] CBM20... afnity of binding to insoluble starch can also be described using linear adsorption isotherms FEBS Journal 276 (200 9) 50065029 ê 200 9 The Authors Journal compilation ê 200 9 FEBS C Christiansen et al Carbohydrate-binding module family 20 Table 3 Binding of CBM20s to soluble oligosaccharides CBM20 (expression host) Ligand and method GA SBD (A niger) (A niger) (A niger) Maltosea Maltoheptaosea Maltododecaosea... Lys576 and Trp588 with either leucine or isoleucine signicantly reduced the raw starch-binding capacity [31] Thus, adsorption levels for Carbohydrate-binding module family 20 single mutants were $ 60%, as compared with $ 74% for the C-terminally truncated protein, which resembles the wild type in this case Double and triple mutations resulted in modest further reductions in adsorption to $ 205 0%, with the. .. in the bioengineering of starches The design of hydrolases for noncook raw starch processes for bioethanol industries is an urgent and important task, where CBM20s and other SBD types have the potential to increase hydrolytic efciency and rates Finally, the continuous updating of databases with new sequences has enabled a more robust analysis of the evolutionary relationships within CBM20s and in the. .. context of the related families CBM21, CBM48, and CBM53 The increase in the number of the sequences, however, both reveals the challenges of unambiguous family assignment and represents a source for future discoveries of new functionalities and applications Acknowledgements This work was supported by the Danish Natural Science Research Council, the Danish Research Council for Technology and Production... Abou Hachem), the Carbohydrate-binding module family 20 Carlsberg Foundation, and a FOBI PhD scholarship (C Christiansen) S Janecek thanks the Slovak grant agency VEGA for grant No 2 0114 08 and the Ministry of Education of the Slovak Republic for project AV-4 202 3 08 References 1 Martin C & Smith AM (1995) Starch biosynthesis Plant Cell 7, 971985 2 Tester RF, Karkalas J & Qi X (200 4) Starch . ARTICLE The carbohydrate-binding module family 20 – diversity, structure, and function Camilla Christiansen 1,2,3 , Maher Abou Hachem 2 ,S ˇ tefan Janec ˇ ek 4,5 , Anders Viksø-Nielsen 3 , Andreas. module family 20 FEBS Journal 276 (200 9) 500 6–5 029 ª 200 9 The Authors Journal compilation ª 200 9 FEBS 5009 from microbial amylolytic enzymes of GH13, GH14, and GH15, the plant GWD3 [18], and the. Christiansen et al. Carbohydrate-binding module family 20 FEBS Journal 276 (200 9) 500 6–5 029 ª 200 9 The Authors Journal compilation ª 200 9 FEBS 5011 the regulatory proteins AMPK [43], AKIN [48], and SEX4

Ngày đăng: 30/03/2014, 01:20

Xem thêm: Báo cáo khoa học: The carbohydrate-binding module family 20 – diversity, structure, and function doc, Báo cáo khoa học: The carbohydrate-binding module family 20 – diversity, structure, and function doc

Báo cáo khoa học: The carbohydrate-binding module family 20 – diversity, structure, and function doc

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan