Báo cáo khoa học: Alternative splicing: global insights potx

11 544 0
Báo cáo khoa học: Alternative splicing: global insights potx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

MINIREVIEW Alternative splicing: global insights Martina Hallegger*, Miriam Llorian* and Christopher W J Smith Department of Biochemistry, University of Cambridge, UK Keywords alternative splicing; microarray; RNA-Seq Correspondence C W J Smith, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK Fax: +44 1223 766002 Tel: +44 1223 333655 E-mail: cwjs1@cam.ac.uk *These authors contributed equally to this work (Received 26 August 2009, accepted 22 October 2009) Following the original reports of pre-mRNA splicing in 1977, it was quickly realized that splicing together of different combinations of splice sites – alternative splicing– allows individual genes to generate more than one mRNA isoform The full extent of alternative splicing only began to be revealed once large-scale genome and transcriptome sequencing projects began, rapidly revealing that alternative splicing is the rule rather than the exception Recent technical innovations have facilitated the investigation of alternative splicing at a global scale Splice-sensitive microarray platforms and deep sequencing allow quantitative profiling of very large numbers of alternative splicing events, whereas global analysis of the targets of RNA binding proteins reveals the regulatory networks involved in post-transcriptional gene control Combined with sophisticated computational analysis, these new approaches are beginning to reveal the so-called ‘RNA code’ that underlies tissue and developmentally regulated alternative splicing, and that can be disrupted by disease-causing mutations doi:10.1111/j.1742-4658.2009.07521.x Introduction Alternative splicing allows individual genes to produce two or more variant mRNAs, which in many cases encode functionally distinct proteins With the progressive generation of ever larger sequence datasets, the proportion of multi-exon human genes that are known to be alternatively spliced has expanded to 92–94%, of which 85% have a minor isoform frequency of at least 15% [1,2] Despite some debate about the extent to which all of this alternative splicing is functionally important [3], there is no disputing that alternative splicing is a major contributor to the diverse repertoire of transcriptomes and proteomes Its importance is underscored by the fact that misregulated alternative splicing can lead to human disease [4,5] As part of the overarching effort to understand how the information encrypted within genomes is used to generate fully functional organisms, it is therefore necessary to decipher the ‘RNA codes’ underlying regulated patterns of alternative splicing Traditionally, research on alternative splicing regulation focused on the study of minigene models in vitro or in vivo The picture that emerged is that regulation of alternative splicing occurs via the action of numerous RNA binding proteins expressed at variable levels between tissues These activators and repressors often mediate their effects by binding to enhancer and silencer elements within or surrounding alternatively spliced exons (reviewed in [6]) Although much progress has been made using model systems, a drawback is that even when a model alternative splicing event has been thoroughly characterized it is not immediately obvious which of its features are generally shared by Abbreviations CLIP, UV cross-linking and immunoprecipitation; CELF, CUGBP and ETR3 like family (of RNA binding proteins); CUGBP, CUG binding protein; miRNA, micro-RNA; RNP, ribonucleoprotein; MBNL, muscleblind like; PTB, polypyrimidine tract binding protein; SELEX, selective evolution of ligands by exponential enrichment; SR protein, serine-arginine rich protein 856 FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS M Hallegger et al coregulated alternative splicing events as part of a common regulatory programme, and which features are oddities of the particular model system Over recent years new high-throughput methodologies have allowed the analysis of thousands of alternative splicing events in parallel These tools – principally splicesensitive microarrays, but also medium-throughput automated RT-PCR, and increasingly deep sequencing – allow large-scale quantitative profiling of splice variants This is important in allowing the generation of large datasets of coregulated splicing events – a prerequisite for defining RNA codes Biomedically, these approaches can facilitate the identification of splicing signatures that are associated with pathologies [7] At the same time, improved methods for defining the full cellular complement of RNAs to which a particular protein binds – for example, CLIP (UV cross-linking and immunoprecipitation [8]) and its ‘next generation’ derivative HITS-CLIP [9] or CLIP-Seq [10] – as well as a global analysis of alternative splicing changes produced as a result of splicing factor knockdown or knockout, provide additional ‘factor-centric’ datasets that can contribute to defining the codes Several recent reviews have covered different aspects of these global analyses [11–15] The aim of this minireview is to highlight some of the recently published information that contributes towards breaking the RNA code by the application of high-throughput methodology, mainly focusing upon work in mammalian systems We start by providing a brief review of the enabling technologies, and move on to discuss the insights they have allowed and possible future developments Analogue and digital transcriptome profiling Early microarrays typically contained probes consisting of full-length cDNAs or oligonucleotide probes located towards the 3¢ end of transcripts, and were unable to distinguish alternatively spliced isoforms However, a number of current array designs, in different ‘flavours’ depending on the location of the probes, can distinguish between splice variants (Fig 1A, Table 1): (a) tiling arrays, with overlapping probes across a known genomic sequence (a chromosome or an entire genome) [16]; (b) exon-body arrays, in which probes are located within exons For example, the Affymetrix human ExonArray includes 1.4 million probe sets corresponding to all known human exons, ranging from the well annotated to more speculative computational predictions [17–20]; (c) splice-junction arrays, which contain probes crossing spliced junctions [21]; or Alternative splicing: global insights (d) exon-junction arrays, which contain probes within exons as well as across exon junctions Among the exon-junction arrays that have been used successfully are human and mouse arrays interrogating 3100 and 3700 cassette exons, respectively [22,23] A similar design has been used to interrogate 8315 alternative splicing events in Drosophila [24–26] Finally, a ‘whole transcript’ microarray monitoring 203 672 exons and 178 351 exon junctions has allowed the identification of more than 24 000 human alternative splicing events [27] Such arrays have been applied successfully to study changes in alternative splicing under different conditions ranging from tissue-specific changes [17,27,28], cancer-associated splicing [19,29], signalactivated splicing [26,30], developmentally regulated splicing [20,31], as well as to define functional targets by splicing factor depletion [18,25,32–34] and alternative splicing events linked to nonsense-mediated decay [35] Although splice-sensitive microarrays have been applied with great success (see Table 1), they have some limitations, including cross-hybridization problems, limited dynamic range, as well as a low signal-tonoise ratio due to background In particular, many of the normal rules for optimal probe design have to be relaxed or ignored in the case of exon-junction probes Finally, arrays are not an ideal platform for discovering new alternative splicing events, including, for example, inclusion of pseudo-exons (see accompanying review by Dhir and Buratti [36]), and they are limited to organisms with sequenced genomes Sequence-based methods, including small tags, such as expressed sequence tags, cap analysis of gene expression [37], serial analysis of gene expression [38], as well as full-length cDNAs [39,40], have been used to obtain digital counts of transcript abundance, but they have suffered from bias introduced in the sample preparation, inability to detect lowly expressed genes and low statistical power The development of highthroughput DNA sequencing technologies [10,41,42] circumvents many of these previous barriers [1,43–48] RNA-Seq has the capacity to generate millions of short sequence reads (25–30 or 200–400 nucleotides depending on the sequencing technology) of cDNAs derived from polyA-enriched mRNA [45] Reads are then mapped on to unique locations on the genome and annotated transcriptome (for splice-junction reads), providing a digital count of expressed sequences (exons) Differences in read densities across genes in different conditions allow for quantification of gene expression [2,43] Comparison with microarray or RT-PCR data shows that read counts give an accurate estimate of relative gene expression levels across a very broad dynamic range [1,2] FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS 857 Alternative splicing: global insights M Hallegger et al A B Fig High-throughput methods for global analyses of alternative splicing (A) Schematic representation of different splice-sensitive microarrays (adapted from [27]) Exon arrays, typically Affymetrix Exon Arrays, contain oligonucleotide probe sets for every known and predicted exon Junction arrays, typically used in [21], contain probes spanning exon junctions across annotated genes Exon-junction arrays typically contain both exon-body and exon-junction probes The coverage of these arrays varies from a few thousand cassette exons [22,23] to all annotated alternatively spliced genes in Drosophila [24–26] or every single annotated exon and exon junction in  18 000 human genes [27] The bottom panel shows an example of differential exon usage for a typical cassette exon by means of the differential hybridization signals (B) RNA-Seq The genomic structure for a typical cassette exon is depicted in the middle of the panel, where constitutive exons are shown in purple and the alternative cassette exon in blue Sequence reads obtained from the high-throughput method are represented in colourcoded rectangles (see inset) and are mapped within the genomic sequence The counting of reads corresponding to inclusion (upper) and skipping (bottom) allows for the estimation of ‘inclusion ratios’ for the different alternatively spliced isoforms Because many sequence reads span exon–exon junctions, RNA-Seq can identify novel splicing events The discovery of new alternative splicing events and mRNA isoforms is an area where the new sequencing technologies will have an immediate impact However, a greater challenge is to harness RNA-Seq for digital quantitative profiling of alternative splicing (Fig 1B) In principle, changes in alternative splicing between two conditions can be quantitated by comparing the number of reads mapping to reciprocal events (e.g exon inclusion versus skipping) [2], or by normalizing the number of reads mapping to a particular splice junction or exon by the number of reads across the gene In practice, large-amplitude changes in alternative splicing events within genes that are themselves highly expressed are readily detected (e.g the ‘switch858 like’ events reported in [2]) Only in one-third of 105 000 annotated alternative splicing events were reciprocal reads detected by Wang et al [2], allowing quantification of tissue-specific differential splicing using a minimum threshold of 10% change in inclusion ratio between tissues However, more subtle changes in alternative splicing within genes for which few reads are available will evade detection [49] Recent estimates suggest that 200 million reads would be required to quantitate accurately the splicing levels in 80% of genes [15] In the future, the progressively decreasing cost and increasing read lengths and volume of high-throughput sequencing can only advance the ability of RNA-Seq to profile alternative splicing quantitatively Methods to ‘focus’ sequence reads on to splice junctions, such as RNA-mediated annealing, FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS M Hallegger et al Alternative splicing: global insights Table Summary of splice-sensitive microarray analyses Array design Experiment Species Validation rate (events tested) Reference 203 672 exons ⁄ 178 351 exon junctions 110 367 exons ⁄ 93 382 exon junctions  125 000 junction probes 40 443 exon-junction probe sets Affymetrix Exon Array Probe sets for million exons 48 tissues and cell lines Time course of heart development 52 tissues and cell lines Nova-2 knockout brains Colon, bladder, prostate cancer tissues 11 human tissues Colon cancer Lymphoblastoid cell lines hnRNPLL knockdown in T cells Mid-fetal brain hnRNPL knockdown PTB knockdown in N2A cells Erythropoiesis Human Mouse Human Mouse Human 74% (23 events tested) Not mentioned 58% 100% (49 ⁄ 49) 66.67% (10 ⁄ 15) [27] [31] [21] [72] [29] Human Human Human Human Mouse Human Mouse Human 86%  33% 78% (25 ⁄ 32) Not mentioned 95% (65 ⁄ 68) 22% (11 ⁄ 50) 27 ⁄ 30 events validated [17] [19] [16] [18] [28] [32] [34] [20] 10 adult mouse tissues 27 tissues and cell lines Activation of Jurkat cells Knockdown of UPF1, UPF2, UPF3 in HeLa Knockdown of Sam68 Knockdown of SR and hnRNP proteins in S2 cells Knockdown of hnRNP proteins in S2 cells Alternative splicing changes upon insulin or Wingless stimulation Mouse Mouse Human Human Not mentioned 68% (17 ⁄ 25) 83% [23] [22] [30] [35] Mouse Drosophila 68.5% (24 ⁄ 35) 100% (6 ⁄ 6) [33] [24] Drosophila 70% [25] Drosophila 70% (11 ⁄ 15) [26] Exon Array and array featuring exon-body and exon-junction probe sets 3126 cassette exons 3707 cassette exons > 5000 cassette exons 3055 cassette exons 1300 exons 8315 mRNAs ⁄ 9868 alt junction probes selection, extension and ligation [50] or preselection by customized capture arrays [51], might enable more cost-effective quantitative profiling of a large number of alternative splicing events In the meantime, some of the splice-sensitive microarray platforms will remain competitive Surveying splicing regulator targets Cataloguing the targets of RNA binding proteins that are known splicing regulators provides a complementary entry point for unravelling RNA codes ‘Functional targets’ can be classified as the set of alternative splicing events that are affected by perturbing the levels of a splicing regulator, by knockdown, knockout or overexpression These targets can be identified by global transcriptome profiling tools, such as splicesensitive microarrays [18,25,32–34], medium-throughput RT-PCR [52], RNA-Seq or even quantitative proteomics [53] However, apparent functional targets can include indirect secondary targets A complementary approach is to identify direct RNA ‘binding targets’ Selective evolution of ligands by exponential enrichment (SELEX) is an initial fully in vitro approach that defines the optimal binding site, typically short variably degenerate motifs, for an RNA binding protein by iterative selection from an initially fully degenerate sequence pool [54] A variant approach, genomic SELEX, uses RNA transcribed from genomic DNA as the starting pool for selection [55] SELEX is a useful, although not obligatory, precursor to methods that catalogue the actual RNA species (mRNA or pre-mRNA) bound by a splicing regulatory protein Direct immunoprecipitation without prior cross-linking (RNP immunoprecipitation) followed by hybridization to arrays can be a useful approach [25] However, a more powerful approach for identifying binding targets is CLIP (Fig 2), which was originally developed to identify targets of the neuron-specific NOVA proteins [8,56] RNA is first cross-linked in vivo to bound protein by UV irradiation, fragmented to  100 nucleotide tags, isolated by immunoprecipitation, reverse transcribed and then sequenced A key feature of CLIP is that UV induces ‘zero-length’ cross-links only between RNA and directly bound proteins, thereby allowing enrichment FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS 859 Alternative splicing: global insights M Hallegger et al Fig HITS-CLIP Intact tissue or tissue culture cells are UV irradiated to induce covalent cross-links between RNA and RNA binding proteins Cells are lysed under very stringent conditions and treated with DNAse and partially digested with RNAses The RNA–RNP complex is pulled-down by immunoprecipitation The RNA is radioactively 5¢ labelled and ligated to a 5¢ RNA linker The sample is run on SDS ⁄ PAGE with neutral pH and blotted Only RNA cross-linked to protein will be transferred on to the membrane A small fragment of membrane is isolated at a position that corresponds to the protein plus RNA between 50 and 100 nucleotides After proteinase K digestion, the RNA is recovered from the membrane and ligated on its 3¢ end to an RNA adapter with complementarity to the RT primer The following PCR step with primer complementary to ligated linkers also allows the addition of appropriate HITS-specific primer sequences (adapted from [76]) of specifically bound sequences by immunoprecipitation under stringent conditions The original CLIP procedure has now been modified, with direct highthroughput sequencing of reverse transcribed tags [9,10] The so-called HITS-CLIP [9] or CLIP-Seq [10] protocols allow saturated coverage of binding targets, giving a truly global view of the RNP landscape of individual proteins, and suggesting possible novel functions This ‘next generation’ CLIP approach has already been applied to the splicing regulators NOVA [57], FOX2 [58], SFRS1 (better known as SF2 ⁄ ASF) [59,60], as well as the miRNA-associated protein, argonaute [61] The comprehensive view afforded by this approach reveals additional, nonsplicing-related, roles for these RNA binding proteins For example, a surprising new function for NOVA2 in alternative poly(A)-site choice was discovered Neuronal cells in general tend to process at promoter-distal poly(A)-sites and the NOVA2 targets follow this trend Proliferating cells produce shorter 3¢ UTRs and therefore reduce the potential of miRNA regulation [62] By the same token, neuronal transcripts with long UTRs are potentially more prone to regulatory inputs from both miRNAs and 3¢ UTR binding proteins In practice, methods to define functional and binding targets are complementary A comprehensive global analysis of the Drosophila homologues of the mammalian hnRNPA ⁄ B proteins, hrp36, hrp38, hrp40, 860 hrp48, involved analysis by a splice-sensitive array of alterations in alternative splicing upon knockdown, determination of SELEX motifs in vitro and direct immunoprecipitation without prior cross-linking followed by hybridization to arrays using a whole genome tiling array [25] This provided many insights into the functional redundancy and specialization of this family, and provided hints about their probable mechanism of action Perhaps most surprisingly, in view of popular models about antagonism between the two families of proteins, very few alternative splicing events were found to be regulated by both hnRNP and SR proteins [24,25] Tissue and individual variations in alternative splicing Over the last year, several reports have focussed on the global analysis of transcript isoform differences between human tissues [1,2,16,27,28,47,63,64], mouse tissues [31,63], normal and cancer tissues [64], in response to specific signalling pathways in Drosophila [26], or developmental transitions in human brain [28], mouse heart [31] and mouse stem cells [63] The combination of these approaches has revealed extensive transcript complexity Sequencing approaches show that many transcripts extend beyond the previously annotated 5¢ and 3¢ gene FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS M Hallegger et al boundaries [1,2,63] Moreover, there has been a substantial increase in the number of known alternative splicing events, with the capacity of discovering new splice junctions, ranging from 1400 in one study [63] to between 4294 and 11 099 in another [1] The majority of detected alternative splicing events, including those newly discovered, show clear tissue specificity, demonstrating the importance of alternative splicing in tissuespecific programmes of gene expression In one study alone, involving 400 million 32 base reads from 15 human tissues and cell lines, 22 000 tissue-specific alternative transcript events were identified [2] A group of alternative splicing events that shows extreme changes between tissues – so-called ‘switch-like’ events – is associated with the regulation of highly tissue-specific functions by switching between distinct full-length isoforms [2] Perhaps unsurprisingly, some of these switch-like alternative splicing events within highly expressed genes (e.g TPM1) have been used for many years as model systems of regulated alternative splicing Interestingly, although in many cases alternative splicing regulates functionally coherent groups of genes, there is no significant overlap between those genes that are differentially transcribed and those that are differentially spliced within the same tissue or within specific cell programmes [27,30,31,42,65] For example, upon T cell activation, genes related to the immunological response are affected at the level of transcription, whereas cell cycle genes are differentially spliced [30] These findings build upon the original observations of Pan et al [23] suggesting that overall programmes of tissue-specific gene expression involve independent subprogrammes operating on different subsets of genes at the levels of transcription and splicing [66] On the other hand, in response to certain signalling pathways in Drosophila melanogaster cells, a 40% overlap was found between genes that undergo both transcriptional and splicing changes, suggesting that transcriptional and post-transcriptional co-ordination could be important to deploy quick responses upon certain stimuli [26] Sequencing and array studies have also provided fascinating glimpses at the degree to which alternative splicing varies between individuals RNA-Seq of samples originating from seven cerebellar cortex samples [2] and exon tiling array analysis of 57 lymphoblastoid cell lines [16] both showed a significant association between genomic variations (single nucleotide polymorphisms) and alternative splicing patterns Happily (for those working on mechanisms of tissue-specific splicing), both studies indicated that although alternative splicing variation between individuals is common, it is secondary to tissue-specific alternative splicing Alternative splicing: global insights Motifs and maps RNA-Seq and microarray analysis on tissues have generated a genome-scale catalogue of isoform expression profiles [2,17,27,31] These data provide a resource to identify the RNA sequences involved in the regulation of tissue-specific alternative splicing by motif enrichment analysis In some cases, the motifs associated with tissue-specific alternative splicing hint at the involvement of ‘usual suspects’ – well-known splicing regulators with defined binding sequences By microarray profiling 48 human tissues and systematically screening for 4-mer to 7-mer RNA ‘words’ associated with 24 426 alternatively spliced exons, Castle et al [27] identified 143 motifs enriched near tissue-specific exons Interestingly, the two most frequent motifs, UCUCU and UGCAUG, coincide with binding consensus sequences for PTB ⁄ nPTB and FOX splicing factors, and show a distinct pattern of genomic localization Similar observations were made based on RNA-Seq reads from 15 human tissues and cell lines [2] UCUCU motifs were enriched within a 200 nucleotide region upstream of cassette exons that are upregulated in brain and striated muscle The extent to which these exons are spliced correlates inversely with PTB expression levels [2,27], consistent with PTB’s well-known role as a splicing repressor [67] The Castle et al [27] junction array was also used to analyse alternative splicing during development of the mouse heart, resulting in the identification of 63 developmentally regulated alternative splicing events, falling into three temporal groups More than half of these events were regulated similarly during development of the chicken heart [31] Enriched motifs included binding sites for the CUGBP, MBNL, FOX, STAR and PTB families of splicing factors Forty-four of these alternative splicing events were further investigated in hearts from transgenic animals that overexpressed CUGBP1 or were depleted of MBNL1 Of the 24 exons with altered inclusion levels, 13 were regulated by CUGBP1, five by MBNL1 and six antagonistically by both [31] The switch in relative activities of CUGBP and MBNL proteins during development appears to explain a large subset of splicing transitions detected during postnatal heart development Observation of enriched motifs in the cases above allowed inferences to be drawn about the probable cognate binding proteins, e.g Fox, PTB, MBNL and CELF proteins However, there are more than 300 RNA binding proteins encoded in mammalian genomes [68], which have the potential to act as splicing regulators, but for most little or nothing is known about their binding specificity Traditional SELEX to FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS 861 Alternative splicing: global insights M Hallegger et al determine their binding specificity would be laborious However, a new array-based procedure may provide the capability to rapidly derive the optimal binding motifs for many of these proteins [69], which would assist in future attempts to link factors with enriched motifs NOVA and FOX maps In the case of two families of mammalian proteins, the FOX and NOVA proteins, a variety of techniques, culminating in HITS-CLIP analysis, have converged on very similar RNA maps, in which the precise location of binding sites for the cognate proteins is predictive of their action as either repressors or silencers of alternatively spliced exons The NOVA proteins are neuron-specific RNA binding proteins that are targets of a neuronal autoimmune response associated with cancer Analysis of these proteins in the Darnell laboratory has led the way in the global analysis of RNA binding protein function [70] SELEX analysis indicated that the optimal binding site for NOVA consisted of clusters of three YCAY motifs [71], and importantly a cluster of such motifs matched a cis element crucial for NOVA-regulated alternative splicing of an exon in the GABAA gene Analysis of alterations in alternative splicing in the neocortex of wild-type and Nova2) ⁄ ) mice using an Affymetrix prototype junction array with  40 000 probe sets allowed the identification of  50 alternative splicing events that were NOVA regulated [72] The genes affected by NOVA-dependent alternative splicing were highly enriched for proteins involved in synaptic function, emphasizing the fact that alternative splicing targets functionally coherent groups of genes The CLIP method was originally developed to analyse in vivo NOVA binding RNAs by conventional cloning and sequencing of purified RNA tags Of the moderate number of sequence tags identified, only  20% contained clusters of YCAY motifs, but in these cases the tags were often associated with NOVA-regulated alternative splicing events [56] On the basis of the accumulated group of validated NOVA targets, a bioinformatic exercise was carried out to identify clusters of YCAY motifs within 200 nucleotides of alternative exons or their flanking constitutive exons, and moreover to predict whether these clusters would act as enhancers or silencers [73] The resulting NOVA RNA map contained various intronic and exonic silencers, as well as intronic enhancers NOVA clusters within the downstream intron were invariably enhancers, whereas within the exon and most positions in the upstream intron they were silencers Most recently, the NOVA 862 RNA map has been refined by high-throughput sequencing (using the Roche 454 platform) of NOVA2 CLIP tags from mouse neocortex, with confirmation of splicing outcomes by splice-junction array comparison of wild-type and Nova2) ⁄ ) mice [57] As expected, the comprehensive HITS-CLIP approach rediscovered many of the previously known NOVA targets, as well as many new ones The refined NOVA map showed that NOVA binding clusters within 500 nucleotides of the alternative 5¢ splice site or constitutive 3¢ splice site acted as enhancers, whereas NOVA binding within 500 nucleotides of the constitutive 5¢ splice site or surrounding the NOVA-regulated exon was inhibitory The FOX1 and -2 proteins are alternative splicing regulators that have a single RNA binding domain with an unusual degree of specificity for the cognate UGCAUG binding site [74] In a number of recent global transcriptome profiling studies, FOX binding motifs were found to be associated with exons regulated in striated muscle and neurons [2,27,75], consistent with the expression patterns of FOX1 and -2 Analysis of breast and ovarian cancer using an RTPCR panel of alternative splicing events indicated that one-third of cases of increased exon skipping in cancer were associated with downstream FOX sites Moreover, FOX2 expression is lower in breast cancer and its own alternative splicing is altered in ovarian cancer [64] Closer analysis of the various FOX datasets showed an interesting position-dependent effect, reminiscent of the NOVA map [57,73] When located downstream of alternative splicing exons, FOX binding sites act as enhancers, whereas on the upstream side they act as repressors (Fig 3) The FOX ‘RNA map’ was also converged upon by two additional approaches that used FOX binding sites and mRNA targets as the starting point The long and nondegenerate nature of the FOX binding site allowed Zhang et al [75] to conduct a computational search for positionally conserved UGCAUG motifs within 200 nucleotides of internal exons across 28 vertebrate genomes Comparing the bioinformatics with data collected from the Castle et al [27] custom exon-junction array for alternative splicing in 47 different tissues and cell lines, they identified the position dependency of FOX binding sites Finally, CLIP-Seq analysis was carried out for FOX2 binding sites in human embryonic stem cells [58] Of 5.3 million 36 nucleotide reads, 4.4 million mapped to unique genomic locations leading to the identification of > 3500 clusters representing genuine FOX2 binding events Surprisingly, although the UGCAUG motif was highly enriched, an exact match was only found in 22% of clusters, and even the core GCAUG pentamer was present in only 33%, FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS M Hallegger et al A B Fig Position-dependent activity of FOX proteins (A) Enrichment of UGCAUG motifs in the downstream intron is associated with increased exon inclusion in heart, skeletal muscle, brain and cerebellar cortex Higher motif frequency in the upstream intron is associated with reduced inclusion in skeletal muscle Adapted from [2] (B) Enrichment of FOX binding sites on the upstream side of alternatively spliced exons, indicated by the blue line, is associated with FOX-dependent exon skipping, whereas enrichment on the downstream side (red line) is associated with FOX-dependent inclusion Adapted from [2,58,64,75] indicating that FOX2 can bind to other sites, perhaps in co-operation with other proteins FOX2 sites were highly enriched around alternative splicing exons and a similar position-dependent FOX2 activity map was deduced Interestingly, it appears that FOX2 is a key player in a splicing regulatory network in human embryonic stem cells The alternative splicing events regulated by FOX2 were highly enriched for splicing regulatory proteins, including numerous hnRNP and SR proteins and an autoregulatory splicing event in the FOX2 gene itself [58] In contrast, different sets of FOX2 targets were identified in neural progenitors, with the major functional enrichment being for cytoskeletal proteins, consistent with other reports [27,58,64,75] Towards a predictive splicing map Global alternative splicing profiling points towards the association of some sequence motifs and their Alternative splicing: global insights cognate binding proteins with some tissue-specific splicing programmes, whereas the NOVA and FOX splicing maps indicate the position-dependent activity of some splicing regulators But even the activity of FOX and NOVA when bound at particular locations is dependent upon the binding and activity of other factors There is still some way to go before a full tissue-specific splicing code, with the ability to predict the consequences of mutations, is deciphered A recent study highlighted one of the important future directions The Frey and Blencowe groups have developed a machine-learning approach in which the tissue-specific splicing profiles of 3707 mouse cassette exons, gathered using a custom junction-array platform [22], have been combined with over a thousand separate ‘RNA features’ in order to generate a ‘splicing code’ that predicts changes in exon inclusion between tissues The features include known protein binding sequences (including FOX, NOVA and PTB ⁄ nPTB), motifs with predicted silencer or enhancer activity, secondary structures, conservation, exon and intron size, and whether exon inclusion or skipping introduces a premature termination codon Using this approach, distinct combinations of features are found to be predictive of five different tissue categories of alternative splicing: central nervous system, muscle, embryo, ‘digestive organs’ (including liver, kidney, gut) and tissue independent (B Frey, personal communication) This pioneering study is based upon a moderate number of cassette exons and 27 tissue-specific datasets, but it provides a clear direction for future endeavours Further refinement of the splicing code will be readily achieved by a combination of additional tissue datasets and analysis of transcriptomes of defined cell types (most tissues contain a variety of differentiated cell types), together with larger numbers and different categories of alternative splicing events The ability to sequence the transcriptomes of single cells [63] will also be enormously helpful as improved methods for sequencingbased quantitative profiling of alternative splicing are developed Of course, defining the logic of the code will pose many questions about the underlying mechanisms For example, why FOX and NOVA proteins inhibit from an upstream position, but activate from downstream of an alternative exon? As the details of the splicing codes are revealed, there will be scope for a great deal of further mechanistic dissection at the molecular level However, in contrast to earlier work on alternative splicing mechanisms, experimentalists will know in advance that they are revealing the mechanisms of generally applicable programmes FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS 863 Alternative splicing: global insights M Hallegger et al Acknowledgements We thank Brendan Frey for comments on the manuscript and for communicating unpublished data Work in the CWJS laboratory is funded by the Wellcome Trust (programme grant 077877) and by EC grant EURASNET-LSHG-CT-2005-518238 16 17 References Pan Q, Shai O, Lee LJ, Frey BJ & Blencowe BJ (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing Nat Genet 40, 1413–1415 Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP & Burge CB (2008) Alternative isoform regulation in human tissue transcriptomes Nature 456, 470–476 Melamud E & Moult J (2009) Stochastic noise in splicing machinery Nucleic Acids Res 37, 4873–4886 Raponi M & Baralle D (2009) Alternative splicing: good and bad effects of translationally silent substitutions FEBS J 277, doi:10.1111/j.1742-4658.2009.07519.x Faustino NA & Cooper TA (2003) Pre-mRNA splicing and human disease Genes Dev 17, 419–437 Matlin AJ, Clark F & Smith CW (2005) Understanding alternative splicing: towards a cellular code Nat Rev Mol Cell Biol 6, 386–398 Soreq L, Gilboa-Geffen A, Berrih-Aknin S, Lacoste P, Darvasi A, Soreq E, Bergman H & Soreq H (2008) Identifying alternative hyper-splicing signatures in MG-thymoma by exon arrays PLoS ONE 3, e2392 Ule J, Jensen K, Mele A & Darnell RB (2005) CLIP: a method for identifying protein-RNA interaction sites in living cells Methods 37, 376–386 Jensen KB & Darnell RB (2008) CLIP: crosslinking and immunoprecipitation of in vivo RNA targets of RNA-binding proteins Methods Mol Biol 488, 85–98 10 Wang Z, Gerstein M & Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics Nat Rev Genet 10, 57–63 11 Ben-Dov C, Hartmann B, Lundgren J & Valcarcel J (2008) Genome-wide analysis of alternative pre-mRNA splicing J Biol Chem 283, 1229–1233 12 Hartmann B & Valcarcel J (2009) Decrypting the genome’s alternative messages Curr Opin Cell Biol 21, 377–386 13 Moore MJ & Silver PA (2008) Global analysis of mRNA splicing RNA 14, 197–203 14 Wang Z & Burge CB (2008) Splicing regulation: from a parts list of regulatory elements to an integrated splicing code RNA 14, 802–813 15 Blencowe BJ, Ahmad S & Lee LJ (2009) Current-generation high-throughput sequencing: deepening insights 864 18 19 20 21 22 23 24 25 26 27 into mammalian transcriptomes Genes Dev 23, 1379–1386 Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R & Majewski J (2008) Genome-wide analysis of transcript isoform variation in humans Nat Genet 40, 225–231 Clark TA, Schweitzer AC, Chen TX, Staples MK, Lu G, Wang H, Williams A & Blume JE (2007) Discovery of tissue-specific exons using comprehensive human exon microarrays Genome Biol 8, R64 Oberdoerffer S, Moita LF, Neems D, Freitas RP, Hacohen N & Rao A (2008) Regulation of CD45 alternative splicing by heterogeneous ribonucleoprotein, hnRNPLL Science 321, 686–691 Gardina PJ, Clark TA, Shimada B, Staples MK, Yang Q, Veitch J, Schweitzer A, Awad T, Sugnet C, Dee S et al (2006) Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array BMC Genomics 7, 325 Yamamoto ML, Clark TA, Gee SL, Kang JA, Schweitzer AC, Wickrema A & Conboy JG (2009) Alternative pre-mRNA splicing switches modulate gene expression in late erythropoiesis Blood 113, 3363–3370 Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, Armour CD, Santos R, Schadt EE, Stoughton R & Shoemaker DD (2003) Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays Science 302, 2141–2144 Fagnani M, Barash Y, Ip JY, Misquitta C, Pan Q, Saltzman AL, Shai O, Lee L, Rozenhek A, Mohammad N et al (2007) Functional coordination of alternative splicing in the mammalian central nervous system Genome Biol 8, R108 Pan Q, Shai O, Misquitta C, Zhang W, Saltzman AL, Mohammad N, Babak T, Siu H, Hughes TR, Morris QD et al (2004) Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform Mol Cell 16, 929–941 Blanchette M, Green RE, Brenner SE & Rio DC (2005) Global analysis of positive and negative pre-mRNA splicing regulators in Drosophila Genes Dev 19, 1306–1314 Blanchette M, Green RE, MacArthur S, Brooks AN, Brenner SE, Eisen MB & Rio DC (2009) Genome-wide analysis of alternative pre-mRNA splicing and RNAbinding specificities of the Drosophila hnRNP A ⁄ B family members Mol Cell 33, 438–449 Hartmann B, Castelo R, Blanchette M, Boue S, Rio DC & Valcarcel J (2009) Global analysis of alternative splicing regulation by insulin and wingless signaling in Drosophila cells Genome Biol 10, R11 Castle JC, Zhang C, Shah JK, Kulkarni AV, Kalsotra A, Cooper TA & Johnson JM (2008) Expression of 24,426 human alternative splicing events and predicted FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS M Hallegger et al 28 29 30 31 32 33 34 35 36 37 38 39 cis regulation in 48 tissues and cell lines Nat Genet 40, 1416–1425 Johnson MB, Kawasawa YI, Mason CE, Krsnik Z, Coppola G, Bogdanovic D, Geschwind DH, Mane SM, State MW & Sestan N (2009) Functional and evolutionary insights into human brain development through global transcriptome analysis Neuron 62, 494– 509 Thorsen K, Sorensen KD, Brems-Eskildsen AS, Modin C, Gaustadnes M, Hein AM, Kruhoffer M, Laurberg S, Borre M, Wang K et al (2008) Alternative splicing in colon, bladder, and prostate cancer identified by exon array analysis Mol Cell Proteomics 7, 1214–1224 Ip JY, Tong A, Pan Q, Topp JD, Blencowe BJ & Lynch KW (2007) Global analysis of alternative splicing during T-cell activation RNA 13, 563–572 Kalsotra A, Xiao X, Ward AJ, Castle JC, Johnson JM, Burge CB & Cooper TA (2008) A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in the developing heart Proc Natl Acad Sci USA 105, 20333–20338 Hung LH, Heiner M, Hui J, Schreiner S, Benes V & Bindereif A (2008) Diverse roles of hnRNP L in mammalian mRNA processing: a combined microarray and RNAi analysis RNA 14, 284–296 Chawla G, Lin CH, Han A, Shiue L, Ares M Jr & Black DL (2009) Sam68 regulates a set of alternatively spliced exons during neurogenesis Mol Cell Biol 29, 201–213 Xing Y, Stoilov P, Kapur K, Han A, Jiang H, Shen S, Black DL & Wong WH (2008) MADS: a new and improved method for analysis of differential alternative splicing by exon-tiling microarrays RNA 14, 1470–1479 Saltzman AL, Kim YK, Pan Q, Fagnani MM, Maquat LE & Blencowe BJ (2008) Regulation of multiple core spliceosomal proteins by alternative splicing-coupled nonsense-mediated mRNA decay Mol Cell Biol 28, 4320–4330 Dhir A & Buratti E (2009) Alternative splicing: role of pseudoexons in human disease and potential therapeutic strategies FEBS J 277, doi:10.1111/j.1742-4658.2009 07520.x Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T et al (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage Proc Natl Acad Sci USA 100, 15776–15781 Velculescu VE, Zhang L, Vogelstein B & Kinzler KW (1995) Serial analysis of gene expression Science 270, 484–487 Iida K, Fukami-Kobayashi K, Toyoda A, Sakaki Y, Kobayashi M, Seki M & Shinozaki K (2009) Analysis Alternative splicing: global insights 40 41 42 43 44 45 46 47 48 49 50 51 52 of multiple occurrences of alternative splicing events in Arabidopsis thaliana using novel sequenced full-length cDNAs DNA Res 15, 155–164 Kim YC, Wu Q, Chen J, Xuan Z, Jung YC, Zhang MQ, Rowley JD & Wang SM (2009) The transcriptome of human CD34+ hematopoietic stem-progenitor cells Proc Natl Acad Sci USA 106, 8278–8283 Ansorge WJ (2009) Next-generation DNA sequencing techniques N Biotechnol 25, 195–203 Calarco JA, Saltzman AL, Ip JY & Blencowe BJ (2007) Technologies for the global discovery and analysis of alternative splicing Adv Exp Med Biol 623, 64–84 Mortazavi A, Williams BA, McCue K, Schaeffer L & Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq Nat Meth 5, 621–628 Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M & Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing Science 320, 1344–1349 Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J & Bahler J (2008) Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution Nature 453, 1239–1243 Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH & Ecker JR (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis Cell 133, 523–536 Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D et al (2008) A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome Science 321, 956–960 Cloonan N, Forrest AR, Kolle G, Gardiner BB, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing Nat Meth 5, 613–619 Li H, Lovci MT, Kwon YS, Rosenfeld MG, Fu XD & Yeo GW (2008) Determination of tag density required for digital transcriptome analysis: application to an androgen-sensitive prostate cancer model Proc Natl Acad Sci USA 105, 20179–20184 Yeakley JM, Fan JB, Doucet D, Luo L, Wickham E, Ye Z, Chee MS & Fu XD (2002) Profiling alternative splicing on fiber-optic arrays Nat Biotechnol 20, 353–358 Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ et al (2007) Genome-wide in situ exon capture for selective resequencing Nat Genet 39, 1522–1527 Venables JP, Koh CS, Froehlich U, Lapointe E, Couture S, Inkel L, Bramard A, Paquet ER, Watier V, Durand M et al (2008) Multiple and specific mRNA processing targets for the major human hnRNP proteins Mol Cell Biol 28, 6033–6043 FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS 865 Alternative splicing: global insights M Hallegger et al 53 Spellman R, Llorian M & Smith CW (2007) Crossregulation and functional redundancy between the splicing regulator PTB and its paralogs nPTB and ROD1 Mol Cell 27, 420–434 54 Tuerk C & Gold L (1990) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase Science 249, 505–510 55 Kim S, Shi H, Lee DK & Lis JT (2003) Specific SR protein-dependent splicing substrates identified through genomic SELEX Nucleic Acids Res 31, 1955–1961 56 Ule J, Jensen KB, Ruggiu M, Mele A, Ule A & Darnell RB (2003) CLIP identifies Nova-regulated RNA networks in the brain Science 302, 1212–1215 57 Licatalosi DD, Mele A, Fak JJ, Ule J, Kayikci M, Chi SW, Clark TA, Schweitzer AC, Blume JE, Wang X et al (2008) HITS-CLIP yields genome-wide insights into brain alternative RNA processing Nature 456, 464–469 58 Yeo GW, Coufal NG, Liang TY, Peng GE, Fu XD & Gage FH (2009) An RNA code for the FOX2 splicing regulator revealed by mapping RNA-protein interactions in stem cells Nat Struct Mol Biol 16, 130–137 59 Sanford JR, Coutinho P, Hackett JA, Wang X, Ranahan W & Caceres JF (2008) Identification of nuclear and cytoplasmic mRNA targets for the shuttling protein SF2 ⁄ ASF PLoS ONE 3, e3369 60 Sanford JR, Wang X, Mort M, Vanduyn N, Cooper DN, Mooney SD, Edenberg HJ & Liu Y (2009) Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts Genome Res 19, 381–394 61 Chi SW, Zang JB, Mele A & Darnell RB (2009) Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps Nature 460, 479–486 62 Sandberg R, Neilson JR, Sarma A, Sharp PA & Burge CB (2008) Proliferating cells express mRNAs with shortened 3¢ untranslated regions and fewer microRNA target sites Science 320, 1643–1647 63 Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Siddiqui A et al (2009) mRNA-Seq whole-transcriptome analysis of a single cell Nat Meth 6, 377–382 64 Venables JP, Klinck R, Koh C, Gervais-Bird J, Bramard A, Inkel L, Durand M, Couture S, Froehlich U, Lapointe E et al (2009) Cancer-associated regulation of alternative splicing Nat Struct Mol Biol 16, 670–676 866 65 Calarco JA, Xing Y, Caceres M, Calarco JP, Xiao X, Pan Q, Lee C, Preuss TM & Blencowe BJ (2007) Global analysis of alternative splicing differences between humans and chimpanzees Genes Dev 21, 2963–2975 66 Blencowe BJ (2006) Alternative splicing: new insights from global analyses Cell 126, 37–47 67 Spellman R & Smith CW (2006) Novel modes of splicing repression by PTB Trends Biochem Sci 31, 73–76 68 McKee AE, Minet E, Stern C, Riahi S, Stiles CD & Silver PA (2005) A genome-wide in situ hybridization map of RNA-binding proteins reveals anatomically restricted expression in the developing mouse brain BMC Dev Biol 5, 14 69 Ray D, Kazan H, Chan ET, Castillo LP, Chaudhry S, Talukder S, Blencowe BJ, Morris Q & Hughes TR (2009) Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins Nat Biotechnol 7, 667–670 70 Ule J & Darnell RB (2007) Functional and mechanistic insights from genome-wide studies of splicing regulation in the brain Adv Exp Med Biol 623, 148–160 71 Buckanovich RJ & Darnell RB (1997) The neuronal RNA binding protein Nova-1 recognizes specific RNA targets in vitro and in vivo Mol Cell Biol 17, 3194–3201 72 Ule J, Ule A, Spencer J, Williams A, Hu JS, Cline M, Wang H, Clark T, Fraser C, Ruggiu M et al (2005) Nova regulates brain-specific splicing to shape the synapse Nat Genet 37, 844–852 73 Ule J, Stefani G, Mele A, Ruggiu M, Wang X, Taneri B, Gaasterland T, Blencowe BJ & Darnell RB (2006) An RNA map predicting Nova-dependent splicing regulation Nature 444, 580–586 74 Auweter SD, Fasan R, Reymond L, Underwood JG, Black DL, Pitsch S & Allain FH (2006) Molecular basis of RNA recognition by the human alternative splicing factor Fox-1 EMBO J 25, 163–173 75 Zhang C, Zhang Z, Castle J, Sun S, Johnson J, Krainer AR & Zhang MQ (2008) Defining the regulatory network of the tissue-specific splicing factors Fox-1 and Fox-2 Genes Dev 22, 2550–2563 76 Wang Z, Tollervey J, Briese M, Turner D & Ule J (2009) CLIP: construction of cDNA libraries for highthroughput sequencing from RNAs cross-linked to proteins in vivo Methods 48, 287–293 FEBS Journal 277 (2010) 856–866 ª 2010 The Authors Journal compilation ª 2010 FEBS ... Authors Journal compilation ª 2010 FEBS 857 Alternative splicing: global insights M Hallegger et al A B Fig High-throughput methods for global analyses of alternative splicing (A) Schematic representation... Towards a predictive splicing map Global alternative splicing profiling points towards the association of some sequence motifs and their Alternative splicing: global insights cognate binding proteins... variation between individuals is common, it is secondary to tissue-specific alternative splicing Alternative splicing: global insights Motifs and maps RNA-Seq and microarray analysis on tissues have

Ngày đăng: 06/03/2014, 09:22

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan