Cloning and characterization of a novel kelch like gene in zebrafish

CLONING AND CHARACTERIZATION OF A NOVEL KELCH-LIKE GENE IN ZEBRAFISH WU YI LIAN NATIONAL UNIVERSITY OF SINGAPORE 2003 CLONING AND CHARACTERIZATION OF A NOVEL KELCH-LIKE GENE IN ZEBRAFISH BY WU YI LIAN (BSc. Hons) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF BIOLOGICAL SCIENCES NATIONAL UNIVERSITY OF SINGAPORE 2003 Acknowledgements ACKNOWLEDGEMENTS I would like to express my deepest gratitude to my supervisor, A/P Gong Zhiyuan, for his invaluable guidance, unwavering patience and mentorship in the course of my research. I am especially grateful for the many opportunities that has been given to me to explore in both the research and management fields, that has made my experience in the lab a enriching and rewarding one. I am also thankful to past and present members of the laboratory, Chen Mingru, Ju Bensheng, Ke Zhiyuan, Kee Peck Wai, Liu Xingjun, Pan Xiufang, Safia SR, Shan Tao, Simon Lim, Sudha PM, Tay Tuan Leng, Tong Yan, Wan Haiyan, Wang Hai, Wang Xukun, Yan Tie, Zeng Sheng, Zeng Zhiqiang for their invaluable advice and help. Life-long friendships have been forged even though we’re no longer working together and I enjoy our little get-togethers every few months. I also want to thank Aaron, Ka-leng, Sandra Tan, Chen Sufen, Jacqueline Tan for your friendship and all the laughter that we’ve shared. Especially to Ka-Leng, for always providing a listening ear, even when you’re miles away. Special thanks also goes to Sandra, my first “Shifu” in the laboratory, for all the patience and guidance over the years. Special thanks also goes to Lay Hua, for her support and all the great times spent. To my parents, thank you for your unconditional love and support in all the decisions and paths that I have chosen to take in my life. For your prayers and also for always reminding me to look to the Lord Jesus. Most of all, I am eternally grateful to God, without whom nothing would be possible. For all the many blessings in my life, and for being my unfailing source of strength and help and hope throughout these years and this thesis. i Table of Contents TABLE OF CONTENTS Acknowledgements i Table of Contents ii List of Figures and Tables iv List of Abbreviations v Summary vii 1 Chapter I. Introduction 1. Beyond the Genome: Turning Data into Knowledge 1.1 The Human Genome Unveiled 1.2 Gene Annotation 1.3 Comparative Genomics 1.4 Expressed Sequence Tags (ESTs) and In Silico Analysis 1.5 Generation of Functional Data using Model Organisms 2. Zebrafish in the Context of the Human Genome Project 2.1 Zebrafish as an Experimental System 2.2 Mutagenesis Screens 2.3 Genomic Infrastructure 2.4 The Syntenic Relationship of the Zebrafish and Human Genomes 2.5 Experimental Tractability 2.6 Zebrafish: From Disease Modelling to Drug Discovery 3. Rationale of the Project 2 2 2 4 8 11 13 13 14 19 21 24 26 28 Chapter II. Materials and Methods 30 1. Cloning of Full Length Zebrafish klhl cDNA 1.1 Rapid Amplification of cDNA Ends (RACE)-PCR 1.2 Recovery of DNA Fragments from Agarose Gel 1.3 Ligation 1.4 Transformation 1.4.1 Preparation of Competent Cells 1.4.2 Transformation 1.5 Colony Screening 1.6 Isolation and Purification of Plasmid DNA 1.7 Automated Sequencing 1.8 Sequence Homology Search 2. Characterization of Zebrafish klhl Expression 2.1 Northern Hybridisation 2.1.1 Isolation of Total RNA 2.1.2 Formaldehyde RNA Gel Electrophoresis and Blotting 31 31 33 33 35 35 35 36 36 37 38 ii 39 39 39 39 Table of Contents 2.1.3 Labelling of Radioactive Probe 2.1.4 Hybridisation 2.1.5 Washes and Autoradiography 2.1.6 Membrane Stripping 2.2 Whole-Mount In Situ Hybridisation on Zebrafish Embryos 2.2.1 Probe synthesis 2.2.2 Preparation of staged zebrafish embryos 2.2.3 In Situ Hybridisation 2.2.4 Incubation with Preabsorbed Antibodies 2.2.5 Staining 2.2.6 Mounting and photography 2.3 Two-Colour Whole Mount In Situ Hybridisation 3. Characterization of Human Ortholog KLHL 3.1 Identification of Human Orthologous Gene KLHL 3.2 Cloning of KLHL Fragment 3.3 Northern Blot Analysis 40 41 42 43 43 43 44 45 45 46 46 47 49 49 49 49 Chapter III. Results 51 1. Identification of ES34 as a Putative Kelch Repeat Protein 2. Molecular Cloning of Zebrafish klhl 3. Sequence Analysis of Zebrafish klhl 4. klhl is Conserved Across Zebrafish, Human, Mouse and Rat 5. Genome Mapping of klhl 6. Developmental Accumulation of klhl 7. Tissue Distribution Analysis of klhl in Adult Zebrafish 8. Expression of klhl is Similar in Human, Rat and Zebrafish 9. Ontogenetic Expression of klhl during Somitogenesis 10. Expression of klhl in Fast and Slow Muscle 11. Expression of klhl during Cardiac Morphogenesis 12. Expression of klhl in Cranial Muscle Development 52 53 53 59 62 67 69 69 72 77 79 79 Chapter IV. Discussion 84 1. Zebrafish as a Model for Vertebrate Biology 2. klhl is a Member of the Kelch Family of Proteins 3. klhl is Expressed in the Somites and Cardiac Muscles 4. Role of klhl in Muscle Structure and Function 5. Comparative Genomics, a Look into Evolutionary History 6. Rapid In Silico Cloning of Genes 7. Future Directions 85 86 88 90 93 95 97 References 99 iii List of figures and tables LIST OF FIGURES AND TABLES Fig. 1 Map of pBK-CMV vector 32 Fig. 2 Map of pT7Blue vector 34 Fig. 3 Nucleotide and predicted amino acid sequence of zebrafish klhl cDNA 54 Fig. 4 Alignment of the kelch repeats of zebrafish klhl and human KLHL 58 Fig. 5 Amino acid sequence alignment of zebrafish klhl, Fugu klhl, human 60 KLHL, mouse (m) Klhl and rat ® Klhl proteins Fig. 6 Genome mapping of klhl 63 Fig. 7 Expression of klhl in developing zebrafish embryos in comparison to 68 two other MSP genes, tpma and mylz2 Fig. 8 Tissue distribution of klhl mRNAs in comparison with tpma and mylz2 70 mRNAs in adult zebrafish Fig. 9 Northern blot analysis of KLHL mRNA in human tissues 71 Fig. 10 Expression of klhl and tpma in zebrafish embryos 74 Fig. 11 Ontogenetic expression of klhl, tpma and mylz2 during the various 76 stages of somitogenesis Fig. 12 Comparison of expression of klhl, tpma, desmin and smbpc in 36 hpf 78 embryos Fig. 13 klhl expression during cardiac morphogenesis 80 Fig. 14 Localization of klhl transcripts in 72 hp embryos 83 Fig. 15 A schematic overview of cytoskeletal linkages in striated muscle 90 Fig. 16 Schematic model of the cytoskeletal filament linkages at the 92 sacrolemma of striated muscle Table 1 Summary of EST clones homologous to klhl iv 73 List of abbreviations LIST OF ABBREVIATIONS aa amino acid AP alkaline phosphatase arp acidic ribosomal protein gene BAC bacterial artificial chromosome BCIP 5-bromo-3-chloro-3-indolyl phosphate bp base pair BTB broad-complex, tramtrack, bric-a-brac cDNA DNA complementary to RNA cmlc2 cardiac myosin light chain 2 cpm counts per minute DEPC diethyl pyrocarbonate DIG digoxygenin DNA deoxyribonucleic acid dNTP deoxyribonucleotide triphosphate EDTA ethylene diaminetetraacetic acid ENU ethylnitrosourea ES embryonic subtractive EST expressed sequence tag FCS fetal calf serum GFP green fluorescent protein HGP human genome project hpf hours post fertilization kb kilo base pair klhl kelch-like gene LB Luria-Bertani medium LG linkage group MA maleic acid MGI Merck Gene Index MOPS 3-(N-morpholino)propanesulfonic acid mRNA messenger ribonucleic acid MSP muscle specific protein MTN multiple tissue blot mya million years ago mylz2 myosin, light polypeptide 2, fast skeletal muscle gene NBT nitroblue terazolium nt nucleotide v List of abbreviations ORF open reading frame PAC P1-derived artificial chromosome PBS phosphate buffered saline PBST PBS, 0.1% Tween 20 PCR polymerase chain reaction PFA paraformaldehyde POZ poxvirus and zinc finger RACE rapid amplication of cDNA ends RAPD randomly amplified polymorphic DNA RH radiation hybrid RNA ribonucleic acid SAGE serial analysis of gene expression SDS sodium dodecyl sulfate smbpc slow myosin binding protein C SSC sodium chloride-trisodium citrate solution SSCT sodium chloride-trisodium citrate solution, 0.1% Tween 20 tpma alpha tropomyosin gene UTR untranslated region vhmc ventricular myosin heavy chain YAC yeast artificial chromosome vi Summary SUMMARY The completion of the human genome project brings with it the task of deciphering and interpreting the sequence, carrying it from sequence to function. The zebrafish has rapidly emerged as the forerunner for scientists riding the next wave of genome exploration, being uniquely positioned to study vertebrate development. In the study, zebrafish was used as the model to isolate and characterize a novel gene, kelch-like, klhl that we had identified in an earlier screen for important genes involved in embryogenesis. klhl was found to be a member of the kelch-repeat superfamily, containing two evolutionary conserved domains- BTB/POZ domain and six kelch repeats. Many members of the kelch-repeat superfamily have been shown to be involved in the organization of cell shape and function. Database mining revealed the presence of putative orthologues of klhl in human, mouse, rat and pufferfish. klhl was determined to map to zebrafish linkage 13 and was found to be syntenic with the proposed ortholog of klhl in human, mouse and rat. In an effort to elucidate the function of klhl, klhl gene expression was compiled by northern and in situ hybridization. klhl is specifically expressed in the fast skeletal and cardiac muscle. Comparisons of klhl with previously identified muscle genes, tpma and mlyz2, indicated that klhl is expressed around 10 hpf and is one of the earliest genes to be expressed in the somitogenic pathway. Northern blot analyses show that the human ortholog, KLHL, is also specifically expressed in the skeletal muscles and heart. In silico analyses of rat EST clones corresponding to rat Klhl ortholog also indicate that its expression pattern in rat is also conserved, suggesting the evolutionary conserved role of klhl. The expression pattern of klhl as well as the presence of the kelch repeats indicate a possible role for klhl in the organization of striated muscle cytoarchitecture. vii Introduction Chapter I Introduction 1 Introduction 1. Beyond the Genome: Turning Data into Knowledge 1.1 The Human Genome Unveiled April 2003 marked the fiftieth anniversary of the discovery of the double helix by James Watson and Francis Crick. A momentous event in the history of biology, the 1953 breakthrough marked a new chapter in science, opening the door to the exploration of many avenues which has become the occupation of researchers all over the world. April 2003 also marked the completion of one of the most important and ambitious scientific projects in history: the sequencing of the human genome (Pennisi, 2003), that fittingly may prove to be an appropriate close to the chapter opened some fifty years before. Involving the coordinated effort of 20 laboratories and hundreds of people around the world, the human genome project (HGP) was an impressive technical and logistical feat with the sequence representing an enormous opportunity to understand biology and accelerate biomedical research. However this represents just the data acquisition phase. Faced with an avalanche of sequence data, researchers are now faced with the daunting task of deciphering and interpreting the data and get more biology from the sequences. Indeed, as well put by the paper on the draft genome of the International Human Genome Sequencing Consortium (Lander et al., 2001), “the human genome project is but the latest increment in a remarkable scientific program whose origins stretch back a hundred years to the rediscovery of Mendel’s laws and whose end is nowhere in sight.” 1.2 Gene Annotation Whilst the human genome was not the first to be sequenced, with over 45 completely sequenced genomes including those of the worm Caenorhabditis elegans and fly Drosophila melanogaster completed by the time the draft sequence was released in 2 Introduction February 2001 (Bernel et al., 2001), it represented a new challenge to researchers with the ultimate goal to compile a complete list of all human genes and their encoded proteins (Lander et al, 2001; Shoemaker et al., 2001). Gene identification is particularly difficult in human DNA owning to the large size of its genome. One of the reasons for the increase in genome size in human as compared to the worm or fly is due to the introns becoming much longer (about 50 kb versus 5 kb). The exons, on the other hand, are roughly the same size (Birney et al., 2001; Lander et al., 2001). Thus, the density of the genes in the human genome was much lower than for any other genome sequenced back in 2001 (Bork and Copley, 2001). For the most part, gene prediction is done computationally. A combination of three basic approaches was employed in the sequencing projects to predict the genes (Lander et al., 2001, Venter et al., 2001). The first approach is based on ab initio prediction of exons based on compositional signals found in the DNA sequence. Groups of exons are identified based on certain computational algorithms that gather statistical information about splice junctions, exon and intron lengths for example (Birney et al., 2001; Lander et al., 1998;). While these ab initio predictions were quite accurate in the fly (Reese et al., 2000) and worm, they would not be so reliable for the human draft sequence. The low signal (exon) to noise (intron) ratio leads to misprediction by computational gene finding strategies. In addition, gaps and errors within the draft sequence would give rise to frameshifts, when the reading frame of the gene is disrupted by the addition or removal of bases (Birney et al., 2001). The second approach is based on direct experimental evidence of transcription provided by expressed sequence tags (ESTs), short sequences of DNA corresponding to a fragment of a complementary DNA (cDNA). Analysing genomic sequences in the context of ESTs provides a more accurate resource for resolving gene 3 Introduction structure against the vast genomic background. This method is however subjected to artefactual and contaminant sequences from heterogeneous nuclear RNA, genomic DNA and vector sequences. Estimation of gene number based on EST numbers have led to varying estimates from 35,000 to 120,000 genes (Ewing and Green, 2000; Liang et al., 2000). The third approach uses indirect evidence based on sequence similarity to previously identified genes and proteins in humans and other organisms. This approach, while effective in identifying genes, cannot differentiate between a functional or nonfunctional (pseudogene) gene. A pseudogene is a non-functional copy that is very similar to a normal gene but that has been altered slightly so that it is not expressed. Also, novel genes cannot be identified by this method. Following the release of the draft sequence, the gene number was put at 30,000 to 40,000 (Lander et al., 2001; Venter et al., 2001), a far cry from the 80,000 – 100,000 genes thought to exist at one time (Gardiner-Garden and Frommer, 1987; Levin, 1990). Of these, ~15,000 were known genes and the remaining 10,000- 20,000 gene predictions of lower confidence, possessing evidence derived only from the bioinformatics approaches of sequence homology and ab initio predictions (Lander et al., 2001; Saha et al., 2002). Even today, following the completion of the human genome sequence, the number of human genes have not been determined conclusively, with Francis Collins, director of the National Human Genome Research Institute (NHGRI) putting it at a little under 30,000 (Pennisi, 2003). 1.3 Comparative Genomics One tool for gene identification that will become more powerful with the completion of more genome projects is comparative genomics. The science of comparative genomics has a long and fruitful history in biology. It has its roots in 4 Introduction Aristotle, who understood that the commonalities among species would facilitate comprehension of the underlying “differentiae” that distinguish animals with common features. Comparing the human genome with those of other species would not only help us understand what makes us genetically different, it may also help us understand our genes, their regulation and expression and their complex interactions (Murphy et al., 2001). One of the most startling things to emerge from the draft sequence was the fact that the human genome, despite being about 30 times larger than the fly and worm genomes, contained only about twice the number of genes (Lander et al., 2001; Venter et al., 2001). It was clear that physical and behavioural differences between species were not simply a consequence of gene number. Comparative studies between human and the fly, and between human and the worm revealed that the biggest difference laid in the complexity of the proteins: more domains per protein and novel combinations of domains (Baltimore, 2001). About 60% of fly proteins and 40% of worm proteins have sequence similarity to predicted human proteins. Yet more than 90% of the domains identified in human proteins were also present in the fly or worm proteins (Lander et al, 2001; Venter et al., 2001). The story is one of new architectures built from old pieces, with shuffling of domains, creating new permutations. While the value of comparative analysis of distantly related organisms is beyond dispute, comparison of closely related genomes would be more important in resolving the issue at hand – identifying the genes and their functions. Comparing conserved sequence regions between two related organisms would allow us to identify genes and other important regions in both organisms with no previous knowledge of either gene content. This is because thanks to natural selection, genes are more likely to retain their sequences through evolution than the DNA surrounding them. However, there are limitations to 5 Introduction functional interferences based on interspecies comparisons of anciently diverged coding sequences (Makalowski and Boguski, 1998). Furthermore, gene regulatory elements are not amenable to comparisons across vast evolutionary distances as they are more divergent (Makalowski and Boguski, 1998). As succinctly put by Rubin (2001), “the ideal species for comparison are those whose form, physiology and behaviour are as similar as possible, but whose genomes have evolved sufficiently that non-functional sequences have had time to diverge”. However, he also warns that in practice, there is no ideal species, because different genes and regulatory sites evolve at different rates. In what is seen as a pilot project to evaluate which genome sequences would be the best appropriate to aid in the annotation of the human genome and the understanding of vertebrate genome evolution (phylogenomics), the National Institute of Health (NIH) Intramural Sequencing Centre is mapping and sequencing segments of 11 vertebrate genomes orthologous to six regions on human chromosome 7. (http://www.nisc.nih.gov) (Thomas and Touchman, 2002). (The 11 genomes are mouse, rat, pig, cow, dog, cat, baboon, chimpanzee, chicken, zebrafish and pufferfish.) The power of comparative sequence analysis with related organisms at suitable evolutionary distances to identify genes have been exemplified in many cases. Crollius and colleagues (2000) reported successes in comparisons between the human genome and that of pufferfish Tetraodon nigroviridis. With a genome eight times more compact than that of human, the pufferfish proved valuable in identifying potential exons in the human genome (Crollius et al., 2000). Through alignment of mouse DNA related to human chromosome 19, Stubbs and her group identified exons, regulatory elements, and candidate genes that were missed by other predictive methods (Dehal et al., 2001). 6 Introduction Recently, the draft sequences of the Fugu and mouse genome and the comparative analyses with the human sequence were published in August 2002 and December 2002 respectively (Aparicio et al., 2002; Waterston et al., 2002). Preliminary analysis of the pufferfish genome by Aparicio and colleagues suggest that the Fugu gene dataset may help uncover as many as 1000 novel human genes in the human genome. Conserved gene order or synteny was also discovered between the human and Fugu genes. Findings from the mouse genome support the notion that there are only about 30,000 genes in a typical mammalian genome, 99% of which have a sequence match in the human genome. 96% of these genes lie with syntenic regions of mouse and human chromosomes (Waterston et al., 2002). The comprehensive conservation of linkage between the human and mouse genome (http://www.ncbi.nlm.nih.gov/Homology) has several practical applications. First, the comparative maps allow the rapid identification of gene orthologs. Two genes are orthologous if they diverged after a speciation event, when a new species forms from an existing one; two genes are paralogous if they diverged after a gene duplication event. The identification of orthologs is particularly useful when investigating disease phenotypes (Watkins-Chow et al., 1997; Lander et al., 2001), allowing the correlation of mouse models and human disease. This also facilitates the positional cloning of disease genes. Second, the study of conserved segments among genomes provides insights into the rates and patterns of chromosomal evolution, as well as into the forces that help to shape the genomes of modern-day animals (O’Brien et al., 1999; Lander et al., 2001; Murphy et al., 2001). Third, cross-referencing of human and mouse genomes aids in the assembly of the mouse sequence using the human sequence as a scaffold (Lander et al., 2001). 7 Introduction Indeed, it seems that for the immediate future, the most dramatic developments in eukaryotic genome biology are likely to be in comparative genomics (Taylor, 2001). Advanced technologies of the HGP have been harnessed to describe the complexities of genome organization not only in the mammalian species (mouse, rat, dog, chimp) but also in other vertebrates such as the pufferfish and zebrafish.. Each of these whole genome shotgun sequences is expected to fill in a piece of the evolutionary history, providing us with a better insight into the laboratory notebook of evolution. 1.4 Expressed Sequence Tags (ESTs) and In Silico Analysis Playing a complementary role to the genome sequencing projects is the EST sequencing projects. In the 1990s, Brenner (1990) and other investigators advocated the large-scale sequencing of transcription products of genes, in the form of cDNAs, as a prelude to genomic DNA sequencing. The rationale for this was that it would be more useful and cost effective as the protein-coding regions of our genes only make up ~3% of the entire genome. The remaining 97% is of unknown function and often referred to as “junk DNA”. The era of high-throughput cDNA sequencing was initiated in 1991 by a landmark paper by Adams and colleagues (1991) demonstrating the richness of data that could be derived from an EST sequencing project. The basic strategy involved the random selection of cDNA clones after which single-pass sequencing was performed. This sequencing could be from either the 5’ and/or 3’ end of the clone, and the sequence is not checked for errors or artefacts. In their article, they generated partial sequences from 609 randomly selected cDNA clones from a human brain library. Of these 609 sequences, 197 (32%) matched to human sequences, 48 (8%) matched to entries of other organisms and 230 (38%) had no significant matches. The results demonstrated that sufficient 8 Introduction information was contained in 150 to 400 bases of a nucleotide sequence from one sequencing run for preliminary identification of the cDNA. In addition, it revealed the utility of ESTs for novel gene discovery. The use of ESTs in the identification of genes has been exemplified in numerous studies. Most recently however was the use of ESTs in the prediction of genes on human chromosome 21 (Hattori et al., 2000). Of the 225 genes identified on chromosome 21, 42 genes were only identified with the use of ESTs (Yuan et al., 2001). This represented 18.7% of the gene identification process that relied on EST sequences. Besides its use in gene identification and annotation of genomic sequences, ESTs have assumed important roles in the construction of gene-based physical maps of several genomes, including that of human (Schuler et al., 1996). In this application, PCR or hybridisation assays developed from ESTs can be used to identify bacterial artificial chromosomes (BACs), or other types of large insert clones from which genome physical maps are constructed. Placement of ESTs onto a physical map immediately identifies the genomic intervals that contain the sequences for the gene (Marra et al., 1998). Since then, EST projects have been initiated on a diverse collection of organisms that include C. elegans, D. melanogaster, rat, mouse and zebrafish. For many of these organisms, the ESTs could be subdivided further into tissue types. The EST database, dbEST, is the fastest growing division of the GenBank (Pandey and Lewitter, 1999). To date, over 18,762,324 sequences from 594 species have been reported in the database (dbEST release 3 October 2003, http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html). While this large dataset of DNA sequences is data rich, it is unfortunately information poor with absence of additional correlative data. The sequence generated is generally of poor quality with 9 Introduction misreads and filled with library construction and sequencing artefacts (Yuan et al., 2001). Such a situation thus led to the development of EST gene indices such as the UniGene (Boguski and Schuler, 1995; Schuler et al., 1996), Merck Gene Index (MGI) (Eckman et al., 1998) and TIGR Gene Index (Quakenbush et al., 2000). The goal of all gene indices is to reduce the vast amount of data into a organized catalogue from which one can determine how many unique transcripts exist and whether a new sequence falls into any of the existing ESTs cluster (Yuan et al., 2001). The UniGene database (http://www.ncbi.nlm.nih.gov/UniGene) categorizes GenBank sequences into a nonredundant set of gene-oriented clusters, where a single cluster represents all the ESTs that correspond to a unique gene. Related information, such as the tissue types in which the gene is expressed and its location is also provided. Currently, the UniGene database contains 13 data sets, eight of which belong to animals. The eight organisms are human, mouse, rat, fly, zebrafish, clawed frog, cow and mosquito. Large-scale sequence comparisons have also been used to cross-reference the sequence clusters of the various organisms. The HomoloGene database (http://www.ncbi.nlm.nih.gov/HomoloGene) displays curated and calculated orthologs and homologs for nucleotide sequences represented in UniGene. The advent of such databases ushers in a new era in which classical biological analyses that were once performed at the bench are now performed rapidly in silico (Pandey and Lewitter, 1999). Since one gene is often represented by multiple ESTs, it is possible to generate a contiguous sequence by assembling ESTs that overlap. Such in silico cloning methods are nowadays used regularly to complete the mRNA sequence or to identify novel gene orthologs and homologs. In addition, in silico expression data, which is obtained by simply counting the frequency of ESTs, is often seen accompanying a paper reporting the cloning of a new gene (Ko, 2001). 10 Introduction 1.5 Generation of Functional Data using Model Organisms With the large amount of data accumulating from the genome project, it is no surprise that in silico analysis is very much in evidence. There is a heightened expectation that the increasingly powerful computer analyses of computer databases today would be sufficient to take us from sequence to function. Indeed much of what we know about the function of human genes is inferred computationally. To rectify this problem, studies are underway to generate functional data in model organisms. Annotation by sequence similarity or domain structure is usually the first step performed in many studies, but such predictions can sometimes be unreliable and misleading. Genes of similar sequences may have acquired new functions during evolution. This is particularly true for duplicated genes. In their study of the triplicate Drosophila genes paired, gooseberry and gooseberry-neuro, Li and Noll (1994) suggested that following duplication, genes acquire new functions by changes in their regulatory regions generating an altered expression. Adaptation of the protein is “secondary and a necessary consequence of its expression in the newly acquired context of this function” (Xue et al., 2001). Further studies by Xue et al. (2001) also implied that while the Cterminal portions of paired and gooseberry are divergent in their primary sequences, they were qualitatively the same. Such results led Noll’s groups to question the validity of amino acid similarity as a general measure of functional equivalence in homologous proteins (Xue and Noll, 1996; Xue et al., 2001). Thus information in databases is not by itself, sufficient to determine biological function but serve as a foundation for the design of detailed experimental studies to establish the actual function of the molecules. Much more information about gene function can be obtained from knowing expression patterns and gain- or loss-of function studies and model organisms would 11 Introduction feature heavily in this respect. Such studies can realistically be done only in model organisms not only because of ethical and social issues, but more importantly because the sophisticated genetic and transgenic experimentation needed to resolve the complex biological networks are not available in humans. Genome-wide initiatives in assessing expression and function are underway for all model organisms. The Berkeley Drosophila genome project, for one, is surveying the expression of all Drosophila genes by wholemount in situ hybridisation in embryos and creating a catalogue of gene mutations by insertions of P elements or Gal4 activation domains into many different sites in the genome (Spradling et al., 1995; 1999; Kopczynski et al., 1998). The question to be asked at this point however would be the extent of functional interchangeability of the genes among the different organisms. Over the years, it has emerged from studies in many animal models, not only individual protein domains and proteins, but entire biochemical pathways are conserved throughout evolution (Miklos and Rubin, 1996). In the Ras and Notch signalling cascade, for example, many of the protein components are conserved between yeasts, flies, worms, and humans (Artavanis-Tsakonas et al., 1995, Wasserman et al., 1995). Knowledge of the biological role of a shared protein in one organism can then be transferred to other organisms. The extent to which a disease or biological process can reasonably be modelled in an organism phylogenetically different from us must be critically examined otherwise we run the risk of creating interesting but useless information which might confound the issue (Margolin, 2001). The genome projects in each of the model organisms would greatly facilitate this work and with the human genome sequence, allow the speedy transfer of knowledge to human biology. 12 Introduction 2. Zebrafish in the Context of the Human Genome Project One of the most promising model organisms to emerge in light of the HGP is the zebrafish (Danio rerio), a small tropical freshwater teleost fish. It is “a dream system for scientists riding the next wave for genome-wide exploration” (Fishman, 2001). A combination of various factors ensures that the zebrafish will have an important role in the functional analysis of the human genome. Some of these factors include its tractability in mutagenesis screens to the availability of genomic resources which will be elaborated in the next sections. 2.1 Zebrafish as an Experimental System Originating from the Ganges river in India, the zebrafish first emerged as a model system for the study of developmental biology in the 1980s. Pioneering the use of this inexpensive fish was George Streisinger and colleagues (1981) at the University of Oregon who recognized the many virtues of this experimental system for genetic analyses. Some of these virtues include its short generation time, the large brood size and the external development of clear, transparent embryos, which makes the zebrafish embryos experimentally accessible. Development is rapid and with 12 hours after fertilization one can visualize the establishment of a body plan that is typically vertebrate (Westerfield, 1989). By 5 days after fertilization, most organs, or at least their primordia are in place (Kimmel et al., 1995). Laboratory methods for its husbandry are well established (Westerfield, 1994) and the stages of embryonic development thoroughly described and characterized (Kimmel et al., 1995). While the significance of Streisinger’s work with zebrafish was not widely recognized at that time, it marked the birth of a new animal model system that has since risen to become a pre-eminent model in biomedical research 13 Introduction (Beier, 1998; Grunwald and Eisen, 2002; for recent reviews, see Shin and Fishman, 2002, Ackermann and Paw, 2003 and Rubinstein, 2003). 2.2 Mutagenesis Screens The ability to carry out classical forward genetic analyses with zebrafish was largely responsible for its rise in prominence. Since its early days as a research organism, the appeal of the zebrafish has relied on its potential use in genetic screens which was unique among vertebrate model organisms. Today, no other vertebrate can rival the repertoire of zebrafish mutagenesis tools, breeding strategies and screening methods (Malicki et al., 2002). Previously, saturation mutagenesis of Drosophila had been used successfully by Nüsslein-Volhard and Eric Wieschaus to uncover more than 200 genes involved in pattern formation and unravel the regulatory cascade of molecular events (Nüsslein-Volhard and Wieschaus, 1980; Kalthoff, 1996). The results of such studies had been extrapolated successfully to vertebrates with mutations in the vertebrate homologue of the gene having profound developmental consequences. This demonstrated the conservation of pathways even in highly divergent organisms like Drosophila and the mouse. Despite this, several new features characterize the vertebrate which are not present in invertebrates, specifically with respect to organ form and function. Some examples include the development and function of the notochord, kidneys and multi-chambered heart, which are unique in vertebrates (Driever and Fishman, 1996; Fishman, 1999; Dooley and Zon, 2000). Within vertebrates, these processes have been well conserved. Little, however, was known about them. A similar analysis was thus proposed in vertebrates to uncover loci of developmental importance, especially those important in 14 Introduction organ form and function, which were not scored in Drosophila screens (Nüsslein-Volhard, 1994). Saturation mutagenesis screening had previously been applied only to invertebrates as the large number of animals needed for screens deemed them prohibitively expensive for vertebrates other than the zebrafish. The zebrafish possessed some advantages over the other more established vertebrate models such as the mouse and Xenopus, both of which do not breed prolifically and the embryos are not readily observable, making them unsuitable for the long, laborious screening process (Kahn, 1994). All these factors led to the zebrafish becoming the vertebrate of choice for random, genome-wide, large-scale mutagenesis of genes crucial for vertebrate development (Driever et al., 1996; Haffter et al., 1996; Schulte-Merker, 2000). The first large-scale genetic screens in vertebrates were carried out in zebrafish in 1996 using the chemical mutagen ethylnitrosourea (ENU). Undertaken by groups in Massachusetts General Hospital, Boston and Max Planck Institute, Tüebingen, the two screens, conducted in parallel, identified more than 2,000 mutants involved in embryonic development (Driever et al., 1996; Haffter et al., 1996). The basis of the screens was an outgrowth of the work that had previously been done in Drosophila (Nüsslein-Volhard and Wieschaus, 1980). Random mutations were induced by treating the male fish with ENU, which was known to be an efficient germ-line mutagen in mice. ENU generates single-nucleotide mutations in the germ-line principally by alkylating guanine residues with consequent GC→AT transitions (Solnica-Krezel et al., 1994). The levels of ENU administered had been titered to generate one to two mutations per haploid genome (Mullins et al., 1994; Solnica-Krezel et al., 1994). The mutants were then bred to homozygosity in a three-generation scheme (Driever et al., 1996; Haffter et al., 1996). 15 Introduction The main tool for identification of mutant phenotypes was detailed visual inspection of the embryos under the dissecting microscope (Driever et al., 1996; Haffter et al., 1996). This inspection was performed at five different stages during embryonic and early larval development. By the time the studies were performed, the development of the zebrafish embryo had been studied in detail, from the pre-gastrula and gastrula stages to the pharyngula stages through to the early larval period (Kimmel et al., 1995), lending to a strong base of knowledge for the identification of mutant phenotypes. The mutations are believed to have affected more than 500 genetic loci, affecting an impressive range of targets: eye, pigment, kidney, notochord, muscle, brain and fins, just to name a few (Warren and Fishman, 1998). The screens and the mutants uncovered were the subject of an entire issue of the journal Development (December 1996 volume 123) and the study was described in Science as “an accomplishment of historic proportions” (Grunwald, 1996). However, these first screens were not saturating, and concentrated on the identification of genes involved in early development (Driever et al., 1996; Haffter et al., 1996). The Tüebingen group has undertaken a second saturation mutagenesis screen of the zebrafish, Tüebingen 2000, in collaboration with Artemis Pharmaceuticals and this second screen is aiming more at the later stages of organogenesis (Schulte-Merker, 2000). The expectation that the zebrafish model will introduce screens as a standard tool of vertebrate genetics has been fulfilled. In addition to the large-scale screens, a number of smaller screens have been conducted in zebrafish, identifying numerous other loci required for different physiological processes. The utility of zebrafish in such screens is due largely to the establishment of techniques allowing the manipulation of the ploidy and parental origin of genes in zebrafish (Streisinger et al., 1981; Kimmel, 1989). The ability 16 Introduction to generate haploid embryos, for example, facilitates genetic screens by eliminating a generation or more from crossing schemes (Kimmel, 1989; Walker, 1999). Such genetic screens, based on analysis of zebrafish haploid or parthenogenetic diploid embryos, have been used to identify genes required during embryogenesis (Henion et al., 1996; Alexander et al., 1998; Beattie et al., 1999). Besides the different screening methods, there are also several means by which mutations can be induced in the zebrafish germ-line, mainly chemical mutagenesis, radiation methods and insertional mutagenesis (Knapik, 2000). Chemical mutagenesis using ENU is by far the most widely employed method in zebrafish as it is effective and easily administered by incubating the fish in ENU. Other chemicals that have been used include EMS and TMP which cause small deletions. Radiation methods using X-rays and gamma rays are routinely performed in zebrafish laboratories to induce genome-wide mutations. Causing large multigene lesions, this method is not useful for the annotation of genes by functions. The last method of insertional mutagenesis involves the insertion and integration of exogenous DNA sequences into the genome, disrupting the genes at the site of insertion. While insertional mutagens have been shown to be less efficient than chemicals (Spradling et al., 1995; Schier et al., 1996), this system shows extraordinary potential as the inserted DNA serves as a tag to clone the mutated gene. This greatly speeds up the normally laborious process inherent with the use of chemical mutagens. The average time taken to clone a gene responsible for a ENU-induced mutation is about 1.5 years, although it is expected to decrease to 9 months following completion of the zebrafish genome project (Chen et al., 2002). At the moment, the genes underlying only about 50 mutants have been reported out of the hundreds of mutants uncovered in the mutagenesis screens (Golling et al., 2002). Many of these genes have been previously 17 Introduction described as important developmental genes in other species. Efficient methods of insertional mutagenesis would thus contribute significantly to the task of assigning functions to genes. Several advances have been made towards the use of insertional mutagenesis in zebrafish with the use of retroviruses. In 1994, Nancy Hopkins and her group identified a pseudotyped retroviral vector that could infect the zebrafish germ-line (Lin et al., 1994). The pseudotyped retrovirus system was found to be able to generate a large number of insertions at different loci very efficiently (Gaiano et al., 1996a) and this has made it possible for large-scale insertional mutagenesis to be performed (Gaiano et al., 1996b; Amsterdam et al., 1999; Golling et al., 2002). Several genes have been identified using this technology (Allende et al., 1996; Becker et al., 1998; Kawakami et al., 2000a; Golling et al., 2002). More noteworthy is the fact that it takes as little as two weeks to identify the retrovirally mutated gene (Golling et al., 2002). In addition, many of the genes identified using insertional mutatgenesis are novel genes without known biological or biochemical functions. The number of genes cloned by insertional mutagenesis is expected to rise quickly with the development of a high-titer retrovirus producer cell line, circumventing the problem of reproducibly making high-titer, non-toxic virus preparations (Chen et al., 2002). According to Chen et al. (2002), preparations from this line allowed the generation of about 500,000 germ-line-transmissible insertions in a population of 25,000 founder fish in about 2 months. Transposons have also been evaluated for their efficacy and use in insertional mutagenesis system in zebrafish (Ivics et al., 1999). While still in its infancy, several transposon systems show great potential as a tool to develop insertional mutagenesis. Some examples include the Tol2 element from medaka (Kawakami et al., 2000b) and the 18 Introduction synthetic Sleeping Beauty (SB) transposon systems (Ivics et al., 1997; Hackett et al., 2001). In particular, the SB system has been used for insertional mutagenesis employing both gene-traps and enhancer-traps (Hackett et al., 2001). 2.3 Genomic Infrastructure Another virtue of the zebrafish lies in the wide availability of zebrafish genetic and genomic resources. Zebrafish mutations identified in the screens define the function of hundreds of essential genes in the vertebrate genome. For these mutants to be useful, cloning of the mutated genes is essential to allow the elucidation the molecular mechanisms underlying cellular function (reviewed in Postlethwait and Talbot, 1997). The two main approaches of cloning mutated genes, positional cloning and candidate gene approach, have benefited greatly from the recent advances in zebrafish genomic infrastructure (reviewed in Talbot and Hopkins, 2000; Malicki et al., 2002). The efficient identification of genes disrupted by mutation in zebrafish requires dense maps of the genome. Prior to 1994, there was no genetic map for zebrafish and the paucity of resources such as large-insert genomic libraries rendered the task virtually impossible (Malicki et al., 2002). Today, a full array of genomic and molecular genetic tools is available. Large-insert genomic libraries needed for positional cloning have been generated. To date, two zebrafish yeast artificial chromosome (YAC) libraries, one bacterial artificial chromosome (BAC) library, and one P1-derived artificial chromosome (PAC) library have been constructed (Zhong et al., 1998; Amemiya et al., 1999) and used successfully to isolate known genes and/or genomic regions (Amemiya et al., 1999). Several genetic linkage maps have been developed which cover essentially the entire genome (see Talbot and Hopkins, 2000) in which each chromosome is represented by a 19 Introduction single linkage group (Johnson et al., 1996). Among vertebrates, only human, mouse, rat, and zebrafish have closed linkage maps. More than 3845 microsatellite (CA) repeats have been meiotically mapped since the last update in July 2001, providing an average resolution sufficient to initiate positional cloning (Shimoda et al., 1999; http://zebrafish.mgh.harvard.edu). Published genetic linkage maps have also localized ~1500 cloned genes and ESTs (Postlethwait et al., 1998; Gates et al., 1999; Kelly et al., 2000; Woods et al., 2000). Radiation hybrid (RH) maps with markers which include simple sequence length polymorphisms (SSLPs), cloned genes and ESTs, have been developed for zebrafish (Kwok et al., 1998; Geisler et al., 1999; Hukriede et al., 1999, 2001). The two zebrafish RH maps, LN54 and Goodfellow T51, together cover >90% of the zebrafish genome (Talbot and Hopkins, 2000) and will provide a framework for the EST sequencing and mapping projects currently underway. As of dbEST release 3 October 2003, the zebrafish EST sequences deposited in GenBank number 362,362, making it the eight highest species in a list of 594 species. Efforts have also been initiated to obtain the complete sequence of the zebrafish genome, a feat that will undoubtedly increase the usefulness of the genetic and genomic tools in the fish. While the finished zebrafish genome is expected to be completed only in 2005 by the Sanger Institute, sequences from the whole genome shotgun and clone sequencing project are made available online (http://www.sanger.ac.uk/Projects/D_rerio/). Zebrafish sequences are also available through the ensembl website which features the zebrafish whole genome shotgun assembly sequence version 2 as released on the 3rd April 2003 (http://www.ensembl.org/Danio_rerio/). Last but not least, the utility of the genomic infrastructure to the community of zebrafish investigators is heavily dependent upon the existence of mechanisms that 20 Introduction facilitate access to this information. As more labs started working with the zebrafish, the Zebrafish Information Network (ZFIN) (http://zfin.org) was set up as to cope with the phenomenal rate of increase of information. The ZFIN is a centralized database for zebrafish researchers, providing links and information about zebrafish genes, mutations, genetic maps etc (Westerfield et al., 1999a,b; Sprague et al., 2003). In addition, zebrafish resources are also available from the NCBI site (http://www.ncbi.nlm.nih.gov/genome/guide/D_rerio.html). 2.4 The Syntenic Relationship of the Zebrafish and Human Genomes The third virtue of the system is the conservation of synteny between zebrafish and human genomes. Besides facilitating the identification of mutants by positional cloning and the candidate gene approach, the genetic maps have been useful in comparative studies between zebrafish and other vertebrate genomes. By comparing the map positions of zebrafish genes and their mammalian orthologs, Postlewait et al. (1998) discovered that a significant fraction of genes show synteny between the genomes, conserved chromosome segments. In general, the likelihood that a syntenic relationship will be disrupted correlates with the physical distance between the loci and the evolutionary distance between the species. Despite the 450 million years of evolutionary distance between zebrafish and human (Kumar and Hedges, 1998), analyses have identified 167 conserved syntenies involving two or more putatively orthologous genes (Gates et al., 1999; Woods et al., 2000). Furthermore, the analyses also identified 136 orthologus pairs that were not members of conserved syntenies. While this may reflect errors in mapping or in orthology determination, they may also nucleate additional synteny groups as additional genes are mapped. A minimum estimate of ~300 conserved synteny groups was thus 21 Introduction estimated between the zebrafish and human genomes (Wood et al., 2000). Similar results were obtained in another study done at the same time (Barbazuk et al., 2000). Analyses of mouse and human, as well as zebrafish and human synteny groups have also led to the conclusion that mouse and human, which diverged ~112 million years ago (mya), have greater conservation than zebrafish and human (Gates et al., 1999; Woods et al., 2000). Despite the current gaps in the zebrafish-human comparative map, conservation of synteny between the two has had several uses. First, such analyses have been valuable in defining candidate genes for zebrafish mutant (Karlstrom et al., 1999; Schmid et al., 2000). For example, the yot locus was mapped to linkage group 9 (LG9) which had been shown to be syntenic to human chromosome 2. A survey of genes on human chromosome 2, together with an inference that yot mutations affected Hedgehog signalling led to the identification of gli2 as a candidate for yot (Karlstrom et al., 1999). Second, the correspondence between the zebrafish and human genome may be used to predict orthologous gene relationships (Barbazuk et al., 2000). While orthologs are best identified by branching patterns on phylogenetic trees, this approach is not feasible for many of the ESTs (Woods et al., 2000). The sequence-based prediction of gene orthology is however sometimes not reliable, particularly in the case of multigene families. A synteny-based approach might be useful in resolving the issue. Based on the syntenic correspondence of zebrafish and human genomes, Barbazuk et al. (2000) suggested human orthologs for 20 genes or ESTs out of 32 whose ortholog relationships could not be confidently identified by BLAST. Third, zebrafish comparative maps can help in the understanding of the vertebrate genome, particularly as a valuable outgroup, distinguishing shared features of mammalian genomes and those derived from ancestral genomes. (Postlethwait et al., 1998, 2000; Gates et al., 1999; Woods et al., 2000). 22 Introduction Comparative mapping data suggests that a genome duplication event occurred early in the lineage leading to zebrafish following its divergence from the tetrapods. Numerous studies reveal that teleosts gene families often contain more members than the equivalent families in mammals (reviewed in Wittbrodt et al., 1998). For example, there are four engrailed genes in zebrafish while tetrapods have only two members (Force et al., 1999). Mapping studies also suggest that these events were the result of whole-genome duplication instead of tandem duplications as zebrafish has two copies of large chromosome segments surrounding the engrailed genes syntenic to mammalian genomes. The findings of the engrailed genes were corroborated by similar studies (Amores et al., 1998; Postlethwait et al., 1998; Gates et al., 1999). Evidence in other teleosts like medaka and pufferfish, suggests that this event occurred early in the evolution of the teleost lineage (Wittbrodt et al., 1998; Smith et al., 2002). The data from such studies can also help clear up the origin of the human genome. In their analysis of zebrafish comparative maps, Postlethwait et al. (2000) have thrown up some intriguing hypotheses addressing whether certain mammalian chromosomes may have been part of larger composite chromosomes that subsequently underwent chromosome fission in different mammalian lineages. Following the whole genome duplication of zebrafish after divergence with the tetrapods, zebrafish should have twice as many chromosomes as humans in the absence of chromosome rearrangements. Zebrafish, however, only has 25 chromosomes in the haploid set, 2 more than humans. By examining the loci in zebrafish and the various tetrapods, human, mouse and cat, Postlethwait et al. (2000) suggests that tetrapods and fish both had a low-numbered ancestral vertebrate karyotype, possibly 12 or 13 chromosomes in the haploid set. In the single round of duplication leading to the teleost lineage, these would have doubled to the 24 or so chromosomes characterizing most fish 23 Introduction genomes while in mammals, these would have broken apart into the high numbered karyotypes defining many mammalian genomes. 2.5 Experimental Tractability Another virtue of the zebrafish is the array of cellular, molecular and genetic techniques available in the zebrafish system. Methods of introducing DNA into zebrafish embryos have included microinjection, electroporation and the use of microprojectiles. The microinjection of plasmid DNA has proven to be the most reliable method of producing transgenic zebrafish. Transgenic zebrafish carrying the green fluorescent protein (GFP) derivatives have been successfully generated for many studies including cell lineage tracing experiments, promoter studies and tissue-specific transgene expression for example (reviewed in Gong et al., 2001). Such GFP transgenic fishes under the control of tissue-specific promoters may come in useful in future mutagenesis studies targeting specific tissues and organs. There has also been the development of other types of transgenics in zebrafish, including the GAL4-UAS (Sheer and Campos-Ortega, 1999) and cre-loxP system, which allows one to express a gene product in a directed stage- and tissue-specific manner. Such systems allow the function of a gene product to be determined in any given process, particularly in cases where its function in later stages is obscured by phenotypic consequences accrued in the early stages of embryogenesis. More recently, Ando et al. (2001) reported a new method of conditional gene expression in zebrafish involving photo-mediated activation of caged mRNA. This method is simple, rapid and economical, not requiring the generation of any transgenic lines. It involves the chemical modification of RNA by a synthetic compound 6-bromo-4-diazomethyl-7hydroxycoumarin (Bhc-diazo) which forms a covalent bond with the phosphate group on 24 Introduction the backbone of RNA, inactivating or caging the RNA. This Bhc-caged mRNA is reactivated by photoillumination with long-wave ultraviolet (UV) light (350-365 nm) as Bhc undergoes photolysis, uncaging the RNA. Using this method, Ando et al. (2001) showed the Bhc-caged Gfp mRNA had severely reduced translational activity in vitro, whereas illumination of Bhc-caged mRNA with UV light led to partial recovery of translational activity. Besides gain-of-function analyses using the ectopic expression of genes, loss-offunction analyses are also important to fully determine the function of a gene in vivo. While such reverse genetics approaches such as gene knockouts used to be severely lacking in zebrafish, or rather in all vertebrate systems other than the mouse, recent advances have improved the prospects in zebrafish. Ma et al. (2000) demonstrated that zebrafish cells obtained from short-term cell cultures could generate germ-line chimeras following their introduction into a host embryo. Shuo Lin and colleagues reported the nuclear transfer in zebrafish using long-term-cultured donor cells (Lee et al., 2002), holding promise for gene targeting in zebrafish. Recently, Wienholds and colleagues reported the first successful report of generation of a fish mutant for rag-1 by reverse genetics (Wienholds et al., 2002). In this method, male fish were first mutagenized by ENU and crossed with wild type females. Sperm was then collected from individual F1 fish. After nested PCR amplification screening for a mutation in a gene of interest, they recovered and bred "target-selected" zebrafish. Although further steps are still required to develop the gene knockout methodology, the work reported in these studies shows promise in the future for introducing targeted mutations into zebrafish. While the gene knockout technology is still not available, the advent of translationblocking morpholino oligonucleotides has led to a method of sequence-specific gene 25 Introduction inactivation in zebrafish (Nasevicius and Ekker, 2000; Ekker and Larson, 2001; Malicki et al., 2002). Morpholinos have been shown to effectively and specifically induce phenotypes similar to that of chemically induced loss-of-function genes (Nasevicius and Ekker, 2000). More recently, a new reverse genetics tool was described in zebrafish using modified peptide nucleic acids (MPNA) to selectively shut down the production of individual proteins (Jesuthasan, 2002; Urtishak, 2002). A variant of a reverse genetic screen, large-scale whole-mount in situ hybridisation screens are feasible in zebrafish owning to the transparency of the embryos. Such screens have been used successfully to identify important genes involved in embryonic development (Meng et al.,1999; Kudoh et al., 2001). 2.6 Zebrafish: From Disease Modelling to Drug Discovery The repertoire of techniques available in zebrafish has added to its sheer elegance as a model organism and the zebrafish is uniquely positioned to bridge the gap between its vertebrate and invertebrate counterparts in studies of development and genetics. In addition to its developmental advantages, recent studies indicate that the zebrafish has a great potential to serve as a model for human disease that range from heart failure and vascular disease to fields as diverse as osteoporosis, renal failure, Parkinson’s disease, diabetes and cancer (for recent reviews, see Shin and Fishman, 2002; Ackermann and Paw, 2003). Many of the mutant phenotypes identified in the mutagenesis screens are reminiscent of human clinical disorders. The validity of using the zebrafish as a model for human disease is illustrated by the various examples of zebrafish mutant phenotypes with clinical relevance in the various fields of haematopoiesis (Brownlie et al., 1998; Wang et al., 1998), cardiac and renal development (reviewed in Dooley and Zon, 2000; Ward and 26 Introduction Lieschke, 2002) among others. The study of the biology of the phenotypes has provided new insights into the pathophysiology of the disease. For example, the work of Brownlie et al (1998) in identifying the sauternes (sau) mutant represented the first animal model of congenital sideroblastic anaemia (CSA) in humans. The sau mutant is characterized by delayed erythroid maturation and abnormal globin gene expression, resulting in a microcytic, hypochromic anaemia. Positional cloning identified the mutant gene as encoding for a erythroid-specific enzyme δ-aminolevulinate synthase (ALAS2), required for haem biosynthesis. In humans, mutations in ALAS2 cause CSA. More recently, Langenau et al. (2003) reported the induction of clonally derived T cell acute lymphoblastic leukemia in transgenic zebrafish expressing mouse c-myc under control of the zebrafish Rag2 promoter. Such transgenic oncofish may be used in drug screens for prevention and treatment of tumours as well as in genetic screens for identifying mutations that suppresses or enhance tumorigenesis. The current momentum behind the zebrafish as a model organism augurs well not only for developmental biologists, but also for those dissecting the genetic components of human disease. The ex utero development of transparent zebrafish embryos also lends its hands to the search for drugs and novel therapeutic approaches in a ‘chemical genetic’ approach (Peterson et al., 2000; Shin and Fishman, 2002; Kid and Weinstein, 2003; Langheinrich, 2003). The zebrafish embryo is permeable to many small molecules. This feature, together with the small size of the zebrafish embryo allows for the simultaneous screening of large number of drugs following exposure of the embryos to a library of low molecular weight compounds in 96 well plates. In an elegant study by Peterson et al. (2000), the effect of ~1000 small molecules on zebrafish development were screened simultaneously by monitoring whole zebrafish embryos for anatomic alterations at frequent intervals. 27 Introduction Peterson and his colleagues were able to identify several small molecules that modulated various aspects of vertebrate ontogeny. In particular, their results allowed them to dissect the logic of melanocyte and otolith development and identify the critical periods for the events. Such results indicate the unexplored potential of chemical screening to dissect developmental processes and identify novel genes in vertebrate development. Thus, such studies hold promise for preclinical drug discovery as well as toxicological evaluation. 3. Rationale of the Project With the aim to identify novel zebrafish genes important in embryonic development, we had previously performed a small-scale in situ hybridisation screen in zebrafish embryos with 75 unidentified clones derived from a subtracted embryonic cDNA library (Wu, 1999). Our focus was on genes whose expression is spatially and temporally regulated during development as many genes with developmental regulatory function are expressed in a regionalized fashion. Screens of this nature have been carried out in Xenopus, Drosophila, mouse and zebrafish embryos, yielding a large selection of genes with highly regulated expression patterns (Gawantka et al., 1998; Kopczynski et al., 1998; Neidhardt et al., 2000; Kudoh et al., 2001). Such studies supplement mutagenesis screens which requires laborious processes, moving from mutant to gene. Moreover, as mutagenesis screens relies heavily on “phenotype first” approach, genes with subtle lossof-function phenotypes or genes whose function can be compensated for by other genes or pathways are unlikely to be found. In our screen, we found that 19 out of the 75 (25.3%) clones presented a restricted expression pattern. Six of these clones were sequenced completely and we found two of them encoding novel proteins. In particular, one clone ES34, was expressed specifically in 28 Introduction the somites and it possessed an evolutionary conserved protein domain known as the kelch motif. The kelch motif was first discovered as a sixfold tandem element in the Drosophila kelch protein that is essential for oogenesis (Xue and Cooley, 1993). It is a segment of 4456 amino acids in length and multiple sequence alignment reveals eight key conserved residues, including four hydrophobic residues followed by a double glycine element, separated from two characteristically spaced aromatic residues (Adams et al., 2000). Proteins containing kelch repeats appear to play fundamental roles in cellular activities as evident by the pathological consequences of mutations in kelch repeats that have been found in humans and mouse (Bomont et al., 2000; Nemes et al., 2000; Bradybrook et al., 2001; VanHouten et al., 2001). For example, Bomont et al. (2000) found that a kelch protein, gigaxonin, is mutated in giant axonal neuropathy which corresponds to a generalized disorganization of the cytoskeletal intermediate filaments. This report is in agreement with other studies in which kelch proteins are emerging as key links between microfilaments and a variety of cellular structures and functions (reviewed in Adams et al., 2001). Considering the roles this family of proteins may play in human health and disease, it is of interest to isolate the full-length cDNA clone of this gene from zebrafish. This would allow us to deduce the complete amino acid sequence for comparison with its human ortholog. Further study of its expression pattern in zebrafish will predict the expression and function of the 29 novel human orthologous gene. Materials and methods Chapter II Materials and Methods 30 Materials and methods 1. Cloning of Full Length Zebrafish klhl cDNA 1.1 Rapid Amplification of cDNA Ends (RACE)-PCR Polymerase chain reaction (PCR) is a powerful tool to amplify DNA fragments millions of times by a thermostable DNA polymerase and a pair of primers. The RACE procedure or one-sided PCR is a method by which the PCR technique can be used to amplify the 3’ and 5’ ends of a cDNA using a small stretch of known sequence within the gene. ES34 full-length 5’ cDNA sequence was obtained using the RACE-PCR method from a cDNA library made from 24 hpf embryos (generously provided by Dr Valdimir Korzh, Fish Developmental Biology, Institute of Molecular Agrobiology) constructed in pBK-CMV (Fig. 1) using the Lambda Uni-Zap XR cloning system (Stratagene, USA). The cDNAs were cloned uni-directionally between the EcoRI and XhoI sites (5'Æ3') of pBK-CMV. Two gene-specfic primers KR1 (5’- CAGCATCTAGGGACTTCCAT-3’) and KR2 (5’-TTTGCCACTGGTTTGAGGAT3’) and a vector antisense primer T3, were used for amplification. The components of this polymerase chain reaction (PCR) (50 µl) included 5 µl of 10X PCR buffer (0.5 M KCl; 0.1 M Tris-HCl, pH 8.8; 15 mM MgCl2; 1% Triton X-100), 2.5 µl of 2 mM dNTP, 0.5 µl of 0.2 µg/µl sense primer, 0.5 µl of 0.2 µg/µl antisense primer, 0.2 µl of 5 U/µl Taq polymerase and 1 µl template DNA. The cycling condition was as follows: 94 °C/5 min, 30 cycles of 94 °C/30 sec, 55 °C/1 min, and 72 °C/1 min, and finally 72 °C/5 min. The amplification was carried out in a Hybaid PCR Express thermal cycler. All PCR products were run on 1% agarose gel with 0.5 µg/ml ethidium bromide in 1x TAE buffer and visualized on 312 nm UV box (Model TF-35M UV transilluminator Villber Lourmat, France). 31 Materials and methods Fig. 1. Map of pBK-CMV vector (reproduced from Stratagene catalogue) 32 Materials and methods 1.2 Recovery of DNA Fragments from Agarose Gel The QIAquick Gel Extraction Kit (Qiagen, USA) was used to recover the DNA fragments of interest from the agarose gel according to the manufacturer instructions. Briefly, the gel slice containing the DNA band was cut from the gel and melted at 50°C in Buffer QX1 for 10 minutes and then loaded into a QIAquick spin column. The volume of Buffer QX1 used was approximately three times of the gel slice volume. The column was centrifuged at 14,000 rpm for 1 minute, washed by adding 0.75 ml of Buffer PE, and spun again. After removing residual Buffer PE by spinning at 14,000 rpm for 1 minute, 30 µl of H2O was added to the centre of the column. The column was incubated at room temperature for 1 minute and the DNA fragment was eluted into a 1.5-ml centrifuge tube by centrifugation at 14,000 rpm for 1 minute. 1.3 Ligation The recovered PCR products were cloned into the pT7Blue T-vector system (Novagen, USA) (Fig. 2). The pT7 Blue T-vector was prepared by the manufacturer by cutting the vector with EcoRV and adding a 3’ terminal thymidine to both ends. These single 3’-T overhangs at the insertion site greatly improve the efficiency of ligation of PCR products into the vector because the Taq DNA polymerase generates a 3’ adenine overhang in the PCR products. The ligation reaction was carried out in 20 µl reaction volume, containing 2 µl of 10X ligation buffer (0.3 M Tris-HCl, pH 7.8; 0.1 M MgCl2; 0.1 M DTT and 5 mM ATP), insert DNA, vector DNA and 1 unit T4 DNA ligase (Gibco BRL, USA). The molar ratio of insert-to-vector DNA was usually 3:1. Ligation reaction was incubated at 16°C overnight. 33 Materials and methods Fig. 2. Map of pT7Blue vector (reproduced from Novagen catalogue) 34 Materials and methods 1.4 Transformation 1.4.1 Preparation of Competent Cells For the preparation of competent bacteria cells, 2 ml of LB broth was inoculated with a single fresh colony of Escherichia coli strain DH5α and incubated at 37°C with 250 rpm shaking overnight. The following morning, 0.5 ml of the culture was re-inoculated into a 250 ml flask containing 50 ml of LB broth and shaken at 250 rpm at 37°C until OD600 reached around 0.5. The culture was transferred into 50 ml Falcon 2070 tubes and chilled on ice for 15 minutes. Cells were pelleted by centrifugation at 1,000 g at 4°C for 15 minutes. The pelleted cells were drained thoroughly and resuspended in 1/3 starting culture volume of RF1 (100 mM RbCl; 50 mM MnCl2; 30 mM potassium acetate; 10 mM CaCl2 and 15% glycerol). After incubation on ice for 15 minutes, the cells were spun down and resuspended in 1/12.5 of the original volume of RF2 (10 mM MOPS; 10 mM RbCl; 75 mM CaCl2; 15% glycerol). After another 15 minutes incubation on ice, the competent cells were transferred into 1.5-ml microcentrifuge tubes in aliquots of 100 µl and fast-frozen in liquid nitrogen. These aliquots can be stored at -80°C for several months. 1.4.2 Transformation Normally 10 µl of ligation reaction was added into 100 µl of E.coli DH5α competent cells and incubated on ice for 30 minutes. This was followed by a heat shock at 37°C for 90 seconds after which the tube was immediately placed on ice for 2 minutes. 900 µl of LB medium was added to the transformation mixture and incubated at 37°C for 1 hour with shaking at 200 rpm. 1/10 and 9/10 of the transformation reaction mixture was spread onto two separate LB plates supplemented with ampicillin 35 Materials and methods (50 µg/ml) in order to produce proper density of transformant colonies. The plates were incubated at 37°C overnight. 1.5 Colony Screening PCR can be applied to screen for correct recombinant DNA directly using the bacteria colonies, as DNA would be effectively released from bacteria cells under the repeated high temperature during PCR. One pair of vector primers flanking cloned insert will define the size of insert by a PCR reaction. For PCR screening, colonies to be examined were marked in numerical order. A toothpick was used to touch the colony and the attached bacteria were spread to the bottom of a PCR tube, which was preloaded with 20 µl of PCR mixture, containing 0.6 units of Taq DNA polymerase, 2 µl of 10X PCR buffer, 1 µl of 2 mM dNTP mix and 0.2 µg of each sense and antisense primers. T7 and U19 primers were used for the PCR. The PCR program includes initial denaturation at 94°C for 5 minutes, followed by 30 cycles of denaturation at 94°C for 30 seconds, annealing at 55°C for 45 seconds and elongation at 72°C for 1.5 minutes. PCR product was examined in 1% agarose gel. Colonies that yielded PCR products with expected size were inoculated for plasmid DNA preparation. 1.6 Isolation and Purification of Plasmid DNA Small-scale preparation of plasmid DNA was carried out using the Wizard Plus SV Miniprep Kit (Promega, USA) according to the centrifugation protocol as described by the manufacturer. This protocol involved alkaline lysis, binding of plasmid to a spin column, followed by elution of DNA with water. 36 Materials and methods 3 ml of overnight bacteria culture in LB-ampicillin (50 µg/ml) medium was harvested by centrifugation at 10,000 g for one minute using the 5417C centrifuge (Eppendorf, Germany). The bacterial pellet was resuspended in 250 µl of Cell Resuspension Solution (50 mM Tris-HCL, pH 7.5; 10 mM EDTA; 100 µg/ml RNase A). 250 µl of Cell Lysis solution (0.2M NaOH, 1% SDS) was added to the bacterial suspension and mixed by gently inverting the tube several times. 10 µl of alkaline protease (25 µg/µl) was then added and incubated for 5 minutes at room temperature. 350 µl of neutralisation solution (4.09 M guanidine hydrochloride; 0.759 M potassium acetate; 2.12 M glacial acetic acid) was added to neutralize the mixture. After centrifugation at 14,000 rpm for 10 minutes, the clear lysate was transferred to a spin column in a collection tube and centrifuged for 1 minute at 14,000 g. The flow-through was discarded and the column was re-inserted into the collection tube. 750 µl of wash solution (60 mM potassium acetate; 10 mM Tris-HCl, pH 7.5; 60% ethanol) was added to the spin column and centrifuged at 14,000 rpm for 1 minute. This step was repeated with 250 µl wash solution and centrifuged at 14,000 rpm for 2 minutes. The spin column was next transferred to a sterile 1.5-ml microcentrifuge tube and 50 µl of sterile water was applied, and left to stand for one minute. Plasmid DNA was eluted by centrifugation at 14,000g for one minute. 1.7 Automated Sequencing Automated sequencing reactions were carried out using the ABI Prism Big Dye Terminator Cycle Sequencing Ready Reaction Kit (Perkin Elmer Applied Biosystems, USA). The gene specific primers used to obtain the full-length sequence were KR-int 5'-GACCCTGTCTCTATACACCA-3’ and ES34-int 5'- CGGTCAGCGCAGGCCGTCCG-3’. Each sequencing reaction (10 µl) contained 4 µl 37 Materials and methods of Terminator Ready Reaction Mix (Perkin Elmer, USA), 200-500 ng of double strand DNA, and 3.2 pmol of primer. PCR was performed using the GeneAmp PCR System 9600 (Perkin Elmer) with 25 cycles of 96°C/10 seconds, 50°C/5 seconds and 60°C/4 minutes, and finally hold at 4°C. Ethanol precipitation was carried out to purify the extension products. 2 µl of 3 M NaOAc (pH4.6) and 50 µl of 95% ethanol was mixed with the 20 µl reaction mix, and incubated at room temperature for 15 minutes. The tube was spun for 20 minutes at 14,000 rpm, 4°C. The pellet was rinsed with 250 µl of 70% ethanol and air-dried. The DNA pellet was dissolved in 4 µl of loading dye (50 ml contains 1 ml of 25 mM EDTA, pH8.0; 10 ml of deionised formamide; 50 mg Dextran blue and 39 ml of H2O) and heated to 92°C for 3 minutes. Samples were then chilled on ice for 2 minutes before being loaded into the wells of 6% polyacrylamide sequencing gel (50 ml of gel mix contains 5 ml of long ranger gel solution; 5ml of 10x TBE; 26 ml of H2O; 18 g of urea; 250 µl 10%APS and 35 µl TEMED). The electrophoresis was carried out at 1,690 volts for 5-9 hours. The sequencing ladders were analysed automatically by an ABI PRISM 377 DNA sequencer system and software. 1.8 Sequence Homology Search DNA sequences were submitted to FASTA (http://www2.ebi.ac.uk/fasta3/) for sequence homology search. The search was against all DNA entries in EMBL database (recent release + new releases). Motif searches were conducted using Pfam (http://www.sanger.ac.uk/Software/Pfam/search.shtml). Sequence alignments were performed using CLUSTAL program. 38 Materials and methods 2. Characterization of Zebrafish klhl Expression 2.1 Northern Hybridisation 2.1.1 Isolation of Total RNA Total RNA from zebrafish embryos and different adult tissues were extracted using TRIzol reagent (Gibco BRL). Briefly, about 200 embryos or 100 mg of tissues were frozen in liquid nitrogen and homogenized in 1 ml of TRIzol reagent. The homogenate was incubated at room temperature for 5 minutes to allow nucleoproteins to dissociate before chloroform was added. The mixture was shaken vigorously by hand for 15 seconds and incubated for another 5 minutes. This was followed by centrifugation at 12,000 g for 15 minutes at 4oC to separate the aqueous and organic phases. 500 µl of aqueous phase was then transferred to a new tube and an equal amount of isopropanol was added. The RNA was precipitated by incubation at room temperature for 10 minutes, following which it was pellet by centrifugation at 12,000 g for 10 minutes at 4oC and washed with 1 ml of 70% ethanol. The RNA pellet was then dissolved in 15 µl of DEPC (diethyl pyrocarbonate) treated water. RNA was quantified by optical density reading at 260 nm and 280 nm using UV-1601 spectrophotometer (SHIMADZU, Japan). One unit of OD260 is equivalent to 40 µg/ml of RNA, OD260:OD280 ratios >2.0 indicates good quality of RNA products. 2.1.2 Formaldehyde RNA Gel Electrophoresis and Blotting 10 µg of total RNA was fractionated on a 1.2% denaturing agarose gel (100 ml of gel contains 1.2 g agarose, 10 ml of 10X MOPS, 73 ml of H2O and 17 ml of 37% formaldehyde). Each RNA sample contained 50% formamide, 1X MOPS, 7% formaldehyde and 0.1 mg/ml ethidium bromide; and was heated at 65°C for 10 minutes before loading with 1X loading buffer (0.4% bromophenol blue; 6% sucrose 39 Materials and methods in water). The gel was run at 75 volts in running buffer containing 1X MOPS and 3% formaldehyde until the dye was near the end. After electrophoresis, the gel was rinsed in distilled water and a picture was taken with a ruler to show the distance among the bands. The RNA was transferred to Hybond™-N nylon membrane (Amersham, USA) overnight using 20X SSC (3 M NaCl; 0.3 M sodium citrate, pH7.0) as transfer buffer. The membrane was then air-dried and cross-linked by UV irradiation on a 312 mm UV box for 3 minutes. 2.1.3 Labelling of Radioactive Probe The cDNA fragments was amplified by PCR using the vector primers and used as templates for probe labelling with the Random Primers DNA Labelling System (Gibco BRL, USA). Random priming involved the use of random hexamers that anneal at random sites along the template. Klenow polymerase was used to extend the hexamers and at the same time to incorporate labeled mucleotide, [α-32P] dATP (Spp. Act. 3000 Ci/mmole; 10 mCi/ml in aqueous solution) (Amersham, USA) into the probe. 25 ng of DNA in 10.5 µl of MilliQ water was denatured at 100°C for 5 minutes and cooled in ice immediately. 1 µl each of dGTP, dTTP and dCTP with 7.5 µl of Random Primers Buffer Mixture and 2.5 µl of 32 P-dATP was added to make up to a total volume of 25 µl after the addition of 0.5 µl Klenow Fragment. The reaction was incubated at room temperature for 1 hour and terminated by adding 2.5 µl of the STOP solution to the reaction mixture. After labelling, the nick column (Pharmacia, Sweden) was used to remove unincorporated radiolabelled nucleotides as this produces high signal-to-background ratios on hybridisation results. The column contains Sephadex® G-50 DNA Grade 40 Materials and methods which serves as a gel filtration matrix. Large fragments that do not interact with the matrix are eluted first while unincorporated nucleotides that are trapped in the pores of the matrix are eluted later. First, the cover of the column was removed to allow the column buffer to drain out. The column was then washed with 2 ml TE buffer (10 mM Tris; 10 mM EDTA, pH 8.0). After that, the labelling reaction was loaded to the top of the column. The incorporated DNA fragments were eluted with 400 µl of TE buffer each for three times. The elutions were collected with three individual Eppendorf tubes and monitored for radioactivity by liquid scintillation counting. Normally, the second elution with the highest radioactivity was used as the probe for the hybridisation. 2 µl of the eluate was mixed with 3 ml of scintillation fluid (BCS, Amersham) and monitored for radioactivity. The counting was carried out on a Wallace Guardian 1414 Liquid Scintillation Counter (Biolaboratories, USA) for 120 seconds and 1 x 106 cpm/ml of hybridisation buffer was used for the hybridisation. 2.1.4 Hybridisation Prehybridisation was conducted to prevent non-specific hybridisation of the probe. The denatured salmon sperm DNA acted as a blocking reagent that helped to reduce the background signal. The membranes were placed in a hybridisation-rolling bottle (HB-OV-BS, Hybaid) with the DNA side facing inwards. The bottle contains 5 ml of hybridisation buffer (50% formamide; 5X Denhardt’s solution; 4X SET; 0.2% NaPPi; 25 mM phosphate buffer; 0.5% SDS, 100 µg/ml denatured salmon sperm DNA and 10% w/v dextran sulfate) (for 20X SET solution, 3 M NaCl; 0.6 M Tris, pH 8.0 and 40 mM EDTA). Salmon sperm DNA stock (10 mg/ml) was denatured in boiling water for 5 minutes and then kept on ice for 5 minutes. The bottle was then transferred 41 Materials and methods to a hybridisation incubator (Mini Oven MKII, Hybrid). Prehybridisation was carried out at 42°C for more than 2 hours with a spinning speed of 7 rpm. Labelled probe was denatured at 92°C for 5 minutes in a heat-block (type 17600, Thermolyne, U.S.A) and then immediately chilled on ice for 5 minutes. The hybridisation bottle was taken out of the incubator and probe was added to the buffer to the final concentration of 1 x 106 cpm/ml. Hybridisation was performed at 42°C for 16 hours. 2.1.5 Washes and Autoradiography After hybridisation, the buffer was discarded and 20 ml of washing solution (2X SET; 0.2% NaPPi and 0.5% SDS) was added. The hybridisation bottle was agitated by gentle shaking at room temperature for 15 minutes with two changes of solution after which the following two washing steps were performed at 65°C for 20 minutes each. Pre-warmed wash solution was used and the incubator was preset at 65°C. A final stringent wash was carried out using a final wash solution (0.2X SET; 0.5% SDS) at 65°C for 20 minutes. This was conducted only if the radioactivity count was still too high. The membrane was wrapped with a Saran polyethylene to keep the membrane moist. An X-omat (Kodak) film was placed and autoradiographed at -80°C for overnight. The autoradiogram was developed using the M35 X-omat developer (Kodak, USA). 42 Materials and methods 2.1.6 Membrane Stripping The probe hybridised on the membrane was stripped away by washing the membrane in striping solution (0.05x SET, 0.1% SDS) at 80 °C for 30 minutes. The membrane was air-dried and ready for reprobing. 2.2 Whole-Mount In Situ Hybridisation on Zebrafish Embryos 2.2.1 Probe synthesis 5 µg of plasmid DNA was linearized at the 5’ end of the cDNA insert by SmaI digestion at 37 °C for 45 minutes. The digestion reaction was stopped by phenol/chloroform extraction, followed by ethanol precipitation. The linearized DNA was resuspended in 20 µl of water. 1 µg of linearized DNA was used to synthesis the digoxygenin (DIG)/Fluorescein probe. The reaction was performed at 37 °C for 2 hours in a total volume of 20 µl containing 4 µl of 5x transcription buffer (Stratagene), 2 µl of DIG/Fluorescein-NTP mix [10 mM ATP, 10 mM CTP, 10 mM GTP, 6.5 mM UTP, and 3.5 mM DIG/Fluorescein-UTP (Boehinger, Germany)], 1 µl of RNAse inhibitor (40 U/µl) (Promega) and 1 µl of T7 RNA polymerase (50 U/µl) (Promega). Following the reaction, 2 µl of RNAse free DNAse I (Promega) was used to digest the DNA template at 37 °C for 15 minutes. Digestion was stopped by adding 1 µl of 0.5 M EDTA (pH 8.0). 2.5 µl of 4 M LiCl and 75 µl of cold 100% ethanol was added to precipitate the RNA. After washing with 70% ethanol, the RNA probe was resuspended in 100 µl of DEPC treated water. The probe was purified using Chroma spin 100 DEPC H2O columns (Clontech, USA) by spinning at 700 g for 5 minutes to remove the impurity and small RNA fragments. 43 Materials and methods 2.2.2 Preparation of staged zebrafish embryos Zebrafish (Danio rerio) were purchased from local aquarium fish farms and their embryos were staged according to 'The Zebrafish Book' (Westerfield, 1995) and presented as hours post fertilization (hpf) at 28.5oC. To avoid pigment development in later stage (>30 hpf) embryos, 0.003% PTU (1-phenyl-2-thiourea, Sigma) was added when the embryos were 10-16 hpf. Staged embryos were fixed in 4% PFA (paraformaldehyde)/ PBS (phosphate buffered saline - 0.8% NaCl; 0.02% KCl; 0.0144% Na2HPO4; 0.024% KH2PO4, pH7.4) for 12 to 24 hours at room temperature or 4°C. Embryos younger than 16 hpf were fixed before dechorionization and the chorion was removed afterwards. Embryos older than 16 hpf were dechorionated before fixation. Older embryos with tails were hibernated on ice for 15 minutes before fixation to prevent curling of tails. After fixation, the embryos were washed in PBST (PBS, 0.1% Tween 20) twice for one minute, four times for 30 minutes at room temperature on a nutator (Clay Adams, Becton Dickinson, USA). Embryos at 24 hpf and older were treated with proteinase K (10 µg/ml, Boehringer, Germany). The time of exposure depended upon the embryonic stages and the specific activity of proteinase K, which varied from batch to batch. For most cases, the time below was adopted. 16-24 hpf 3-4 minutes 24-32 hpf 5-6 minutes 32-50 hpf 10-20 minutes 50-72 hpf 20-40 minutes To stop the proteinase K reaction, the proteinase K solution was removed completely, and the embryos were fixed again in 4% PFA/PBS for 20 minutes at room temperature. Embryos were then washed in PBST twice for 1 minute each and four times for 20 minutes each. 44 Materials and methods 2.2.3 In Situ Hybridisation Embryos were prehybridised in 500 µl PBST + 500 µl hybridisation buffer (50% formamide, 5X SSC, 50 µg/ml heparin, 500 µg/ml yeast tRNA, 0.1% Tween 20, 10 mM citric acid, pH6.0) at room temperature for an hour. The solution was replaced with 1 ml of fresh hybridisation buffer and the embryos were prehybridised at 70oC for 2-5 hours. 1 µl of DIG/Fluorescein probe was diluted in 200 µl of hybridisation buffer and denatured at 80oC for 5 minutes, followed by 5 minutes of ice bath. Embryos of different stages were selected and mixed together and buffer containing the probe was added. Hybridisation was performed at 70oC in a circulating water bath overnight. The following morning, the probe solution was removed and replaced with prewarmed 5X SSC. This was left to wash at 70oC for two hours, followed by 0.2X SSC at 70oC for two hours. The embryos were then washed at room temperature twice for five minutes in PBS. 2.2.4 Incubation with Preabsorbed Antibodies Commercial DIG and fluorescein-alkaline phosphatase (AP) antibodies (Boehringer) should be preincubated with biological tissues, preferably of the same stage as the sample that later will be used for detection of signals in order to decrease the staining background and to increase signal-to-noise ratio. In the study, anti-DIG and Fluor-AP was diluted to 1:500 and 1:50 in 10% FCS/PBS (fetal calf serum in PBS) respectively and incubated with 50 zebrafish embryos of any stages on a nutator at 4°C overnight. After that, the antibodies solution was transferred to a new tube and diluted to 1:5000 and 1:500 with 10% FCS/PBS. 10 µl of 0.5 M EDTA (pH 8.0) and 5 µl of 10% 45 Materials and methods sodium azide were added to a volume of 10 ml antibody solution to prevent bacterial growth. The preabsorbed antibody was stored at 4°C and can be used repeatedly. Following hybridisation, the embryos were incubated in 10% FCS/PBS for two hours at room temperature to block non-specific binding sites for antibody. The blocking solution was replaced with alkaline-phosphatase (AP)-coupled anti-DIG Fab fragments, and the embryos were incubated at 4oC overnight. 2.2.5 Staining The next day, embryos were washed in PBST twice for one minute, four times for 30 minutes, followed by twice for five minutes in PBS, once for 30 seconds and twice for 10 minutes in buffer 9.5 (0.1 M Tris-HCl, pH 9.5, 50 mM MgCl2, 10 mM NaCl, 0.1% Tween 20) on a nutator at room temperature. For staining, 4.5 µl of NBT (50 mg/ml nitroblue terazolium in 70% dimethyl formamide, Boehringer Mannhein, Germany) and 3.5 µl of BCIP (5-bromo-3-chloro-3-indolyl phosphate, Boehringer Mannhein, Germany) were added into 1 ml of buffer 9.5 and placed in the dark. Staining was stopped by washing the embryos twice for 10 minutes in PBS. Embryos were then transferred to 4% PFA/PBS and kept at 4oC. 2.2.6 Mounting and photography Selected embryos were washed with PBS twice for 10 minutes each and transferred to 50% glycerol/PBS, equilibrated at room temperature for a couple of hours. For whole-mount, a single chamber was made by placing stacks of 2-3 small cover glasses on both sides of a 25.4 x 76.2 mm microscope slide. Small cover glasses in the 46 Materials and methods stacks will be perfectly solid 1 hour after placing a drop of Permount between them. Selected embryo was transferred to the chamber in a small drop of 50% glycerol/PBS and oriented by a needle. A 22 x 44 mm cover glass with a small drop of the same buffer was superimposed onto the embryo. The orientation of the embryo can be adjusted by gently moving the cover glass. For flat specimen, the yolk of selected embryo was removed completely by needles. The embryo without yolk was then placed onto a slide with a small drop of 50% glycerol/PBS and adjusted to a proper orientation by removing excess of liquid and by needles. A small fragment of cover glass (as small as possible) was covered onto the embryo. Care was taken to avoid bubbles and a drop of 50% glycerol/PBS was added to fill the space under the cover glass. This specimen was sealed with nail polish along the edge of the cover glass to prevent it from drying. For cross-section, some of the stained embryos were embedded in a 1.5% agarose/sucrose block and equilibrated in 30% sucrose overnight. On the following day, embedded embryos were mounted on a holder using tissue freezing medium (Jung), and sectioned with a cryostat microtome (Leica CM 1900) in transverse orientation (15 µm). Sections were placed on Fisherbrand Superfrost/Plus microscope slides, and mounted using glycerol/PBS (1:1) solution. The slides were sealed with nail varnish. Photos were taken using a camera mounted to an Olympus AX-70 microscope (Olympus, Japan). The film used was Kodak Gold 200 ASA. 2.3 Two-Colour Whole-mount In Situ Hybridisation In two-color whole-mount in situ hybridisation, two different RNA probes labelled with DIG and Fluorescein respectively are used for the same embryos. Fluorescein 47 Materials and methods labeling is performed following the same procedure as in DIG labelling. However, instead of using DIG-UTP, Fluor-UTP is used instead. For hybridisation, the two probes are added to the same tube in a ratio of 2:1 Fluorescein to DIG (Fluorescein is less sensitive to be detected than DIG). After incubation at 68ºC for overnight, probes were removed by washing in 2XSSCT (2XSSC + 0.1% Tween 20) at 68ºC for 2 hours, followed by another wash in 0.2XSSCT (0.2XSSC + 0.1% Tween 20) at 68ºC for 2 hours. The DIG detection was first carried out as described in the previous sections (see 2.2.4 and 2.2.5). Following the DIG staining with NBT/BCIP, the embryos were washed with MA buffer (0.15 M maleic acid; 0.1 M NaCl; pH7.5) twice for 10 minutes each. To remove the phosphatase activity of first antibody, the embryos were incubated with 0.1 M glycine (pH 2.2) for 30 minutes at room temperature. After that, the embryos were washed in PBST four times for 10 minutes each and then incubated in blocking buffer (5% Blocking Reagent in MA buffer, Boehringer) for 2 hours at room temperature. Subsequently, embryos were incubated with Anti-Fluorescein-AP (see section 2.2.4) at 4ºC for overnight. To detect the fluorescein signal, the embryos were first washed with MA buffer 4 times for 1 hour each, followed by wash with Buffer 8.2 (0.1 M Tris-HCl, pH 8.2; 50 mM MgCl2; 10 mM NaCl; and 0.1% Tween 20) three times for 5 minutes each at room temperature. Embryos were then stained in staining buffer, a 1:1 mixture of Fast Red solution (made by dissolving ½ Fast Red tablet [Boehringer] in 1 ml Buffer 8.2, spinning down undissolved particles and transferring the supernatant to a new tube) and NAMP solution (a 1:100 dilution of NAMP stock [50 mg/ml Naphthol As-MX, Sigma] in Buffer 8.2) for 1-3 hours. The stained embryos were washed in PBST twice for 10 minutes each and can be stored in 4% PFA/PBS for several months. 3. Characterization of Human Ortholog KLHL 48 Materials and methods 3.1 Identification of Human Orthologous Gene KLHL The putative human homolog of zebrafish klhl was identified by searching the Fasta databases using the DNA and amino acid sequence of klhl. This orthology was confirmed by synteny analysis of both the zebrafish and human genomes. Mapping positions of zebrafish ESTs were identified from the publicly available RH panel maps found at http://zfin.org, http://wwwmap.tuebingen.mpg.de and http://www.ncbi.nlm.nih.gov/genome/guide/D_rerio.html. The map positions of human genes were obtained from http://www.ncbi.nlm.nih.gov/LocusLink and http://www.ensembl.org. 3.2 Cloning of KLHL Fragment Based on the sequence annotated in Ensembl (http://www.ensembl.org), we designed two gene-specific primers that spanned the second exon of the gene, HC34F (5’ACAGATGCTTCTTATGGCCC-3’) and HC34R (5’-GAAATCCATCCATCACAGCC3’) to amplify a 0.9 kb fragment of KLHL from human DNA. The template genomic DNA was extracted from the leucocytes of peripheral blood and was provided generously by a member of the laboratory (Tong, 2001). The conditions of the reaction were as previously described in section 1.1. The fragment cloned was sequenced as described in 1.7 to ascertain the desired DNA fragment was cloned. 3.3 Northern Blot Analysis A MTN blot containing 2 µg of poly A+ RNA isolated from a variety of human tissues (Clontech #7760-1, USA) was hybridised to a 32 P-radiolabelled KLHL probe according to the manufacturer instructions. The probe was generated as previously 49 Materials and methods described in 2.1.3. Briefly, the MTN bolt was prehybridised in 10 ml of ExpressHyb Solution (Clontech) for 30 minutes at 68oC with rolling in a hybridisation incubator. Radioactively labelled probe was heat-denatured at 95-100oC for 5 minutes and immediately chilled on ice. The prehybridisation solution was replaced with fresh ExpressHyb solution containing the probe to a final concentration of 1 x 106 cpm/ml and hybridisation was carried out at 68oC for 1 hour. Following hybridisation, the solution was discarded and 10 ml of washing solution 1 (2x SSC, 0.05% SDS) was added. The blot was washed for 40 minutes with continuous agitation at room temperature with 2 changes of solution. A further two washes were performed with wash solution 2 (0.1x SSC, 0.1% SDS) at 50oC. The blot was removed from the bottle with forceps and immediately covered with Saran wrap. It was subsequently exposed to X-ray film at –80oC overnight. The probe was stripped away by washing the blot with 0.5% SDS preheated to 90100oC for 10 minutes following the blot was left to air-dry. The blot was subsequently hybridised with a β-actin cDNA probe provided by the manufacturer of the MTN blot (Clontech) used as an RNA loading and transfer control. 50 Results Chapter III Results 51 Results 1. Identification of ES34 as a Putative Kelch Repeat Protein We identified ES34 through a systematic search for novel genes with tissuespecific expression patterns during zebrafish embryogenesis (Wu, 1999). The ES34 clone, selected for its somite-specific expression pattern, contained an insert of 948 bp with a major open reading frame (ORF) encoding a putative polypeptide of 188 amino acids. At the DNA level, apart from identifying a number of ESTs and genomic sequences, sequence homology searches did not reveal any significant matches to any known gene in the database. Analysis of the translated amino acid, however, revealed that it contained a kelch motif and showed ~25% homology to members of a protein family called the kelch-repeat superfamily (reviewed in Adams et al., 2000). The kelch motif is an ancient and evolutionarily conserved sequence motif of 44-56 amino acids in length. First identified in Drosophila kelch (Xue and Cooley, 1993), over 28 kelch- containing proteins have so far been isolated and characterized (Adams et al., 2002) in organisms as diverse as viruses, fungi to mouse and human. These proteins typically contain a series of four to seven motifs that form kelch repeats and have been grouped together into the kelch-repeat superfamily. Members of this superfamily have diverse range of biological roles that range from actin-binding to cell regulation and are structurally diverse but are grouped together based on the presence of the kelch repeat. Search against motif database PFAM (http://www.sanger.ac.uk/Software/Pfam/) analysis of the putative polypeptide encoded by ES34 revealed that the protein contained only 3 kelch repeats. In addition, the ORF of the predicted gene sequence did not have an in-frame stop codon upstream of the first ATG. Thus, ES34 may be a partial cDNA clone and additional cloning sequences were sought to obtain the fulllength clone of this gene. This gene will be termed klhl henceforth after the various 52 Results human kelch-like genes belonging to the same familiy (Solysik-Espanola et al., 1999; Lai et al., 2000; Nemes et al., 2000; Braybrook et al., 2001; Wang et al., 2001). 2. Molecular Cloning of Zebrafish klhl Vector- and gene-specific primers were used to amplify the missing 5’ cDNA from a cDNA library made from 24 hpf embryos constructed using the vector pBKCMV (Fig. 1). T3 vector primer and KR1 5’-CAGCATCTAGGGACTTCCAT-3’ were used for the primary PCR. The products were then subjected to secondary PCR with T3 and nested gene-specific primer KR2 5’-TTTGCCACTGGTTTGAGGAT-3’. The size of the PCR product was estimated to be about 1.5 kb. The PCR products were gel extracted for direct ligation into pT7-Blue vector (Fig. 2). Clones carrying the correct insert size (~1.5 kb) were sequenced from both ends using vector primers U19 and T7. As the sequences obtained were relatively short and had no overlapping regions, internal primers were designed for further sequencing. The sequences were aligned using the DNAMAN program and were found to be continuous with the original ES34 clone. The assembly of the sequences from the two clones is shown in Fig. 3 and the total length of the nucleotide sequence is 2326 bp long. 3. Sequence Analysis of Zebrafish klhl The 2326-bp sequence was translated and found to contain an ORF encoding a 635 amino acid product (Fig. 3). We infer the ATG codon at nucleotide (nt) residue 76 to be the true start site of translation because it begins the longest reading frame and is preceded by numerous stop codons in the 5’ untranslated region (UTR) for all three reading frames. In addition, the putative methionine initiation codon occurs in the 53 Results Fig. 3. Nucleotide and predicted amino acid sequence of zebrafish klhl cDNA. The 2326 bp sequence was assembled from the 5’ RACE fragment and ES34 clone. The start nucleotide of ES34 is indicated by an arrowhead. The proposed start codon is highlighted in bold. The stop codon is indicated by an asterisk, and the potential polyadenylation signal AATAAA is doubly underlined. The BTB/POZ domain is underlined, and the six kelch repeats are boxed up. Gene-specific PCR primers used for 5’ RACE and sequencing are shown in bold and indicated by overhead arrows. 54 Results 1 GAGGTGCAAGTCTGTCTTTCTCCTGCACCCACTTCATCTCCACATCGGTCTTGGCTGTGA 61 1 CCCATCACTACAGTCATGGCACCCAAAAAGAACAAGGCGGCTAAGAAGAGCAAAGCCGAT M A P K K N K A A K K S K A D 121 16 ATCAACGAGATGACGATCATGGTCGAGGACAGCCCCTCCAACAAAATCAACGGGCTCAAC I N E M T I M V E D S P S N K I N G L N 181 36 ACGCTCCTGGAGGGCGGAAACGGCTTTAGTTGCATCTCCACCGAAGTCACCGACCCCGTC T L L E G G N G F S C I S T E V T D P V 241 56 TATGCACCAAACCTCCTGGAGGGTCTGGGCCACATGAGGCAGGACAGCTTCCTCTGTGAC Y A P N L L E G L G H M R Q D S F L C D 301 76 CTCACGGTAGCAACCAAATCCAAGTCCTTCGACGTTCACAAAGTAGTGATGGCATCCTGC L T V A T K S K S F D V H K V V M A S C 361 96 481 136 AGCGAGTACATCCAAAACATGCTCCGGAAGGATCCGTCTCTAAAGAAGATTGAGCTCAGT S E Y I Q N M L R K D P S L K K I E L S KR-int GATTTATCCCCAGTTGGTTTGGCTACAGTCATCACTTATGCCTATTCTGGAAAACTGACC D L S P V G L A T V I T Y A Y S G K L T KR-int CTGTCTCTATACACCATCGGCAGCACTATATCTGCAGCCTTGCTCCTCCAGATCCACACT L S L Y T I G S T I S A A L L L Q I H T 541 156 TTGGTGAAAATGTGTAGTGATTTCTTAATGCGGGAGACTAGCGTGGAGAATTGCATGTAT L V K M C S D F L M R E T S V E N C M Y 601 176 GTGGTCAACATTGCCGACACGTACAATCTAAAAGAGACGAAGGAAGCTGCTCAGAAGTTC V V N I A D T Y N L K E T K E A A Q K F 661 196 ATGCGAGAGAACTTCATTGAGTTCTCCGAGATGGAGCAGTTCCTCAAACTCACCTACGAG M R E N F I E F S E M E Q F L K L T Y E 721 216 CAAATCAACGAGTTCCTCACAGACGACTCACTTCAGTTGCCTTCAGAGCTCACGGCTTTC Q I N E F L T D D S L Q L P S E L T A F 781 236 CAGATCGCAGTCAAGTGGTTGGATTTTGATGAAAAGAGGTTGAAGTACGCTCCTGATCTG Q I A V K W L D F D E K R L K Y A P D L 841 256 CTGTCCAACATCCGTTTTGGCACCATCACCCCCCAGGATCTTGTTAGTCACGTGCAGAAC L S N I R F G T I T P Q D L V S H V Q N 901 276 GTTCCCAGGATGATGCAGGATGCCGAGTGCCACCGTTTGCTGGTCGACGCCATGAATTAC V P R M M Q D A E C H R L L V D A M N Y 961 296 CACCTGCTGCCGTTCCAGCAGAACATCCTTCAATCCCGGAGGACCAAAGTTCGTGGAGGT H L L P F Q Q N I L Q S R R T K V R G G ES34-int CTCCGAGTTCTGCTTACTGTTGGCGGACGGCCTGCGCTGACCGAAAAGTCTCTCAGCAAG L R V L L T V G G R P A L T E K S L S K Kelch R1 GACATTCTCTACAGGGACGAGGATAATGTCTGGAACAAGCTGACGGAGATGCCTGCTAAG D I L Y R D E D N V W N K L T E M P A K 421 116 1021 316 1081 336 1141 356 1201 376 AGCTTCAATCAGTGTGTGGCCGTTTTGGACGGTTTCCTTTACGTGGCTGGAGGAGAAGAC S F N Q C V A V L D G F L Y V A G G E D Kelch R2 CAGAATGATGCAAGAAACCAGGCAAAGCATGCAGTCAGCAATTTCAGCAGATACGACCCC Q N D A R N Q A K H A V S N F S R Y D P 55 Results 1261 396 CGATTCAACACGTGGATCCACCTAGCCAACATGATTCAGAAGCGTACTCATTTCAGCCTC R F N T W I H L A N M I Q K R T H F S L 1321 416 1501 476 AACACCTTCAATGGTCTGCTCTTCGCCGTCGGGGGCCGTAATTCTGACGGCTGCCAGGCG N T F N G L L F A V G G R N S D G C Q A ES34 Kelch R3 KR2 KR1 TCTGTCGAGTGCTACGTCCCATCCTCAAACCAGTGGCAAATGAAAGCCCCAATGGAAGTC S V E C Y V P S S N Q W Q M K A P M E V KR1 CCTAGATGCTGCCATGCCAGCTCAGTTATCGATGGCAAGATCTTGGTTAGCGGTGGTTAC P R C C H A S S V I D G K I L V S G G Y Kelch R4 ATTAACAACGCCTACTCTCGAGCCGTCTGTTCCTACGACCCATCCACTGATAGCTGGCAG I N N A Y S R A V C S Y D P S T D S W Q 1561 496 GATAAAAACAGCCTGAGCAGCCCGAGAGGATGGCACTGTTCGGTGACCGTCGGAGATCGT D K N S L S S P R G W H C S V T V G D R 1621 GCTTACGTGCTCGGCGGCAGTCAACTGGGCGGACGTGGAGAGAGAGTAGACGTCTTGCCT 516 A Y V L G G S Q L G G R G E R V D V L P Kelch R5 GTTGAATGCTATAACCCTCACTCTGGCCAGTGGAGCTACGTTGCCCCCTTGCTGACGGGA V E C Y N P H S G Q W S Y V A P L L T G 1381 436 1441 456 1681 536 1741 556 1801 576 GTGAGCACTGCAGGCGCTGCCACCTTGAATAACAAGATCTACCTCTTGGGCGGCTGGAAT V S T A G A A T L N N K I Y L L G G W N Kelch R6 GAGATTGAGAAGAAGTACAAGAAATGCATTCAGGTTTATAATCCTGATCTTAACGAATGG E I E K K Y K K C I Q V Y N P D L N E W 1861 596 ACTGAAGATGACGAATTGCCAGAGGCTACGGTTGGTATCTCGTGTTGTGTCGTCACCATC T E D D E L P E A T V G I S C C V V T I 1921 616 CCCACACGCAAAACACGAGAGTCGAGGGCCAGCTCGGTGTCATCCGCACCAGTTAGTATA P T R K T R E S R A S S V S S A P V S I 1981 TAAGCAGAGAGAGAGAGAGTGGTGGGTAAATGTATTTGAGTTGCTAAAGGTCAATTTATA * 2041 CTTCTGCGTCAAGTAGGTAGCACAGATCCGGCAAAGCTTCATCACACACTTTGGTCGTGC 2101 ACACTTCACCATACCAAATAAATGCAACTACATATTTCCGCGATGTGGAATGCAAGGTCT 2161 GTGATTGGTCAGATTTGGTAGAGATGACAAAATGTGGGCGGGGCCGACAGTTGTGAGAGA 2221 GAGGCAAGTGTTTAACAAGTGTCAAGTCCTATGGAGGAGCACTGTATGGATACGTTTGTT 2281 TTGTTTACTCTGTGATTAAAGTTATTAAACGTTAGAAAAAAAAAAA 56 Results context (5’-GTCATGG-3’) of a nearly perfect Kozak sequence (5’-A/GCCATGG-3’) (Kozak, 1991). An AATAAA sequence at nt 2116-2122 may serve as a polyadenylation signal. Analysis of the amino acid sequence of klhl with protein domain identification software revealed the presence of two conserved domains, kelch repeats and the BTB/POZ domain (Fig. 3). The kelch repeat domain (amino acids 318-615) is contained within the carboxy-terminal half of klhl and consists of six repeats of approximately 50 amino acids in length. Comparison of the various kelch repeats in klhl reveal that while the sequence identity between individual repeats is low, multiple alignment of the kelch repeats from klhl shows a conserved pattern of residues, including a double glycine element (GG) and a tyrosine (Y) separated from a tryptophan (W) by precisely six residues, suggesting that the sequence and tandem arrangement of kelch repeats in klhl is similar to those found in the kelch protein family (Fig. 4). Towards the amino-terminal portion of the clone lies the BTB (broadcomplex, tramtrack, bric-a-brac) (Godt et al., 1993) or POZ (poxvirus and zinc finger) domain (Bardwell and Treisman, 1994; Albagli et al., 1995). The BTB/POZ domain is a ~120 aa motif that has been identified in several actin binding proteins having a kelch motif, as well as in several C2H2-type zinc finger transcription factors (Albagli et al., 1995; Collins et al., 2001). This domain is believed to be important for proteinprotein interaction and has been shown to mediate both homo- and heterodimeric protein-protein interactions in vitro and the formation of multimeric complexes in vivo (Bardwell and Treisman, 1994; Robinson and Cooley, 1997). 57 Results klhl klhl klhl klhl klhl klhl R1 R2 R3 R4 R5 R6 318 367 421 468 515 567 KLHL KLHL KLHL KLHL KLHL KLHL R1 R2 R3 R4 R5 R6 317 366 420 467 514 566 ** * 366 420 467 514 566 615 365 419 466 513 565 614 Fig. 4. Alignment of the kelch repeats of zebrafish klhl and human KLHL. Conserved residues are highlighted in dark blue and three / two identical residues in four sequences are highlighted in mauve and light blue respectively. Dots represent gaps inserted for maximal alignment. The beginning and end residues for the kelch repeats are indicated. The invariant glycine doublet and tryptophan residues are identified by asterisks. Sequence alignment was performed using the multiple alignment program in DNAMAN software package. The multiple alignment of the six repeats from Klhl and KLHL was based on the same criteria used by Lai et al (2000). 58 Results 4. klhl is Conserved Across Zebrafish, Human, Mouse and Rat The blast search of klhl against the non-redundant GenBank protein database revealed that klhl showed remarkable homology (72% identity) to a hypothetical protein on human chromosome 6 (accession number CAC16284) and also to hypothetical mouse (NP_766513) and rat proteins (XP_236428), sharing about 71% identity to both. Of lower significance were hits to known members of the kelch repeat superfamily of proteins, such as human kelch-like 5 protein AAL08584 and rat actinfillin (AAM74154), sharing only about 26% identity. The low conservation between klhl and the reported kelch-like proteins suggests that klhl is most likely a novel member of the kelch-repeat superfamily rather than a zebrafish ortholog of the various known kelch-like proteins (Kelch-like protein 1-6, X and ENC-1 etc) (Hernandez et al., 1997; Soltysik-Espanola et al., 1999; Lai et al., 2000; Nemes et al., 2000; Bradybrook et al., 2001; Wang et al., 2001). Multiple alignment of our predicated klhl protein and the hypothetical mammalian proteins identified in the BLAST search showed remarkable conservation spanning the entire length of the proteins (Fig. 5), with klhl sharing around 74-76% identity with the various mammalian Klhl proteins. Human KLHL shares around 92% identity with the two rodent Klhl while the mouse and rat Klhl sequences share a close 97% identity. The high sequence identity between klhl and the other mammalian proteins strongly suggests that the genes encoding for these putative proteins could well be the mammalian orthologs of klhl. We designated the human, mouse and rat hypothetical proteins as KLHL, m-Klhl and r-Klhl respectively. A putative ortholog of klhl was also identified in pufferfish (Ensembl peptide ID SINFRUP00000164410) using the blast search against Ensembl peptides (http://www.ensembl.org/Multi/blastview). The Ensembl database provides up-to-date 59 Results Fig. 5. Amino acid sequence alignment of zebrafish klhl, Fugu klhl, human KLHL, mouse (m) Klhl and rat (r) Klhl proteins. Conserved residues are highlighted in light blue and three/ two identical residues in four sequences are highlighted in purple and pink respectively. Dots represent gaps inserted for maximal alignment. Sequence alignment was performed using the multiple alignment program in DNAMAN software package. The sequence of Fugu klhl (Ensembl peptide ID SINFRUP00000164410) was obtained from Ensembl and sequences of KLHL (GenBank accession number CAC16284), m-Klhl (NP_766513) and r-Klhl (XP_236428) obtained from GenBank. 60 Results klhl fugu-klhl KLHL m-Klhl r-Klhl MAPKK K KK K DINEMTIMVEDSP NKINGLN LLEGGNGFSCIS EVTD YAPNL MAPKKNKAAKKSKADINEMTIMVEDSPSNKINGLNTLLEGGNGFSCISTEVTDPVYAPNL MAPKKNKTAKKSKGDINEMTIMVEDSPVNKINGLNTLLEGGNGFNCISTEVTDSVYAPNL M APKK KT KK K DINEMTIMVEDSPVNKINGLN LLEGGNGF CIS EVTD YAPNL MAPKK.KIVKKNKGDINEMTIIVEDSPLNKLNALNGLLEGGNGLSCISSELTDASYGPNL M APKK K KKNK DINEMTIIVEDSPLNKLNALNGLLEGGNGLSCISSELTD SYGPNL MAPKK.KTIKKNKAEINEMTIIVEDSPLSKLNALNGLLEGSNSLSCVSSELTDTSYGPNL M APKK KT KKNK EINEMTIIVEDSPL KLNALNGLLEGSNSLSCVSSELTD SYGPNL MAPKK.KTLKKNKPEINEMTIIVEDSPLNKLNALNGLLGGENSLSCVSSELTDTSYGPNL M APKK KT KKNK EINEMTIIVEDSPLNKLNALNGLLGGENSLSCVSSELTD SYGPNL 60 60 59 59 59 klhl fugu-klhl KLHL m-Klhl r-Klhl LEGLG MRQDSFLCDL VATK KSFDVHK VMASCSEYI NMLRKDPS KKIEL DLSPV LEGLGHMRQDSFLCDLTVATKSKSFDVHKVVMASCSEYIQNMLRKDPSLKKIELSDLSPV LEGLSNMRQESFLCDLTVATKSKSFDVHRVVMASCSEYIRNILKKDPTLQKIDLNELSPV L EGLS MRQESFLCDL VATK KSFDVHR VMASCSEYI NILKKDP KIDLNELSPV LEGLSKMRQENFLCDLVIGTKTKSFDVHKSVMASCSEYFYNILKKDPSIQRVDLNDISPL L EGLSKMRQE FLCDLVIGTKTKSFDVHKSVMASCSEYFYNILKKDPS RVDLNDISPL LEGLSKMRQESFLCDLVIGTKTKSFDVHKSVMASCSEYFYNILKNDPSTKRVDLNDIAPL L EGLSKMRQESFLCDLVIGTKTKSFDVHKSVMASCSEYFYNILK DPS KRVDLNDI PL LEGLSKMRQESFLCDLVIGTKTKSFDVHKSVMASCSEYFYNILKNDPSTKRVDLNDIAPL L EGLSKMRQESFLCDLVIGTKTKSFDVHKSVMASCSEYFYNILK DPS KRVDLNDI PL 120 120 119 119 119 klhl fugu-klhl KLHL m-Klhl r-Klhl GLATVI YAY GKLTLSLYTIGS ISAAL LQIHTLVKMCSDFLMRE SVENCMYVVNIA GLATVITYAYSGKLTLSLYTIGSTISAALLLQIHTLVKMCSDFLMRETSVENCMYVVNIA GLATAITYAYSGKLTLSLYGIGSTIAAAMLLQIGTLVKMCSDFLMQELSVENCMYVANIA G LAT I YAY GKLTLSLY IGS I AAM LQI TLVKMCSDFLM E SVENCMYV NIA GLATVIAYAYTGKLTLSLYTIGSIISAAVYLQIHTLVKMCSDFLIREMSVENCMYVVNIA G LATVIAYAYTGKLTLSLYTIGSIISAAVYLQIHTLVKMCSDFLIRE SVENCMYVVNIA GLATVIAYAYTGKLTLSLYTIGSIISAAVYLQIHTLVKMCSDFLIREISVENCMYVVNIA G LATVIAYAYTGKLTLSLYTIGSIISAAVYLQIHTLVKMCSDFLIRE SVENCMYVVNIA GLATVIAYAYTGKLTLSLYTIGSIISAAVYLQIHTLVKMCSDFLIREISVENCMYVVNMA G LATVIAYAYTGKLTLSLYTIGSIISAAVYLQIHTLVKMCSDFLIRE SVENCMYVVNMA 180 180 179 179 179 klhl fugu-klhl KLHL m-Klhl r-Klhl DTY LKE K AAQKFMRENFIEF E EQFLKLTYEQINEFL DD LQLPSEL AFQIAVK DTYNLKETKEAAQKFMRENFIEFSEMEQFLKLTYEQINEFLTDDSLQLPSELTAFQIAVK DAYALKETKKAAQKFMRENFIEFSEMEQFLKLTFEQISDFLSDDSLSLPSELTAFQIAMK D Y LKE K AAQKFMRENFIEF E EQFLKLTFEQI DFL DD L LPSEL AFQIAMK ETYSLKNAKAAAQKFIRDNFLEFAESDQFMKLTFEQINELLIDDDLQLPSEIVAFQIAMK E TY LKNAKAAAQKFIRDNFLEFAESDQFMKLTFEQINELL DDDLQLPSEIVAFQIAMK ETYSLKNAKATAQKFIRDNFIEFAESEQFMKLTFEQINELLVDDDLQLPSELVAFQIAMK E TY LKNAKA AQKFIRDNFIEFAESEQFMKLTFEQINELL DDDLQLPSELVAFQIAMK ETYCLKNAKATAQKFIRDNFIEFADSEQFMKLTFEQINELLIDDDLQLPSELVAFQIAMK E TY LKNAKA AQKFIRDNFIEFADSEQFMKLTFEQINELL DDDLQLPSELVAFQIAMK 240 240 239 239 239 klhl fugu-klhl KLHL m-Klhl r-Klhl WLDFDEKRLKYAPDLLSNIRFGTI PQDLV VQ VPRMMQDAECHRLLVDAMNYHLLPF WLDFDEKRLKYAPDLLSNIRFGTITPQDLVSHVQNVPRMMQDAECHRLLVDAMNYHLLPF WLDFDEKRLKYAADLLTHIRFGTISAQELVNHVQSVPRMMQDAECHRLLVDAMNYHLLPY W LDFDEKRLKYAADLL HIRFGTISAQELVN VQ VPRMMQDAECHRLLVDAMNYHLLPY WLEFDQKRVKYAADLLSNIRFGTISAQDLVNYVQSVPRMMQDADCHRLLVDAMNYHLLPY W LEFDQKRVKYAADLLSNIRFGTISAQDLVNYVQ VPRMMQDADCHRLLVDAMNYHLLPY WLEFDQKRVKHAADLLSNIRFGTISAQDLVNYVQTVPRMMQDADCHKLLVDAMNYHLLPY W LEFDQKRVK AADLLSNIRFGTISAQDLVNYVQ VPRMMQDADCHKLLVDAMNYHLLPY WIEFDQKRVKHAADLLSNIRFGTISAQDLVNYVQTVPRMMQDADCHKLLVDAMNYHLLPY W IEFDQKRVK AADLLSNIRFGTISAQDLVNYVQ VPRMMQDADCHKLLVDAMNYHLLPY 300 300 299 299 299 klhl fugu-klhl KLHL m-Klhl r-Klhl QQN LQSRRTKVRGG RVL TVGGRPALTEKSLSKDILYRD DN W KLTEMPAKSFNQC QQNILQSRRTKVRGGLRVLLTVGGRPALTEKSLSKDILYRDEDNVWNKLTEMPAKSFNQC QQNILQSRRTKVRDGLKVILTVGGRPALTEKSLSKDVLYRDTDNLWNKLTELPAKSFNQC Q QN LQSRRTKVRDG KVI TVGGRPALTEKSLSKDVLYRD DN W KLTELPAKSFNQC HQNTLQSRRTRIRGGCRVLVTVGGRPGLTEKSLSRDILYRDPENGWSKLTEMPAKSFNQC H QNTLQSRRTRIRGGCRVL TVGGRPGLTEKSLSRDILYRDPENGWSKLTEMPAKSFNQC HQNTLQSRRTRIRGGCRVLITVGGRPGLTEKSLSRDILYRDPENGWSKLTEMPAKSFNQC H QNTLQSRRTRIRGGCRVL TVGGRPGLTEKSLSRDILYRDPENGWSKLTEMPAKSFNQC HQNTLQSRRTRIRGGCRVLITVGGRPGLTEKSLSRDVLYRDPENGWSKLTEMPAKSFNQC H QNTLQSRRTRIRGGCRVL TVGGRPGLTEKSLSRDVLYRDPENGWSKLTEMPAKSFNQC 360 360 359 359 359 klhl fugu-klhl KLHL m-Klhl r-Klhl VAVLDGFLYVAGGEDQNDARNQAKHAVSNF RYDPRFNTWIHL M QKRTHFSL FNG VAVLDGFLYVAGGEDQNDARNQAKHAVSNFSRYDPRFNTWIHLANMIQKRTHFSLNTFNG VAVLDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHLTNMSQRRTHFSLNTFNG V AVLDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHL M QRRTHFSL FNG VAVMDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHLASMNQKRTHFSLSVFNG V AVMDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHL SMNQKRTHFSLSVFNG VAVMDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHLGSMNQKRTHFSLSVFNG V AVMDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHL SMNQKRTHFSLSVFNG VAVMDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHLGSMNQKRTHFSLSVFNG V AVMDGFLYVAGGEDQNDARNQAKHAVSNFCRYDPRFNTWIHL SMNQKRTHFSLSVFNG 420 420 419 419 419 klhl fugu-klhl KLHL m-Klhl r-Klhl LLFAVGGRN DG ASVECYVPS NQWQ KAPMEVPRCCHAS V DGKILV GGYI NAY LLFAVGGRNSDGCQASVECYVPSSNQWQMKAPMEVPRCCHASSVIDGKILVSGGYINNAY LLFAVGGRNADGVQASLECYVPSSNQWQMKAPMDVPRCCHASSVIDGKILVSGGYINNAY L LFAVGGRNADG ASLECYVPS NQWQ KAPMDVPRCCHAS V DGKILV GGYI NAY LVYAAGGRNAEGSLASLECYVPSTNQWQPKTPLEVARCCHASAVADGRVLVTGGYIANAY L VYA GGRNAEGSLASLECYVPSTNQWQPK PLEVARCCHASAVADGRVLVTGGYI NAY LLYAVGGRNSEGSLASLECYVPSTNQWQPKAPLEVARCCHASAVADGRVIVTGGYIGSAY L LYAVGGRN EGSLASLECYVPSTNQWQPKAPLEVARCCHASAVADGRVIVTGGYI AY LLYAVGGRNAEGSLASLECYVPSTNQWQPKAPLEVARCCHASAVADGRVIVTGGYIGSAY L LYAVGGRNAEGSLASLECYVPSTNQWQPKAPLEVARCCHASAVADGRVIVTGGYI AY 480 480 479 479 479 klhl fugu-klhl KLHL m-Klhl r-Klhl SR VC YDP D WQD SRAVCSYDPSTDSWQDKNSLSSPRGWHCSVTVGDRAYVLGGSQLGGRGERVDVLPVECYN LS PRGWHC V VGDR YVLGGSQLG RGERVDVL VE Y SRAVCSYDPSTDTWQDKSSLSTPRGWHCAASMGDRAYVFGGSQLGGRGERVDVLAVESYN S R VC YDP D WQD LSTPRGWHCA MGDR YV GGSQLG RGERVDVL VESY SRSVCAYDPASDSWQELPNLSTPRGWHCAVTLSDRVYVMGGSQLGPRGERVDVLTVECYS S RSVCAYDPA D WQELP LSTPRGWHCAV LSDR YVMGGSQLGPRGERVDVLTVE YS SRSVCAYDPALDAWQELPQLSTPRGWHCAVALGDRLYVMGGSQLGPRGERVDVLTVESFS S RSVCAYDPA D WQELP LSTPRGWHCAV LGDR YVMGGSQLGPRGERVDVLTVESFS SRSVCAYDPALDAWQELPGLSTPRGWHCSVALGDRVYVMGGSQLGPRGERVDVLTVESFS S RSVCAYDPA D WQELP LSTPRGWHC V LGDR YVMGGSQLGPRGERVDVLTVESFS 540 540 539 539 539 klhl fugu-klhl KLHL m-Klhl r-Klhl P GQWSYVAPL GVSTAG PHSGQWSYVAPLLTGVSTAGAATLNNKIYLLGGWNEIEKKYKKCIQVYNPDLNEWTEDDE LN K YLLGGWNE EKKYKKCIQ YNPDLNEWTEDDE PHSGQWSYCTPLHTGVSTAGISLLNNKIYLLGGWNEGEKKYKKCIQVYNPDLNEWTEDDE P GQWSY PL GVSTAGIS LN K YLLGGWNEGEKKYKKCIQ YNPDLNEWTEDDE PATGQWSYAAPLQVGVSTAGVSALHGRAYLVGGWNEGEKKYKKCIQCFSPELNEWTEDDE P GQWSY APL VGVSTAGVSALHGRAYLVGGWNEGEKKYKKCIQCF PELNEWTEDDE PAARQWSFVAPLPVGVSTAGVSALHGRAYLLGGWNEGDKKYKKCIQCFNPELNEWTEDDE P QWSFVAPL VGVSTAGVSALHGRAYLLGGWNEGDKKYKKCIQCFNPELNEWTEDDE PVARQWSFVAPLPVGVSTAGVSALHGRAYLVGGWNEGEKKYKKCIQCFNPELNEWMEDDE P QWSFVAPL VGVSTAGVSALHGRAYLVGGWNEGEKKYKKCIQCFNPELNEW EDDE 600 600 599 599 599 klhl fugu-klhl KLHL m-Klhl r-Klhl LPEATVGISCC V IP LPEATVGISCCVVTIPTRKTRESRASSVSSAPVS TRESRASSVSS PVS LPEATVGISCCIITVPTRKTRESRASSVSSAPVS L PEATVGISCC I VP TRESRASSVSS PVS LPEATVGVSCCTLSMPNNVTRESRASSVSSVPVS L PEATVGVSCCTL MPN VTRESRASSVSSVPVS LPEATVGVSCCTLAMPNSVSRESRASSVSSVPVS L PEATVGVSCCTL MPN V RESRASSVSSVPVS LPEATVGVSCCTLAMPNSVSRESRASSVSSVPVS L PEATVGVSCCTL MPN V RESRASSVSSVPVS 634 634 633 633 633 61 Results and comprehensive sequence data from several metazoan organisms including human, mouse, rat, zebrafish and the pufferfish Fugu rubipes (Clamp et al.,2003). Zebrafish and Fugu klhl share about 87% identity as compared to the 74-76% identity shared between zebrafish and the mammalian sequences, demonstrating the closer evolutionary relationship between the two teleosts fish. Fugu klhl share 73-75% identity with the mammalian Klhls (Fig. 5) 5. Genome Mapping of klhl It is well known that the correspondence between the two genomes may be used for the prediction of gene orthologs (Barbazuk et al., 2000). We thus sought to confirm the gene orthology relationship through conserved synteny analysis. Findings from the release of the draft sequence of mouse show that as much as 96% of genes present in both the mouse and human genome lie in syntenic regions. Work by Postlewait and others (1998) have revealed that extensive contiguous blocks of synteny exist between the zebrafish and human genomes. A search of the zebrafish EST database (http://www.genetics.wustl.edu/fish_lab/frank/cgi-bin/fish/) identified several clones that are identical to klhl. Among them was clone fb95e08 that had previously been mapped to linkage group (LG) 13 using both LN54 and T51 radiation hybrid (RH) panels and in a meiotic mapping heat shock (HS) panel (Kwok et al., 1998; Geisler et al., 1999; Geisler and Jorg, unpublished; Hukriede et al., 1999; Kelly et al., 2000) (Fig. 6A). The gene encoding for the putative human KLHL protein, KLHL (Genbank accession no. Q9H511) was found to map to human 6p12.2 (http://www.ensembl.org). The genes encoding for mouse (Genbank accession no. NM_172925) and rat Klhl 62 Results Fig. 6. Genome mapping of klhl. (A) The partial linkage map of zebrafish LG13, indicating the position of klhl gene in relation to other markers. Map positions of klhl (shown in bold) and markers on the T51 and LN54 radiation hybrid panels, and HS meiotic panel were obtained from ZFIN (http://zfin.org/ZFIN) and the Tüebingen zebrafish genome map web page (http://wwwmap.tuebingen.mpg.de). (B) Map location of klhl, gsta3, bmp5, and bpag1 on zebrafish LG13 and their orthologs on human chromosome 6 (Hsa6). Intrachromosomal rearrangements have altered the gene order in fish and mammalian chromosomes. The relative chromosomal locations for the human orthologs were (http://www.ncbi.nlm.nih.gov/LocusLink), obtained and the from Ensembl Locuslink database (http://www.ensembl.org). The maps are not drawn to scale. (C) Conservation of synteny between zebrafish LG13, Hsa 6, mouse chromosome 9 (Mmu9) and rat chromosome 8. The apparent orthologs are arranged in the same column. 63 A LG13 T51 LN54 PANEL cM cR 56.0 7.4 z11918 76.0 110.0 12 12 4 8 7 11 4 4 8 28 cR fa09h08 z5643 fa25h01 z11918 fk08d06 (gsta3) z4176 z10513 fc45e02 fc38a05 (bpag1) zehl0669 (bmp5) z9564 z5643 fa09h08 40.04 49.22 56.26 z9564 z13611 3.2 zf-es31 (gsta) 5.9 fb95e08 (klhl) 9.1 fc79g12.x1 fj16a03 fc83g12 fj21h11 fa09h08 7.89 17.0 z5366 13 4 8 4 17 237.0 18.25 25.7 fb95e08 (klhl) 41 25.2 cM fe36e11 150.0 191.0 HS PANEL fj97f06 z6104 fj99f10 chunp1029 5 cM 20 cR z9564 z13611 fb26d02 64 fb95e08 (klhl) Results B Hsa 6 LG13 T51 LN54 80 cR HS 3.2 cM gsta 95 cR bpag1 106 cR bmp5 110 cR 40 cR 17.0 cM klhl GSTA 6p12.2 KLHL 6p12.2 BMP5 6p12.1 BPAG1 6p12-p11 C Zebrafish LG13 gsta3 bpag1 bmp5 klhl fk08d06 fc38a05 zehl0669 fb95e08 BPAG1 BMP5 KLHL Bmp5 Klhl zf-es31 Human Hsa6p12-11 GSTA3 Mouse Mmu9 Gsta3 Rat Rn8 Gsta3 Klhl 65 Results (Ensembl gene ID ENSRNOG00000006224) proteins were found to be present on mouse chromosome 9 and rat chromosome 8 respectively (http://www.ensembl.org). A syntenic relationship between human chromosome 6 and mouse chromosome 9 had previously been described (http://www. ncbi.nlm.nih.gov/Homology) (Fig. 6C). Woods et al. (2000) had previously identified a syntenic relationship between zebrafish LG13 and human chromosome 6. Both zebrafish LG13 and human chromosome contain at least 5 pairs of orthologous genes, mekk4/MEKK4, kiaa0796/KIAA0796, gsta3/GSTA3, fbx5/FBX5 and dld/DLL1. However, many of these genes are not located in the same region as klhl. Only one gene, glutathione Stransferase 3 gsta3 (fk08d06 and zf-es31), is located close to klhl (Fig. 6A). Consistent with this data, GSTA3 maps close to KLHL (6p12.2) (http://www.ncbi.nlm.nih.gov/LocusLink) (Fig. 6B,C). The region containing klhl and gsta3 in zebrafish LG13 might thus be syntenic to the region containing KLHL and GSTA3 in human chromosome 6. In mouse, Gsta3 has also been mapped close to Klhl on chromosome 9 (Fig. 6C). To further strengthen this relationship, ESTs mapped close to klhl and gsta3 were examined and an EST (zehl0669) coding for bone morphogenetic protein 5 (bmp5) gene was identified. The ortholog BMP5 has been mapped near the locus of KLHL,located at position 6p12.1 (Fig. 6B,C) while the mouse Bmp5 has also been mapped in a nearby region on chromosome 9 (Fig. 6C). We have also found a zebrafish bulbous pemphigoid antigen 1 gene, bpag1 (fc38a05), near klhl, similar to the presence of a human ortholog gene, BPAG1 (6p12-p11), near KLHL (Fig. 6B,C). These results confirms that zebrafish LG13 is syntenic to human chromosome 6 and mouse chromosome 9 and further supports that KLHL, m-Klhl and r-Klhl are orthologs of klhl in human, mouse and rat respectively. 66 Results However, it is noteworthy that the genes are not arranged in the same manner in the zebrafish and human chromosome. As seen in Fig. 6B, the gene order is rearranged between the two chromosomes. This result is not unexpected and is consistent with Postlethwait et al. (2000) where it was noted that while large blocks of conserved syntenies were observed between zebrafish and humans, gene orders were frequently inverted and transposed. 6. Developmental Accumulation of klhl To examine the temporal expression of klhl, northern blot hybridization was carried out using total RNA extracted from different stages of zebrafish embryos, and also from adult fish. As klhl had previously been found to be expressed specifically in the somites in developing zebrafish embryos (Wu, 1999), the stages of embryos chosen ranged from 12 hpf (6 somite stage), the beginning of somitogenesis, to hatched fry (72 hpf). To position klhl in the somitogenic pathway, we compared its expression pattern with two muscle specific protein (MSP) cDNA clones, α-tropomyosin (tpma) and fast skeletal muscle myosin light polypeptide 2 (mylz2), which had previously been characterized in our laboratory (Xu et al., 2000). The expression pattern of tpma and mylz2 obtained in this study was consistent with that of Xu et al. (2000), with tpma transcripts first becoming detectable at ~12 hpf, and mylz2 ~18 hpf (Fig. 7). The expression of the two genes increased rapidly during development and was maintained at high levels through to 72 hpf. Both genes were also expressed in the adult fish, albeit at a lower level than in developing embryos. As shown in Fig. 7, klhl was expressed at about the same time as mylz2 with its transcripts detectable at ~18 hpf. Its expression pattern was similar to that of the other 67 Results Fig. 7. Expression of klhl in developing zebrafish embryos in comparison to two other MSP genes, tpma and mylz2. A northern blot containing 10 µg/lane of total RNA isolated from zebrafish embryos of various stages from beginning of somitogenesis (12 hpf) to hatched fry (72 hpf) was hybridized with individual cDNA probes as indicated on the left of each panel. The stages of embryos are indicated at the top of each lane; adult, RNA prepared from whole fish. The sizes of major hybridized transcripts are indicated on the right. The same blot was hybridized with a ubiquitously expressed acidic ribosomal phosphoprotein (arp) probe to monitor the quantity and quality of the RNAs. 68 Results two genes with an increase in expression level during development, and a weaker expression in the adult fish. The size of the klhl transcript detected was about 4.0 kb. The disparity in size observed between our predicted 2.4 kb full length clone and the 4.0 kb mRNA suggests that the klhl gene might have an unusual long 5’ UTR or polyA tail. A ubiquitous acidic ribosomal phosphoprotein P0 (arp) cDNA probe was used to hybridize to the same RNA blot to monitor the quality of RNA and to ensure the even loading of all RNA samples (Ju et al., 1999). 7. Tissue Distribution Analysis of klhl in Adult Zebrafish To examine if the restricted pattern of expression was maintained in the adult zebrafish, northern blot hybridization was carried out using total RNA extracted from several adult tissues. The two MSP cDNA clones, tpma and mylz2 were again included in the study. As shown in Fig. 8, a single transcript was detected in the heart and skeletal muscle lanes. In adult tissues, klhl mRNA was expressed strongly in the trunk skeletal muscle and weakly in the heart. This expression pattern was similar to that of tpma mRNA which was also detected in the heart and skeletal muscle. mylz2 is expressed specifically in the trunk skeletal muscles. arp was used as a control to monitor the quality and quantity of the RNAs. 8. Expression of klhl is Similar in Human, Rat and Zebrafish To determine if KLHL expression is similar to that of klhl, we designed KLHLspecific primers to amplify a 0.9 kb DNA fragment from human DNA. The fragment was radiolabeled and used as a probe in northern blot analysis of poly (A+) RNA from human adult tissue. As seen in Fig. 9, a single transcript of ~7.0 kb was detected in the heart and skeletal muscle tissues. 69 Results Fig. 8. Tissue distribution of klhl mRNAs in comparison with tpma and mylz2 mRNAs in adult zebrafish. A northern blot containing 10 µg/lane of total RNA isolated from eight different zebrafish tissues was hybridized with individual cDNA probes as indicated on the left of each panel. The names of tissues are indicated at the top of the lanes. Abbreviations: B, brain; E, eyes; G, gill; H, heart; I, intestine; L, liver; M, trunk skeletal muscle; O, ovary. The same blot was hybridized with a ubiquitously expressed acidic ribosomal phosphoprotein (arp) probe to monitor the quantity and quality of the RNAs. 70 Results Fig. 9. Northern blot analysis of KLHL mRNA in human tissues. The human blot was hybridized with a radiolabelled KLHL-specific probe and with control human βactin probe (bottom panel). Positions of molecular size makers are indicated on the left. 71 Results Based on the results of searches in public human, mouse and rat EST databases, we were also able to deduce that klhl is expressed in similar tissues in human and rat (Table 1). We were able to identify two human EST clones that were identical to KLHL. One was isolated from a cDNA library constructed using mRNA from a cDNA head library and the other from a fetal heart cDNA library, indicating that KLHL is expressed in the fetal heart. We also identified three mouse and 10 rat EST clones that showed amino acid sequence homology with zebrafish klhl. While we were not able to extract much information about Klhl expression in the mouse, all the rat EST clones identified were isolated from cDNA libraries constructed from fetal heart or muscle mRNA (Table 1). Thus, it is likely that klhl is expressed in a similar fashion in human, rat and zebrafish and its function is conserved among the different species. 9. Ontogenetic Expression of klhl during Somitogenesis To reveal further the temporal and spatial expression pattern of klhl, whole mount in situ hybridization was carried out with embryos of various somitogenesis stages. In zebrafish, somite formation begins at about 10 hpf, and ends at 24 hpf, with one pair of somites forming approximately every half hour (Hanneman and Westerfield, 1989; Kimmel et al., 1995). Probes for tpma and mylz2 were also included for comparison. Xu et al. (2000) had previously studied the expression of MSP genes in zebrafish embryos by whole mount in situ hybridization. Based on the timing and pattern of their expression during somitogenesis, Xu et al. (2000) had classified the MSP genes into three groups: early, intermediate and late. tpma belonged to the early gene group. First detected around 10 hpf, tpma is expressed in the adaxial cells prior to somite formation (Fig. 10B). From 10 to 12 hpf, when the first six somites are formed, 72 Results Table 1. Summary of EST clones homologous to klhl Organism Unigene Cluster Human Mouse Mm.208843 Genbank Library Source Accession No BF824983 head R57328 fetal heart AA672987 myotubules AV174355 8-day embryo BB618579 BB528731 15 days embryo head BB663561 Rat Rn. 22511 BB369350 16 days embryo head AW254032 rat atrium at 16.5 dpc, ventricle at 16.5 dpc, AV canal at 16.5 dpc, atrium at 15 dpc, ventricle at 15 dpc, AV canal at 15 AW525772 dpc and ventrile at 13 dpc. rat atrium at 15 dpc, ventricle at 16.5 BF408492 dpc, atrium at 16.5 dpc, ventricle at 13 dpc, ventricle at 15 dpc, AV canal at 15 BF414303 dpc. BF543966 Normalized rat atrium AA849356 Normalized rat muscle AI171017 AI171183 73 Results Fig. 10. Expression of klhl (A, C) and tpma (B, D) in zebrafish embryos. (A, B) Dorsal views of 2-somite stage embryos, anterior to the left. (C, D) Dorsal views of flat mounted 12 hpf (6-somite) embryos, anterior to the left. 74 Results the tpma transcript signal intensifies and extends posteriorly as the gene is activated in a rostral-caudal manner (Fig. 10B, D, 11F, G). This signal becomes stronger in older embryos as the amount of transcripts starts to increase with the increase in somites number (Fig. 11H-J). tpma is expressed in all the formed somites. mylz2, on the other hand, is activated later than tpma, with its transcripts first detected at 16 hpf (Xu et al., 2000) (Fig. 11K). As an example of an intermediate gene, mylz2 is not expressed in all formed somites, with its transcripts only detected in the older somites. For example, by 16 hpf, when 14 somites were formed, expression of mylz2 transcripts were found only in the anterior 10 somites (Fig. 11K) while tpma transcripts were detected in all 14 somites. This appears to be the case for the intermediate and late genes group in which expression is absent in the last 2-6 formed somites (Xu et al., 2000). Based on the result of the northern hybridization, we would expect klhl to be expressed in a similar fashion with mylz2. On examination of klhl expression in 16 hpf embryos, we found strong expression in all 14 formed somites (Fig. 11D), similar to that of tpma (Fig. 11I). In addition, another domain of expression was found in bilateral stripes of cells, just adjacent to the future hindbrain (boxed up in Fig. 11D, Fig 13A). We went on to investigate the expression of klhl in younger embryos and found that it displayed a similar expression pattern as that of tpma. First detected at the 2-somite stage, klhl expression lags about 30 minutes behind that of tpma. Expression of klhl first occurs in the adaxial cells on both sides of the notochord (Fig. 10A). By 12 hpf (6-somite), klhl expression was detected in the formed somites and in the posterior adaxial cells in the unsegmented region (Fig. 10C), with its level of expression increasing with the development of somites (Fig. 11B-E). Expressed in all formed somites, klhl expression is a typical early MSP gene as defined by Xu et al. (2000). The inconsistency between the results of the northern blot hybridization and in situ 75 Fig. 11. Ontogenetic expression of klhl (A-E), tpma (F-J) and mylz2 (K-L) during the various stages of somitogenesis. Vertical panel columns shown embryos at the same stage: 11hpf (A, F), 12 hpf (B, G), 14 hpf (C, H), 16 hpf (D, I, K) and 18 hpf (E, J, K). Additional domain of expression of klhl is boxed up in (D). All panels show lateral views of embryos, anterior to the left. 76 Results hybridization could be accounted for by the fact that the initial expression of klhl was too low to be detected by northern. This explanation could be borne out from in situ hybridization results where a much longer staining time was required for detection of klhl expression as compared to tpma, despite the fact that similar probe concentrations were used. 10. Expression of klhl in Fast and Slow muscle To verify the muscle specificity of klhl, trunk-level tissue sections of 36 hpf embryos were examined. In the adult, zebrafish muscle fibers can be subdivided into slow and fast muscle fibers and precursor to these two fiber types can be identified very early in development. By 24 hpf, fast muscle fibers are found in the deep cells of myotome, whereas slow muscle fibers form a superficial monolayer on the surface of the myotome (Devoto et al., 1996; Stickney et al., 2000). To determine gene expression in fast and slow muscles, two-color in situ hybrididization was carried out using the klhl probe and a slow muscle myosin probe, smbpc. smbpc, encoding a slow myosin binding protein C, is expressed only in the slow muscle (Fig. 12J-K) (Xu et al., 2000). Compared with the specific staining of smbpc in slow muscles, klhl transcripts were fast-muscle-specific (Fig. 12A-C). No expression was detected in the superficial slow muscle cells defined by smbpc mRNA expression. This was confirmed by the comparison of expression of tpma and smbpc mRNAs, and desmin and smbpc mRNAs. tpma mRNA had previously been shown to be expressed only in the fast skeletal muscle (Xu et al., 2000) and its expression is similar to that of klhl (Fig. 12D-F). desmin, on the other hand, is expressed in both fast and slow muscles (Xu et al., 2000) (Fig. 12G-I). 77 Results Fig. 12. Comparison of expression of klhl (A-C), tpma (D-F), desmin (G-I) and smbpc (J-L) in 36 hpf embryos. (A, D, G, J) Lateral view of whole mount 36 hpf embryos. (B, E, H, K) Cross sections of embryos at the trunk position as indicated by arrowheads in A, D, G, and J respectively. (C, F, I) Cross sections of embryos doublystained with fluorescein labeled smbpc antisense riboprobe (magenta) and digoxigenin labeled klhl (C), tpma (F) and desmin (I) antisense riboprobe (purple). (L) Cross section of embryo stained with smbpc antisense riboprobe only. 78 Results 11. Expression of klhl during Cardiac Morphogenesis We went on to study in more detail the additional domain of expression of klhl that we had observed earlier. First appearing at the 13-somite stage (15.5 hpf), klhl transcripts are present as bilateral stripes of cells, lying in close proximity to the yolk and adjacent to the future hindbrain (Fig. 13A and data not shown). As development proceeds, klhl-expressing cells rapidly migrates towards the midline, appearing as a butterfly-shaped configuration at 18 hpf (Fig. 13B) and finally fusing to become a ring structure by 19.5 hpf (Fig. 13C) which is transformed into the heart tube by 26 hpf (Fig. 13D). The ring structure is actually a shallow cardiac cone formed as the cells posterior to the bridge fuse followed by an anterior closure that creates a central lumen. The apex of the cone is raised dorsally around the lumen (Yelon et al., 1999; Yelon, 2001). Cardiac morphogenesis continues to be highly dynamic after the heart tube has formed. By 30 hpf, klhl transcripts are present as a curved tube as cardiac looping takes place (Lee et al., 1996; Yelon et al., 1999; Yelon, 2001) (Fig. 13E-G). Anatomic chamber differentiation also takes place at the same time and constriction is evident at the atrioventricular boundary by 48 hpf (Fig. 13H). klhl transcripts persist in the embryonic heart at least through 48 hpf and are also present in adult heart (Fig. 8). klhl is expressed uniformly throughout the myocardium, and its expression reveals the entire progression of heart formation. 12. Expression of klhl in Cranial Muscle Development We also analyzed klhl expression in late embryogenesis to determine if klhl was expressed similarly to tpma in the cranial skeletal muscles. Like trunk skeletal muscle, cranial skeletal muscles in vertebrates are derived from paraxial mesoderm and express myogenic regulatory genes that activate structural genes that give muscles 79 Results Fig. 13. klhl expression during cardiac morphogenesis. (A-F) Dorsal views of embryos at (A) 16 hpf, (B) 18 hpf, (C) 19.5 hpf, (D) 24 hpf, (E) 26 hpf , (F) 30 hpf, anterior at the top. (G, H) Head-on views of 36 hpf (F) and 48 hpf embryos (G). The embryo’s left side is to the right of the figure. The ventricle is indicated with an arrowhead and the atrium is indicated with an arrow in (H). 80 81 Results their contractile function (Schilling and Kimmel, 1997). As seen in Fig. 14, klhl, like MSP genes, is expressed in cranial muscles. The pattern of expression observed is in essential similar to that of tpma (Schilling and Kimmel, 1997) and mylz2 (Xu et al., 2000). 82 Results Fig. 14. Localization of klhl transcripts in 72 hpf embryos. (A) Ventral view of embryo, anterior to the left. (B) Lateral view of left side of the embryo. Abbreviations: ah, adductor; am, adductor mandibulae; ao, adductor operculi; do, dilator operculi; hh, hyohyoideus; ih, interhyal; ima, intermadibularis anterior; imp, intermendibularis posterior; io, inferior oblique; lap, levator arcus palatini; mr, medial rectus; PFB, pectoral fin bud; psm, presomite muscle; so, superior oblique; sr, superior rectus; vam, ventral abdominal muscle. The nomenclature for cranial skeletal and muscle elements is based on Schilling and Kimmel (1997). 83 Discussion Chapter IV Discussion 84 Discussion 1. Zebrafish as a Model for Vertebrate Biology The zebrafish has become a model of choice for many scientists seeking to understand vertebrate development. This has largely been a result of the successes of the large-scale mutagenesis screens as well as the advances in zebrafish genome research. My study here has focused on the isolation and characterization of a novel zebrafish gene, klhl, expressed in skeletal and cardiac muscles. This study is an offshoot of an earlier project to identify novel genes important in zebrafish embryogenesis using the method of whole mount RNA in situ hybridisations (Wu, 1999). Identification of the expression pattern of novel genes can often shed light on their function. Such applications are especially useful in light of the millions of mouse, rat and human ESTs that have accumulated in the genome projects. However, large-scale expression studies using higher vertebrates are both cumbersome and cost-prohibitive for such a vast collection of genes. Zebrafish provide an alternative approach to this problem. Large number of zebrafish embryos can be produced inexpensively. Hundreds of whole mount in situ hybridisations to staged embryos using zebrafish EST sequences can be performed simultaneously to reveal their expression patterns. Such an approach is promising in light of the fact that many zebrafish genes have mammalian orthologs in EST databases. This is exemplified by the identification of human, mouse and rat orthologs of klhl in my study. Furthermore, many zebrafish mutant phenotypes have been found to resemble certain human diseases. Thus, the zebrafish could present an alternative vertebrate model of human diseases (Dodd et al., 2000). The molecular and functional characterization of genes in zebrafish might be useful for assigning functions to human genes known only by sequences that are identified by the HGP. 85 Discussion 2. klhl is a Member of the Kelch Family of Proteins To this end, I report here the cloning and developmental analysis of a novel zebrafish gene, klhl. The 1905 bp ORF of klhl is predicted to encode a 635 aa protein that possesses two evolutionary conserved domains– an N-terminal BTB/POZ domain and followed by six kelch repeats. BTB/POZ is found in two main contexts: one is in zinc finger-containing proteins, in which it mediates a transcriptional repression activity (Chang et al., 1996); the second context, in association with 4-6 kelch repeats (Adams et al., 2000). The BTB-POZ domain has been shown to mediate homodimerization or heterodimerization of proteins that contain it (Bardwell and Treisman, 1994). Database mining and sequence analyses of various genomes available (from human, Drosophila, C.elegans, S, cerevisiae etc) using the kelch motif consensus (Pam01344) as a query sequence identify the domain as an evolutionary conserved one, found throughout phylogeny (Adams et al., 2000; Prag and Adams, 2003). Besides the BTB-POZ domain, the kelch motif has been found to associate with other protein motifs such as discoidin, Fbox, coiled-coil but can also been found alone as well, with the number of repeats ranging from four to seven. Whatever the association, the sets of repeated kelch motifs are predicted to form a β-propeller structure based on the crystal structure determined for a single kelch-repeat protein, fungal galactose oxidase (PDB 1GOF) (Bork and Doolittle, 1994; Adams et al., 2000). Considering that the kelch motif appears in so many different contexts, it is no surprise that kelch-repeat proteins have multiple potential binding interactions and play functionally diverse activities in the cell (For a list of the interactions and cellular functions of kelch repeat proteins, see Adams et al., 2000). Besides klhl, both the BTB/POZ and kelch domains have been found in a number of proteins that constitute a subfamily of the kelch superfamily, termed the N-dimer, C86 Discussion propeller proteins (Adams et al., 2000). Members of this family contain an N-terminal BTB/POZ domain and four to six kelch motifs located within the C-terminal region. Examples of proteins belonging to this subgroup include Drosophila kelch (Robinson and Cooley, 1997; Kelso et al., 2002), Calicin (von Bulow et al., 1995), ENC-1 (Hernandez et al., 1997), Keap1 (Itoh et al., 1999) and Mayven (Soltysik-Espanola et al., 1999). One recurrent function among this group of proteins concerns sub-cellular organization, in particular actin-binding. Drosophila kelch is a structural component of actin-rich ring canals that regulate the nutrient transport from the nurse cells to the oocyte. Mayven and ENC-1 are predominantly expressed in the human brain and the mouse neural system respectively and are predicted to be important in the organization of the actin cytoskeleton by functioning as actin-binding proteins (Hernandez et al., 1997; Soltysik-Epanola et al., 1999). These proteins have been shown to bind directly with actin through their kelch repeats. There are however other kelch-repeat proteins that affect cell morphology and organization but do not themselves bind directly, or colocalize with actin. Calicin is one example of this, responsible for the organization of an actin-negative structure, the sperm calyx (von Bulow et al., 1995). Even more functionally diverse, Keap1 plays a role in gene expression by sequestering a transcription factor Nrf2 in the cytoplasm under normal conditions. This interaction is downregulated in the presence of electrophilic agents which stimulate translocation of Nrf2 to the nucleus to counterattack this agents (Itoh et al., 1999). A recent study however indicates that Keap1 might also have a function in cell morphology and organization. Velichkova and colleagues (2002) has found that Keap1 binds to the SH3 domain of myosin-VIIa and associates with it in specialized adhesion junctions. At the heart of this association was the kelch repeat domain. It is apparent from 87 Discussion the above that we cannot assign a specific function to klhl based upon its sequence similarity to the kelch family, as different members appear to have diverse functions. 3. klhl is Expressed in the Somites and Cardiac Muscles Northern blot analyses showed that klhl and its human homolog KLHL are both specifically expressed in the skeletal muscles and heart. We also noted that the rat Klhl EST clones were isolated from a cDNA library constructed using mRNA from fetal heart or muscle. This suggests that the role klhl plays has been conserved through evolution. Previously, another kelch protein gene, human Sarcosin, had been found to be expressed in the muscles as well (Taylor et al., 1998). Detailed analysis using a blot containing various human muscle tissues revealed that expression of Sarcosin was restricted to sarcomeric muscle, skeletal muscle and heart muscle (Taylor et al., 1998). The rat ortholog of Sarcosin, Kelch-related protein 1 (Krp1) was also found to express in the heart and skeletal muscle (Spence et al., 2000). Expression of klhl and Sarcosin mRNAs thus appear to be limited to the striated muscles (i.e. skeletal and cardiac muscles). While cardiac muscle can form a third class of fiber besides smooth and striated muscles, it resembles striated muscles in many aspects (Darnell et al., 1990). To gain more insight about the functional role of klhl, we examined the expression of klhl during zebrafish development. klhl was first detected in the embryos at the 2somite stage, around 10.5 hpf. At this stage, it is expressed in the first two somites and also in the adaxial cells adjacent to the notochord. These adaxial cells would later migrate and differentiate into slow muscle fibers of the adult fish (Devoto et al., 1996). Striated muscle fibers in zebrafish, like most vertebrates, can be broadly classified into either fast or slow muscle depending on contraction speeds and metabolic activities (Darnell et al., 88 Discussion 1990; Devoto et al., 1996). Slow muscle fibers contract and relax slowly, and they can create and maintain tension for long periods of time. Fast muscle fibers contract fast and fatigue fast and are used to generate rapid movements by sudden bursts of contraction. However, a survey of klhl expression in zebrafish formed somites showed that it was not expressed in the slow muscles. It was expressed only in the fast muscle. Similar results have been obtained with a few other fast muscle genes like tpma and troponin C (Xu et al., 2000) that are also expressed in adaxial cells (Xu et al., 2000). Such observations indicate that the differentiating slow muscles may cease the expression of genes during or after cell migration to the superficial layer (Xu et al., 2000). The expression of klhl in the early segmentation period embryos makes it one of the earliest genes to be expressed in the somitogenic pathway. Examples of MSP genes expressed at this early stage include desmin and tpma (Xu et al., 2000). In fact, the expression pattern of klhl closely correlates with that of tpma in the skeletal muscles. Like tpma, klhl expression increases with the number of somites and is expressed throughout somitogenesis and in the adult muscles. Expression of klhl in other skeletal muscles such as head muscles is also similar to tpma. We also detected expression of klhl in myocardial precursors. While skeletal and cardiac tissues have a similar sarcomeric organization and function, they have distinct cellular origins arising from separate populations of mesodermally derived progenitor cells (Gregorio and Antin, 2000). klhl mRNA was first detected at the 13-somite stage in bilateral stripes of myocardial cells. The initiation of klhl expression at this stage corresponds to the expression of two muscle-specific contractile protein genes, cmlc2 and ventricular myosin heavy chain (vhmc) in zebrafish (Yelon et al., 2000). In the chick embryo, muscle-specific contractile protein genes are first detected in cardiac progenitors 89 Discussion at the one-to-four-somite embryo stage, just as the most-anterior regions of the two heart primordia begin to fuse (Gregorio and Antin, 2000). This suggests that the expression of MSP genes takes place at approximately the same time in both zebrafish and chicks. Cardiac fusion in zebrafish takes place shortly after the first detection of cmlc2 and vhmc at the 17-somite stage (Yelon et al., 2000). 4. Role of klhl in Muscle Structure and Function The striated muscle machinery contains a complex interconnected cytoskeletal network critical for its contractile activity (Clark et al., 2002) (Fig. 15). The basic contractile unit of the striated muscle, the sarcomere, contains four filament systems: actin-containing thin filaments that span the I-band and overlap with myosin-containing thick filaments in the A-band, titin and nebulin. The individual sarcomeres are bordered by Z-lines, where the thin filaments, titin and nebulin are anchored (Fig. 15). The precise Fig. 15. A schematic overview of cytoskeletal linkages in striated muscle (Adapted from Clark et al., 2002). assembly and alignment of the various filament systems is critical for muscular contraction. While the filament systems has been intensely studied and the molecular interactions between actin and myosin generating the motion of contraction well known, 90 Discussion the mechanisms by which they become organized during myofibril assembly are still poorly understood as evident by the continual discovery of novel proteins that are associated with the contractile apparatus (Clark et al., 2002). Deciphering the precise relationships among striated muscle components is important considering the diverse and complex number of muscle myopathies that result directly from mutations in contractile and associated proteins. In all likelihood, the domain organization of klhl, its evolutionary conserved restricted expression pattern in the striated muscles in zebrafish and the various mammalian species suggests that it plays an important role in the assembly of the striated muscle. The spatial and temporal expression pattern of klhl as well as its domain organization suggests that klhl may play a role in the organization of the striated muscle cytoarchitecture. Some insights into the function of klhl in striated muscle can be obtained from examining another kelch-related protein that is also specifically expressed in the striated muscle, sarcosin (Taylor et al., 1998), also called Krp1 (Spence et al., 2000; Lu et al., 2003). klhl and Krp1 share only about 20% sequence identity. The work of Spence and colleagues (2000) indicate the possible role of Krp1 in defining the structure and processes that occur at cell tips. Krp1 was found to colocalize with F-actin at the membrane-rufflelike structures in the tips of pseudopodia of rat fibroblasts although Krp1 and actin were not found to interact. In addition, overexpression of Krp1 in transformed rat fibroblasts were found to dramatically elongated pseudopodia while the expression of truncated Krp1 polypeptides, BTB/POZ domain or kelch repeats only, resulted in the reduction of pseudopodia length, presumably by acting as dominant negative mutants. The entire protein is thus required for the interaction with the necessary processes. 91 Discussion More recently, Krp1 was picked up in another screen by Lu and colleagues for NRAP binding partners (2003). N-RAP is an actin-binding LIM protein, concentrated at myotendinous junctions (MTJ) in skeletal muscle and intercalated disks in cardiac muscle (Fig. 16). The MTJ and intercalated disks are the sites of mechanical coupling between the myofibrils and the cell membrane, necessary to effectively transmit force. N-RAP serves as a link between the terminal actin and protein complexes at the cell membrane, Fig. 16. Schematic model of the cytoskeletal filament linkages at the sarcolemma of striated muscle.(Adapted from Clark et al., 2002). interacting with actin with its C-terminus and binding to talin through its N-terminus LIM domain. The C-terminus of talin may also associate with vinculin. The multiple potential interactions with talin, actin and vinculin at the MTJ provide a stable mechanical link between the contractile cytoskeleton and the sacrolemma because it is the focal point for much of the force generated during contraction (Clark et al., 2002). Like the earlier study by Spence et al. (2000), Krp1 was found at the periphery of cells, this time at the periphery of mature myofibrils that appeared to be joining laterally with narrow myofibrils in cultured chicken cardiomyocytes. This lateral fusion transforms myofibril precursors 92 Discussion into mature myofibrils with broad Z-lines. Krp1 is postulated to be involved late in myofibril assembly, catalysing the lateral fusion of myofibril precursors. It has been suggested that the order of expression of MSP genes may reflect the assembly sequence of the myofibril. One prominent model of myofibrillogenesis proposes that thick and thin filaments assemble independently in muscle cells. In this model, thick filaments appear later and are incorporated into preformed structures containing α-actinin, sarcomeric actin and titin (Holtzer et al., 1997; Gregorio and Antin, 2000). In their study of zebrafish MSP genes, Xu et al. (2000) also noted that that the genes for thin filament proteins are expressed earlier than the genes for thick filament proteins. klhl, like tpma, is expressed earlier than zebrafish skeletal α-actin which is the predominant isoform of actin found in adult striated muscles (Xu et al., 2000). The early expression of klhl in the somitogenic pathway and in myocardial precursors suggests its importance for the assembly of the myofibril structure. A search of the Ensembl database with the human sarcosin sequence (Genbank accession number (AAH06534) indicates there might be a zebrafish ortholog of sarcosin in zebrafish (ENSDARP00000012238) too, sharing about 60% identity. It would be interesting to isolate this proposed ortholog of Krp1 in zebrafish and characterize its temporal expression pattern to determine if the timing of activation of the gene is later than klhl. 5. Comparative Genomics, a Look into Evolutionary History The usefulness of the zebrafish as a model to elucidate the function of human genes has heightened with the construction of genetic linkage and RH maps by various groups. Despite the 450 million years of evolutionary distance between zebrafish and 93 Discussion human (Kumar and Hedges, 1998), analyses of human and zebrafish gene maps reveal extensive conservation of synteny between the two species, i.e. genes that are on the same chromosome in human tend to be located in the same chromosome fragment in zebrafish (Postlethwait et al., 1998). In the study, we were able to determine the chromosomal location of klhl to LG13. In humans, the proposed ortholog of KLHL was mapped to the short arm of chromosome at position 6p12.2 while in mouse, the gene is located on the long arm of chromosome 9. The proposed rat ortholog has been mapped to chromosome 8. Comparative mapping suggests that zebrafish klhl is located in a similar genomic organization as in the human, mouse and rat genome, being clustered together with the bmp5 and gsta3 gene. The membership of the klhl genes in the LG13-Hsa6-Mmu9-Rn8 conserved synteny group adds support to the predicted orthology. We were also able to identify another pair of orthologous genes that are part of the LG13-Hsa6 conserved synteny group, bpag1/BPAG1. Interestingly, however, the mouse ortholog of this gene, dld, is part of another conserved synteny group between Hsa6 and Mmu1. Postlethwait et al. (2000) in their study of vertebrate chromosomes suggested that mammalian genomes derive from the fission of large ancestral chromosomes and these broken up differentially in different tetrapod lineages. Taken together, the data suggest that klhl, gsta3, bmp5 and bpag1 were located on a single chromosome in the last common ancestor of mammals and zebrafish. However, this hypothesized chromosome broke apart differently in different lineages in human and mouse. Our study of gene order between LG13 and Hsa6 also reveals that while LG13 and Hsa6 are syntenic to each other, multiple intrachromosomal rearrangements have altered gene orders in zebrafish and humans. 94 Discussion Besides being used for the prediction of gene orthology, such comparative gene maps are useful for the identification of mutant genes, facilitating both positional cloning and the candidate gene approach. Comparing the map locations of zebrafish genes and their human counterparts, for example, could suggest candidates for zebrafish mutations. Unfortunately, we have not been able to identify any disease locus associated with either klhl or its mammalian homologs. 6. Rapid In Silico Cloning of Genes Our study also reflects the emerging trend in the use of nucleotide sequence databases to clone or identify a relevant gene. Nowadays, many analyses, previously limited to the bench, can be performed with a computer. Such in silico analyses reduce the time and effort needed to identify or clone a gene. The cornerstone of molecular biological approaches in the genome project is the identification of expressed genes by brute force sequencing – the generation of ESTs (Schimenti, 1998). Despite their fragmentary nature, ESTs have proven to be useful in analysing gene expression, cloning of genes and as markers on chromosomes (Schuler, 1997; Gong, 1999). This has led to the development of EST gene indices like Unigene. Using automated procedures, ESTs are partitioned into sets or clusters that are very likely to represent distinct genes. In addition to strong sequence similarity, clone identifiers can be used to group ESTs derived from the same cDNA even when their sequences do not overlap. Such a database has proven useful in our study in the identification of the partial sequence for rat Klhl from various ESTs that belong to Unigene cluster Rn. 22511. We had earlier used the sequence of zebrafish klhl to query the rat ESTs database to source out rat ESTs clones similar to klhl. The result of this search was similar to the clones in the 95 Discussion Unigene cluster, indicating the utility of the database in assembly of individual ESTs into a consensus sequence (Zhuo et al., 2001). Besides the identification of orthologous genes, in silico analyses provide a fast and inexpensive way of obtaining information about gene expression. For example, by looking at the cDNA libraries from which the ESTs are obtained, we suggest that rat Klhl is expressed in a similar fashion to its orthologs in zebrafish and human. Such a computational approach to expression analyses can even be extended to determine the transcriptional profile of tissues as performed by Bortoluzzi and colleagues (2000). Another genome database programme that was useful in our study was the Ensembl genome database project (Hubbard et al., 2000;Clamp et al., 2003). Ensembl annotates known genes and predicts novel genes, providing a database of human genome annotation. According to Hubbard et al. (2002), “Ensembl genes are regarded as being accurate predicted gene structures with a low false positive rate” as they are supported by experimental evidence of at least one form via sequence homology. Using Ensembl, we were able to identify the putative human homolog of klhl located on chromosome 6. Besides identifying two ESTs that match to this putative gene, we also have experimental evidence that this gene is expressed in human skeletal muscles and heart. In addition to annotating the human genome, the Ensembl database provides up-to-date sand comprehensive sequence data from several metazoan organisms including human, mouse, rat, zebrafish and the pufferfish Fugu rubipes (Clamp et al.,2003). We were able to identify the putative mouse, rat and pufferfish ortholog of klhl through this function of Ensembl. Again, we have also managed to identify several ESTs matching to the predicted mouse and rat gene, providing evidence that it is expressed. The advent and demonstrated 96 Discussion utility of such databases suggests that the rigorous cloning of genes might be a thing of the past. 7. Future Directions With the increasing information on gene sequences from genome projects, the problem of elucidating the function of genes has spawned a new area of research that is being called “functional genomics”. At the forefront of this new era is a small tropical freshwater fish originating from India, the zebrafish. A variety of factors have made the zebrafish an excellent system for the analysis of the vertebrate genome. In this study, a novel zebrafish kelch gene with orthologs in human, mouse and rat as well as pufferfish was identified and characterized. Elucidation of the role klhl plays in the zebrafish would have implications in the human system. The analysis of the expression pattern of klhl as well as its domain organization generates many implications about its possible function. Continuous effort is still required thereafter to determine the exact role this protein might play in muscle structure and function. To gain more insight into the function of klhl, functional analyses to perturb klhl gene expression must be performed. There are two broad classes of functional analysis, namely gain-of-function and loss-of-function analyses. Gain-of-function studies involve the overexpression of a gene while loss-of-function analyses are carried out by destroying or inhibiting the gene. Gain-of-function study can be easily achieved in zebrafish by the microinjection of klhl RNA into the zebrafish embryo or by transgenesis. Loss-of-function analyses are however not as straightforward. The nature of zebrafish enables the application of chemical, deletional and insertional mutagenesis approaches to knock out genes. However, these methods are unspecific. In recent years much effort has been 97 Discussion devoted to the development of new methods for loss-of-function study, including dominant negative and RNA interference experiments (Hunter, 2000). While the efficacy of such methods in zebrafish are still under debate (Li et al., 2000, Oates et al., 2000), the recent development of morpholino-based gene knock-down technology opens the door to the genome-wide assignment of function based on sequence in zebrafish (Nasevicius and Ekker, 2000). Morpholinos are chemically modified oligonucleotides with similar basestacking abilities as natural genetic material. In zebrafish, they have been used widely and have proven effective in the blocking of gene function (refer to Genesis special morpholino, Vol. 30, Issue 3, 2001). Morpholinos work by interfering with the translation initiation process. In additional to the functional analyses, it will be necessary to identify the subcellular localization and also potential binding partners of klhl in zebrafish. GFP-tagged klhl cDNA construct can be transfected into zebrafish embryos or muscle cell lines to determine its cellular localization while in vitro coprecipitations assays can be performed with glutathione S-transferase (GST) fusions of klhl and potential binding partners such as talin. It would also be interesting to isolate and characterize the zebrafish ortholog of krp1 and determine its temporal pattern of expression in zebrafish embryos. Such a study would provide us with a better understanding of myofibrillar assembly and their associated myopathic disorders, and hopefully this will oneday translate into therapeutics treatments. As we move into the near future where studies of the relationship of gene function and protein structure will become increasingly important, further insights can be obtained from the determination of klhl crystal structure. 98 Discussion References 99 Discussion Ackermann, G.E. and Paw, B.H. (2003). Zebrafish: a genetic model for vertebrate organogenesis and human disorders. Front Biosci. 8, d1227-53. Adams, J., Kelso, R. and Cooley, L. (2000). The kelch repeat superfamily of proteins: propellers of cell function. Trends Cell Biol. 10, 17-24. Albagli, O., Dhordain, P., Deweindt, C., Lecocq, G. and Leprince, D. (1995). The BTB/POZ domain: a new protein-protein interaction motif common to DNA- and actin-binding proteins. Cell Growth & Differ. 6, 1193-1198. Alexander, J., Stainier, D.Y.R and Yelon, D. (1998). Screening mosaic F1 females for mutations affecting zebrafish heart induction and patterning. Dev. Genet. 22, 288-299. Allende, M., Amsterdam, A., Becker, T., Kawakami, K., Gaiano, N. and Hopkins, N. (1996). Insertional mutagenesis in zebrafish identifies two novel genes, pescadillo and dead eye, essential for embryonic development. Genes Dev. 10, 3141-3155. Amemiya, C., Zhong, T.P., Silverman, G.A., Fishman, M.C. and Zon, L.I. (1999). Zebrafish YAC, BAC, and PAC genomic libraries. Methods Cell Biol. 60, 235-258. Amsterdam, A., Burgess, S., Golling, G., Chen, W., Sun, Z., Twonsend, K., Farrington, S., Haldi, M. and Hopkins, N. (1999). A large-scale insertional mutagenesis screen in zebrafish. Genes Dev. 13, 2713-2724. Ando, H., Furuta, T., Tsien, R.Y. and Okamoto. H (2001). Photo-mediated gene activation using caged RNA/DNA in zebrafish embryos. Nature Genet. 28, 317-325. Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J.M., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A., Gelpke, M.D., Roach, J., Oh, T., Ho, I.Y., Wong, M., Detter, C., Verhoef, F., Predki, P., Tay, A., Lucas, S., Richardson, P., Smith, S.F., Clark, M.S., Edwards, Y.J., Doggett, N., Zharkikh, A., Tavtigian, S.V., Pruss, D., Barnstead, M., Evans, C., Baden, H., Powell, J., Glusman, G., Rowen, L., Hood, L., Tan, Y.H., Elgar, G., Hawkins, T., Venkatesh, B., Rokhsar, D., Brenner, S. (2002). Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science. 297, 1301-1310/ Artavanis-Tsakonas, S., Matsumo, K. and Fortini, M.E. (1995). Notch signaling. Science 268, 225-232. Aysough, K.R. (1998). In vivo functions of actin-binding proteins. Current Opin. Cell Biol. 10, 102-111. Baltimore, D. (2001). Our genome unveiled. Nature 409, 814-816. Barbazuk, W.B., Korf, I., Kadavi, C., Heyen, J., Tate, S., Wun, E., Bedell, J.A., McPherson, H.D. and Johnson, S.L. (2000). The syntenic relationship of the zebrafish and human genomes. Genome Res. 10: 1351-1358. 100 Discussion Bardwell, V.J. and Treisman, R. (1994). The POZ domain: a conserved proteinprotein interaction motif. Genes Dev. 8, 1664-1677. Barut, B.A. and Zon, L.I. (2000). Realizing the potential of zebrafish as a model for human disease. Physiol. Genomics. 2, 49-51. Beattie, C.E., Raible, D.W., Henion, P.D. and Eisen, J.S. (1999). Early pressure screens. Methods Cell. Biol. 60, 71-86. Becker, T.S., Burgess, S.M., Amsterdam, A.H, Allende, M.L., Hopkins, N. (1998). not really finished is crucial for development of the zebrafish outer retina and encodes a transcription factor highly homologous to human nuclear respiratory factor-1 and avian initiation binding repressor. Development 125, 4369-4378. Beier, D.R. (1998). Zebrafish: genomics on the fast track. Genome Res. 8, 9-17. Bernal, A., Ear, U. and Kyrpides, N. (2001) Genomes OnLine Database (GOLD): a monitor of genome projects world-wide. Nucleic Acids Res. 29, 126-127. Birney, E., Bateman, A., Clamp, M.E. and Hubbard, T.J. (2001). Mining the draft human genome. Nature 409, 827-831. Boguski, M.S. and Schuler, G.D. (1995). ESTablishing a human transcript map. Nature Genet. 10, 369-371. Bomont, P., Cavalier, L., Blondeau, F., Hamida, C.B., Belal, S., Tazir, M., Demir, E., Topaloglu, H., Korinthenberg, R., Tuysuz, B., Landrieu, P., Hentati, F. and Koenig, M. (2000). The gene encoding gigaxonin, a new member of the cytoskeletal BTB/kelch repeat family, is mutated in giant axonal neuropathy. Nat. Genet. 26, 370374. Bork, P. and Copley, R. (2001). Filling in the gaps. Nature 409,818-820. Bork, P. and Doolittle, R.F. (1994). Drosophila kelch motif is derived from a common enzyme fold. J. Mol. Biol. 236, 1277-1282. Bortoluzzi, S., d’Alessi, F., Romualdi, C. and Danieli, G.D. (2000). The human adult skeletal muscle transcriptional profile reconstructed by a novel computational approach. Genome Res. 10, 344-349. Braybrook, C., Warry, G., Howell, G., Arnason, A., Bjornsson, A., Moore, G E., Ross, M.T. and Stanier, P. (2001). Identification and characterization of KLHL4, a novel human homologue of the Drosophila Kelch gene that maps within the X-linked cleft palate and Ankyloglossia (CPX) critical region. Genomics 72, 128-136. Brenner, S. (1990). The human genome: the nature of the enterprise. Ciba Found Symp. 146, 6-12. 101 Discussion Brownlie, A., Donovan, A., Pratt, S.J., Paw, B.H., Oates, A.C., Brugnara, C., Witkowska, H.E., Sassa, S. and Zon, L.I. (1998). Positional cloning of the zebrafish sauternes gene: a model for congenital sideroblastic anaemia. Nature Genet. 20, 244250. Burge, C. and Karlin, S. (1997). Prediction of complete gene structures in human genomic DNA. J Mol Biol 268, 78-94. Chang, C.C., Ye, B.H., Chaganti, R.S. and Dalla-Favera, R. (1996). BCL-6, a POZ/zinc-finger protein, is a sequence-specific transcriptional repressor. Proc. Natl. Acad. Sci. USA. 93, 6947-6952. Chen, J. and Fishman, M.C. (1996). Zebrafish tinman homolog demarcates the heart field and initiates myocardial differentiation. Development 122, 3809-3816. Chen, W., Burgess, S., Goling, G., Amsterdam, A. and Hopkins, N. (2002). Highthroughput selections of retrovirus producer cell lines leads to markedly improved efficiency of germ line-transmissible insertions in zebrafish. J. Virol. 76, 2192-2198. Clamp, M., Andrews, D., Barker, D., Bevan, P., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., Hammond, M., Hubbard, T., Kasprzyk, A., Keefe, D., Lehvaslaiho, H., Iyer, V., Melsopp, C., Mongin, E., Pettett, R., Potter, S., Rust, A., Schmidt, E., Searle, S., Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta-Vidal, A., Vastrik, I., Birney, E. (2003) .Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res.31, 38-42. Clark, K.A., McElhinny, A.S., Beckerle, M.C., and Gregorio, C.C. (2002). Striated muscle cytoarchitecture: an intricate web of form and function. Ann. Rev. Cell Dev. Biol. 18, 637-706. Collins, T., Stone, J.R. and Williams, A.J. (2001). All in the family: the BTB/POZ, KRAB, and SCAN Domains. Mol. Cell. Biol. 21, 3609-3615. Darnell, J., Lodish, H. and Baltimore, D. (1990). Molecular Cell Biology, 859-902. Scientific American Books, Inc. 2nd Edition. Dehal, P., Predki, P., Olsen, A.S., Kobayashi, A., Folta, P., Lucas, S., Land, M., Terry, A., Eacle Zhou, C.L., Rash, S., Zhang, Q., Gordon, L., Kim, J., Elkin, C., Pollard, M.J., Richardson, P., Rokhsar, D., Uberbacher, E., Hawkins, T., Branscomb, E. and Stubbs, L. (2001). Human chromosome 19 and related regions in mouse: conservative and lineage-specific evolution. Science 293, 104-111. Devoto, S.H., Melancon, E., Eisen, J.S. and Westerfield, M. (1996). Identification of separate slow and fast muscle precursor cells in vivo, prior to somite formation. Development 122, 3371-3380. 102 Discussion Dib, C., Faure, S., Fizames, C., Samson, D., Drouot, N., Vignal, A., Maillasseau, P., Marc, S., Hazan, J., Seboun, E., Lathrop, M., Gyapay, G., Morissette, J. and Weissenbach, J. (1996). A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380, 152-154. Dietrich, W.F., Miller, J., Steen, R., Merchant, M.A., Damron-Boles, D, Husain, A., Dredge, R., Daly, M.J., Ingalls, K.A., O’Connor, T.J., Evans, C.A., DeAngelis, M.M., Levinson, D.M., Kruglyak, L., Goodman, N., Copeland, N., Jenkins, N.A., Hawkins, T.L., Stein, L., Page, D.C. and Lander, E.S. (1996). A comprehensive genetic map of the mouse genome. Nature 380, 149-152. Dodd, A., Curtis, P.M., Williams, L.C. and Love, D.R. (2000). Zebrafish: bridging the gap between development and disease. Hum. Mol. Genet. 9, 2443-2449. Dooley, K. and Zon, L.I. (2000). Zebrafish: a model system for the study of human disease. Curr. Opin. Genet. Dev. 10, 252-256. Driever, W. and Fishman, M.C. (1996). The zebrafish: heritable disorders in transparent embryos. J. Clin. Invest. 97, 1788-1794. Driever, W., Solnica-Krezel, L., Schier, A.F., Neuhauss, S.C.F., Malicki, J., Stemple, D.L., Stainier, D.Y.R., Zwartkruis, F., Abdelilah, S., Rangini, Z., Belak, J. and Boggs, C. (1996). A genetic screen for mutations affecting embryogenesis in zebrafish. Development 123, 37-46. Ekker, S.C. and Larson, J.D. (2001). Morphant technology in model developmental systems. Genesis 30, 89-93. Ewing, B. and Green, P. (2000). Analysis of expressed sequence tags indicates 35,000 human genes. Nature 25, 232-234. Fishman, M. C. (1999). Zebrafish genetics: the enigma of arrival. Proc. Natl. Acad. Sci. USA 96, 10554-10556. Fishman, M.C. (2001). Zebrafish- the canonical vertebrate. Science 294, 1290-1291. Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y. and Postlethwait, J.H. (1999). Preservation of duplicate genes by complementary degenerative mutations. Genetics 151,1531-1545. Gaiano, N., Allende, M., Amsterdam, A., Kawakami, K. and Hopkins, N. (1996a). Highly efficient germ-line transmissions of germ-line transmission of proviral insertions in zebrafish. Proc. Natl. Acad. Sci. USA 93, 7777-7782. Gaiano, N., Amsterdam, A., Kawakami, K., Allende, M., Becker, T. and Hopkins, N. (1996b). Insertional mutagenesis and rapid cloning of essential genes in zebrafish. Nature 383, 829-832. 103 Discussion Gardiner-Garden, M. and Frommer, M. (1987). CpG islands in vertebrate genomes. J. Mol. Biol. 196, 261-282. Gates, M.A., Kim, L., Egan, E.S., Cardozo, T., Sirotkin, H.I., Dougan, S.T., Lashkari, D., Abagyan, R., Schier, A.F. and Talbot, W.S. (1999). A genetic linkage map for zebrafish: comparative analysis and localization of genes and expressed sequences. Genome Res. 9,334-347. Gawantka, V., Pollet, N., Delius, H., Vingron, M., Pfister, R., Nitsch, R., Blumenstock, C. and Niehrs, C. (1998). Gene expression screening in Xenopus identifies molecular pathways, predicts gene function and provides a global view of embryonic patterning. Mech. Dev. 77, 95-141. Geisler, R., Rauch, G.J., Baier, H., van Bebber, F., Brobeta, L., Dekens, M.P., Finger, K., Fricke, C., Gates, M.A., Geiger, H. et al. (1999). A radiation hybrid map of the zebrafish genome. Nat Genet 23, 86-89. Godt, D., Couderc, J.L., Cramton, S.E. and Laski, F.A. (1993). Pattern formation in the limbs of Drosophila: bric a brac is expressed in both a gradient and a wave-like pattern and is required for specification and proper segmentation of the tarsus. Development 119, 799-812. Golling, G., Amsterdam, A., Sun, Z., Antonelli, M., Maldonado, E., Chen, W., Burgess, S., Haldi, M., Artzt, K., Farrington, S., Lin, S., Nissen, R.M. and Hopkins, N. (2002). Insertional mutagenesis in zebrafish rapidly identifies genes essential for early vertebrate development. Nat. Genet. 31, 135-140. Gong, Z. (1999). Zebrafish expressed sequence tags and their applications. Methods Cell Biol. 60, 213-233 Gong, Z., Ju, B and Wan, H. (2001). Green fluorescent protein (GFP) transgenic fish and their applications. Genetics 111,213-225. Gregorio, C.C. and Antin, P.B. (2000). To the heart of myofibril assembly. Trends Cell Biol. 10, 355-362. Grunwald, D.J. (1996). A fin-de siecle achievement: charting new waters in vertebrate biology. Science 274, 1634-1635. Grunwald, D.J. and Eisen, J. (2002). Headwaters of the zebrafish- emergence of a new model vertebrate. Nature Genet. 3, 717-724. Hackett, P., Clark, K., Cui, Z., Guerts, A., Liu, G., Davidson, A.L. and Ekker, S.C. (2001). Applications of the sleeping beauty transposon system in fish. 3rd IUBS symposium on molecular aspect of fish genomes and development conference abstract, 63. National University of Singapore, Department of Biological Sciences. 104 Discussion Haffter, P., Granato, M., Brand, M., Mullins, M.C., Hammerschmidt, M., Kane, D.A., Odenthal, J., van Eeden, F.J.M., Jiang, Y., Heisenberg, C.P., Kelsh, R.N., FurutaniSeiki, M., Vogelsang, E., Beuchle, D., Schach, U., Fabian, C. and Nüsslein-Volhard, C. (1996). The identification of genes with unique and essential functions in the development of the zebrafish, Danio rerio. Development 123, 1-36. Hanneman, E. and Westerfield, M. (1989). Early expression of acetylcholinesterase activity in functionally distinct neurons of the zebrafish. J. Comp. Neurol. 284, 350361. Hattori, M., Fujiyama, A., Taylor, T.D., Watanabe, H., Yada, T., Park, H.S., Toyoda, A., Ishii, K., Totoki, Y., Choi, D.K., Soeda, E., Ohki, M., Takagi, T., Sakaki, Y., Taudien, S., Blechschmidt, K., Polley, A., Menzel, U., Delabar, J., Kumpf, K., Lehmann, R., Patterson, D., Reichwald, K., Rump, A., Schillhabel., M., Schudy, A., Zimmermann, W., Rosenthal, A., Kudoh, J., Shibuya, K., Kawwasaki, K., Asakawa, S., Shintani, A., Sasaki, T., Nagamine, K., Mitsuyama, S., Antonaarakis, S.E., Minoshima, S., Shimizu, N., Nordsiek, G., Hornischer, K., Brandt, P. et al. (2000). The DNA sequence of human chromosome 21. Nature 405, 311-319. Hedges, S.B. and Kumar, S. (2002). Vertebrate genomes compared. Science. 297, 1283-1285. Henion, P.D., Raible, D., Beattie, C., Stoesser, K., Weston, J.A. and Eisen, J.S. (1996). Screen for mutations affecting development of zebrafish neural crest. Dev. Genet. 18, 11-17. Hernandez, M.C., Andres-Barquin, P.J., Martinez, S., Bulfone, A., Rubenstein, J.L. and Israel, M.A. (1997). ENC-1: a novel mammalian kelch-related gene specifically expressed in the nervous system encodes an actin-binding protein. J Neurosci. 17, 3038-3051. Hodgkinson, J.L. (2000). Actin and the smooth muscle regulatory proteins: a structural perspective. J. Muscle Res. Cell Motil. 21, 115-130. Holtzer, H, Hijikata, T., Lin, Z.X., Zhang, Z.Q., Holtzer, S., Protasi, F., FraziniArmstrong, C. and Sweeney, H.L. (1997). Independent assembly of 1.6 µm long bipolar MHC filaments and I-Z-I bodies. Cell Struct. Funct. 22, 83-93. Hubbbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., Hammond, M., Huminiecki, L., Kasprzyk, A., Lehvaslaiho, H., Lijnzaad, P., Melsopp, C., Mongin, E., Pettett, R., Pocock, M., Potter, S., Rust, A., Schmidt, E., Searle, S., Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta-Vidal, A., Vastrik, I. and Clamp, M. (2002). The Ensembl genome database project. Nucleic Acids Res. 30, 38-41. 105 Discussion Hukriede, N., Fisher, D., Epstein, J., Joly, L., Tellis, P., Zhou, Y., Barbazuk, B., Cox, K., Fenton-Noriega, L., Hersey, C., Miles, J., Sheng, X., Song, A., Waterman, R., Johnson, S.L., Dawid, I.B., Chevrette, M., Zon, L.I., McPherson, J. and Ekker, M. (2001). The LN54 radiation hybrid map of zebrafish expressed sequences. Genome Res. 11, 2127-2132. Hukriede, N.A., Joly, L., Tsang, M., Miles, J., Tellis, P., Epstein, J. A., Barbazuk, W. B., Li, F. N., Paw, B., Postlethwait, J. H., Hudson, T.J., Zon, L.I., McPherson, J.D., Chevrette, M., Dawid, I.B., Johnson, S.L. and Ekker, M. (1999). Radiation hybrid mapping of the zebrafish genome. Proc. Natl. Acad. Sci. USA 96, 9745-9750. Hunter, C.P. (2000). Shrinking the black box of RNAi. Curr. Biol. 10, R137-R140. Ito, N., Philips, S.E.V., Yadav, K.D.S. and Knowles, P.F. (1994). Crystal structure of a free redical enzyme, galactose oxidase. J. Mol. Biol. 238, 94-814. Itoh, K., Wakabayashi, N., Katoh, Y., Ishii, T., Igarashi, K., Engel, J.D. and Yamamoto, M. (1999). Keap1 represses nuclear activation of antioxidant responsive elements by Nrf2 through binding to the amino-terminal Neh2 domain. Genes Dev. 13, 76-86. Ivics, Z., Hackett, P.B., Plasterk, R.H. and Ivsvák, Z. (1997). Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon system from fish and its transposition in human cells. Cell 91, 501-510. Ivics, Z., Ivsvák, Z. and Hackett, P.B. (1997). Genetic applications of transposons and other repetitive elements in zebrafish. Methods Cell Biol. 60, 99-131. Jesuthasan, S. (2002). Zebrafish in the spotlight. Science 297, 1484-1485. Johnson, S.L., Gates, M.A., Johnson, M., Talbot, W.S., Horne, S., Baik, K., Rude, S., Wong, J.R. and Postlethwait, J.H. (1996). Centromere-linkage analysis and consolidation of the zebrafish genetic map. Genetics 142, 1277-1288. Ju, B., Xu, Y., He, J., Liao, J., Yan, T., Hew, C.L., Lam, T.J. and Gong, Z. (1999). Faithful expression of green fluorescent protein (GFP) in transgenic zebrafish embryos under control of zebrafish gene promoters. Dev. Genet. 25, 158-167. Kahn, P. (1994). Zebrafish hit the big time. Science 264, 904-905. Kalthoff, K. (1996). Analysis of Biological Development, 493-541. McGraw-Hill, Inc. Kapranov, P., Cawley, S.E., Drenkow, J., Bekiranov, S., Strausberg, R.L., Fodor, S.P.A. and Gingeras, T.R. (2002). Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916-919. 106 Discussion Karlstrom, R.O., Talbot, W.S. and Schier, A.F. (1999). Comparative synteny cloning of zebrafish you-too: mutations in the Hedgehog target gli2 affect ventral forebrain patterning. Genes Dev. 13, 388-393. Kawakami, K.A., Amsterdam, A., Shimoda, N., Becker, T., Mugg, J., Shima, A. and Hopkins, N. (2000a). Proviral insertions in the zebrafish hagoromo gene, encoding an F-box/WD40-repeat protein, cause stripe pattern anomalies. Curr. Biol. 10, 463-466. Kay, B.K., Williamson, M.P., Sudol, M. (2000). The importance of being proline: the interaction of proline-rich motifs in signaling proteins with their cognate domains. FASEB J. 14, 234-241. Kelly, P.D., Chu, F., Woods, I.G., Ngo-Hazelett, P., Cardozo, T., Huang, H., Kimm, F., Liao, L., Yan, Y.L., Zhou, Y., Johnson, S.L., Abagyan, R., Schier, A.F., Postlethwait, J.H. and Talbot, W.S. (2000). Genetic linkage mapping of zebrafish genes and ESTs. Genome Res. 10, 558-567. Kelso, R.J., Hudson, A.M. and Cooley, L. (2002). Drosophila kelch regulates actin organization via Src64-dependent tyrosine phosphorylation. J. Cell Biol. 156, 703713. Kidd, K.R. and Weinstein, B.M. (2003). Fishing for novel angiogenic therapies. J Pharmacol. 140, 585-594. Kim, I.R., Mohammadi, E., Huang, R.C.C. (1999). Isolation and characterization of IPP, a novel human gene encoding an actin-binding, kelch-like protein. Gene 228, 7383. Kimmel, C.B. (1989). Genetics and early development of zebrafish. Trends Genet. 5, 283-288. Kimmel, C.B., Ballard, W.W., Kimmel, S.R., Ullmann, B. and Schilling, T.F. (1995). Stages of embryonic development of the zebrafish. Dev. Dyn.. 203, 253-310. Knapik, E.W. (2000). ENU mutagenesis in zebrafish- from genes to complex diseases. Mamm. Genome 11, 511-519. Knowles, B.A. and Cooley, L. (1994). The specialized cytoskeleton of the Drosophila egg chamber. Trends Genet. 10, 235-241. Ko, M.S.H. (2001). Embryogenomics: developmental biology meets genomics. Trends Biotech.19, 511-518. Kopczynski, C.C., Noordermeer, J.N., Serano, T.L., Chen, W., Pendleton, J.D., Lewis, S., Goodman, C.S. and Rubin, G.M. (1998). A high throughput screen to identify secreted and transmembrane proteins involved in Drosophila embryogenesis. Proc. Natl. Acad. Aci. USA 95, 9973-9978. 107 Discussion Kozak, M. (1991). Structural features in eukaryotic mRNAs that modulate the initiation of translation. J. Biol. Chem. 266, 19867-19870. Kudoh, T., Tsang, M., Hukriede, N.A., Chen, X., Dedekian, M., Clarke, C.J., Kiang, A., Schultz, S., Epstein, J.A., Toyoma, R. Dawid, I.B. (2001). A gene expression screen in zebrafish embryogenesis. Genome Res. 11,1979-1987. Kumar, S. and Hedges, S.B. (1998). A molecular timescale for vertebrate evolution. Nature 392, 917-920. Kwok, C., Korn, R., Davis, M., Burt, D., Chritcher, R., McCarthy, L., Paw, B., Zon, L., Goodfellow, P. and Schmitt, K. (1998). Characterization of whole genome radiation hybrid mapping resources for non-mammalian vertebrates. Nucleic Acids Res. 26: 3562-3566. Lai, F., Orelli, B.J., Till, B.G., Godley, L.A., Fernald, A.A., Pamintuan, L. and Le Beau, M.M. (2000). Molecular characterization of KLHL3, a human homologue of the Drosophila kelch gene. Genomics 66, 65-75. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., Heaford, A., Howland, J., Kann Lehoczky, J., LeVine, R., McEwan, P., McKernan, K., Meldrim, J., Mesirov, J.P., Miranda, C., Morris, W., Naylor, J., Raymond, C., Rosetti, M., Santos, R., Sheridan, A., Sougnez, C., Stange-Thomann, N., Stojanovic, N., Subramanian, A., Wyman, D., Rogers, J., Sulston, J., Ainscough, R., Beck, S., Bentley, D., Burton, J., Clee, C., Carter, N., Coulson, A., Deadman, R., Deloukas, P., Dunham, A., Dunham, I., Durbin, R., French, L., Grafham, D., Gregory, S., Hubbard, T. et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860-921. Langheinrich, U. (2003). Zebrafish: a new model on the pharmaceutical catwalk. BioEssays 25, 904-912. Lee, K., Huang, H., Ju, B. and Lin, S. (2002). Cloned zebrafish by nuclear transfer from long-term-cultured cells. Nature Biotech. 20, 795-799. Lee, K., Xu, Q. and Breitbart, R.E. (1996). A new tinman-related gene, nkx2.7, anticipates the expression of nkx2.5 and nkx2.3 in zebrafish heart and pharyngeal endoderm. Dev Biol. 180, 722-731. Lewin, B. (1990). Genes IV, 466-481. Oxfor Univ. Press, Oxford. Li, X. and Noll, M. (1994). Evolution of distinct developmental functions of three Drosophila genes by acquisition of different cis-regulatory regions. Nature 367, 8387. 108 Discussion Li, Y.X., Farrell, M.J., Liu, R., Mohanty, N. and Kirby, M.L. (2000). Double-stranded RNA injection produces null phenotypes in zebrafish. Dev. Biol. 217, 394-405. Liang, F., Holt, I., Pertea, G., Karamycheva, S., Salzberg, S.L. and Quackenbush, J. (2000). Gene index analysis of the human genome estimates approximately 120,000 genes. Nature 25, 239-240. Lin, S., Gaiano, N., Culp, P., Burns, J., Friedmann, T., Yee, J.K. and Hopkins, N. (1994). Integration and germ-line transmission of a pseudotyped retroviral vector in zebrafish. Science 265, 666-669. Lu, S., Carroll, C.L., Herrera, A.H., Ozanne, B. and Horowits, R. (2003). New NRAP-binding partners α-actinin, filamin and Krp1 detected by yeast two-hybrid screening: implications for myofibril assembly. J Cell Sci. 116, 2169-2178. Luo, G., Herrera, A.H. and Horowits, R. (1999). Molecular interactions of N-RAP, a nebulin related protein of striated muscle myotendon junctions and intercalated disks. Biochemistry 38, 6135-6143. Luo, G., Zhang, J.Q., Nguyen, T, Herrera, A.H., Paterson, B. and Horowits, R. (1997). Complete cDNA sequence and tissue localization of N-RAP, a novel nebulin-related protein of striated muscle. Cell Motil. Cytoskeleton 38, 75-90. Ma, C., Fan, L., Ganassin, R., Bols, N. and Collodi, P. (2001). Production of zebrafish germ-line chimeras from embryo cell cultures. Proc. Natl. Acad. Sci. USA 98, 24612466. Makalowski, W. and Boguski, M.S. (1998). Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. Proc. Natl. Acad. Sci. USA 95, 9407-9412. Malicki, J.J., Pujic, Z., Thisse, C., Thisse, B. and Wei, X. (2002). Forward and reverse genetic approaches to the analysis of eye development in zebrafish. Vision Res. 42, 527-533. Margolin, J. (2001). From comparative and functional genomics to practical decisions in the clinic: a view from the trenches. Genome Res. 11, 923-925. Marra, M.A., Hillier, L. and Waterson, R.H. (1998). Expressed sequence tagsESTablishing bridges between genomes. Trends Genet. 14, 4-7. Meng, A., Moore, B., Tang, H., Yuan, B. and Lin, S. (1999). A Drosophila doublesexrelated gene, terra, is involved in somitogenesis in vertebrates. Development 126, 1259-1268. Miklos, G.L.G. and Rubin, G.M. (1996). The role of the genome project in determining gene function: insights from model organisms. Cell 86, 521-529. 109 Discussion Mullins, M.C., Hammerschmidt, M., Haffter, P. and Nüsslein-Volhard, C. (1994). Large-scale mutagenesis in the zebrafish: in search of genes controlling development in a vertebrate. Curr. Biol. 4, 189-202. Murphy, W.J., Stanyon, R. and O’Brien, S.J. (2001). Evolution of mammalian genome organization inferred from comparative gene mapping. Genome Biology 2, reviews 0005.1-0005.8. Nasevicius, A. and Ekker, S.C. (2000). Effective targeted gene ‘knockdown’ in zebrafish. Nature Genet. 26, 216-220. Neidhardt, L., Gasca, S., Wertz, K., Obermayr, F., Worpenberg, S., Lehrach, H. and Herrmann, B.G. (2000). Large-scale screen for genes controlling mammalian embryogenesis, using high-throughput gene expression analysis in mouse embryos. Mech. Dev. 98, 77-93. Nemes, J.P., Benzow, K.A., Moseley, M.L., Ranum, L.P. and Koob, M.D. (2000). The SCA8 transcript is an antisense RNA to a brain-specific transcript encoding a novel actin-binding protein (KLHL1). Hum. Mol. Genet. 9, 1543-1551. Nüsslein-Volhard, C. (1994). Of flies and fishes. Science 266, 572-574. Nüsslein-Volhard, C. and Wieschaus, E. (1980). Mutations affecting segment number and polarity in Drosophila. Nature (Lond.) 287, 795-801. O’Brien, S.J., Menotti-Raymond, M., Murpy, W.J., Nash, W.G., Wienberg, J., Stanyon, R., Copeland, N.G., Jenkins, N.A., Womack, J.E. and Graves, J.A.M. (1999). The promise of comparative genomics in mammals. Science 286, 458-481. Oates, A.C., Bruce, A.E. and Ho, R.K. (2000). Too much interference: injection of double-stranded RNA has nonspecific effects in the zebrafish embryo. Dev. Biol. 224, 20-28. Pandey, A. and Lewitter, F. (1999). Nucleotide sequence databases: a goldmine for biologists. Trends Biochem. Sci. 24, 276-280. Pennisi, E (2003). Human genome: reaching their goal early, sequencing labs celebrate. Science 300, 409. Pesko, G.A. (2000). From sequence to consequence. Genome Biol. 1, reports 406.1. Peterson, R.T., Link, B.A., Dowling, J.E. and Schreiber, S.L. (2000). Small molecules developmental screens reveal the logic and timing of vertebrate development. Proc. Natl. Acad. Sci. USA 97, 12965-12969. Postlethwait, J.H. and Talbot, W.S. (1997). Zebrafish genomics: from mutants to genes. Trends Genet. 13, 183-190. 110 Discussion Postlethwait, J.H., Woods, I.G., Ngo-Hazelett, P., Yan, Y.L., Kelly, P.D., Chu, F., Huang, H., Hill-Force, A. and Talbot, W.S. (2000). Zebrafish comparative genomics and the origins of vertebrate chromosomes. Genome Res. 10, 1890-1902. Postlethwait, J.H., Yan, Y.L., Gates, M.A., Horne, S., Amores, A., Brownlie, A., Donovan, A., Egan, E.S., Force, A., Gong, Z., Goutel, C., Fritz, A., Kelsh, R., Knapik, E., Liao, E., Paw, B., Ransom, D., Singer, A., Thomson, M., Abduljabbar, T. S., Yelick, P., Beier, D., Joly, J.S., Larhammar, D., Rosa, F., Westerfield, M., Zon, L., Johnson, S.L. and Talbot, W.S. (1998). Vertebrate genome evolution and the zebrafish gene map. Nat. Genet. 18, 345-349. Prag, S. and Adams, J.C. (2003). Molecular phylogeny of the kelch-repeat superfamily reveals expansion of BTB/kelch proteins in animals. BMC Bioinformatics. Sep. Epublished before print. Quackenbush, J., Liang, F., Holt, I., Pertea, G. and Upton, J. (2000). The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res. 28, 141-145. Reese, M.G., Kulp, D., Tammana H. and Haussler, D. (2000). Genie- gene finding in Drosophila melanogaster. Genome Res. 10, 529-538. Robinson, D.N. and Cooley, L. (1997). Drosophila kelch is an oligomeric ring canal actin organizer. J. Cell Biol. 141, 553-566. Rubin, G.M. (2001). Comparing species. Nature 409, 820-821. Rubin, G.M., Yandell, M.D., Wortman, J.R., Gabor Miklos, G.L., Nelson, C.R., Hariharan, I.K., Fortini, M.E., Li, P.W., Apweiler, R., Fleischmann, W., Cherry, J.M., Henikoff, S., Skupski, M.P., Misra, S., Ashburner, M., Birney, E., Boguski, M.S., Brody, T., Brokstein, P., Celniker, S.E., Chervitz, S.A., Coates, D., Cravchik, A., Gabrielian, A., Galle, R.F., Gelbart, W.M., George, R.A., Goldstein, L.S., Gong, F., Guan, P., Harris, N.L., Hay, B.A., Hoskins, R.A., Li, J., Li, Z., Hynes, R.O., Jones ,S.J., Kuehl. P.M., Lemaitre, B., Littleton, J.T., Morrison, D.K., Mungall, C., O'Farrell, P.H., Pickeral, O.K., Shue, C., Vosshall, L.B., Zhang, J., Zhao, Q., Zheng, X.H. and Lewis, S. (2000). Comparative genomics of the eukaryotes. Science 287, 2204-2215. Rubinstein, A.L. (2003). Zebrafish: from disease modeling to drug discovery. Curr. Opin. Drug Discov. Devel. 6, 218-223. Saha, S., Sparks, A.B., Rago, C., Akmaev, V., Wang, C.J., Vogelstein, B., Kinzler, K.W. and Velculescu, V.E. (2002). Using the transciptome to annotate the genome. Nat. Biotechol. 508-512. Schier, A.F., Joyner, A.L., Lehmann, R. and Talbot, W.S. (1996). From screens to genes: prospects for insertional mutagenesis in zebrafish. Genes Dev. 10,3077-3080. 111 Discussion Schilling, T.F. and Kimmel, C.B. (1997). Musculoskeletal patterning in the pharyngeal segments of the zebrafish embryo. Development 124, 2945-2960. Schimenti, J. (1998), Global analysis of gene function in mammals: integration of physical, mutational and expression strategies. Electronic Journal of Biotechnology 1. Schmid, B., Furthauer, M., Connors, S.A., Trout, J., Thisse, C. and Mullins, M.C. (2000). Equivalent genetic roles for bmp7/snailhouse and bmp2/swirl in dorsoventral pattern formation. Development 127, 957-967. Schmid, M.F., Agris, J.M., Jakana, J., Matsudaira, P. and Chiu, W. (1994). Threedimensional structure of a single filament in the Limulus acrosomal bundle: scruin binds to homologous helix-lopp-beta motifs in actin. J. Cell Biol. 124, 341-350. Schuler, G.D. (1997). Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J. Mol. Med. 75, 694-698. Schuler, G.D., Boguski, M.S., Stewart, E.A., Stein, L.D., Gyapay, G., Rice K., White, R.E., Rodriguez-Tome, P., Aggarwal, A., Bajorek, E., Bentolila, S., Birren, B.B., Butler, A., Castle, A.B., Chiannilkulchai, N., Chu, A., Clee, C., Cowles, S., Day, P.J., Dibling, T., Drouot, N., Dunham, I., Duprat, S., East, C., Hudson, T.J.et al. (1996). A gene map of the human genome. Science 274, 540-546. Schulte-Merker, S. (2000). Zebrafish functional genomics. Interview by Joanne Wixon. Yeast 17, 232-234. Sheer, N. and Campos-Ortega, J.A. (1999). Use of the Gal4-UAS technique for targeted gene expression in the zebrafish. Mech. Dev. 80, 153-158. Shin, J.T. and Fishman, M.C. (2002). From zebrafish to human: modular medical models. Annu. Rev. Genomics Hum. Genet. 3, 311-40. Shoemaker, D.D., Schadt, E.E., Armour, C.D., He, Y.D., Garrett-Engele, P., McDonagh, P.D., Loerch, P.M., Leonardson, A., Lum, P.Y., Cavet, G., Wu, L.F., Altschuler, S.J., Edwards, S., King, J., Tsang, J.S., Schimmack, G., Schelter, J.M., Koch, J., Ziman, M., Marton, M.J., Li, B., Cundiff, P., Ward, T., Castle, J., Krolewski, M., Meyer, M.R., Mao, M., Burchard, J., Kidd, M.J., Dai, H., Phillips, J.W., Linsley, P.S., Stoughton, R., Scherer, S. and Boguski, M.S. (2001). Experimental annotation of the human genome using microarray technology. Nature 409, 922-927. Smith, S.F., Snell, P., Gruetzner, F., Bench, A.J., Haaf, T., Metcalfe, J.A., Green , A.R. and Elgar, G. (2002). Analyses of the extent of shared synteny an conserved gene orders between the genome of Fugu rubripes and human 20q. Genome Res. 12, 776-784. Solnica-Krezel, L., Schier, A.F. and Driever, W. (1994). Efficient recovery of ENUinduced mutations from the zebrafish germline. Genetics. 136, 1401-1420. 112 Discussion Soltysik-Espanola, M., Rogers, R.A., Jiang, S., Kim, T.A., Gaedigk, R., White, R.A., Avraham, H. and Avraham, S. (1999). Characterization of Mayven, a novel actinbinding protein predominantly expressed in brain. Mol. Biol. Cell 10, 2361-2375. Spence, H.J., Johnston, I., Ewart, K., Buchanan, S.J., Fitzgerald, U. and Ozanne, B.W. (2000). Krp1, a novel kelch related protein that is involved in pseudopod elongation in transformed cells. Oncogene. 19, 1266-1276. Spradling, A.C., Stern, D., Beaton, A, Rhem, E.J., Laverty, T., Mozden, N., Misra, S. and Rubin, G.M. (1999). The Berkeley Drosophila genome project gene disruption project: single P-element insertions mutating 25% of vital Drosophila genes. Genetics 1999, 153, 135-177. Spradling, A.C., Stern, D.M., Kiss, I.,Roote, J., Laverty, T. and Rubin, G.M. (1995). Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proc. Natl. Acad. Sci. USA 92, 10824-10830. Sprague, J., Clements, D., Conlin, T., Edwards, P., Frazer, K., Schaper, K., Segerdell, E., Song, P., Sprunger, B. and Westerfield, M. (2003) The Zebrafish Information Network (ZFIN): the zebrafish model organism database. Nucleic Acids Res. 31, 2413. Stainier, D.Y.R, Lee, R.K. and Fishman, M.C. (1993). Cardiovascular development in the zebrafish. Development 119, 31-40. Stickney, H.L., Barresi, M.J.F. and Devoto, S.H. (2000) Somite development in zebrafish. Dev Dyn 219, 287-303. Streisinger, G., Walker, C., Dower, N., Knauber, D. and Singer, F. (1981). Production of clones of homozygous diploid zebra fish (Brachydanio rerio). Nature 291, 293-296. Sun, S., Footer, M. and Matsudaira, P. (1997). Modification of Cys-837 identifies an actin binding site in the β-propeller protein scruin. Mol. Biol. Cell 8, 421-430. T’Jampens, D., Devriendy, L., De Corte, V., Vanderkerchove, J. and Gettemans, J. (2002). Selected BTB/POZ-kelch proteins bind ATP. FEBS Lett. 516, 20-26. Talbot, W.S. and Hopkins, N. (2000). Zebrafish mutations and functional analysis of the vertebrate genome. Genes Dev. 14,755-762. Talbot, W.S. and Schier, A.F. (1999). Positional cloning of mutated zebrafish genes. Methods Cell Biol. 60, 259-286. Taylor, A., Obholz, K., Linden, G., Sadiev, S., Klaus, S. and Carlson, K.D. (1998). DNA sequence and muscle-specific expression of human sarcosin transcripts. Mol. Cell. Biochem. 183, 105-112. 113 Discussion Taylor, M.S. (2001). More biology from the sequence. Genome Biology 1, reports 4018.1-4018.5. Thomas, J.W. and Touchman, J.W. (2002). Vertebrate genome sequencing: building a backbone for comparative genomics. Trends Genet. 18, 104-108. Tong, Y. (2001). Gene variants of follicle stimulating hormone and its receptor in subfertile patients. NUS Masters Thesis. Urtishak, K.A. (2002). Modified peptide nucleic acids: an alternative to morpholinos for targeted gene disruption in zebrafish larvae. Abstract, 5th International Meeting on Zebrafish Development and Genetics, Madison, Wis. van Troys, M., Vandakerckhove, J. and Ampe, C. (1999). Structural modules in actinbinding proteins: towards a new classification. Biochim. Biophy. Acta 1448, 323-348. VanHouten, J.N., Asch, H.L. and Asch, B.B. (2001). Cloning and characterization of ectopically expressed transcripts for the actin-binding protein MIPP in mouse mammary carcinomas. Oncogene 20, 5366-5372. Varkey, J.P., Muhlrad, P.J., Minniti, A.N., Do, Bao. and Ward, S. (1995). The Caenorhabditis elegans spe-26 gene is necessary to form spermatids and encodes a protein similar to the actin-associated proteins kelch and scruin. Genes Dev. 9, 10741086. Velichkova, M., Guttman, J., Warren, C., Eng, L., Kline, K., Bogl, A.W. and Hasson, T. (2002). A human homologue of Drosophila Kelch associates with myosin-VIIa in specialized adhesion junctions. Cell Motil. Cytoskeleton 51, 147-164. Venkatesh, B., Gilligan, P. and Brenner, S. (2000). Fugu: a compact vertebrate reference genome. FEBS Letts. 472, 3-7. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., Gocayne, J.D., Amanatides, P., Ballew, R.M., Huson, D.H., Wortman, J.R., Zhang, Q., Kodira, C.D., Zheng,,X.H., Chen, L., Skupski, M., Subramanian, G.,Thomas, P.D., Zhang, J., Gabor Miklos, G.L., Nelson, C., Broder, S., Clark, A.G., Nadeau, J., McKusick, V.A., Zinder, N., Levine, A.J., Roberts, R.J., Simon, M., Slayman, C., Hunkapiller, M., Bolanos, R. et al. (2001). The sequencing of the human genome. Science 291, 1304-1351. von Bulow, M., Heid, H., Hess, H. and Franke, W.W. (1995). Molecular nature of Calicin, a major basic protein of the mammalian sperm head cytoskeleton. Exp. Cell. Res. 219, 407-413. Walker, C. (1999). Haploid screens and gamma-ray mutagenesis. Methods Cell. Biol. 60, 43-70. 114 Discussion Wang, H., Long, Q., Marty, S.D., Sassa, S. and Lin, S. (1998). A zebrafish model for hepatoerythropoietic porphyria. Nature Genet. 20, 239-243 Wang, S., Zhou, Z., Ying, K., Tang, R., Huang, Y., Wu, C., Xie, Y. and Mao, Y. (2001). Cloning and Characterization of KLHL5, a novel human gene encoding a kelch-related protein with a BTB domain. Biochem. Genet. 39, 227-238. Ward, A.C., Lieschke, G.J. (2002). The zebrafish as a model system for human disease. Frontiers Biosci. 7, d827-d833. Warren, K.S. and Fishman, M.C. (1998). “Physiolofical genomics”: mutant screens in zebrafish. Am. J. Physiol. 275, H1-H7. Wassarman, D.A., Therrien, M. and Rubin, G.M. (1995). The Ras signaling pathway in Drosophila. Curr. Opin. Genet. Dev. 5, 44-50. Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., Antonarakis, S.E., Attwood, J., Baertsch, R., Bailey, J., Barlow, K., Beck, S., Berry, E., Birren, B., Bloom, T., Bork, P., Botcherby, M., Bray, N., Brent, M.R., Brown, D.G., Brown, S.D., Bult, C., Burton, J., Butler, J., Campbell, R.D., Carninci, P., Cawley, S., Chiaromonte, F., Chinwalla, A.T. et al. (2002). Initial sequencing and comparative analysis of the mouse genome. Nature. 420, 520-562. Watkins-Chow, D.E., Buckwalter, M.S., Newhouse, M.M., Lossie, A.C., Brinkmeier, M.L. and Camper, S.A. (1997). Genetic mapping of 21 genes on mouse chromosome 11 reveals disruptions in linkage conservation with human chromosome 5. Genomics 40,114-122. Way, M., Sanders, M., Garcia, C., Sakai, J. and Matsudaira, P. (1995). Sequence and domain organization of scruin, an actin-cross-linking protein in the acrosomal process of Limulus sperm. J Cell. Biol. 128, 51-60. Westerfield, M. (1995). The zebrafish book: a guide for the laboratory use of the zebrafish (Danio rerio). Eugene, OR: University of Oregon, Institute of Neuroscience. Westerfield, M., Doerry, E. and Douglas, S. (1999a). Zebrafish in the Net. Trends Genet. 15, 248-249. Westerfield, M., Doerry, E., Kirkpatrick, A.E. and Douglas, S. (1999b). Zebrafish informatics and the ZFIN database. Methods Cell Biol. 60, 339-355. Wienholds, E., Schulte-Merker, S., Walderich, B., Plasterk, R.H. (2002). Targetselected inactivation of the zebrafish rag1 gene. Science 297, 99-102. Woods, I.G., Kelly, P.D., Chu, F., Ngo-Hazelett, P., Yan, Y.L., Huang, H., Postlethwait, J.H. and Talbot, W.S. (2000). A comparative map of the zebrafish genome. Genome Res. 10, 1903-1914. 115 Discussion Wu, Y.L. (1999). In situ hybridization screen for novel zebrafish genes differentially expressed during embryonic development. NUS Honours Thesis. Xu, Y., He, J., Wang, X., Lim, T.M. and Gong, Z. (2000). Asynchronous activation of 10 muscle-specific protein (MSP) genes during zebrafish somitogenesis. Dev Dyn. 219, 201-215. Xue, F. and Cooley, L. (1993). Kelch encodes a component of intercellular bridges in Drosophila egg chambers. Cell 72, 681-693. Xue, L. and Noll, M. (1996). The functional conservation of proteins in evolutionary alleles and the dominant role of enhancers in evolution. EMBO J. 15, 3722-3731. Xue, L., Li, X. and Noll, M. (2001). Multiple protein functions of paired in Drosophila development and their conservation in the gooseberry and Pax3 homologs. Development 128, 395-405. Yamada, J., Kuramoto, T. and Serikawa, T. (1994). A rat genetic linkage map and comparative maps for mouse or human homologous rat genes. Mamm Genome 5, 6383. Yeh, J.R. and Crews, C.M. (2003). Chemical genetics: adding to the developmental biology toolbox. Dev. Cell. 5, 11-19. Yelon, D. (2001) Cardiac patterning and morphogenesis in zebrafish. Dev Dyn 222, 552-563. Yelon, D., Horne, S.A. and Stainier, D.Y.R. (1999). Restricted expression of cardiac myosin genes reveals regulated aspects of heart tube assembly in zebrafish. Dev Biol. 214, 23-37. Yuan, J., Liu, Y., Wang, Y., Xie, G. and Blevins, R. (2001). Genome analysis with gene-indexing databases. Pharmacol Ther 91, 115-132. Zhong, T.P., Kaphingst, K., Akella, U., Haldi, M., Lander, E.S. and Fishman, M.C. (1998). Zebrafish genomic library in yeast artificial chromosomes. Genomics 48, 136138. Zhong, T.P., Rosenberg, M., Mohideen, M, P.K., Weinstein, B. and Fishman, M.C. (2000). gridlock, an HLH gene required for assembly of the aorta in zebrafish. Science 287, 1820-1824. Zhuo, D., Zhao, W.D., Wright, F.A., Yang, H., Wang, J., Sears, R., Baer, T., Kwon, D., Gordon, D., Gibbs, S., Dai, D., Yang, Q., Spitzner, J., Krahe, R., Stredney, D., Stutz, A. and Yuan, B. (2001). Assembly, annotation, and integration of UNIGENE clusters into the human genome draft. Genome Res. 11, 904-918. 116 [...]... the zebrafish, the Zebrafish Information Network (ZFIN) (http://zfin.org) was set up as to cope with the phenomenal rate of increase of information The ZFIN is a centralized database for zebrafish researchers, providing links and information about zebrafish genes, mutations, genetic maps etc (Westerfield et al., 199 9a, b; Sprague et al., 2003) In addition, zebrafish resources are also available from... Postlethwait and Talbot, 1997) The two main approaches of cloning mutated genes, positional cloning and candidate gene approach, have benefited greatly from the recent advances in zebrafish genomic infrastructure (reviewed in Talbot and Hopkins, 2000; Malicki et al., 2002) The efficient identification of genes disrupted by mutation in zebrafish requires dense maps of the genome Prior to 1994, there was no genetic... bridge the gap between its vertebrate and invertebrate counterparts in studies of development and genetics In addition to its developmental advantages, recent studies indicate that the zebrafish has a great potential to serve as a model for human disease that range from heart failure and vascular disease to fields as diverse as osteoporosis, renal failure, Parkinson’s disease, diabetes and cancer (for... Relationship of the Zebrafish and Human Genomes The third virtue of the system is the conservation of synteny between zebrafish and human genomes Besides facilitating the identification of mutants by positional cloning and the candidate gene approach, the genetic maps have been useful in comparative studies between zebrafish and other vertebrate genomes By comparing the map positions of zebrafish genes... years ago (mya), have greater conservation than zebrafish and human (Gates et al., 1999; Woods et al., 2000) Despite the current gaps in the zebrafish- human comparative map, conservation of synteny between the two has had several uses First, such analyses have been valuable in defining candidate genes for zebrafish mutant (Karlstrom et al., 1999; Schmid et al., 2000) For example, the yot locus was mapped... genetic map for zebrafish and the paucity of resources such as large-insert genomic libraries rendered the task virtually impossible (Malicki et al., 2002) Today, a full array of genomic and molecular genetic tools is available Large-insert genomic libraries needed for positional cloning have been generated To date, two zebrafish yeast artificial chromosome (YAC) libraries, one bacterial artificial chromosome... groups as additional genes are mapped A minimum estimate of ~300 conserved synteny groups was thus 21 Introduction estimated between the zebrafish and human genomes (Wood et al., 2000) Similar results were obtained in another study done at the same time (Barbazuk et al., 2000) Analyses of mouse and human, as well as zebrafish and human synteny groups have also led to the conclusion that mouse and human,... single linkage group (Johnson et al., 1996) Among vertebrates, only human, mouse, rat, and zebrafish have closed linkage maps More than 3845 microsatellite (CA) repeats have been meiotically mapped since the last update in July 2001, providing an average resolution sufficient to initiate positional cloning (Shimoda et al., 1999; http:/ /zebrafish. mgh.harvard.edu) Published genetic linkage maps have also... mutagenesis (Knapik, 2000) Chemical mutagenesis using ENU is by far the most widely employed method in zebrafish as it is effective and easily administered by incubating the fish in ENU Other chemicals that have been used include EMS and TMP which cause small deletions Radiation methods using X-rays and gamma rays are routinely performed in zebrafish laboratories to induce genome-wide mutations Causing large... see Shin and Fishman, 2002; Ackermann and Paw, 2003) Many of the mutant phenotypes identified in the mutagenesis screens are reminiscent of human clinical disorders The validity of using the zebrafish as a model for human disease is illustrated by the various examples of zebrafish mutant phenotypes with clinical relevance in the various fields of haematopoiesis (Brownlie et al., 1998; Wang et al., 1998), ... and Talbot, 1997) The two main approaches of cloning mutated genes, positional cloning and candidate gene approach, have benefited greatly from the recent advances in zebrafish genomic infrastructure... represents just the data acquisition phase Faced with an avalanche of sequence data, researchers are now faced with the daunting task of deciphering and interpreting the data and get more biology... cytoskeletal filament linkages at the 92 sacrolemma of striated muscle Table Summary of EST clones homologous to klhl iv 73 List of abbreviations LIST OF ABBREVIATIONS aa amino acid AP alkaline phosphatase

Cloning and characterization of a novel kelch like gene in zebrafish

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan