current protocols in bioinformatics

2.8K 287 0
current protocols in bioinformatics

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER PUBLICATION INFORMATION CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER PUBLICATION INFORMATION EDITORIAL BOARD Andreas D. Baxevanis (Editor-in-Chief) National Human Genome Research Institute National Institutes of Health Bethesda, Maryland Daniel B. Davison (Editor-in-Chief) Bristol-Myers Squibb Pharmaceutical Research Institute Hopewell, New Jersey Roderic D. M. Page University of Glasgow Glasgow, Scotland Gregory A. Petsko Brandeis University Waltham, Massachusetts Lincoln D. Stein Cold Spring Harbor Laboratory Cold Spring Harbor, New York Gary D. Stormo Washington University School of Medicine St. Louis, Missouri SERIES EDITOR Shonda Leonard Rockville, Maryland http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (1 / 2) [2002-12-19 20:30:23] Current Protocols Library Copyright © 2002 by John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of any part of this work beyond that permitted by Section 107 or 108 of the 1976 United States Copyright Act without the permission of the copyright owner is unlawful. Requests for permission or further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. While the authors, editors, and publisher believe that the specification and usage of reagents, equipment, and devices, as set forth in this book, are in accord with current recommendations and practice at the time of publication, they accept no legal responsibility for any errors or omissions, and make no warranty, express or implied, with respect to material contained herein. Moreover, the information presented herein is not a substitute for professional judgment. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. This is particularly important in regard to new or infrequently employed chemicals or experimental reagents. Library of Congress Cataloging in Publication Data: Current protocols in bioinformatics / editorial board Andreas Baxevanis (editor-in-chief) and Daniel B. Davison (editor-in-chief) [et al.]. v. ; cm. Includes index. ISBN 0-471-25093-7 (cloth : alk. paper) From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc. All rights reserved. http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (2 / 2) [2002-12-19 20:30:23] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER FOREWORD FOREWORD During the last 25 years, computers have moved from being an esoteric tool of the mathematicians and physicists into the mainstream of our daily existence. Increasingly, they are an essential component of modern living. Nowhere is this more apparent than in biology, where the combination of vast databases of information and clever computer programs to manipulate and mine that data now permeate the practice of our science. The new discipline of bioinformatics has not only gained credibility, but is being offered in courses throughout our colleges and universities. In some forward-looking institutions, whole departments dedicated to bioinformatics are springing up. Despite this move to the mainstream, for many molecular biologists, some of whom I will charitably call "more mature," bioinformatics remains something of an enigma. Not quite sure what it means and being unable or unwilling to tinker with a computer themselves, they have nevertheless realized its importance for their research. They have been happy to harness the computer-savvy graduate student in their group, who prefers to sit behind a terminal rather than stand over a lab bench. However, they have often been frustrated by their lack of ability to either perform the analyses themselves or even to know the limitations of the results. Fortunately, help is at hand. Now, anyone who needs to know more about bioinformatics, and especially how to do it themselves, should find this book Current Protocols in Bioinformatics, and its constant updates, to be especially valuable. Because bioinformatics is very much a hands-on subject, this latest addition to the Current Protocols series will be much welcomed. Both the novice user and the more knowledgeable, but occasional, user will find the information in this book to be well presented and most helpful. While not a tutorial, the examples chosen for inclusion introduce the reader to all of the essentials of bioinformatics in a format that will make it easy for even the most mature professor to work through. When that eager graduate student finally produces the sequence of your favorite gene, you will be able to retreat to your office. There you will be able to consult this book and undertake a comprehensive bioinformatics analysis yourself, merely by following the protocols. If you are lucky you may even be able to impress that graduate student with your own erudition, http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (1 / 2) [2002-12-19 20:30:30] Current Protocols Library when you discover some novel property of the gene that was predicted by one of the tools illustrated. Since the landmark publication in 1995 of the first complete sequence of a free-living organism, the bacterium Haemophilus influenzae, genomic biology has flourished. By using DNA sequence to serve as a framework upon which to think about the workings of organisms, a rigor has entered biology that had previously been reserved for the "hard" sciences. Most remarkably, in the last seven years we have learned how little we know about biology and just how much remains to be discovered. Thanks to bioinformatics, we are beginning to make inroads in our understanding of DNA sequences and are making progress in predicting the biological properties of the organisms with which we share this planet. Properly used, as illustrated in the protocols of this book, bioinformatics can be a wonderful generator of hypotheses. As a discovery tool it is unparalleled. To the biologists of the twenty-first century, a good working knowledge of bioinformatics may be more important than learning how to run a centrifuge. But do not abandon that centrifuge just yet. The very best biologists will combine their knowledge of bioinformatics, with the skepticism that demands those hypotheses be tested experimentally. In this way we can be assured that bioinformatics and biological reality will keep in step. Richard J. Roberts New England Biolabs Beverly, Massachusetts From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc. All rights reserved. http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (2 / 2) [2002-12-19 20:30:30] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER PREFACE PREFACE INTRODUCTION The field of bioinformatics has come into full view recently, primarily because of the significant advances made by the Human Genome Project and other systematic sequencing projects, and the necessity for all biologists to be able to apply—at some level—these techniques to their own research. It may come as a surprise to most readers that the origins of the field of bioinformatics go well back into the 1960s, with the pioneering work performed by Margaret Dayhoff and her colleagues, who looked at a then limited number of protein sequences. The work performed by Dayhoff and her colleagues set the stage for the field as we know it today. Bioinformatics occupies a unique niche amongst the sciences, lying at the intersection of biology, genetics, biochemistry, computer science, mathematics, statistics, and numerous other allied fields. The inherent strength of the field of bioinformatics comes from the relationships between investigators in these allied fields; collaborations between these individuals has led to (and will continue to lead to) the development of novel methods and approaches, furthering advances in each of these areas. Such collaborations also set the stage for the piloting of experiments on computers, followed by the verification of the computational results in the laboratory. The central role of bioinformatics has been highlighted by numerous studies, including one by the Biomedical Information Science and Technology Inititiative (BISTI; http://www.nih.gov/about/director/060399.htm). This task force underscored the importance of bioinformatics support and education and its critical role in the advancement of modern science; without bioinformatics-based techniques, the scientific community would not be able to extract, view, or analyze the data being generated by any type of large-scale study, whether it be at the genomic, transcriptomic, or proteomic level. It becomes quite apparent that, regardless of the area of expertise of any given biologist, a firm grasp of basic bioinformatic techniques will become an essential—and indispensable—part of the "scientific arsenal" in tackling biological problems from now on. http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (1 / 6) [2002-12-19 20:30:36] Current Protocols Library OVERVIEW AND PHILOSOPHY Current Protocols in Bioinformatics is designed to provide the experimentalist with insight into the types of data and protocols required to perform basic tasks in the area of bioinformatics. More importantly, it provides insight into understanding and properly interpreting the data produced by these methods. The Current Protocols series is known for its fast and timely publication of valuable and cutting-edge methods; this book takes that mandate one step further. Initial online installments are being offered in advance of the publication of the print manual. This enables us to deliver much needed methods as soon as they are available. The topics described below reflect the planned content for the first year's worth of installments. One of the most important things that the Editors and individual authors contributing to this work can do is to drive home the importance of manually inspecting the data produced by these methods—even though a particular method may produce a result, the method may not actually be biologically relevant or make any sort of sense in the context of the experiment being performed. There is never any substitute for manual inspection of results, with sophisticated users keeping their "biology hat" on as they peruse the results provided by the computer. The overall organization of Current Protocols in Bioinformatics is the product of a significant amount of discussion between the Editors, who have brought to bear their own individual experience from both research and teaching in how to best convey a logical, workflow-based path throughout the various concepts presented herein. Current Protocols in Bioinformatics begins with a discussion of the most commonly used sources of public data, giving the reader an appreciation for the types of questions that can be answered using publicly available databases (Chapter 1). With this as a basis, the book then marches through the major topics within the field of bioinformatics. First, the reader is introduced to methods allowing for the recognition of functional domains (Chapter 2), both at the nucleotide and protein level. These concepts are expanded upon in the following chapter, devoted to similarity searching and the inference of homology, providing the reader useful information regarding the differences between the types of available search algorithms and the reasons for finding homologs (Chapter 3). One of the major goals of the Human Genome Project is to identify all genes within the genome, and Chapter 4 is devoted to methods on this front, as well as to gene-finding strategies and cautions. Moving up in http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (2 / 6) [2002-12-19 20:30:36] Current Protocols Library complexity, Chapter 5 will cover topics related to molecular modeling, including methods such as homology model building and visualization of molecular models. Chapter 6 invokes the interrelationships between proteins from an evolutionary standpoint, providing the reader with an understanding of the concepts behind both conservation and evolution of function within the cell. Chapters 7 and 8 will provide the reader with an appreciation for the interrelatedness of molecular processes; in Chapter 7, this is presented from the standpoint of gene expression and the analysis of gene expression patterns, while in Chapter 8 it is presented from the standpoint of intermolecular interactions. Since so much of bioinformatics and computational biology is dependent upon databases, a thorough treatment of the construction of databases is included (Chapter 9). While this may seem outside the scope of what some biologists would do themselves, more and more biologists are actively involved in the creation of databases for the warehousing of data generated by their own laboratories. Chapters 10 and 11 will deal with large data sets, in respect to both assembling massive amounts of sequence-based data and then performing comparisons between such large data sets. Finally, we will cover the computations behind the application of mass spectrometry to relevant biological questions (Chapter 12), as well as the techniques that can be used at the RNA level (Chapter 13), methods that are unfortunately often overlooked. HOW TO USE THIS MANUAL Format and Organization This publication, currently available online, will be published in the traditional Current Protocols looseleaf and CD-ROM formats by the end of the fourth installment. Each chapter in this work represents a general subject area, with individual protocols contained in units within each chapter. In general, each unit describes a method and includes one or more protocols. Each protocol provides information on required resources, steps and annotations, data interpretation, and commentaries on the "hows" and "whys" of the method. In addition, each chapter has an overview unit, providing a broad perspective on the general subject area, as well as any theoretical discussion that the reader will need as a foundation for the material covered in the individual units within that chapter. Since this field is Web-intensive, links to useful resources are provided in each http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (3 / 6) [2002-12-19 20:30:36] Current Protocols Library unit. Introductory and Explanatory Information Since this publication is, first and foremost, a compilation of techniques in bioinformatics, explanatory information aimed at giving the reader an intuitive grasp of the procedures is included. As stated above, chapters begin with overview units that provide biological context for the procedures that follow in that chapter. Each unit contains an Introduction that describes how the protocols that follow connect to one another, and annotations within the protocol itself describe the particulars of each step in the method. Where relevant, the unit authors have provided sample data sets that the reader can use to reproduce the output presented in their units. Readers are strongly encouraged to make use of these data sets (found on the Current Protocols Web site), both from the standpoint of understanding how to structure their own raw data, as well as to gain first-hand experience with the methods themselves. As one can imagine, none of this material is of any use in the absence of an explanation of how one should interpret the output from any given method. Each protocol-based unit provides a separate section on Guidelines for Understanding Results. The individual authors, experts in their respective fields, have taken great care to provide the user with a basic understanding of how to interpret their results. In some cases, examples of bad or misleading results are also given, thereby helping the reader develop a critical perspective on the use of these methods. Finally, each protocol-based unit closes with a Commentary, giving background information regarding the historical and theoretical development of the method, as well as alternative approaches, the importance of critical parameters used in the protocol, and different approaches that could accomplish the same end. All units contain references to the primary literature, which the user is encouraged to read to gain a better appreciation for the methods described in the protocols. Protocols Many units in Current Protocols in Bioinformatics contain groups of protocols, each presented as a discrete series of steps. The Basic Protocol, presented first in each unit, is the generally recommended or most universally applicable approach. Alternate Protocols are provided http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (4 / 6) [2002-12-19 20:30:36] Current Protocols Library where variations on the Basic Protocol can be employed to achieve similar ends, or where requirements for the end result vary from those for the Basic Protocol. Support Protocols describe additional steps that are required to perform the Basic or Alternate Protocols and that stand alone as "subroutines." A series of appendices is provided, with information on concepts that are applicable across the individual chapters and units. These appendices include examples of common file formats, the interconversion between common file formats, basic Unix commands, and the use of X-Windows. In order to remain accessible to the typical biologist, a strong emphasis has been placed on Web-based solutions. In many cases, though, a Unix-based method may be described, either because it is the only type of solution available, or because it provides distinct and significant advantages over any available Web-based version of the same program. Most of the protocols included in this manual are used by our own research groups as a routine part of our everyday work. As such, we have learned many of the intricacies of the programs, and have made an effort to share this information with the readers of Current Protocols in Bioinformatics. Critical steps and parameters are annotated where this is appropriate, providing the reader with a "troubleshooting guide" as well as an insight into "tricks of the trade." Reader Feedback The successful evolution of this manual into a resource that meets the needs of its readership depends not only upon the perspective and expertise of our colleagues, but upon the observations, experiences, and suggestions of our readership. A reader-response survey can be found on the Current Protocols in Bioinformatics Web page, and we strongly encourage our readers to use this survey to provide us with their constructive comments. Acknowledgements There are many individuals whom we must thank, without whose efforts this work would not have become a reality. First and foremost, our thanks go to all of the authors whose individual contributions make up this work. The expertise and professional viewpoints that these individuals bring to bear go a long way in making this work's content as strong as it is. We also thank our Senior Editor, Ann Boyle, as well as http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (5 / 6) [2002-12-19 20:30:36] Current Protocols Library our Developmental Editor, Shonda Leonard, for their wisdom, patience, and support in helping to shape Current Protocols in Bioinformatics into a strong, valuable resource for the biological community. We are fortunate to have them on our team, and look forward to continuing our work with them as this work continues to grow and evolve. Other skilled members of the Current Protocols staff who contributed to the success of this project include Scott Holmes, Tom Cannon Jr., Michael Gates, and Joseph White. The extensive copyediting required to produce an accurate protocols manual was ably handled by Allen Ranz, Tom Downey, and Susan Lieberman. Andreas D. Baxevanis, Daniel B. Davison, Roderic D. M. Page, Gregory A. Petsko, Lincoln D. Stein, and Gary D. Stormo From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc. All rights reserved. http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (6 / 6) [2002-12-19 20:30:36] [...]... 20:31:41] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING BIOLOGICAL DATABASES UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information for Genetic Loci Involved in Human Disease Acknowledgments Acknowledgments The author thanks Daniel W Sink for his assistance in developing the synuclein example From Current Protocols in Bioinformatics Online Copyright... From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/tse ryId=0&matchNum=0&getSearchResults=0-0&numMatches=0 [2002-12-19 20:31:30] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING BIOLOGICAL DATABASES UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information... sequencing, as well as the development of new sequencing technologies (cf Collins et al., 1998) From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/tse ryId=0&matchNum=0&getSearchResults=0-0&numMatches=0 [2002-12-19 20:31:07] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING... Hopkins University Press, Baltimore, Maryland From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/tse ryId=0&matchNum=0&getSearchResults=0-0&numMatches=0 [2002-12-19 20:31:53] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING BIOLOGICAL DATABASES UNIT 1.2 Searching Online... Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/tse ryId=0&matchNum=0&getSearchResults=0-0&numMatches=0 [2002-12-19 20:31:59] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING BIOLOGICAL DATABASES UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information for... understand how to find sequence data of interest as a basis for the more advanced analyses presented in this work From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (2 / 2) [2002-12-19 20:30:54] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS. .. College of Medicine Houston, Texas Michael Q Zhang Cold Spring Harbor Laboratory Cold Spring Harbor, New York From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (4 / 4) [2002-12-19 20:30:44] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS. .. ryId=0&matchNum=0&getSearchResults=0-0&numMatches=0 [2002-12-19 20:31:19] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING BIOLOGICAL DATABASES UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information for Genetic Loci Involved in Human Disease BASIC PROTOCOL: SEARCHING OMIM OVER THE INTERNET BASIC PROTOCOL: SEARCHING OMIM OVER THE INTERNET OMIM may be accessed directly from the... polymorphism information, and mammalian homologies of the gene in the OMIM database From Current Protocols in Bioinformatics Online Copyright © 2002 John Wiley & Sons, Inc All rights reserved http://www.mrw2.interscience.wiley.com/cponline/ d=0&matchNum=0&getSearchResults=0-0&numMatches=0 (5 / 5) [2002-12-19 20:31:25] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS CHAPTER 1 USING BIOLOGICAL... UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information for Genetic Loci Involved in Human Disease CONTRIBUTORS AND INTRODUCTION UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information for Genetic Loci Involved in Human Disease CONTRIBUTORS AND INTRODUCTION Contributed by Andreas D Baxevanis National Human Genome Research Institute National Institutes of Health . Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER PUBLICATION INFORMATION CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER PUBLICATION INFORMATION. Searching Online Mendelian Inheritance in Man (OMIM) for Information for Genetic Loci Involved in Human Disease CONTRIBUTORS AND INTRODUCTION UNIT 1.2 Searching Online Mendelian Inheritance in. [2002-12-19 20:30:30] Current Protocols Library CURRENT PROTOCOLS IN BIOINFORMATICS FRONT MATTER PREFACE PREFACE INTRODUCTION The field of bioinformatics has come into full view recently,

Ngày đăng: 11/04/2014, 09:39

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan