Computational Biology and Applied Bioinformatics

COMPUTATIONAL BIOLOGY AND APPLIED BIOINFORMATICS Edited by Heitor Silvério Lopes and Leonardo Magalhães Cruz Computational Biology and Applied Bioinformatics Edited by Heitor Silvério Lopes and Leonardo Magalhães Cruz Published by InTech Janeza Trdine 9, 51000 Rijeka, Croatia Copyright © 2011 InTech All chapters are Open Access articles distributed under the Creative Commons Non Commercial Share Alike Attribution 3.0 license, which permits to copy, distribute, transmit, and adapt the work in any medium, so long as the original work is properly cited. After this work has been published by InTech, authors have the right to republish it, in whole or part, in any publication of which they are the author, and to make other personal use of the work. Any republication, referencing or personal use of the work must explicitly identify the original source. Statements and opinions expressed in the chapters are these of the individual contributors and not necessarily those of the editors or publisher. No responsibility is accepted for the accuracy of information contained in the published articles. The publisher assumes no responsibility for any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas contained in the book. Publishing Process Manager Davor Vidic Technical Editor Teodora Smiljanic Cover Designer Jan Hyrat Image Copyright Gencay M. Emin, 2010. Used under license from Shutterstock.com First published August, 2011 Printed in Croatia A free online edition of this book is available at www.intechopen.com Additional hard copies can be obtained from orders@intechweb.org Computational Biology and Applied Bioinformatics, Edited by Heitor Silvério Lopes and Leonardo Magalhães Cruz p. cm. ISBN 978-953-307-629-4 free online editions of InTech Books and Journals can be found at www.intechopen.com Contents Preface IX Part 1 Reviews 1 Chapter 1 Molecular Evolution & Phylogeny: What, When, Why & How? 3 Pandurang Kolekar, Mohan Kale and Urmila Kulkarni-Kale Chapter 2 Understanding Protein Function - The Disparity Between Bioinformatics and Molecular Methods 29 Katarzyna Hupert-Kocurek and Jon M. Kaguni Chapter 3 In Silico Identification of Regulatory Elements in Promoters 47 Vikrant Nain, Shakti Sahi and Polumetla Ananda Kumar Chapter 4 In Silico Analysis of Golgi Glycosyltransferases: A Case Study on the LARGE-Like Protein Family 67 Kuo-Yuan Hwa, Wan-Man Lin and Boopathi Subramani Chapter 5 MicroArray Technology - Expression Profiling of MRNA and MicroRNA in Breast Cancer 87 Aoife Lowery, Christophe Lemetre, Graham Ball and Michael Kerin Chapter 6 Computational Tools for Identification of microRNAs in Deep Sequencing Data Sets 121 Manuel A. S. Santos and Ana Raquel Soares Chapter 7 Computational Methods in Mass Spectrometry-Based Protein 3D Studies 133 Rosa M. Vitale, Giovanni Renzone, Andrea Scaloni and Pietro Amodeo Chapter 8 Synthetic Biology & Bioinformatics Prospects in the Cancer Arena 159 Lígia R. Rodrigues and Leon D. Kluskens VI Contents Chapter 9 An Overview of Hardware-Based Acceleration of Biological Sequence Alignment 187 Laiq Hasan and Zaid Al-Ars Part 2 Case Studies 203 Chapter 10 Retrieving and Categorizing Bioinformatics Publications through a MultiAgent System 205 Andrea Addis, Giuliano Armano, Eloisa Vargiu and Andrea Manconi Chapter 11 GRID Computing and Computational Immunology 223 Ferdinando Chiacchio and Francesco Pappalardo Chapter 12 A Comparative Study of Machine Learning and Evolutionary Computation Approaches for Protein Secondary Structure Classification 239 César Manuel Vargas Benítez, Chidambaram Chidambaram, Fernanda Hembecker and Heitor Silvério Lopes Chapter 13 Functional Analysis of the Cervical Carcinoma Transcriptome: Networks and New Genes Associated to Cancer 259 Mauricio Salcedo, Sergio Juarez-Mendez, Vanessa Villegas-Ruiz, Hugo Arreola, Oscar Perez, Guillermo Gómez, Edgar Roman-Bassaure, Pablo Romero, Raúl Peralta Chapter 14 Number Distribution of Transmembrane Helices in Prokaryote Genomes 279 Ryusuke Sawada and Shigeki Mitaku Chapter 15 Classifying TIM Barrel Protein Domain Structure by an Alignment Approach Using Best Hit Strategy and PSI-BLAST 287 Chia-Han Chu, Chun Yuan Lin, Cheng-Wen Chang, Chihan Lee and Chuan Yi Tang Chapter 16 Identification of Functional Diversity in the Enolase Superfamily Proteins 311 Kaiser Jamil and M. Sabeena Chapter 17 Contributions of Structure Comparison Methods to the Protein Structure Prediction Field 329 David Piedra, Marco d'Abramo and Xavier de la Cruz Chapter 18 Functional Analysis of Intergenic Regions for Gene Discovery 345 Li M. Fu Contents VII Chapter 19 Prediction of Transcriptional Regulatory Networks for Retinal Development 357 Ying Li, Haiyan Huang and Li Cai Chapter 20 The Use of Functional Genomics in Synthetic Promoter Design 375 Michael L. Roberts Chapter 21 Analysis of Transcriptomic and Proteomic Data in Immune-Mediated Diseases 397 Sergey Bruskin, Alex Ishkin, Yuri Nikolsky, Tatiana Nikolskaya and Eleonora Piruzian Chapter 22 Emergence of the Diversified Short ORFeome by Mass Spectrometry-Based Proteomics 417 Hiroko Ao-Kondo, Hiroko Kozuka-Hata and Masaaki Oyama Chapter 23 Acrylamide Binding to Its Cellular Targets: Insights from Computational Studies 431 Emmanuela Ferreira de Lima and Paolo Carloni Preface Nowadays it is difficult to imagine an area of knowledge that can continue developing without the use of computers and informatics. It is not different with biology, that has seen an unpredictable growth in recent decades, with the rise of a new discipline, bioinformatics, bringing together molecular biology, biotechnology and information technology. More recently, the development of high throughput techniques, such as microarray, mass spectrometry and DNA sequencing, has increased the need of computational support to collect, store, retrieve, analyze, and correlate huge data sets of complex information. On the other hand, the growth of the computational power for processing and storage has also increased the necessity for deeper knowledge in the field. The development of bioinformatics has allowed now the emergence of systems biology, the study of the interactions between the components of a biological system, and how these interactions give rise to the function and behavior of a living being. Bioinformatics is a cross-disciplinary field and its birth in the sixties and seventies depended on discoveries and developments in different fields, such as: the proposed double helix model of DNA by Watson and Crick from X-ray data obtained by Franklin and Wilkins in 1953; the development of a method to solve the phase problem in protein crystallography by Perutz's group in 1954; the sequencing of the first protein by Sanger in 1955; the creation of the ARPANET in 1969 at Stanford UCLA; the publishing of the Needleman-Wunsch algorithm for sequence comparison in 1970; the first recombinant DNA molecule created by Paul Berg and his group in 1972; the announcement of the Brookhaven Protein DataBank in 1973; the establishment of the Ethernet by Robert Metcalfe in the same year; the concept of computers network and the development of the Transmission Control Protocol (TCP) by Vint Cerf and Robert Khan in 1974, just to cite some of the landmarks that allowed the rise of bioinformatics. Later, the Human Genome Project (HGP), started in 1990, was also very important for pushing the development of bioinformatics and related methods of analysis of large amount of data. This book presents some theoretical issues, reviews, and a variety of bioinformatics applications. For better understanding, the chapters were grouped in two parts. It was not an easy task to select chapters for these parts, since most chapters provide a mix of review and case study. From another point of view, all chapters also have extensive X Preface biological and computational information. Therefore, the book is divided into two parts. In Part I, the chapters are more oriented towards literature review and theoretical issues. Part II consists of application-oriented chapters that report case studies in which a specific biological problem is treated with bioinformatics tools. Molecular phylogeny analysis has become a routine technique not only to understand the sequence-structure-function relationship of biomolecules but also to assist in their classification. The first chapter of Part I, by Kolekar et al., presents the theoretical basis, discusses the fundamental of phylogenetic analysis, and a particular view of steps and methods used in the analysis. Methods for protein function and gene expression are briefly reviewed in Hupert- Kocurek and Kaguni’s chapter, and contrasted with the traditional approach of mapping a gene via the phenotype of a mutation and deducing the function of the gene product, based on its biochemical analysis in concert with physiological studies. An example of experimental approach is provided that expands the current understanding of the role of ATP binding and its hydrolysis by DnaC during the initiation of DNA replication. This is contrasted with approaches that yield large sets of data, providing a different perspective on understanding the functions of sets of genes or proteins and how they act in a network of biochemical pathways of the cell. Due to the importance of transcriptional regulation, one of the main goals in the post- genomic era is to predict how the expression of a given gene is regulated based on the presence of transcription factor binding sites in the adjacent genomic regions. Nain et al. review different computational approaches for modeling and identification of regulatory elements, as well as recent advances and the current challenges. In Hwa et al., an approach is proposed to group proteins into putative functional groups by designing a workflow with appropriate bioinformatics analysis tools, to search for sequences with biological characteristics belonging to the selected protein family. To illustrate the approach, the workflow was applied to LARGE-like protein family. Microarray technology has become one of the most important technologies for unveiling gene expression profiles, thus fostering the development of new bioinformatics methods and tools. In the chapter by Lowery et al. a thorough review of microarray technology is provided, with special focus on MRNA and microRNA profiling of breast cancer. MicroRNAs are a class of small RNAs of approximately 22 nucleotides in length that regulate eukaryotic gene expression at the post-transcriptional level. Santos and Soares present several tools and computational pipelines for miRNA identification, discovery and expression from sequencing data. Currently, the mass spectroscopy-based methods represent very important and flexible tools for studying the dynamic features of proteins and their complexes. Such [...]... clustered The sample calculations and steps involved in UPGMA clustering algorithm using distance matrix shown in Fig 4 are given below 14 Computational Biology and Applied Bioinformatics Iteration 1: OTU A is minimally equidistant from OTUs B and C Randomly we select the OTUs A and B to form one composite OTU (AB) A and B are clustered together Compute new distances of OTUs C, D and E from composite OTU (AB)... d(i,j) – (ri + rj) See Fig 8 for Md 16 Computational Biology and Applied Bioinformatics Fig 8 The modified distance matrix Md and clustering for iteration 1 of N-J As can be seen from Md in Fig 8, OTUs A and C are minimally distant We select the OTUs A and C to form one composite OTU (AC) A and C are clustered together Iteration 2: Compute new distances of OTUs B, D and E from composite OTU (AC) Distances... generated using a dataset and each tree conveys a story of evolution The two main types of information inherent in any phylogenetic tree are the topology (branching pattern) and the branch lengths 4 Computational Biology and Applied Bioinformatics Before getting into the actual process of molecular phylogeny analysis (MPA), it will be helpful to get familiar with the concepts and terminologies frequently... spectrometry and the related computational methods for studying the threedimensional structure of proteins Rodrigues and Kluskens review synthetic biology approaches for the development of alternatives for cancer diagnosis and drug development, providing several application examples and pointing challenging directions of research Biological sequence alignment is an important and widely used task in bioinformatics. .. experimental studies and computational analysis to predict the trans-acting factors and transcriptional regulatory networks for mouse embryonic retinal development The chapter by Roberts shows how advances in bioinformatics can be applied to the development of improved therapeutic strategies The chapter describes how functional genomics experimentation and bioinformatics tools could be applied to the design... incorporating the biological, biochemical and evolutionary considerations These mathematical models are used to compute genetic distances between sequences The use of appropriate model of evolution and statistical tests help us to infer maximum evolutionary information out of sequence data Thus, the selection of the right model of 10 Computational Biology and Applied Bioinformatics sequence evolution becomes... optimum phylogenetic tree, which explains the 12 Computational Biology and Applied Bioinformatics evolutionary pattern of the OTUs under study The exhaustive search method examines theoretically all possible tree topologies for a chosen number of species and derives the best tree topology using a set of certain criteria Table 3 shows the possible number of rooted and unrooted trees for n number of species/OTUs... Compilation and curation of homologous sequences The compilation of nucleic acid or protein sequences, appropriate to undertake validation of hypothesis using MPA, from the available resources of sequences is the next step in MPA 6 Computational Biology and Applied Bioinformatics At this stage, it is necessary to collate the dataset consisting of homologous sequences with the appropriate coverage of OTUs and. .. initially and distantly related sequences are progressively added to the alignment of aligned sequences Thus, the gaps inserted are always retained A suitable scoring function, sum-of-pairs, consensus, consistency-based etc 8 Computational Biology and Applied Bioinformatics is employed to derive the optimum MSA (Nicholas et al., 2002; Batzoglou, 2005) Most of the MSA packages use Needleman and Wunsch... written by Lima and Carloni, the authors report the use of bioinformatics tools, by means of molecular docking and molecular simulation procedures, to predict and explore the structural determinants of acrylamide and its derivative in complex with all of their known cellular target proteins in human and mice Professor Heitor Silvério Lopes Bioinformatics Laboratory, Federal University of Technology . COMPUTATIONAL BIOLOGY AND APPLIED BIOINFORMATICS Edited by Heitor Silvério Lopes and Leonardo Magalhães Cruz Computational Biology and Applied Bioinformatics. Computational Biology and Applied Bioinformatics, Edited by Heitor Silvério Lopes and Leonardo Magalhães Cruz p. cm. ISBN 978-953-307-629-4 free online editions of InTech Books and. Pandurang Kolekar, Mohan Kale and Urmila Kulkarni-Kale Chapter 2 Understanding Protein Function - The Disparity Between Bioinformatics and Molecular Methods 29 Katarzyna Hupert-Kocurek and

Computational Biology and Applied Bioinformatics

Thông tin tài liệu

Từ khóa liên quan

Mục lục

preface_ Computational Biology and Applied Bioinformatics

Part 1

01_Molecular Evolution & Phylogeny: What, When, Why & How?

02_Understanding Protein Function - The Disparity Between Bioinformatics and Molecular Methods

03_In Silico Identification of Regulatory Elements in Promoters

04_In Silico Analysis of Golgi Glycosyltransferases: A Case Study on the LARGE-Like Protein Family

05_MicroArray Technology - Expression Profiling of MRNA and MicroRNA in Breast Cancer

06_Computational Tools for Identification of microRNAs in Deep Sequencing Data Sets

07_Computational Methods in Mass Spectrometry-Based Protein 3D Studies

08_Synthetic Biology & Bioinformatics Prospects in the Cancer Arena

09_An Overview of Hardware-Based Acceleration of Biological Sequence Alignment

Part 2

10_Retrieving and Categorizing Bioinformatics Publications through a MultiAgent System

11_GRID Computing and Computational Immunology

12_A Comparative Study of Machine Learning and Evolutionary Computation Approaches for Protein Secondary Structure Classification

13_Functional Analysis of the Cervical Carcinoma Transcriptome: Networks and New Genes Associated to Cancer

14_Number Distribution of Transmembrane Helices in Prokaryote Genomes

15_Classifying TIM Barrel Protein Domain Structure by an Alignment Approach Using Best Hit Strategy and PSI-BLAST

16_Identification of Functional Diversity in the Enolase Superfamily Proteins

17_Contributions of Structure Comparison Methods to the Protein Structure Prediction Field

Tài liệu cùng người dùng

Tài liệu liên quan