We use cookies to enhance your experience on our website. By continuing to use our website, you are agreeing to our use of cookies. You can change your cookie settings at any time. Find out more
Cover

Concepts in Bioinformatics and Genomics

Jamil Momand, Alison McCurdy, Contributions by Silvia Heubach, and Nancy Warter-Perez

Publication Date - July 2016

ISBN: 9780199936991

504 pages
Paperback
8-1/2 x 11 inches

In Stock

Retail Price to Students: $174.99

Expertly balances biology, mathematics, and programming

Description

Concepts in Bioinformatics and Genomics takes a conceptual approach to its subject, balancing biology, mathematics, and programming while highlighting relevant real-world applications and providing students with the tools to compute and analyze biological data. It presents many thought-provoking exercises that will stretch students' imaginations and give them a deeper understanding of the molecular biology, basic probability, software programs, and program-coding methodology underpinning this exciting field.

Features

  • Balanced presentation. Biology, mathematics, and computer science are presented in a balanced way that highlights connections between the three areas and makes the material accessible to all students, no matter what their backgrounds are.
  • Flexible organization. Chapters are designed as stand-alone units, allowing instructors to introduce biology, mathematics, and programming topics in order of preference.
  • Overview of molecular biology. Chapter 1 provides the essential biology concepts and vocabulary needed for understanding bioinformatics.
  • Mathematics chapters. Contributed by Silvia Heubach (California State University, Los Angeles), Chapters 11 and 12 introduce basic probability as it leads up to the concept of Expect value (E-value) and its use in sequence alignment programs, including discussions of Hidden Markov chains.
  • Basic programming. Two concluding chapters contributed by Nancy Warter-Perez (California State University, Los Angeles) introduce students to programming exercises directly related to bioinformatics problems, including hands-on work with Python-a popular and commonly used programming language in this field.
  • Case study of TP53. Discussions of TP53, the p53 tumor suppressor gene, are woven throughout the text in order to provide students with a consistent case study and give them insights to this clinically relevant gene.
  • Rich pedagogy. The book includes in-text glossary terms, a comprehensive index, extensive footnotes, thought-provoking exercises, "Scientist Spotlight" boxes featuring biographies of pioneers in bioinformatics, and much more.

About the Author(s)

Jamil Momand is Professor of Biochemistry at California State University, Los Angeles. He received the Cal State LA Outstanding Professor Award for the 2014-2015 academic year.

Alison McCurdy is Professor of Chemistry at California State University, Los Angeles. She was the recipient of the 2009 California State University Distinguished Woman Award.

Reviews

"First and foremost, this text has well-balanced coverage of the subject, in both breadth and depth, between biology and computing. The overall presentation is pedagogically well crafted with a lot of concrete examples and graphics. I am sure that this book will positively impact the teaching of bioinformatics."--Li Liao, University of Delaware

"After using this text students will be able to grasp difficult areas of computational biology and gain an appreciation for and excitement about the potential that bioinformatics offers."--Erich Baker, Baylor University

Table of Contents

    Preface
    About the Author


    Chapter I: Review of Molecular Biology

    Learning outcomes

    1.1 Genes and DNA

    1.2 RNA-the intermediary

    1.3 Amino acids-the building blocks of proteins

    1.4 Levels of protein structure

    1.5 The genetic code

    1.6 Relative sizes of matter

    1.7 DNA alterations

    1.8 A case study: sickle cell anemia
    · What are the symptoms of sickle cell anemia?
    · Sickle cell anemia is the first disease linked to a specific mutation

    1.9 Introduction to p53

    Summary
    `L Exercises
    References

    Box 1-1. A Closer Look: A rare inherited cancer is caused by mutated Tp53


    Chapter 2: Information organization and sequence databases

    Learning outcomes

    2.1 Introduction

    2.2 Public databases

    2.3 The header

    2.4 The feature keys
    · The CDS feature key and gene structure
    · The gene feature key and FASTA format
    · Thought Question 2.1

    2.5 Limitations of GenBank

    2.6 Reference Sequence (RefSeq)
    · Alternative splicing

    2.7 Primary and secondary databases
    · The UniProt Knowledge Base (UniProtKB) database

    Summary

    Exercises
    Answers to Thought Questions
    References

    Box 2-1. Scientist Spotlight: Walter Goad, GenBank Founder
    Box 2-2. A Closer Look: GenBank is Critical to the Discovery of the MDM2 Oncoprotein-an Inhibitor of p53


    Chapter 3: Molecular Evolution

    Learning outcomes

    3.1 Introduction

    3.2 Conserved regions in proteins

    3.3 Molecular Evolution
    · Transformation of normal cells to cancer cells
    · Are mutations inherited?
    · Natural selection
    · Mechanisms of mutation

    3.4 Ancestral genes and protein evolution

    3.5 Modular proteins and protein evolution

    Summary
    Exercises
    References

    Box 3-1. Scientist Spotlight: Barbara McClintock


    Chapter 4: Substitution matrices

    Learning outcomes

    4.1 Introduction

    4.2 The identity substitution matrix

    4.3 An amino acid substitution system based on natural selection

    4.4 Development of the matrix of “accepted” amino acid substitutions
    · Thought Question 4-1

    4.5 Relative mutability calculations

    4.6 Development of the PAM1 mutation probability matrix

    4.7 Determination of the relative frequencies of amino acids

    4.8 Conversion of the PAM1 mutation probability matrix to the PAM1 log-odds substitution matrix

    4.9 Conversion of the PAM1 mutational probability matrix to other PAM

    4.10 Practical uses for PAM substitution matrices

    4.11 The BLOSUM substitution matrix
    · Thought Question 4-2

    4.12 The physico-chemical properties of amino acids correlate to values in matrices

    4.13 Practical usage

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 4-1. Scientist Spotlight: Margaret Belle (Oakley) Dayhoff


    Chapter 5: Pairwise sequence alignment

    Learning outcomes

    5.1 Introduction

    5.2 Sliding window
    · Dot plots
    · The Dotter program

    5.3 The Needleman-Wunsch global alignment program
    · Initialization and matrix fill
    · Traceback
    · Gap penalties

    5.4 Modified Needleman-Wunsch global alignment (N-Wmod) program with linear gap penalty
    · N-Wmod initialization
    · N-Wmod matrix fill
    · N-Wmod traceback

    5.5 Ends-free global alignment

    5.6 Local alignment algorithm with linear gap penalty

    Summary
    Exercises
    References

    Box 5-1. Scientist Spotlight: Christian Wunsch


    Chapter 6: Basic Local Alignment Sequence Tool and Multiple Sequence Alignment

    Learning outcomes

    6.1 Introduction

    6.2 The BLAST program
    · Four phases in the BLAST program
    · How does BLAST account for gaps?
    · How is a hit deemed to be statistically significant?
    · Thought Question 6-1
    · Why is the BLAST program faster than the Smith-Waterman program?
    · Low complexity regions and masking
    · Usefulness of BLAST
    · Psi-BLAST
    · Thought Question 6-2

    6.3. Multiple Sequence Alignment (MSA)
    · CLUSTALW

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 6-1. Scientist Spotlight: David Lipman, NCBI Director


    Chapter 7: Protein structure prediction

    Learning outcomes

    7.1 Introduction

    7.2 Experimental methods of structure determination
    · X-ray crystallography
    · NMR spectroscopy

    7.3 Information deposited into the Protein Data Bank

    7.4 Molecular viewers
    · Thought question 7-1

    7.5 Protein folding
    · Christian Anfisen's protein unfolding and refolding experiment
    · Local minimum energy states
    · Energy Landscape theory

    7.6 Protein structure prediction methods
    · Prediction method 1: computational methods
    · Combining computational methods and knowledge-based systems
    · Calculation of accuracy of structure predictions
    · Prediction method 2: statistical and knowledge-based methods
    · Prediction method 3: neural networks
    · Prediction method 4: homology modeling
    · Prediction method 5: Threading
    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 7-1. A Closer Look: p53 co-crystallized with DNA reveals insights into cancer


    Chapter 8: Phylogenetics

    Learning outcomes

    8.1 Introduction

    8.2 Phylogeny and phylogenetics
    · Molecular clocks
    · Phylogenetic tree nomenclature
    · How to tell if sequences in two lineages are undergoing sequence substitution at nearly equal rates?
    · DNA, RNA and protein-based trees

    8.3 Two classes of tree-generation methods
    · Unweighted pair group method with arithmetic mean (UPGMA)
    · Thought question 8-1
    · Thought question 8-2
    · Thought question 8-3
    · Thought question 8-4
    · Bootstrap analysis
    · Other substitution rate models-Kimura two-parameter model and Gamma distance model
    · Neighbor-Joining method

    8.4 Application of phylogenetics to studies of the origin of modern humans

    8.5 Phylogenetic Tree of Life

    8.6 The Tp53 gene family members in different species

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 8-1. A Closer Look: What do we know about Neanderthal and Denisovan?
    Box 8-2. Scientist Spotlight: Svante Pääbo


    Chapter 9. Genomics

    Learning outcomes

    9.1 Introduction

    9.2 DNA sequencing-dideoxy method
    · Dideoxy nucleotides
    · The step-by-step procedure of DNA sequencing
    · Electrophoresis
    · Thought question 9-1

    9.3 Polymerase chain reaction (PCR)

    9.4 DNA sequencing-next generation (next-gen) sequencing technologies
    · Common themes in next-gen sequencing technologies
    · Ion semiconductor sequencing
    · Nanoport-based sequencing

    9.5 The PhiX174 bacteriophage genome

    9.6 The genome of Haemophilus influenzae Rd. and the whole genome shotgun sequencing approach
    · The whole genome shotgun approach
    · Thought question 9-2
    · The Haemophilus influenzae Rd. genome

    9.7 Genome assembly and annotation
    · Contig N50 and scaffold N50
    · Bacterial genome annotation systems

    9.8 Genome comparisons
    · Synteny Dotplot
    · Comparison of E. coli Substrain DH10B to E. coli Substrain MG1655

    9.10 The human genome
    · General characteristics of the human genome
    · Thought question 3
    · Detailed analysis of the human genome landscape

    9.11 The region of the human genome that encompasses the Tp53 gene
    · General comments on the region encoding the Tp53 gene
    · Tracks that display information about the Tp53 region of the genome

    9.12 The haplotype map
    · What is a haplotype?
    · Haplotypes can be specified by markers derived from SNPs, indels and CNVs
    · Tag SNPs
    · Thought question 9-4
    · How did haplotypes originate?
    · The HapMap database

    9.13 Practical application of Tag SNP, SNP and mutation analyses

    9.14 What is the smallest genome?

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 9-1. Scientist Spotlight: J. Craig Venter
    Box 9-2. A Closer Look: DNA Fingerprinting (DNA Profiling)


    Chapter 10. Transcript and protein expression analysis

    Learning outcomes

    10.1 Introduction

    10.2 Basic principles of gene expression

    10.3 Measurement of transcript levels
    · Thought question 10-1

    10.4 The transcriptome and microarrays
    · Stages of a microarray experiment
    · Heatmaps
    · Thought question 10-2
    · Cluster analysis
    · Thought question 3
    · Practical applications of microarray data
    · Considerations to take in the interpretation of microarray data
    · Protein levels can be controlled by regulation of degradation rate

    10.5 RNA-seq (RNA sequencing)
    · Advantages of RNA-seq
    · Overview of RNA-seq steps
    · Bridge amplification
    · Analysis of an experiment using RNA-seq

    10.6 Proteome
    · Separation of proteins and quantification of their steady-state levels-two-dimensional (2D) gel electrophoresis
    · Identification of proteins-liquid chromatography-mass spectroscopy (LC-MS)
    · Advantages and challenges of current proteome analysis techniques

    10.7 Regulation of p53-controlled genes

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 10-1. Scientist Spotlight: Patrick O. Brown



    Chapter 11. Basic probability

    Learning outcomes

    11.1 Introduction

    11. 2 The basics of probability
    · Definitions and basic rules
    · Counting methods when order matters
    · Counting methods when order does not matter
    · Independence
    · Dependence
    · Thought Question 11-1
    · Bayesian inference
    · Thought Question 11-2

    11.3 Random variables
    · Discrete random variables
    · Thought Question 11-3
    · Thought Question 11-4
    · Continuous random variables

    Summary
    Exercises
    Answers to Thought Questions
    References


    Chapter 12. Advanced probability for bioinformatics applications

    Learning outcomes

    12.1 Introduction

    12.2 Extreme value distribution

    12.3 Significance of alignments

    12.4 Stochastic processes
    · Markov chains
    · Thought Question 12-1
    · Hidden Markov models
    · Poisson process and Jukes-Cantor Model

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 12-1 Scientist Spotlight: Michael Waterman


    Chapter 13. Programming basics and applications to bioinformatics

    Learning outcomes

    13.1 Introduction

    13.2 Developers and users work together to make new discoveries.

    13.3 Why Python?

    13.4 Getting started with Python

    13.5 Data flow: representing and manipulating data
    · Variable names
    · Data types and operators

    13.6 Putting it together-a simple program to lookup the hydrophobicity of an amino acid

    13.7 Decision making
    · Operations for decision making
    · If-tests
    · Conditional expressions
    · Loops
    · Thought Question 13-1
    · Thought Question 13-2
    · Thought Question 13-3

    13.8 Input and output

    13.9 Program design: developing Kyte-Doolittle's hydropathy sliding window tool
    · Step 1: Understand the problem
    · Steps 2 through 4: Develop and refine algorithm
    · Step 5: Code in target language (Python)
    · Steps 6 and 7: Program verification (testing and debugging)
    · Thought Question 13-4

    13.10 Hierarchical design: functions and modules
    · Python functions
    · Thought Question 13.5
    · Python modules and packages

    Summary
    Exercises
    Answers to Thought Questions
    References

    Box 13-1. Scientist Spotlight: Russell F. Doolittle


    Chapter 14. Developing a bioinformatics tool

    Learning outcomes

    14.1 Introduction

    14.2 Analysis of an existing tool: EMBOSS water local alignment tool
    · Thought question

    14.3 Overview of SPA: A simple pairwise alignment tool

    14.4 Algorithms

    14.5 Algorithms for SPA
    · Input sequences
    · Create substitution matrix
    · Input gap penalties
    · Suite of pairwise sequence alignment algorithms
    · Output alignment

    14.6 Algorithm complexity

    14.7 Extensions to simple pairwise alignment tool

    Summary
    Exercises
    Project
    Answers to Thought Questions
    References

    I. Box 14-1. Scientist Spotlight: Richard Karp

    Glossary
    Index

Related Titles