Python for Bioinformatics, Second Edition Sebastian Bassi pdf download
Python for Bioinformatics, Second Edition Sebastian Bassi pdf download
https://textbookfull.com/product/python-for-bioinformatics-
second-edition-sebastian-bassi/
https://textbookfull.com/product/mastering-python-for-
bioinformatics-ken-youens-clark/
https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-
loucas/
https://textbookfull.com/product/bioinformatics-algorithms-
design-and-implementation-in-python-1st-edition-miguel-rocha/
https://textbookfull.com/product/raspberry-pi-for-python-
programmers-cookbook-second-edition-tim-cox/
Translational Bioinformatics for Therapeutic
Development Joseph Markowitz
https://textbookfull.com/product/translational-bioinformatics-
for-therapeutic-development-joseph-markowitz/
https://textbookfull.com/product/learn-python-programming-second-
edition-fabrizio-romano/
https://textbookfull.com/product/code-for-opencv-3-x-with-python-
by-example-second-edition-gabriel-garrido/
https://textbookfull.com/product/effective-python-90-specific-
ways-to-write-better-python-second-edition-brett-slatkin/
https://textbookfull.com/product/learning-python-design-patterns-
second-edition-chetan-giridhar/
PYTHON FOR
BIOINFORMATICS
SECOND EDITION
CHAPMAN & HALL/CRC
Mathematical and Computational Biology Series
Series Editors
N. F. Britton
Department of Mathematical Sciences
University of Bath
Xihong Lin
Department of Biostatistics
Harvard University
Nicola Mulder
University of Cape Town
South Africa
Mona Singh
Department of Computer Science
Princeton University
Anna Tramontano
Department of Physics
University of Rome La Sapienza
Proposals for the series should be submitted to one of the series editors above or directly to:
CRC Press, Taylor & Francis Group
3 Park Square, Milton Park
Abingdon, Oxfordshire OX14 4RN
UK
Published Titles
An Introduction to Systems Biology: Statistical Methods for QTL Mapping
Design Principles of Biological Circuits Zehua Chen
Uri Alon An Introduction to Physical Oncology:
Glycome Informatics: Methods and How Mechanistic Mathematical
Applications Modeling Can Improve Cancer Therapy
Kiyoko F. Aoki-Kinoshita Outcomes
Computational Systems Biology of Vittorio Cristini, Eugene J. Koay,
Cancer and Zhihui Wang
Emmanuel Barillot, Laurence Calzone, Normal Mode Analysis: Theory and
Philippe Hupé, Jean-Philippe Vert, and Applications to Biological and Chemical
Andrei Zinovyev Systems
Python for Bioinformatics, Second Edition Qiang Cui and Ivet Bahar
Sebastian Bassi Kinetic Modelling in Systems Biology
Quantitative Biology: From Molecular to Oleg Demin and Igor Goryanin
Cellular Systems Data Analysis Tools for DNA Microarrays
Sebastian Bassi Sorin Draghici
Methods in Medical Informatics: Statistics and Data Analysis for
Fundamentals of Healthcare Microarrays Using R and Bioconductor,
Programming in Perl, Python, and Ruby Second Edition
Jules J. Berman Sorin Drăghici
Chromatin: Structure, Dynamics, Computational Neuroscience:
Regulation A Comprehensive Approach
Ralf Blossey Jianfeng Feng
Computational Biology: A Statistical Biological Sequence Analysis Using
Mechanics Perspective the SeqAn C++ Library
Ralf Blossey Andreas Gogol-Döring and Knut Reinert
Game-Theoretical Models in Biology Gene Expression Studies Using
Mark Broom and Jan Rychtář Affymetrix Microarrays
Computational and Visualization Hinrich Göhlmann and Willem Talloen
Techniques for Structural Bioinformatics Handbook of Hidden Markov Models
Using Chimera in Bioinformatics
Forbes J. Burkowski Martin Gollery
Structural Bioinformatics: An Algorithmic Meta-analysis and Combining
Approach Information in Genetics and Genomics
Forbes J. Burkowski Rudy Guerra and Darlene R. Goldstein
Spatial Ecology Differential Equations and Mathematical
Stephen Cantrell, Chris Cosner, and Biology, Second Edition
Shigui Ruan D.S. Jones, M.J. Plank, and B.D. Sleeman
Cell Mechanics: From Single Scale- Knowledge Discovery in Proteomics
Based Models to Multiscale Modeling Igor Jurisica and Dennis Wigle
Arnaud Chauvière, Luigi Preziosi, Introduction to Proteins: Structure,
and Claude Verdier Function, and Motion
Bayesian Phylogenetics: Methods, Amit Kessel and Nir Ben-Tal
Algorithms, and Applications
Ming-Hui Chen, Lynn Kuo, and Paul O. Lewis
Published Titles (continued)
RNA-seq Data Analysis: A Practical Introduction to Bio-Ontologies
Approach Peter N. Robinson and Sebastian Bauer
Eija Korpelainen, Jarno Tuimala, Dynamics of Biological Systems
Panu Somervuo, Mikael Huss, and Garry Wong Michael Small
Introduction to Mathematical Oncology Genome Annotation
Yang Kuang, John D. Nagy, and Jung Soh, Paul M.K. Gordon, and
Steffen E. Eikenberry Christoph W. Sensen
Biological Computation Niche Modeling: Predictions from
Ehud Lamm and Ron Unger Statistical Distributions
Optimal Control Applied to Biological David Stockwell
Models Algorithms for Next-Generation
Suzanne Lenhart and John T. Workman Sequencing
Clustering in Bioinformatics and Drug Wing-Kin Sung
Discovery Algorithms in Bioinformatics: A Practical
John D. MacCuish and Norah E. MacCuish Introduction
Spatiotemporal Patterns in Ecology Wing-Kin Sung
and Epidemiology: Theory, Models, Introduction to Bioinformatics
and Simulation Anna Tramontano
Horst Malchow, Sergei V. Petrovskii, and
The Ten Most Wanted Solutions in
Ezio Venturino
Protein Bioinformatics
Stochastic Dynamics for Systems Anna Tramontano
Biology
Combinatorial Pattern Matching
Christian Mazza and Michel Benaïm
Algorithms in Computational Biology
Statistical Modeling and Machine Using Perl and R
Learning for Molecular Biology Gabriel Valiente
Alan M. Moses
Managing Your Biological Data with
Engineering Genetic Circuits Python
Chris J. Myers Allegra Via, Kristian Rother, and
Pattern Discovery in Bioinformatics: Anna Tramontano
Theory & Algorithms Cancer Systems Biology
Laxmi Parida Edwin Wang
Exactly Solvable Models of Biological Stochastic Modelling for Systems
Invasion Biology, Second Edition
Sergei V. Petrovskii and Bai-Lian Li Darren J. Wilkinson
Computational Hydrodynamics of Big Data Analysis for Bioinformatics and
Capsules and Biological Cells Biomedical Discoveries
C. Pozrikidis Shui Qing Ye
Modeling and Simulation of Capsules Bioinformatics: A Practical Approach
and Biological Cells Shui Qing Ye
C. Pozrikidis
Introduction to Computational
Cancer Modelling and Simulation Proteomics
Luigi Preziosi Golan Yona
PYTHON FOR
BIOINFORMATICS
SECOND EDITION
SEBASTIAN BASSI
MATLAB• is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the
accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB • software or related products
does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular
use of the MATLAB• software.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity
of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized
in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying,
microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Acknowledgments xxix
Section I Programming
Chapter 1 Introduction 3
1.1 WHO SHOULD READ THIS BOOK 3
1.1.1 What the Reader Should Already Know 4
1.2 USING THIS BOOK 4
1.2.1 Typographical Conventions 4
1.2.2 Python Versions 5
1.2.3 Code Style 5
1.2.4 Get the Most from This Book without Reading It All 6
1.2.5 Online Resources Related to This Book 7
1.3 WHY LEARN TO PROGRAM? 7
1.4 BASIC PROGRAMMING CONCEPTS 8
1.4.1 What Is a Program? 8
1.5 WHY PYTHON? 10
1.5.1 Main Features of Python 10
1.5.2 Comparing Python with Other Languages 11
1.5.3 How Is It Used? 14
1.5.4 Who Uses Python? 15
1.5.5 Flavors of Python 15
1.5.6 Special Python Distributions 16
1.6 ADDITIONAL RESOURCES 17
vii
viii Contents
Section IV Appendices
Index 417
List of Figures
3.1 Intersection. 60
3.2 Union. 61
3.3 Difference. 61
3.4 Symmetric difference. 62
3.5 Case 1. 65
3.6 Case 2. 66
xvii
xviii LIST OF FIGURES
22.1 Product of Listing 22.2, using the demo dataset (NODBDEMO). 356
xxi
Preface to the First Edition
This book is a result of the experience accumulated during several years of working
for an agricultural biotechnology company. As a genomic database curator, I gave
support to staff scientists with a broad range of bioinformatics needs. Some of them
just wanted to automate the same procedure they were already doing by hand, while
others would come to me with biological problems to ask if there were bioinformat-
ics solutions. Most cases had one thing in common: Programming knowledge was
necessary for finding a solution to the problem. The main purpose of this book is to
help those scientists who want to solve their biological problems by helping them
to understand the basics of programming. To this end, I have attempted to avoid
taking for granted any programming-related concepts. The chosen language for this
task is Python.
Python is an easy-to-learn computer language that is gaining traction among
scientists. This is likely because it is easy to use, yet powerful enough to accomplish
most programming goals. With Python the reader can start doing real programming
very quickly. Journals such as Computing in Science and Engineering, Briefings
in Bioinformatics, and PLOS Computational Biology have published introductory
articles about Python. Scientists are using Python for molecular visualization, ge-
nomic annotation, data manipulation, and countless other applications.
In the particular case of the life sciences, the development of Python has been
very important; the best exponent is the Biopython package. For this reason, Section
II is devoted to Biopython. Anyhow, I don’t claim that Biopython is the solution to
every biology problem in the world. Sometimes a simple custom-made solution may
better fit the problem at hand. There are other packages like BioNEB and CoreBio
that the reader may want to try.
The book begins from the very basic, with Section I (“Programming”), teaching
the reader the principles of programming. From the very beginning, I place a special
emphasis on practice, since I believe that programming is something that is best
learned by doing. That is why there are code fragments spread over the book. The
reader is expected to experiment with them, and attempt to internalize them. There
are also some spare comparisons with other languages; they are included only when
doing so enlightens the current topic. I believe that most language comparisons do
more harm than good when teaching a new language. They introduce information
that is incomprehensible and irrelevant for most readers.
In an attempt to keep the interest of the reader, most examples are somehow
related to biology. In spite of that, these examples can be followed even if the reader
doesn’t have any specific knowledge in that field.
To reinforce the practical nature of this book, and also to use as reference
xxiii
xxiv Preface to the First Edition
The first edition of Python for Bioinformatics was written in 2008 and published
in 2009. Even after eight years, the lessons in this book are still valuable. This is
quite an accomplishment in a field that evolves at such a fast pace. In spite of its
usefulness, the book is showing its age and would greatly benefit from a second
edition.
The predominant Python version is 3.6, although Python 2.7 is still in use in
production systems. Since there are incompatibilities between these versions, lot of
effort was made to make all code in the book Python 3 compatible.
Not only has the software changed in these past eight years, but enterprise atti-
tude and support toward Open Source Software in general and Python in particular
has changed dramatically. There are also new computing paradigms that can’t be
ignored such as collaborative development and cloud computing.
In the original book, Chapter 14 was called “Collaborative Development: Version
Control” and was based on Bazaar, a software that follows the currently used
distributed development workflow but is not what is being used by most developers
today. By far the most software development is done with Git at GitHub. This
chapter was rewritten to focus on current practices.
Web development is another area that changed significantly. Although this is
not a book about web development, the chapter “Web Applications” now reflects
current usage of long-running processes and frameworks instead of CGI/WSGI and
middleware-based applications. Frameworks were discussed as a side note in this
chapter, but now the chapter is based around a framework (Bottle) and leave the
old method as a historical footnote.
In databases, the NoSQL gained lot of traction, from being a bullet point in
the first edition, now has its own section using MongoDB, and a Python recipe
was changed to use this NoSQL database.
Graphical libraries have improved since 2009, and there are great quality com-
peting graphic libraries available for Python. There is a whole chapter devoted to
Bokeh, a free interactive visualization library.
Another change that is reflected in this book is the usage of Anaconda and
Jupyter Notebooks (with all code in a cloud notebook provided by Microsoft
Azure1 ).
1
See https://notebooks.azure.com/py4bio/libraries/py3.us
xxv
xxvi Preface to the Second Edition
II
III
IV
CHAPTER IX
TO THE CRIMEA—ILLNESS
(May–August 1855)
For myself, I have done my duty. I have identified my fate with that of
the heroic dead.—Florence Nightingale (private notes, 1855).
II
Miss Nightingale, on this and her later visits to the Crimea, saw
and heard of many deeds of heroism which she loved to tell. “I
remember,” she wrote, “a sergeant, who was on picket, the rest of
the picket killed, and himself battered about the head, stumbled
back to camp, and on his way picked up a wounded man, and
brought him in on his shoulders to the lines, where he fell down
insensible. When, after many hours, he recovered his senses, I
believe after trepanning, his first words were to ask after his
comrade, ‘Is he alive?’ ‘Comrade, indeed! yes, he's alive, it is the
General.’ At that moment the General, though badly wounded,
appeared at the bedside. ‘Oh, General, it's you, is it, I brought in,
I'm so glad. I didn't know your honour, but if I'd known it was you,
I'd have saved you all the same.’ This is the true soldier's spirit.”[174]
III
IV
Miss Nightingale looks to her reward from this country in having a fresh
field for her labours, and means of extending the good that she has
already begun. A compliment cannot be paid dearer to her heart than in
giving her work to do.—Sidney Herbert.
II
Then the musicians took up the Popular Heroine, and both now,
and after her return from the Crimea, sentimental songs, set to
music, were inscribed to her: “Angels with Sweet Approving Smiles,”
“The Shadow on the Pillow,” “The Soldier's Widow,” “The Woman's
Smile,” “The Soldier's Cheer”—this latter “played by the band of the
97th Regiment,”—“Die Soldaten Lebewohl,” “The Star of the East,”
and so forth. The stationers followed in the wake of the printers, and
brought out note-paper with a picture of Florence Nightingale as the
water-mark, or with lithographed views of “Lea Hurst, her home.”
Portraits of her were eagerly sought; and as the family were
unwilling to supply them, likenesses had to be invented to adorn
sentimental prints. Life-boats and emigrant-ships were christened
The Florence Nightingale. Children, streets, valses, and race-horses
were named after her. “The Forest Plate Handicap was won by Miss
Nightingale, beating Barbarity and nine others.” Tradesmen printed
portraits and short lives of her on their paper bags. At Fairs there
were “Grand Exhibitions of Miss Florence Nightingale administering
to the Sick and Wounded.” China figures, with no recognizable
likeness to her, but inscribed “Florence Nightingale,” were put on
sale. The public would not be denied. “Yes, indeed,” wrote Lady
Verney to her sister, “the people love you with a sort of passionate
tenderness that goes to my heart.”
Miss Nightingale did not relish all this. They had sent her various
supplies for the sick, and also a packet of “Lives,” “Portraits,” and the
like to Scutari. “My effigies and praises,” she wrote in reply, “were
less welcome. I do not affect indifference to real sympathy, but I
have felt painfully, the more painfully since I have had time to hear
of it, the éclat which has been given to this adventure. The small still
beginning, the simple hardship, the silent and gradual struggle
upwards, these are the climate in which an enterprise really thrives
and grows. Time has not altered our Saviour's lesson on that point,
which has been learnt successively by all reformers from their own
experience. The vanity and frivolity which the éclat thrown upon this
affair has called forth has done us unmitigated harm, and has
brought mischief on (perhaps) one of the most promising enterprises
that ever set sail from England. Our own old party which began its
work in hardship, toil, struggle, and obscurity has done better than
any other.”
III
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com