Python for Scientists 2nd Edition John M. Stewart download
Python for Scientists 2nd Edition John M. Stewart download
https://ebookgate.com/product/python-for-scientists-2nd-edition-
john-m-stewart/
https://ebookgate.com/product/management-for-engineers-scientists-and-
technologists-2nd-edition-john-v-chelsom/
ebookgate.com
https://ebookgate.com/product/fuzzy-mathematics-an-introduction-for-
engineers-and-scientists-2nd-edition-professor-john-n-mordeson/
ebookgate.com
Python for Data Science For Dummies 1st Edition John Paul
Mueller
https://ebookgate.com/product/python-for-data-science-for-dummies-1st-
edition-john-paul-mueller/
ebookgate.com
https://ebookgate.com/product/python-for-bioinformatics-2nd-edition-
sebastian-bassi/
ebookgate.com
Python for Everyone 2nd Edition Cay Horstmann
https://ebookgate.com/product/python-for-everyone-2nd-edition-cay-
horstmann/
ebookgate.com
https://ebookgate.com/product/shared-care-for-prostatic-diseases-2nd-
edition-john-m-fitzpatrick/
ebookgate.com
https://ebookgate.com/product/introduction-to-machine-learning-with-
python-a-guide-for-data-scientists-1st-edition-andreas-c-muller/
ebookgate.com
https://ebookgate.com/product/cissp-7th-edition-james-m-stewart/
ebookgate.com
https://ebookgate.com/product/great-scientists-speak-again-2nd-
printing-reprint-2020-edition-richard-m-eakin/
ebookgate.com
Python for Scientists
Second Edition
J O H N M . S T E WA RT
Department of Applied Mathematics & Theoretical Physics
University of Cambridge
www.cambridge.org
Information on this title: www.cambridge.org/9781316641231
DOI: 10.1017/9781108120241
c John M. Stewart 2014, 2017
1 Introduction 1
1.1 Scientific Software 1
1.2 The Plan of This Book 4
1.3 Can Python Compete with Compiled Languages? 8
1.4 Limitations of This Book 9
1.5 Installing Python and Add-ons 9
4 NumPy 55
4.1 One-Dimensional Arrays 57
4.1.1 Ab initio constructors 57
4.1.2 Look-alike constructors 58
4.1.3 Arithmetical operations on vectors 59
4.1.4 Ufuncs 60
4.1.5 Logical operations on vectors 62
4.2 Two-Dimensional Arrays 65
4.2.1 Broadcasting 65
4.2.2 Ab initio constructors 66
4.2.3 Look-alike constructors 68
4.2.4 Operations on arrays and ufuncs 69
4.3 Higher-Dimensional Arrays 69
4.4 Domestic Input and Output 69
4.4.1 Discursive output and input 70
4.4.2 NumPy text output and input 71
4.4.3 NumPy binary output and input 72
4.5 Foreign Input and Output 73
4.5.1 Small amounts of data 73
4.5.2 Large amounts of data 73
4.6 Miscellaneous Ufuncs 74
4.6.1 Maxima and minima 74
4.6.2 Sums and products 75
4.6.3 Simple statistics 75
4.7 Polynomials 75
4.7.1 Converting data to coefficients 76
4.7.2 Converting coefficients to data 76
4.7.3 Manipulating polynomials in coefficient form 76
4.8 Linear Algebra 76
4.8.1 Basic operations on matrices 76
4.8.2 More specialized operations on matrices 78
4.8.3 Solving linear systems of equations 79
4.9 More NumPy and Beyond 79
4.9.1 SciPy 80
4.9.2 SciKits 81
5 Two-Dimensional Graphics 82
5.1 Introduction 82
5.2 Getting Started: Simple Figures 83
5.2.1 Front-ends 83
5.2.2 Back-ends 83
5.2.3 A simple figure 84
5.2.4 Interactive controls 86
5.3 Object-Oriented Matplotlib 87
5.4 Cartesian Plots 88
5.4.1 The Matplotlib plot function 88
5.4.2 Curve styles 89
5.4.3 Marker styles 90
5.4.4 Axes, grid, labels and title 90
5.4.5 A not-so-simple example: partial sums of Fourier series 91
5.5 Polar Plots 93
5.6 Error Bars 94
5.7 Text and Annotations 95
5.8 Displaying Mathematical Formulae 96
5.8.1 Non-LATEX users 96
5.8.2 LATEX users 97
5.8.3 Alternatives for LATEX users 98
5.9 Contour Plots 98
5.10 Compound Figures 101
5.10.1 Multiple figures 101
5.10.2 Multiple plots 102
5.11 Mandelbrot Sets: A Worked Example 104
References 250
Index 253
Preface to the Second Edition
The motivation for writing this book, and the acknowledgements of the many who have
assisted in its production, are included in the topics of the Preface to the first edition,
which is reprinted after this one. Here I also need to adjoin thanks to the many readers
who provided constructive criticisms, most of which have been incorporated in this
revision. The purpose here is to explain why a second edition is needed. Superficially it
might appear that very little has changed, apart from a new Chapter 7 which discusses
SymPy, Python’s own computer algebra system.
There is, however, a fundamental change, which permeates most of the latest version
of this book. When the first edition was prepared, the reliable way to use the enhanced
interpreter IPython was via the traditional “terminal mode”. Preparations were under
way for an enhanced “notebook mode”, which looked then rather like the Mathemat-
ica notebook concept, except that it appeared within one’s default web browser.1 That
project has now morphed into the Jupyter notebook. The notebook allows one to con-
struct and distribute documents containing computer code (over forty languages are
supported), equations, explanatory text, figures and visualizations. Since this is also
perhaps the easiest software application for a beginner to develop Python experience,
much of the book has been rewritten for the notebook user. In particular there is now
a lightning course on how to use the notebook in Appendix A, and Chapter 2 has been
extensively rewritten to demonstrate its properties. All of the material in the book now
reflects, where appropriate, its use. For example, it allows SymPy to produce algebraic
expressions whose format is unsurpassed by other computer algebra systems.
This change also affects the areas of interactive graphics and visual animations. Their
demands are such that the standard Python two-dimensional graphics package Mat-
plotlib is having difficulty in producing platform-independent results. Indeed, because
of “improved” software upgrades, the code suggested for immediate on-screen anima-
tions in the first edition no longer works. However, the notebook concept has a subtle
solution to resolve this impasse. Recall that the notebook window is your browser win-
dow, which uses modern HTML graphics. The consequent benefits are introduced in
Chapter 6.
As a final enhancement, all but the most trivial code snippets listed in this book are
now available in electronic form, as a notebook of course, but the website includes
I have used computers as an aid to scientific research for over 40 years. During that
time, hardware has become cheap, fast and powerful. However, software relevant to the
working scientist has become progressively more complicated. My favourite textbooks
on Fortran90 and C++ run to 1200 and 1600 pages respectively. And then we need doc-
umentation on mathematics libraries and graphics packages. A newcomer going down
this route is going to have to invest significant amounts of time and energy in order to
write useful programmes. This has led to the emergence of “scientific packages” such
as Matlab® or Mathematica® which avoid the complications of compiled languages,
separate mathematics libraries and graphics packages. I have used them and found them
very convenient for executing the tasks envisaged by their developers. However, I also
found them very difficult to extend beyond these boundaries, and so I looked for alter-
native approaches.
Some years ago, a computer science colleague suggested that I should take a look at
Python. At that time, it was clear that Python had great potential but a very flaky imple-
mentation. It was, however, free and open-source, and was attracting what has turned
out to be a very effective army of developers. More recently, their efforts have coordi-
nated to produce a formidable package consisting of a small core language surrounded
by a wealth of add-on libraries or modules. A select group of these can and do replicate
the facilities of the conventional scientific packages. More importantly an informed, in-
telligent user of Python and its modules can carry out major projects usually entrusted
to dedicated programmers using Fortran, C etc. There is a marginal loss of execution
speed, but this is more than compensated for by the vastly telescoped development time.
The purpose of this book is to explain to working scientists the utility of this relatively
unknown resource.
Most scientists will have some computer familiarity and programming awareness,
although not necessarily with Python, and I shall take advantage of this. Therefore,
unlike many books which set out to “teach” a language, this one is not just a brisk trot
through the reference manuals. Python has many powerful but unfamiliar facets, and
these need more explanation than the familiar ones. In particular, if you encounter in
this text a reference to the “beginner” or the “unwary”, it signifies a point which is not
made clear in the documentation, and has caught out this author at least once.
The first seven chapters, plus Appendix A, cover almost everything the working sci-
entist needs to know in order to get started in using Python effectively. My editor and
some referees suggested that I should devote the second half of the book to problems in
a particular field. This would have led to a series of books, “Python for Biochemists”,
“Python for Crystallographers”, . . . , all with a common first half. Instead I have cho-
sen to cover just three topics, which, however, should be far more widely applicable in
many different fields. Chapter 8 covers four radically different types of ordinary differ-
ential equations and shows how to use the various relevant black boxes, which are often
Python wrappers around tried and trusted Fortran codes. The next chapter while os-
tensibly about pseudospectral approaches to evolutionary partial differential equations,
actually covers a topic of great utility to many scientists, namely how to reuse legacy
code, usually written in Fortran77, within Python at Fortran-like speeds, without under-
standing Fortran. The final chapter about solving very large linear systems via multigrid
is also a case history in how to use object-oriented programming meaningfully in a sci-
entific context. If readers look carefully and critically at these later chapters, they should
gain the practical expertise to handle problems in their own field.
Acknowledgments are due to the many Python developers who have produced and
documented a very useful tool, and also to the very many who have published code
snippets on the web, a great aid to the tyro, such as this author. Many of my colleagues
have offered valuable advice. Des Higham generously consented to my borrowing his
ideas for the last quarter of Chapter 8. I am especially grateful to Oliver Rinne who read
carefully and critically an early draft. At Cambridge University Press, my Production
Editor, Jessica Murphy and my Copy Editor, Anne Rix have exhibited their customary
expertise. Last but not least I thank the Department of Applied Mathematics and Theo-
retical Physics, Cambridge for continuing to offer me office space after my retirement,
which has facilitated the production of this book.
Writing a serious book is not a trivial task and so I am rather more than deeply
grateful for the near-infinite patience of Mary, the “Python-widow”, which made this
book possible!
1 Introduction
The title of this book is “Python for Scientists”, but what does that mean? The dictio-
nary defines “Python” as either (a) a non-venomous snake from Asia or Saharan Africa
or (b) a computer scripting language, and it is the second option which is intended here.
(What exactly this second definition means will be explained later.) By “scientist”, I
mean anyone who uses quantitative models either to obtain conclusions by processing
pre-collected experimental data or to model potentially observable results from a more
abstract theory, and who asks “what if?”. What if I analyse the data in a different way?
What if I change the model? Thus the term also includes economists, engineers and
mathematicians among others, as well as the usual concept of scientists. Given the vol-
ume of potential data or the complexity (non-linearity) of many theoretical models, the
use of computers to answer these questions is fast becoming mandatory.
Advances in computer hardware mean that immense amounts of data or ever more
complex models can be processed at increasingly rapid speeds. These advances also
mean reduced costs so that today virtually every scientist has access to a “personal
computer”, either a desktop work station or a laptop, and the distinction between the
two is narrowing quickly. It might seem to be a given that suitable software will also be
available so that the “what if” questions can be answered readily. However, this turns
out not always to be the case. A quick pragmatic reason is that, while there is a huge
market for hardware improvements, scientists form a very small fraction of it and so
there is little financial incentive to improve scientific software. But for scientists, this
issue is important and we need to examine it in more detail.
Before we discuss what is available, it is important to note that all computer software
comes in one of two types: proprietary and open-source. The first is supplied by a com-
mercial firm. Such organizations have both to pay wages and taxes and to provide a re-
turn for their shareholders. Therefore, they have to charge real money for their products,
and, in order to protect their assets from their competitors, they do not tell the customer
how their software works. Thus, the end-users have little chance of being able to adapt
or optimize the product for their own use. Since wages and taxes are recurrent expendi-
tures, the company needs to issue frequent charged-for updates and improvements (the
Danegeld effect). Open-source software is available for free or at nominal cost (media,
2 Introduction
algebraic and analytic processes, and to integrate all of them with their numerical and
graphical properties. A disadvantage of all of these packages is the quirky syntax and
limited expressive ability of their command languages. Unlike the compiled languages,
it is often extremely difficult to program a process which was not envisaged by the
package authors.
The best of the proprietary packages are very easy to use with extensive on-line help
and coherent documentation, which has not yet been matched by all of the open-source
alternatives. However, a major downside of the commercial packages is the extremely
high prices charged for their licences. Most of them offer a cut down “student version”
at reduced price (but usable only while the student is in full-time education) so as to
encourage familiarity with the package. This largesse is paid for by other users.
Let us summarize the position. On the one hand, we have the traditional compiled
languages for numerics which are very general, very fast, very difficult to learn and do
not interact readily with graphical or algebraic processes. On the other, we have standard
scientific packages which are good at integrating numerics, algebra and graphics, but are
slow and limited in scope.
What properties should an ideal scientific package have? A short list might contain:
1. a mature programming language which is both easy to understand and which has
extensive expressive ability,
2. integration of algebraic, numerical and graphical functions,
3. the ability to generate numerical algorithms running with speeds within an order of
magnitude of the fastest of those generated by compiled languages,
4. a user interface with adequate on-line help, and decent documentation,
5. an extensive range of textbooks from which the curious reader can develop greater
understanding of the concepts,
6. open-source software, freely available,
7. implementation on all standard platforms, e.g., Linux, Mac OS X, Unix, Windows,
8. a concise package, and so implementable on even modest hardware.
The bad news is that no single “scientific package” quite satisfies all of these criteria.
Consider, e.g., the requirement of algebraic capability. There are two mature open-
source packages, wx-Maxima and Reduce, with significant algebraic capabilities worthy
of consideration, but Reduce fails requirement 4 and both fail criteria 3 and 5. They are,
however, extremely powerful tools in the hands of experienced users. Python, via the
add-on SymPy, see Chapter 7, almost achieves a high standard of algebraic capability.
SageMath fulfils all but the last of the criteria listed above. It is completely based on
Python and its add-ons, and also includes wx-Maxima. For further details see Chapter
7. Thus a rational strategy is to first master Python. If its, admirably few, weaknesses are
crucial for your work, then investigate SageMath. The vast majority of scientists will
find plenty of utility in Python.
In 1991, Guido van Rossum created Python as an open-source platform-independent
general purpose programming language. It is basically a very simple language sur-
rounded by an enormous library of add-on modules, including complete access to the
underlying operating system. This means that it can manage and manipulate programs
4 Introduction
built from other complete (even compiled) packages, i.e., it is a scripting language. This
versatility has ensured both its adoption by power users such as Google, and a real army
of developers. It means also that it can be a very powerful tool for the scientist. Of
course, there are other scripting languages, e.g., Java™ and Perl®, but none has the
versatility or user-base to meet criteria 3–5 above.
Ten years ago it would not have been possible to recommend Python for scientific
work. The size of the army of developers meant that there were several mutually incom-
patible add-on packages for numerical and scientific applications. Fortunately, reason
has prevailed and there is now a single numerical add-on package, NumPy, and a single
scientific one, SciPy, around which the developers have united. When the first edition
of this book was written SymPy, the Python approach to algebraic manipulation, was
still in a phase of rapid development, and so it was not included. While SymPy has yet
to achieve the capabilities of wx-Maxima and Reduce, it now handles many algebraic
tasks reliably
The purpose of this intentionally short book is to show how easy it is for the working
scientist to implement and test non-trivial mathematical algorithms using Python. We
have quite deliberately preferred brevity and simplicity to encyclopaedic coverage in
order to get the inquisitive reader up and running as soon as possible. We aim to leave
the reader with a well-founded framework to handle many basic, and not so basic, tasks.
Obviously, most readers will need to dig further into techniques for their particular
research needs. But after reading this book, they should have a sound basis for this.
This chapter and Appendix A discuss how to set up a scientific Python environment.
While the original Python interpreter was pretty basic, its replacement IPython is so
easy to use, powerful and versatile that Chapter 2 is devoted to it, adopting a hands-on
approach.
We now describe the subsequent chapters. As each new feature is described, we try
to illustrate it first by essentially trivial examples and, where appropriate, by more ex-
tended problems. This author cannot know the mathematical sophistication of potential
readers, but in later chapters we shall presume some familiarity with basic calculus,
e.g., the Taylor series in one dimension. However, for these extended problems we shall
sketch the background needed to understand them, and suitable references for further
reading will be given.
Chapter 3 gives a brief but reasonably comprehensive survey of those aspects of the
core Python language likely to be of most interest to scientists. Python is an object-
oriented language, which lends itself naturally to object-oriented programming (OOP),
which may well be unfamiliar to most scientists. We shall adopt an extremely light
touch to this topic. We should perhaps point out that the container objects introduced
in Section 3.5 do not all have precise analogues in, say, C or Fortran. Again the brief
introduction to Python classes in Section 3.9 may be unfamiliar to users of those two
families of languages. The chapter concludes with two implementations of the sieve
1.2 The Plan of This Book 5
These first chapters cover the basic tools that Python provides to enhance the scien-
tist’s computer experience. How should we proceed further?
A notable omission is that apart, from a brief discussion in Section 4.5, the vast
subject of data analysis will not be covered. There are three main reasons for this.
(a) Recently an add-on module called Pandas has appeared. This uses NumPy and Mat-
plotlib to tackle precisely this issue. It comes with comprehensive documentation,
which is described in Section 4.5.
(b) One of the authors of Pandas has written a book, McKinney (2012), which reviews
IPython, NumPy and Matplotlib and goes on to treat Pandas applications in great
detail.
(c) I do not work in this area, and so would simply have to paraphrase the sources
above.
Instead, I have chosen to concentrate on the modelling activities of scientists. One
approach would be to target problems in bioinformatics or cosmology or crystallogra-
phy or engineering or epidemiology or financial mathematics or . . . etc. Indeed, a whole
series of books with a common first half could be produced called “Python for Bioin-
formatics” etc. A less profligate and potentially more useful approach would be to write
a second half applicable to all of these fields, and many more. I am relying here on the
unity of mathematics. Problems in one field when reduced to a core dimensionless form
often look like a similarly reduced problem from another field.
This property can be illustrated by the following example. In population dynamics,
we might study a single species whose population N(T ) depends on time T . Given a
plentiful food supply, we might expect exponential growth, dN/dT = kN(T ), where the
growth constant k has dimension 1/time. However, there are usually constraints limiting
such growth. A simple model to include these is the “logistic equation”
dN
(T ) = kN(T ) (N0 − N(T )) (1.1)
dT
which allows for a stable constant population N(T ) = N0 . The biological background to
this equation is discussed in many textbooks, e.g., Murray (2002).
In (homogeneous spherically symmetric) cosmology, the density parameter Ω de-
pends on the scale factor a via
dΩ (1 + 3w)
= Ω(1 − Ω), (1.2)
da a
where w is usually taken to be a constant.
Now mathematical biology and cosmology do not have a great deal in common, but
it is easy to see that (1.1) and (1.2) represent the same equation. Suppose we scale the
independent variable T in (1.1) by t = kN0 T , which renders the new time coordinate
t dimensionless. Similarly, we introduce the dimensionless variable x = N/N0 so that
(1.1) becomes the logistic equation
dx
= x(1 − x). (1.3)
dt
In a general relativistic theory, there is no reason to prefer any one time coordinate to
1.2 The Plan of This Book 7
any other. Thus we may choose a new time coordinate t via a = et/(1+3w) , and then,
on setting x = Ω, we see that (1.2) also reduces to (1.3). Thus the same equations
can arise in a number of different fields.3 In Chapters 8–10, we have, for brevity and
simplicity, used minimal equations such as (1.3). If the minimal form for your problem
looks something like the one being treated in a code snippet, you can of course hack the
snippet to handle the original long form for your problem.
Chapter 8 looks at four types of problems involving ordinary differential equations.
We start with a very brief introduction to techniques for solving initial value problems
and then look at a number of examples, including two classic non-linear problems, the
van der Pol oscillator and the Lorenz equations. Next we survey two-point boundary
value problems and examine both a linear Sturm–Liouville eigenvalue problem and an
exercise in continuation for the non-linear Bratu problem. Problems involving delay dif-
ferential equations arise frequently in control theory and in mathematical biology, e.g.,
the logistic and Mackey–Glass equations, and a discussion of their numerical solution
is given in the next section. Finally in this chapter we look briefly at stochastic calcu-
lus and stochastic ordinary differential equations. In particular, we consider a simple
example closely linked to the Black–Scholes equation of financial mathematics.
There are two other major Python topics relevant to scientists that I would like to
introduce here. The first is the incorporation of code written in other languages. There
are two aspects of this: (a) the reuse of pre-existing legacy code, usually written in
Fortran, (b) if one’s code is being slowed down seriously by a few Python functions, as
revealed by the profiler, how do we recode the offending functions in Fortran or C? The
second topic is how can a scientific user make worthwhile use of the object-oriented
programming (OOP) features of Python?
Chapter 9 addresses the first topic via an extended example. We look first at how
pseudospectral methods can be used to attack a large number of evolution problems
governed by partial differential equations, either initial value or initial-boundary value
problems. For the sake of brevity, we look only at problems with one time and one
spatial dimension. Here, as we explain, problems with periodic spatial dependence can
be handled very efficiently using Fourier methods, but for problems which are more
general, the use of Chebyshev transforms is desirable. However, in this case there is
no satisfactory Python black box available. It turns out that the necessary tools have
already been written in legacy Fortran77 code. These are listed in Appendix B, and we
show how, with an absolutely minimal knowledge of Fortran77, we can construct ex-
tremely fast Python functions to accomplish the required tasks. Our approach relies on
the NumPy f2py tool which is included in all of the recommended Python distributions.
If you are interested in possibly reusing pre-existing legacy code, it is worthwhile study-
ing this chapter even if the specific example treated there is not the task that you have
in mind. See also Section 1.3 for other uses for f2py.
One of the most useful features of object-oriented programming (OOP) from the
point of view of the scientist is the concept of classes. Classes exist in C++ (but not
3 This example was chosen as a pedagogic example. If the initial value x(0) = x0 is specified, then the exact
solution is x(t) = x0 /[x0 + (1 − x0 )e−t ]. In the current context, x0 0. If x0 1, then all solutions tend
monotonically towards the constant solution x = 1 as t increases. See also Section 8.5.3.
8 Introduction
C) and Fortran90 and later (but not Fortran77). However, both implementations are
complicated and so are usually shunned by novice programmers. In contrast, Python’s
implementation is much simpler and more user-friendly, at the cost of omitting some of
the more arcane features of other language implementations. We give a very brief intro-
duction to the syntax in Section 3.9. However, in Chapter 10 we present a much more
realistic example: the use of multigrid to solve elliptic partial differential equations in
an arbitrary number of dimensions, although for brevity the example code is for two di-
mensions. Multigrid is by now a classical problem which is best defined recursively, and
we devote a few pages to describing it, at least in outline. The pre-existing legacy code
is quite complicated because the authors needed to simulate recursion in languages,
e.g., Fortran77, which do not support recursion. Of course, we could implement this
code using the f2py tool outlined in Chapter 9. Instead, we have chosen to use Python
classes and recursion to construct a simple clear multigrid code. As a concrete example,
we use the sample problem from the corresponding chapter in Press et al. (2007) so
that the inquisitive reader can compare the non-recursive and OOP approaches. If you
have no particular interest in multigrid, but do have problems involving linked math-
ematical structures, and such problems arise often in, e.g., bioinformatics, chemistry,
epidemiology and solid-state physics among others, then you should certainly peruse
this final chapter to see how, if you state reasonably mathematically precisely what your
problems are, then it is easy to construct Python code to solve them.
The most common criticism of Python and the scientific software packages is that they
are far too slow, in comparison with compiled code, when handling complicated realistic
problems. The speed-hungry reader might like to look at a recent study4 of a straight-
forward “number-crunching” problem treated by various methods. Although the figures
given in the final section refer to one particular problem treated on a single processor,
they do give a “ball park” impression of performance. As a benchmark, they use the
speed of a fully compiled C++ program which solves the problem. A Python solution
using the technique of Chapter 3, i.e., core Python, is about 700 times slower. Once
you use the floating-point module NumPy and the techniques described in Chapter 4
the code is only about ten times slower, and the Matlab performance is estimated to be
similar. However, as the study indicates, there are a number of ways to speed up Python
to about 80% of the C++ performance. Some of these are very rewarding exercises in
computer science.
One in particular, though, is extremely useful for scientists: the f2py tool. This is
discussed in detail in Chapter 9, where we show how we can reuse legacy Fortran code.
It can also be used to access standard Fortran libraries, e.g., the NAG libraries.5 Yet
another use is to speed up NumPy code and so improve performance! To see how this
works, suppose we have developed a program such as those outlined in the later sections
4 See http://wiki.scipy.org/PerformancePython.
5 See, e.g., http://www.nag.co.uk/doc/TechRep/pdf/TR1_08.pdf.
1.4 Limitations of This Book 9
of the book, which uses a large number of functions, each of which carries out a simple
task. The program works correctly, but is unacceptably slow. Note that getting detailed
timing data for Python code is straightforward. Python includes a “profiler” which can
be run on the working program. This outputs a detailed list of the functions ordered by
the time spent executing them. It is very easy to use, and this is described in Section 2.5.
Usually, there are one or two functions which take a very long time to execute simple
algorithms.
This is where f2py comes into its own. Because the functions are simple, even begin-
ners can soon create equivalent code in, say, Fortran77 or Ansi C. Also, because what
we are coding is simple, there is no need for the elaborate (and laborious to learn) fea-
tures of, say, Fortran95 or C++. Next we encapsulate the code in Python functions using
the f2py tool, and slot them into the Python program. With a little experience, we can
achieve speeds comparable to that of a program written fully in, say, Fortran95.
A comprehensive treatment of Python and its various branches would occupy several
large volumes and would be out of date before it reached the bookshops. This book
is intended to offer the reader a starting point which is sufficient to be able to use the
fundamental add-on packages. Once the reader has a little experience with what Python
can do, it is time to explore further those areas which interest the reader.
I am conscious of the fact that I have not even mentioned vitally important concepts,
e.g., finite-volume methods for hyperbolic problems,6 parallel programming and real-
time graphics to name but a few areas in which Python is very useful. There is a very
large army of Python developers working at the frontiers of research, and their endeav-
ours are readily accessed via the internet. Please think of this little book as a transport
facility towards the front line.
pursue the pundits’ policy. Please save your energy, sanity etc., and read Appendix A,
which I have quite deliberately targeted at novices, for the obvious reason!
Admittedly, there is an amount, albeit slight and low-level, of hassle involved here.
So what’s the payoff? Well, if you follow the routes suggested in Appendix A, you
should end up with a system which works seamlessly. While it is true that the original
Python interpreter was not terribly user-friendly, which caused all of the established IDE
purveyors to offer a “Python mode”, the need which they purported to supply has been
overtaken by the enhanced interpreter IPython. Indeed, in its latest versions IPython
hopes to surpass the facilities offered by Matlab, Mathematica and the Python-related
features of commercial IDEs. In particular, it allows you to use your favourite editor,
not theirs, and to tailor its commands to your needs, as explained in Appendix A and
Chapter 2.
2 Getting Started with IPython
This sounds like software produced by Apple®, but it is in fact a Python interpreter on
steroids. It has been designed and written by scientists with the aim of offering very fast
exploration and construction of code with minimal typing effort, and offering appro-
priate, even maximal, on-screen help when required. Documentation and much more is
available on the website.1 This chapter is a brief introduction to the essentials of using
IPython. A more extended discursive treatment can be found in, e.g., Rossant (2015).
In this chapter we shall concentrate on notebook and terminal modes, and we assume
that the reader has set up the environments as described in Sections A.2 and A.3. Before
we get to realistic examples, I must ask for the impatient reader’s forbearance. Tab com-
pletion, Section 2.1, is an unusual but effective method for minimizing key-strokes, and
the introspection feature, Section 2.2 shows how to generate relevant inline information
quickly, without pausing to consult the manual.
While using the IPython interpreter, tab completion is always present. This means that,
whenever we start typing a Python-related name on a line or in a cell, we can pause
and press the tab key, to see a list of names valid in this context, which agree with the
characters already typed.
As an example, suppose we need to type import matplotlib.2 Typing itab re-
veals 15 possible completions. By inspection, only one of them has second letter m, so
that imtab will complete to import. Augmenting this to import mtab shows 30 pos-
sibilities, and by inspection we see that we need to complete the command by import
matptab to complete the desired line.
That example was somewhat contrived. Here is a more compulsive reason for using
tab completion. When developing code, we tend, lazily, to use short names for vari-
ables, functions etc. (In early versions of Fortran, we were indeed restricted to six or
eight characters, but nowadays the length can be arbitrary.) Short names are not always
meaningful ones, and the danger is that if we revisit the code in six months, the intent of
the code may no longer be self-evident. By using meaningful names of whatever length
is needed, we can avoid this trap. Because of tab completion, the full long name only
need be typed once.
2.2 Introspection
IPython has the ability to inspect just about any Python construct, including itself, and
to report whatever information its developers have chosen to make available. This fa-
cility is called introspection. It is accessed by the single character ?. The easiest way to
understand it is to use it, so you are recommended to fire up the interpreter.
Which mode, terminal or notebook, should one use? Beginners should start with ter-
minal mode, as described in Section A.3, thus removing one level of complexity. This
persists throughout this chapter until Section 2.5, where the code snippets are extremely
short. If you choose to use terminal mode, type ipython followed by a ret on the com-
mand line. IPython will respond with quite a lengthy header, followed by an input line
labelled In [1]:. Now that you are in IPython, you can try out introspection by typing
? (followed by ret) on the input line. (Note that within IPython terminal mode the ret
actually executes the command in the current line.) IPython responds by issuing in pager
mode3 a summary of all of the facilities available. If you exit this, then the command
quickref (hint: use tab completion) gives a more concise version. Very careful study
of both documents is highly recommended.
IPython notebook users need to use slightly different commands. After invoking the
notebook, see Section A.2.2 for details, they will be confronted with an unnumbered
single-line blank cell. Now they can try out introspection by typing ? on the input line.
Typing ret at this point merely adds a new line to the cell. In order to execute the
command(s) in the cell one needs to add either shift+ret (which also creates a new cell
below the current one), or ctl+ret which just executes the command(s). The output, the
lengthy facilities summary, appears in a scrollable window at the bottom of the screen.
It can be killed by clicking on the x-button at the top right of the window. The command
quickref (hint: use tab completion) followed by ctl-ret gives a more concise version.
Again very careful study of both documents is highly recommended.
However, scientists are impatient folk, and the purpose of this chapter is to get them
up and running with the most useful features. Therefore, we need to type in some Python
code, which newcomers will have to take on trust until they have mastered Chapters 3
and 4. Again, depending on whether you are operating in notebook or console mode,
the procedures differ slightly.
Notebook users should type each boxed line or boxed code snippet into a cell. They
can then execute the code via ctl+ret or shift+ret. Readers using terminal mode
3 This is based on the Unix less facility. To use it effectively you need know only four commands:
should type in, line by line, boxed lines or code snippets and enter each line separately
using the ret key.
For example, please type in
a=3
b=2.7
c=12 + 5j
s=’Hello World!’
L=[a, b, c, s, 77.77]
4 Note that in the code snippet there is no multiplication sign (*) between 5 and j.
5 For the sake of brevity, we shall not distinguish between a line in terminal mode and a one-line cell in
notebook mode.
14 Getting Started with IPython
2.3 History
If you look at the output from the code in the previous section, you will see that IPython
employs a history mechanism which is very similar to that in Mathematica notebooks.
Input lines are labelled In[1], In[2], . . . , and if input In[n] produces any output, it
is labelled Out[n]. As a convenience, the past three input lines/cells are available as
_i, _ii and _iii, and the corresponding output lines/cells are available as _, __ and
___. In practice though, you can insert the content of a previous input line/cell into the
current one by navigating using ↑ (or ctl-p) and ↓ (or ctl-n), and this is perhaps the
most common usage. Unusually, but conveniently, history persists in terminal mode. If
you close down IPython (using exit) and then restart it, the history of the previous
session is still available via the arrow keys. There are many more elaborate things you
can do with the history mechanism, try the command history?.
The IPython interpreter expects to receive valid Python commands. However, it is very
convenient to be able to type commands which control either the behaviour of IPython
or that of the underlying operating system. Such commands, which coexist with Python
ones, are called magic commands. A very long very detailed description can be found
by typing %magic in the interpreter, and a compact list of available commands is given
by typing %lsmagic. (Do not forget tab completion!) Note that there are two types of
magic, line magic, prefixed by %, and cell magic, prefixed by %%. The latter is relevant
only to notebook mode. You can get very helpful documentation on each command by
using introspection.
Let us start by considering system commands. A harmless example is pwd, which
comes from the Unix operating system where it just prints the name of the current
directory (print working directory) and exits. There are usually three ways to achieve
this in the IPython window. You should try out the following commands.
!pwd
Nothing in Python starts with a “!”, and IPython interprets this as the Unix shell com-
mand pwd, and produces the ASCII answer.
%pwd
Nothing in Python starts with a “%”, and IPython treats this as a line magic command,
interpreting it as the shell command. The u indicates that the string is encoded in Uni-
code, which enables a rich variety of outputs. Unicode is mentioned briefly in Section
A.3.1.
pwd
2.5 IPython in Action: An Extended Example 15
Here is a subtle, but useful, feature. For line magic commands, the % prefix is not al-
ways necessary, see the discussion of %automagic below. The interpreter sees no other
meaning for pwd and so treats it as %pwd.
pwd=’hoho’
pwd
pwd
Now that pwd has no other assignment, it works as a line magic command.
%automagic?
If %automagic is on, as it is by default, the single per cent sign (%) which starts all line
magic commands, can be omitted. This is a great convenience, provided one remembers
that one is dealing with a magic command.
Magic cell commands start with two mandatory per cent signs. They operate on entire
cells, and can be extremely useful, see, e.g., the next section.
For the rest of this chapter we present the first part of an extended example, in order to
show the effectiveness of magic commands. The second part is the example, in Section
3.9, where we consider how to implement arbitrary precision real arithmetic via frac-
tions. There is an issue with fractions, e.g., 3/7 and 24/56 are usually regarded as the
same number. Thus there is a problem here, to determine the “highest common factor”
of two integers a and b, or, as mathematicians are wont to say, their “greatest common
divisor” (GCD), which can be used to produce a canonical form for the fraction a/b.
(By inspection of factors, the GCD of 24 and 56 is 8, which implies 24/56 = 3/7 and no
further reduction of the latter is possible.) Inspection of factors is not easily automated,
and a little research, e.g., on the web, reveals Euclid’s algorithm. To express this con-
cisely, we need a piece of jargon. Suppose a and b are integers. Consider long division
of a by b. The remainder is called a mod b, e.g., 13 mod 5 = 3, and 5 mod 13 = 5. Now
denote the GCD of a and b by gcd(a, b). Euclid’s algorithm is most easily described
recursively via
gcd(a, 0) = a, gcd(a, b) = gcd(b, a mod b), (b 0). (2.1)
16 Getting Started with IPython
Try evaluating by hand gcd(56, 24) according to this recipe. It’s very fast! It can be
shown that the most laborious case arises when a and b are consecutive Fibonacci num-
bers, so they would be useful for a test case. The Fibonacci numbers Fn are defined
recursively via
F0 = 0, F1 = 1, Fn = Fn−1 + Fn−2 , n 2. (2.2)
The sequence begins 0, 1, 1, 2, 3, 5, 8, . . ..
How do we implement both Euclid’s algorithm and Fibonacci numbers efficiently
and speedily in Python? We start with the Fibonacci task because it looks to be more
straightforward.
In order to get started with mastering IPython, the novice reader is asked to take
on trust the next two code snippets. A partial explanation is offered here, but all of
the features will be explained more thoroughly in Chapter 3. However, an important
point needs to be made here. Every programming language needs subsidiary blocks of
code, e.g., the contents of functions or do-loops. Most delineate them by some form of
bracketing, but Python relies exclusively on indentation. All blocks of code must have
the same indentation. The need for a sub-block is usually indicated by a colon (:). A
sub-block is further indented (four spaces is the conventional choice), and the end of
the sub-block is indicated by the cancellation of this extra indentation. IPython and all
Python-aware editors will handle this automatically. Three examples are given below.
The novice reader should try to understand, perhaps roughly, what is going on, before
we move on to consider possible workflows to execute the snippets.
1 # File: fib.py Fibonacci numbers
2
7 def fib(n):
8 """ Returns n’th Fibonacci number. """
9 a,b=0,1
10 for i in range(n):
11 a,b=b,a+b
12 return a
13
14 ####################################################
15 if __name__ == "__main__":
16 for i in range(1001):
17 print "fib(",i,") = ",fib(i)
The details of Python syntax are explained in Chapter 3. For the time being, note that
lines starting with a hash (#), e.g., lines 1 and 14, denote comments. Also lines 3–5
define a docstring, whose purpose will be explained shortly. Lines 7–12 define a Python
function. Note the point made above that every colon (:) demands an indentation. Line 7
2.5 IPython in Action: An Extended Example 17
is the function declaration. Line 8 is the function docstring, again soon to be explained.
Line 9 introduces identifiers a and b, which are local to this function, and refer initially
to the values 0 and 1 respectively. Next examine line 11, ignoring for the moment its in-
dentation. Here a is set to refer to the value that b originally referred to. Simultaneously,
b is set to refer to the sum of values originally referred to by a and b. Clearly, lines 9
and 11 replicate the calculations implicit in (2.2). Now line 10 introduces a for-loop or
do-loop, explained in Section 3.7.1, which extends over line 11. Here range(n) gener-
ates a dummy list with n elements, [0, 1, . . . , n − 1], and so line 11 is executed precisely
n times. Finally, line 12 exits the function with the return value set to that referred to
finally by a.
Naturally, we need to provide a test suite to demonstrate that this function behaves as
intended. Line 14 is simply a comment. Line 15 will be explained soon. (When typing it,
note that there are four pairs of underscores.) Because it is an if statement terminated by
a colon, all subsequent lines need to be indented. We have already seen the idea behind
line 16. We repeat line 17 precisely 1001 times with i = 0, 1, 2, . . . , 1000. It prints a
string with four characters, the value of i, another string with four characters, and the
value of fib(i).
We now present two possible workflows for creating and using this snippet.
Now run the cell again. This write the cell content to the file fib.py in the current
directory, or overwrites that file if it already existed.
Once the program has been verified, we can ask how fast is it? Run the program again
but with the enhanced magic command run -t fib and IPython will produce timing
18 Getting Started with IPython
data. On my machine, the “User time” is 0.05 s, but the “Wall time” is 0.6 s. Clearly,
the discrepancy reflects the very large number of characters printed to the screen. To
verify this, modify the snippet as follows. Comment out the print statement in line 17 by
inserting a hash (#) character at the start of the line. Add a new line 18: fib(i), being
careful to get the indentation correct. (This evaluates the function, but does nothing with
the value.) Now run the program again. On my machine it takes 0.03 s, showing that
fib(i) is fast, but printing is not. (Don’t forget to comment out line 18, and uncomment
in line 17!)
We still need to explain the docstrings in lines 3–5 and 8, and the weird line 15. Close
down IPython (in terminal mode, use exit) and then reopen a fresh version. Type the
single line import fib, which reflects the core of the filename. The tail .py is not
needed. We have imported an object fib. What is it? Introspection suggests the com-
mand fib?, and IPython’s response is to print the docstring from lines 3–5 of the snip-
pet. This suggests that we find out more about the function fib.fib, so try fib.fib?,
and we are returned the docstring from line 8. The purpose of docstrings, which are
messages enclosed in pairs of triple double-quotes, is to offer online documentation to
other users and, just as importantly, you in a few days time! However, introspection has
a further trick up its sleeve. Try fib.fib?? and you will receive a listing of the source
code for this function!
You should have noticed that import fib did not list the first 1001 Fibonacci num-
bers. Had we instead, in a separate session, issued the command run fib, they would
have been printed! Line 15 of the snippet detects whether the file fib.py is being im-
ported or run, and responds accordingly without or with the test suite. How it does this
is explained in Section 3.4.
Now we return to our original task, which was to implement the gcd function implicit
in equation (2.1). Once we recognize that (i) Python has no problem with recursion, and
(ii) a mod b is implemented as a%b, then a minimal thought solution suggests itself, as
in the following snippet. (Ignore for the time being lines 14–18.)
7 def gcdr(a,b):
8 """ Euclidean algorithm, recursive vers., returns GCD. """
9 if b==0:
10 return a
11 else:
12 return gcdr(b,a%b)
13
14 def gcd(a,b):
15 """ Euclidean algorithm, non-recursive vers., returns GCD. """
2.5 IPython in Action: An Extended Example 19
16 while b:
17 a,b=b,a%b
18 return a
19
20 ##########################################################
21 if __name__ == "__main__":
22 import fib
23
24 for i in range(963):
25 print i, ’ ’, gcd(fib.fib(i),fib.fib(i+1))
The only real novelty in this snippet is the import fib statement in line 22, and we
have already discussed its effect above. The number of times the loop in lines 24 and 25
is executed is crucial. As printed, this snippet should run in a fraction of a second. Now
change the parameter 963 in line 24 to 964, save the file, and apply run gcd again. You
should find that the output appears to be in an infinite loop, but be patient. Eventually,
the process will terminate with an error statement that the maximum recursion depth
has been exceeded. While Python allows recursion, there is a limit on the number of
self-calls that can be made.
This limitation may or may not be a problem for you. But it is worth a few moments
thought to decide whether we could implement Euclid’s algorithm (2.1) without using
recursion. I offer one possible solution in the function gcd implemented in lines 14–18
of the snippet. Lines 16 and 17 define a while loop, note the colon terminating line 16.
Between while and the colon, Python expects an expression which evaluates to one of
the Boolean values True or False. As long as True is found, the loop executes line 17,
and then retests the expression. If the test produces False, then the loop terminates
and control passes to the next statement, line 18. In the expected context, b will always
be an integer, so how can an integer take a Boolean value? The answer is remarkably
simple. The integer value zero is always coerced to False, but all non-zero values
coerce to True. Thus the loop terminates when b becomes zero, and then the function
returns the value a. This is the first clause of (2.1). The transformation in line 17 is the
second clause, so this function implements the algorithm. It is shorter than the recursive
function, can be called an arbitrary number of times and, as we shall see, runs faster.
So was the expenditure of thought worthwhile? Using the run command, we can
obtain revealing statistics. First, edit the snippet to make 963 loops with the gcdr func-
tion, and save it. Now invoke run -t gcd to obtain the time spent. On my machine, the
“User time” is 0.31 s. Yours will vary, but it is relative timings that matter. The “Wall
time” reflects the display overhead and is not relevant here. Next invoke run -p gcd,
which invokes the Python profiler. Although you would need to read the documenta-
tion to understand every facet of the resulting display, a little scientific intuition can be
very useful. This shows that there were 963 direct calls (as expected) of the function
gcdr, within a total of 464,167 actual calls. The actual time spent within this func-
tion was 0.237 s. Next there were 1926 calls (as expected) of the function fib, and the
time expended was 0.084 s. Note that these timings cannot be compared with the older
20 Getting Started with IPython
run -t gcd ones, because the newer ones include the profiler overhead, which is sig-
nificant here. However, we can conclude that about 74% of the time was spent in the
function gcdr.
Next we need to repeat the exercise for the gcd function. Amend line 25 of the snippet
to replace gcdr by gcd and resave the file. Now run -t gcd to get a “User time” of
0.20 s. The other command run -p gcd reveals that the 1926 calls of function fib
took 0.090 s. However, the function gcd was called only 993 times (as expected), which
occupied 0.087 s. Thus gcd occupied about 49% of the time taken. Very approximately,
these relative timings factor out the profiler overhead. Now 74% of the 0.31 s timing
for the recursive version is 0.23 s, while 49% of the 0.20 s time for the non-recursive
version is 0.098 s. Thus the expenditure of thought has produced a shortened code which
runs in 43% of the time of the “thoughtless code”!
There are two points which need to be gleaned from this example.
1. The IPython magic command run or %run is the Python workhorse. You do need,
via introspection, to study its docstring. Look also at the variants %run -t and
%run -p. It is also worthwhile introspecting %timeit at this stage.
2. You will see much in the literature about methods for “speeding up” Python. These
are often very clever pieces of software engineering. But none are as effective as
human ingenuity!
3 A Short Python Tutorial
Although Python is a small language, it is a very rich one. It is very tempting, when
writing a textbook, to spell out all of the ramifications, concept by concept. The obvious
example is the introductory tutorial from the originator of Python, Guido van Rossum.
This is available in electronic form as the tutorial in your Python documentation or
on-line1 or as hard copy (van Rossum and Drake Jr. (2011)). It is relatively terse at
150 printed pages, and does not mention NumPy. My favourite textbook, Lutz (2013),
runs to over 1500 pages, a marathon learning curve, and mentions NumPy only in pass-
ing. It is excellent at explaining the features in detail, but is too expansive for a first
course in Python. A similar criticism can be applied to two books with a more scientific
orientation, Langtangen (2009) and Langtangen (2014), both around 800 pages, with a
significant overlap between them. I recommend these various books among many others
for reference, but not for learning the language.
Very few people would learn a foreign language by first mastering a grammar text-
book and then memorizing a dictionary. Most start with a few rudiments of grammar
and a tiny vocabulary. Then by practice they gradually extend their range of constructs
and working vocabulary. This allows them to comprehend and speak the language very
quickly, and it is the approach to learning Python that is being adopted here. The dis-
advantage is that the grammar and vocabulary are diffused throughout the learning pro-
cess, but this is ameliorated by the existence of textbooks, such as those cited in the first
paragraph.
Although the narrative can simply be read, it is extremely helpful to have the IPython
terminal to hand, so that you can try out the code samples. For longer code snippets,
e.g., those in Sections 3.9 and 3.11, it is advisable to use either notebook mode, or
terminal mode together with an editor, so that you can save the code. Your choices are
described in Sections A.2 and A.3 of Appendix A. After trying out the code snippets,
you are strongly encouraged to try out your own experiments in the interpreter.
Every programming language includes blocks of code, which consist of one or more
lines of code forming a syntactic whole. Python uses rather fewer parentheses () and
braces {} than other languages, and instead uses indentation as a tool for formatting
1 It is available at http://docs.python.org/2/tutorial.
22 A Short Python Tutorial
blocks. After any line ending in a colon, :, a block is required, and it is differentiated
from the surrounding code by being consistently indented. Although the amount is not
specified, the unofficial standard is four spaces. IPython and any Python-aware text
editor will do this automatically. To revert to the original indentation level, use the
ret key to enter a totally empty line. Removing braces improves readability, but the
disadvantage is that each line in a block must have the same indentation as the one
before, or a syntax error will occur.
Python allows two forms of comments. A hash symbol, #, indicates that the rest of
the current line is a comment, or more precisely a “tweet”. A “documentation string” or
docstring can run over many lines and include any printable character. It is delimited by
a pair of triple quotes, e.g.,
For completeness, we note that we may place several statements on the same line, pro-
vided we separate them with semicolons, but we should think about readability. Long
statements can be broken up with the continuation symbol ‘\’. More usefully, if a state-
ment includes a pair of brackets, (), we can split the line at any point between them
without the need for the continuation symbol. Here are simple examples.
Python deals exclusively with objects and identifiers. An object may be thought of as a
region of computer memory containing both some data and information associated with
those data. For a simple object, this information consists of its type and its identity,2 i.e.,
the location in memory, which is of course machine dependent. The identity is therefore
of no interest for most users. They need a machine-independent method for accessing
objects. This is provided by an identifier, a label which can be attached to objects. It is
made up of one or more characters. The first must be a letter or underscore, and any sub-
sequent characters must be digits, letters or underscores. Identifiers are case-sensitive:
x and X are different identifiers. (Identifiers which have leading and/or trailing under-
scores have specialized uses, and should be avoided by the beginner.) We must avoid
using predefined words, e.g., list, and should always try to use meaningful identifiers.
However, the choice among, say xnew, x_new and xNew is a matter of taste. Consider
2 An unfortunate choice of name, not to be confused with the about-to-be-defined identifiers.
3.2 Objects and Identifiers 23
Figure 3.1 A schematic representation of assignments in Python. After the first command
p=3.14, the float object 3.14 is created and identifier p is assigned to it. Here the object is
depicted by its identity, a large number, its address in the memory of my computer (highly
machine-dependent) where the data are stored, and the type. The second command q=p assigns
identifier q to the same object. The third command p=’pi’ assigns p to a new “string” object,
leaving q pointing to the original float object.
the following code, which makes most sense if typed in, line-by-line, in the terminal
window.
1 p=3.14
2 p
3 q=p
4 p=’pi’
5 p
6 q
Note that we never declared the type of the object referred to by the identifier p. We
would have had to declare p to be of type “double” in C and “real*8” in Fortran. This is
no accident or oversight. A fundamental feature of Python is that the type belongs to
the object, not to the identifier.3
Next, in line 3, we set q=p. The right-hand side is replaced by whatever object p
pointed to, and q is a new identifier which points to this object, see Figure 3.1. No
equality of identifiers q and p is implied here! Notice that, in line 4, we reassign the
identifier p to a “string” object. However, the original float object is still pointed to by
the identifier q, see Figure 3.1, and this is confirmed by the output of lines 5 and 6. Sup-
pose we were to reassign the identifier q. Then, unless in the interim another identifier
had been assigned to q, the original “float” object would have no identifier assigned to
it and so becomes inaccessible to the programmer. Python will detect this automatically
and silently free up the computer memory, a process known as garbage collection.
3 The curious can find the type of an object with identifier p with the command type(p) and its identity
with id(p).
24 A Short Python Tutorial
Because of its importance in what follows we emphasize the point that the basic
building block in Python is the assignment operation, which despite appearances has
nothing to do with equality. In pseudocode,
<identifier>=<object>
which will appear over and over again. As we have already stated earlier, the type of
an object “belongs” to the object and not to any identifier assigned to it. Henceforth we
shall try to be less pedantic!
Since we have introduced a “float”, albeit informally, we turn next to a simple class
of objects.
3.3 Numbers
Python contains three simple types of number objects, and we introduce a fourth, not so
simple, one.
3.3.1 Integers
Python refers to integers as ints. Early versions supported integers only in the range
[−231 , 231 − 1], but in recent versions the range is considerably larger and is now limited
only by the availability of memory.
The usual operations of addition (+), subtraction (−) and multiplication (∗) are of
course available. There is a slight problem with division, for though p and q may be
integers, p/q need not. We may assume without loss of generality that q > 0, in which
case there exist unique integers m and n with
p = mq + n, where 0 n < q.
Then integer division in Python is defined by p//q, which returns m. The remainder n
is available as p%q. Exponentiation pq is also available as p**q, and can produce a real
number if q < 0.
is automatically widened to be a float. The same applies for division if one operand is
an int and the other is a float. However, if both are ints, e.g., ±1/5, what is the result?
Earlier versions (< 3.0) of Python adopt integer division, 1/5=0 and -1/5=-1, while
versions 3.0 use real division, 1/5=0.2 and -1/5=-0.2. This is a potential pitfall
which is easily avoided. Either use integer division // or widen one of the operands to
ensure an unambiguous result.
Python has a useful feature inherited from its C roots. Suppose we wish to increment
the float referred to by a by two. Clearly the code
temp=a+2
a=temp
will work. However, it is faster and more efficient to use the single instruction
a+=2
import math
dir(math) # or math.<TAB> in IPython
To find out more about the individual objects, we can either consult the written docu-
mentation or use the built-in help, e.g., in IPython
math.atan2? # or help(math.atan2)
If one is already familiar with the contents, then a quick-and-dirty-fix is to replace the
import command above by
anywhere in the code before invoking the functions. Then the function mentioned above
is available as atan2(y,x) rather than math.atan2(y,x), which at first sight looks
appealing. However, there is a another module, cmath, which includes many standard
mathematical functions for complex numbers. Now suppose we repeat the quick-and-
dirty fix.
from cmath import *
Then what does atan2(y,x) refer to? It is unambiguous to widen a real to a complex
number, but not in the other direction! Note that, unlike C, the import command can
occur anywhere in the program before its contents are needed, so chaos is waiting,
imperturbably, to wreck your calculation! Of course Python knows about this, and the
recommended workflow is described in Section 3.4.
However, for the comparison operators “x>y”, “x>=y”, “x<y” and “x<=y”, widening
takes place if necessary. Unusually, but conveniently, chaining of comparison operators
is allowed, e.g., “0<=x<1<y>z” is equivalent to
(0<=x) and (x<1) and (1<y) and (y>z)
While Python is running, it needs to keep a list of those identifiers which have been
assigned to objects. This list is called a namespace, and as a Python object it too has
an identifier. For example, while working in the interpreter, the namespace has the un-
memorable name __main__.
One of the strengths of Python is its ability to include files of objects, functions etc.,
written either by you or by someone else. To enable this inclusion, suppose you have
created a file containing objects, e.g., obj1, obj2 that you want to reuse. The file should
be saved as, e.g., foo.py, where the .py ending is mandatory. (Note that with most text
editors you need this ending for the editor to realize that it is dealing with Python code.)
This file is then called a module. The module’s identifier is foo, i.e., the filename minus
the ending.
This module can be imported into subsequent sessions via
import foo
(When the module is first imported, it is compiled into bytecode and written back to
storage as a file foo.pyc. On subsequent imports, the interpreter loads this precompiled
bytecode unless the modification date of foo.py is more recent, in which case a new
version of the file foo.pyc is generated automatically.)
One effect of this import is to make the namespace of the module available as foo.
Then the objects from foo are available with, e.g., identifiers foo.obj1 and foo.obj2.
If you are absolutely sure that obj1 and obj2 will not clash with identifiers in the
current namespace, you can import them via
which imports everything from the module foo’s namespace. If an identifier obj1 al-
ready existed, its identifier will be overwritten by this import process, which usually
means that the object becomes inaccessible. For example, suppose we had an identifier
gamma referring to a float. Then
overwrites this and gamma now refers to the (real) gamma-function. A subsequent
overwrites gamma with the (complex) gamma-function! Note too that import statements
can appear anywhere in Python code, and so chaos is lurking if we use this option.
Except for quick, exploratory work in the interpreter, it is far better to modify the
import statements as, e.g.,
import math as re
import cmath as co
so that in the example above gamma, re.gamma and co.gamma are all available.
We now have sufficient background to explain the mysterious code line
if __name__ == "__main__"
which occurred in both of the snippets in Section 2.5. The first instance occurred in a
file fib.py. Now if we import this module into the interpreter, its name is fib and not
__main__ and so the lines after this code line will be ignored. However, when devel-
oping the functions in the module, it is normal to make the module available directly,
usually via the %run command. Then, as explained at the start of this section, the con-
tents are read into the __main__ namespace. Then the if condition of the code line is
satisfied and the subsequent lines will be executed. In practice, this is incredibly con-
venient. While developing a suite of objects, e.g., functions, we can keep the ancillary
test functions nearby. In production mode via import, these ancillary functions are ef-
fectively “commented out”.
The usefulness of computers is based in large part on their ability to carry out repeti-
tive tasks very quickly. Most programming languages therefore provide container ob-
jects, often called arrays, which can store large numbers of objects of the same type,
and retrieve them via an indexing mechanism. Mathematical vectors would correspond
to one-dimensional arrays, matrices to two-dimensional arrays etc. It may come as a
surprise to find that the Python core language has no array concept. Instead, it has con-
tainer objects which are much more general, lists, tuples, strings and dictionaries. It
will soon become clear that we can simulate an array object via a list, and this is how
3.5 Container Objects 29
numerical work in Python used to be done. Because of the generality of lists, such sim-
ulations took a great deal longer than equivalent constructions in Fortran or C, and this
gave Python a deservedly poor reputation for its slowness in numerical work. Devel-
opers produced various schemes to alleviate this, and they have now standardized on
the NumPy add-on module to be described in Chapter 4. Arrays in NumPy have much
of the versatility of Python lists, but are implemented behind the scenes as arrays in
C, significantly reducing, but not quite eliminating, the speed penalty. However, in this
section we describe the core container objects in sufficient detail for much scientific
work. They excel in the “administrative, bookkeeping chores” where Fortran and C are
at their weakest. Number-crunching numerical arrays are deferred to the next chapter,
but the reader particularly interested in numerics will need to understand the content of
this section, because the ideas developed here carry forward into the next chapter.
3.5.1 Lists
Consider typing the code snippet into the IPython terminal.
1 [1,4.0,’a’]
2 u=[1,4.0,’a’]
3 v=[3.14,2.78,u,42]
4 v
5 len(v)
6 len? # or help(len)
7 v*2
8 v+u
9 v.append(’foo’)
10 v
Line 1 is our first instance of a Python list, an ordered sequence of Python objects sepa-
rated by commas and surrounded by square brackets. It is itself a Python object, and can
be assigned to a Python identifier, as in line 2. Unlike arrays, there is no requirement that
the elements of a list be all of the same type. In lines 3 and 4, we see that in creating the
list an identifier is replaced by the object it refers to, e.g., one list can be an element in
another. The beginner should consult Figure 3.1 again. It is the object, not the identifier,
which matters. In line 5, we invoke an extremely useful Python function len() which
returns the length of the list, here 4. (Python functions will be discussed in Section 3.8.
In the meantime, we can find what len does by typing the line len? in IPython.) We
can replicate lists by constructions like line 7, and concatenate lists as in line 8. We can
append items to the ends of lists as in line 9. Here v.append() is another useful func-
tion, available only for lists. You should try v.append? or help(v.append) to see a
description of it. Incidentally, list. followed by tab completion or help(list) will
give a catalogue of functions intrinsic to lists. They are the analogue of c.conjugate()
in Section 3.3.4.
30 A Short Python Tutorial
If start is zero, it may be omitted, e.g., u[ :-1] is a copy of u with the last element
omitted. The same applies at the other end, u[1: ] is a copy with the first element
omitted and u[:] is a copy of u. Here, we assume that the slice occurs on the right-hand
side of an assignation. The more general form of slicing is su = u[start:end:step].
Then su contains the elements u[start], u[start+step], u[start+2*step], . . . ,
as long as the index is less than start+end. Thus with the list u chosen as in the
3.5 Container Objects 31
Figure 3.2 Indices and slicing for a list u of length 8. The middle line shows the contents of u
and the two sets of indices by which the elements can be addressed. The top line shows the
contents of a slice of length 4 with conventional ordering. The bottom line shows another slice
with reversed ordering.
1 a=4
2 b=a
3 b=’foo’
4 a
5 b
6 u=[0,1,4,9,16]
7 v=u
8 v[2]=’foo’
9 v
10 u
The first five lines should be comprehensible: a is assigned to the object 4; so is b. Then
32 A Short Python Tutorial
b is assigned to the object ’foo’, and this does not change a. In line 6, u is assigned to
a list object and so is v in line 7. Because lists are mutable, we may change the second
element of the list object in line 8. Line 9 shows the effect. But u is pointing to the same
object (see Figure 3.1) and it too shows the change in line 10. While the logic is clear,
this may not be what was intended, for u was never changed explicitly.
It is important to remember the assertion made above: a slice of a list is always a
new object, even if the dimensions of the slice and the original list agree. Therefore,
compare lines 6–10 of the code snippet above with
1 u=[0,1,4,9,16]
2 v=u[ : ]
3 v[2]=’foo’
4 v
5 u
Now line 2 makes a slice object, which is a copy4 of the object defined in line 1.
Changes to the v-list do not alter the u-list and vice versa.
Lists are very versatile objects, and there exist many Python functions which can
generate them. We shall discuss list generation at many points in the rest of this book.
3.5.5 Tuples
The next container to be discussed is the tuple. Syntactically it differs from a list only by
using () instead of [] as delimiters, and indexing and slicing work as for lists. However,
there is a fundamental difference. We cannot change the values of its elements: a tuple is
immutable. At first sight, the tuple would appear to be entirely redundant. Why not use a
list instead? The rigidity of a tuple however has an advantage. We can use a tuple where
a scalar quantity is expected, and in many cases we can drop the brackets () when there
is no ambiguity, and indeed this is the commonest way of utilizing tuples. Consider the
snippet below, where we have written a tuple assignation in two different ways.
(a,b,c,d)=(4,5.0,1.5+2j,’a’)
a,b,c,d = 4,5.0,1.5+2j,’a’
The second line shows how we can make multiple scalar assignments with a single
assignation operator. This becomes extremely useful in the common case where we need
to swap two objects, or equivalently two identifiers, say a and L1. The conventional way
to do this is
temp=a
a=L1
L1=temp
4 For the sake of completeness, we should note that this is a shallow copy. If u contains an element which is
mutable, e.g., another list w, the corresponding element of v still accesses the original w. To guard against
this, we need a deep copy to obtain a distinct but exact copy of both u and its current contents. For further
details inspect the copy module. In other words, try import copy followed by the line copy?
3.5 Container Objects 33
This would work in any language, assuming temp, a and L1 all refer to the same type.
However,
a,L1 = L1,a
does the same job in Python, is clearer, more concise and works for arbitrary types.
Another use, perhaps the most important one for tuples, is the ability to pass a variable
number of arguments to a function, as discussed in Section 3.8.4. Finally, we note a
feature of the notation which often confuses the beginner. We sometimes need a tuple
with only one element, say foo. The construction (foo) strips the parentheses and
leaves just the element. The correct tuple construction is (foo,).
3.5.6 Strings
Although we have already seen strings in passing, we note that Python regards them as
immutable container objects for alphanumeric characters. There is no comma separator
between items. The delimiters can be either single quotes or double quotes, but not a
mixture. The unused delimiter can occur within the string, e.g.
L = [1,2,3,5,8,13]
ls = str(L)
ls
eval(ls) == L
Strings will turn out to be very useful for the input of data, and, most importantly,
producing formatted output from the print function (see Sections 3.8.6 and 3.8.7).
3.5.7 Dictionaries
As we have seen, a list object is an ordered collection of objects, A dictionary object
is an unordered collection. Instead of accessing the elements by virtue of their position,
we have to assign a keyword, an immutable object, usually a string, which identifies
the element. Thus a dictionary is a collection of pairs of objects, where the first item in
the pair is a key to the second. A key–object pair is written as key:object. We fetch
items via keys rather than position. The dictionary delimiters are the braces { }. Here is
a simple example that illustrates the basics.
Another Random Scribd Document
with Unrelated Content
XIII
ESPOIR ET DÉSESPOIR
II
III
Il est donc certain que par les Grecs, par la Bible, par le
Christianisme qui en est un dernier écho, car l'auteur de l'Apocalypse
et saint Paul étaient des initiés, nous sommes tout imprégnés de
cette révélation, qu'il n'y en a pas, qu'il n'y en eut jamais d'autre,
qu'elle est la grande révélation humaine ou surhumaine, et que par
conséquent il serait juste et salutaire de l'étudier plus attentivement
et plus profondément qu'on ne l'a fait jusqu'à ce jour.
IV
C'est dans les livres sacrés de l'Inde que nous trouvons les traces
les plus sûres et les plus abondantes de cette cosmogonie ou de
cette révélation.
Il y a moins d'un siècle, on ignorait à peu près totalement
l'existence de ces livres. Leurs interprètes ont pris deux routes
différentes. D'un côté, des savants, qu'on pourrait appeler officiels,
ont donné la traduction d'un certain nombre de textes qu'on pourrait
également qualifier d'officiels, textes qu'ils ne comprennent pas
toujours et que leurs lecteurs comprennent encore moins. De l'autre,
des initiés ou soi-disant tels, avec le concours d'adeptes d'une
fraternité occulte, ont proposé, de ces mêmes textes ou d'autres
plus secrets, une interprétation nouvelle et plus impressionnante. Ils
inspirent encore, à tort ou à raison, quelque méfiance. On doit
admettre l'authenticité et l'antiquité de certaines traditions, de
certains écrits primitifs et essentiels, bien qu'il soit impossible de leur
assigner une date approximative, tant ils se perdent dans les brumes
de la préhistoire. Mais ils sont à peu près incompréhensibles sans
clefs et sans commentaires, et c'est ici que commencent les doutes
et les hésitations. Un grand nombre de ces commentaires sont
également très anciens et, à leur tour, ont besoin de clefs, d'autres
paraissent plus récents, d'autres enfin semblent contemporains et le
départ est souvent malaisé entre ce qui se trouve en puissance dans
l'original et ce que les interprètes croient y trouver ou y ajoutent
plus ou moins volontairement. Or, le plus frappant, le plus grandiose
et, en tout cas, le plus clair de la doctrine réside souvent dans les
commentaires.
Il y a ensuite, comme je viens de le dire, la question des clefs,
intimement liée à la précédente. Ces clefs sont plus ou moins
maniables, s'imposent plus ou moins, paraissent parfois chimériques
ou arbitraires, ne sont livrées qu'avec d'étranges précautions, une à
une et parcimonieusement, et peuvent ouvrir plusieurs sens
superposés. Et tout cela s'accompagne de réticences bizarres, de
secrets soi-disant dangereux ou terribles, retenus au moment décisif,
de révélations qu'on prétend incommunicables avant bien des
siècles. Des portes qu'on allait franchir se referment brusquement à
l'instant qu'on entrevoyait enfin un horizon longtemps promis, et
derrière chacune d'elles se cache un initié suprême, un Maître
encore vivant, gardien sacré des derniers arcanes, qui sait tout mais
ne veut ou ne peut rien dire.
Notez, en outre, qu'une foule d'illuminés plus ou moins
intelligents, de jeunes filles et de vieilles dames déséquilibrées, de
naïfs qui adoptent d'emblée et aveuglément ce qu'ils ne
comprennent pas, de mécontents, de ratés, de vaniteux, de
roublards qui pèchent en eau trouble, en un mot la tourbe habituelle
et suspecte qui s'agglomère autour de toute doctrine, de toute
science, de tout phénomène un peu mystérieux, a discrédité ces
premières interprétations ésotériques, dont la source même n'est
pas très claire. Ajoutez enfin que l'incendie de la fameuse
bibliothèque d'Alexandrie, où s'était entassée toute la science de
l'Orient, l'anéantissement, au XVIe siècle, sous le règne mongol
d'Akbar, de milliers d'œuvres sanscrites, la destruction systématique
et impitoyable, surtout aux premiers siècles de l'Église et durant le
Moyen Age, de tout ce qui se rapportait ou faisait allusion à cette
révélation gênante et redoutée, nous ont enlevé nos meilleurs
moyens de contrôle. Les adeptes, il est vrai, affirment, d'autre part,
que les textes véritables, ainsi que les vieux commentaires qui seuls
les rendent compréhensibles, existent encore dans des cryptes
secrètes, dans des bibliothèques souterraines du Thibet ou de
l'Himalaya, aux livres plus innombrables que tous ceux que nous
possédons en Occident, et qu'ils reparaîtront dans un âge plus
éclairé. C'est possible, mais en attendant ils ne nous sont d'aucun
secours.
VI
VII
ebookgate.com