Instant download Introduction to Python for Econometrics, Statistics and Data Analysis Kevin Sheppard pdf all chapter
Instant download Introduction to Python for Econometrics, Statistics and Data Analysis Kevin Sheppard pdf all chapter
com
https://ebookmass.com/product/introduction-to-python-for-
econometrics-statistics-and-data-analysis-kevin-sheppard/
OR CLICK HERE
DOWLOAD NOW
https://ebookmass.com/product/introduction-to-python-for-econometrics-
statistics-and-data-analysis-5th-edition-kevin-sheppard/
ebookmass.com
https://ebookmass.com/product/introduction-to-statistics-and-data-
analysis-6th-edition-roxy-peck/
ebookmass.com
https://ebookmass.com/product/statistics-and-data-analysis-for-
nursing-research-2nd-edition-2nd-edition/
ebookmass.com
https://ebookmass.com/product/the-virtues-in-psychiatric-practice-
john-r-peteet/
ebookmass.com
Sustainable Resource Management: Modern Approaches and
Contexts 1st Edition Chaudhery Mustansar Hussain
https://ebookmass.com/product/sustainable-resource-management-modern-
approaches-and-contexts-1st-edition-chaudhery-mustansar-hussain/
ebookmass.com
https://ebookmass.com/product/excel-vba-programming-for-dummies-6th-
edition-dick-kusleika/
ebookmass.com
https://ebookmass.com/product/study-guide-for-foundations-and-adult-
health-nursing-8th-edition/
ebookmass.com
https://ebookmass.com/product/lippincott-illustrated-reviews-
biochemistry-lippincott-illustrated-reviews-series-7th-edition-ebook-
pdf/
ebookmass.com
https://ebookmass.com/product/philosophy-of-life-german-
lebensphilosophie-1870-1920-prof-frederick-c-beiser/
ebookmass.com
Peace through Self-Determination. Success and Failure of
Territorial Autonomy Felix Schulte
https://ebookmass.com/product/peace-through-self-determination-
success-and-failure-of-territorial-autonomy-felix-schulte/
ebookmass.com
Introduction to Python for
Econometrics, Statistics and Data Analysis
3rd Edition, 1st Revision
Kevin Sheppard
University of Oxford
• Verified that all code and examples work correctly against 2019 versions of modules. The notable
packages and their versions are:
• Python 2.7 support has been officially dropped, although most examples continue to work with 2.7.
Do not Python 2.7 in 2019 for numerical code.
• Fixed direct download of FRED data due to API changes, thanks to Jesper Termansen.
• Thanks for Bill Tubbs for a detailed read and multiple typo reports.
• Tested all code on Pyton 3.6. Code has been tested against the current set of modules installed by
conda as of February 2018. The notable packages and their versions are:
– NumPy: 1.13
– Pandas: 0.22
ii
Notes to the 3rd Edition
This edition includes the following changes from the second edition (August 2014):
• Python 3.5 is the default version of Python instead of 2.7. Python 3.5 (or newer) is well supported by
the Python packages required to analyze data and perform statistical analysis, and bring some new
useful features, such as a new operator for matrix multiplication (@).
• Removed distinction between integers and longs in built-in data types chapter. This distinction is
only relevant for Python 2.7.
• dot has been removed from most examples and replaced with @ to produce more readable code.
• Split Cython and Numba into separate chapters to highlight the improved capabilities of Numba.
• Verified all code working on current versions of core libraries using Python 3.5.
• pandas
• New chapter introducing statsmodels, a package that facilitates statistical analysis of data. statsmod-
els includes regression analysis, Generalized Linear Models (GLM) and time-series analysis using
ARIMA models.
iv
Changes since the Second Edition
• Added diagnostic tools and a simple method to use external code in the Cython section.
• Added examples of joblib and IPython’s cluster to the chapter on running code in parallel.
• New chapter introducing object-oriented programming as a method to provide structure and orga-
nization to related code.
• Added seaborn to the recommended package list, and have included it be default in the graphics
chapter.
• Based on experience teaching Python to economics students, the recommended installation has
been simplified by removing the suggestion to use virtual environment. The discussion of virtual
environments as been moved to the appendix.
• Changed the Anaconda install to use both create and install, which shows how to install additional
packages.
This edition includes the following changes from the first edition (March 2012):
• The preferred installation method is now Continuum Analytics’ Anaconda. Anaconda is a complete
scientific stack and is available for all major platforms.
• New chapter on pandas. pandas provides a simple but powerful tool to manage data and perform
preliminary analysis. It also greatly simplifies importing and exporting data.
• Numba provides just-in-time compilation for numeric Python code which often produces large per-
formance gains when pure NumPy solutions are not available (e.g. looping code).
• Numerous typos
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Important Components of the Python Scientific Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Using Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.A Additional Installation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Basic Math 57
5.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Broadcasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Addition (+) and Subtraction (-) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4 Multiplication (*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 Matrix Multiplication (@) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6 Array and Matrix Division (/) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.7 Exponentiation (**) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.8 Parentheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.9 Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.10 Operator Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7 Special Arrays 77
7.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
15 Graphics 141
15.1 seaborn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
15.2 2D Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
15.3 Advanced 2D Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
15.4 3D Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
15.5 General Plotting Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
15.6 Exporting Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
15.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
xii CONTENTS
16 pandas 161
16.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
16.2 Statistical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
16.3 Time-series Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
16.4 Importing and Exporting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
16.5 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
16.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
30 Examples 349
30.1 Estimating the Parameters of a GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
30.2 Estimating the Risk Premia using Fama-MacBeth Regressions . . . . . . . . . . . . . . . . . . . . . . . . . 354
30.3 Estimating the Risk Premia using GMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
30.4 Outputting LATEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Introduction
1.1 Background
These notes are designed for someone new to statistical computing wishing to develop a set of skills nec-
essary to perform original research using Python. They should also be useful for students, researchers or
practitioners who require a versatile platform for econometrics, statistics or general numerical analysis
(e.g. numeric solutions to economic models or model simulation).
Python is a popular general–purpose programming language that is well suited to a wide range of prob-
lems.1 Recent developments have extended Python’s range of applicability to econometrics, statistics, and
general numerical analysis. Python – with the right set of add-ons – is comparable to domain-specific
languages such as R, MATLAB or Julia. If you are wondering whether you should bother with Python (or
another language), an incomplete list of considerations includes:
You might want to consider R if:
• You want to apply statistical methods. The statistics library of R is second to none, and R is clearly
at the forefront of new statistical algorithm development – meaning you are most likely to find that
new(ish) procedure in R.
• Free is important.
• Documentation and organization of modules are more important than the breadth of algorithms
available.
• Performance is an important concern. MATLAB has optimizations, such as Just-in-Time (JIT) com-
pilation of loops, which is not automatically available in most other packages.
• You don’t mind learning enough Python to interface with Python packages. The Julia ecosystem is
in its infancy and a bridge to Python is used to provide important missing features.
• You like living on the bleeding edge and aren’t worried about code breaking across new versions of
Julia.
Having read the reasons to choose another package, you may wonder why you should consider Python.
• You need a language which can act as an end-to-end solution that allows access to web-based ser-
vices, database servers, data management and processing and statistical computation. Python can
even be used to write server-side apps such as a dynamic website (see e.g. http://stackoverflow.
com), apps for desktop-class operating systems with graphical user interfaces, or apps for tablets and
phones apps (iOS and Android).
• Data handling and manipulation – especially cleaning and reformatting – is an important concern.
Python is substantially more capable at data set construction than either R or MATLAB.
• Free is an important consideration – Python can be freely deployed, even to 100s of servers in on a
cloud-based cluster (e.g. Amazon Web Services, Google Compute or Azure).
1.2 Conventions
2
Python performance can be made arbitrarily close to C using a variety of methods, including Numba (pure python), Cython
(C/Python creole language) or directly calling C code. Moreover, recent advances have substantially closed the gap with respect
to other Just-in-Time compiled languages such as MATLAB.
1.3 Important Components of the Python Scientific Stack 3
2. When a code block contains >>>, this indicates that the command is running an interactive IPython
session. Output will often appear after the console command, and will not be preceded by a com-
mand indicator.
>>> x = 1.0
>>> x + 2
3.0
If the code block does not contain the console session indicator, the code contained in the block is
intended to be executed in a standalone Python file.
import numpy as np
x = np.array([1,2,3,4])
y = np.sum(x)
print(x)
print(y)
1.3.1 Python
Python 3.5 (or later) is required. This provides the core Python interpreter. Most of the examples should
work with the latest version of Python 2.7 as well.
1.3.2 NumPy
NumPy provides a set of array and matrix data types which are essential for statistics, econometrics and
data analysis.
1.3.3 SciPy
SciPy contains a large number of routines needed for analysis of data. The most important include a wide
range of random number generators, linear algebra routines, and optimizers. SciPy depends on NumPy.
IPython provides an interactive Python environment which enhances productivity when developing code
or performing interactive data analysis. Jupyter provides a generic set of infrastructure that enables IPython
to be run in a variety of settings including an improved console (QtConsole) or in an interactive web-
browser based notebook.
4 Introduction
matplotlib provides a plotting environment for 2D plots, with limited support for 3D plotting. seaborn is
a Python package that improves the default appearance of matplotlib plots without any additional code.
1.3.6 pandas
1.3.7 statsmodels
statsmodels is pandas-aware and provides models used in the statistical analysis of data including linear
regression, Generalized Linear Models (GLMs), and time-series models (e.g., ARIMA).
A number of modules are available to help with performance. These include Cython and Numba. Cython
is a Python module which facilitates using a simple Python-derived creole to write functions that can be
compiled to native (C code) Python extensions. Numba uses a method of just-in-time compilation to
translate a subset of Python to native code using Low-Level Virtual Machine (LLVM).
1.4 Setup
The recommended method to install the Python scientific stack is to use Continuum Analytics’ Anaconda.
Appendix 1.A.3 describes a more complex installation procedure with instructions for directly installing
Python and the required modules when it is not possible to install Anaconda.
Windows
Installation on Windows requires downloading the installer and running. Anaconda comes in both Python
2.7 and 3.x flavors, and the latest Python 3.x is the preferred choice. These instructions use ANACONDA
to indicate the Anaconda installation directory (e.g. the default is C:\Anaconda). Once the setup has
completed, open a command prompt (cmd.exe) and run
1.4 Setup 5
cd ANACONDA\Scripts
conda update conda
conda update anaconda
conda install html5lib seaborn
which will first ensure that Anaconda is up-to-date. conda install can be used later to install other pack-
ages that may be of interest. Note that if Anaconda is installed into a directory other than the default, the
full path should not contain Unicode characters or spaces.
Notes
• Install for all users, which requires admin privileges. If these are not available, then choose the “Just
for me” option, but be aware of installing on a path that contains non-ASCII characters which can
cause issues.
• Add Anaconda to the System PATH - This is important to ensure that Anaconda commands can be
run from the command prompt.
• Register Anaconda as the system Python unless you have a specific reason not to (unlikely).
If Anaconda is not added to the system path, it is necessary to add the ANACONDA and ANACONDA\Scripts
directories to the PATH using
set PATH=ANACONDA;ANACONDA\Scripts;%PATH%
Linux and OS X
where x.y.z will depend on the version being installed and ISA will be either x86 or more likely x86_64.
Anaconda comes in both Python 2.7 and 3.x flavors, and the latest Python 3.x is the preferred choice. The
OS X installer is available either in a GUI installed (pkg format) or as a bash installer which is installed
in an identical manner to the Linux installation. It is strongly recommended that the anaconda/bin is
prepended to the path. This can be performed in a session-by-session basis by entering
export PATH=ANACONDA/bin;$PATH
On Linux this change can be made permanent by entering this line in .bashrc which is a hidden file located
in ~/. On OS X, this line can be added to .bash_profile which is located in the home directory (~/).3
After installation completes, execute
conda update conda
conda update anaconda
conda install html5lib seaborn
3
Use the appropriate settings file if using a different shell (e.g. .zshrc for zsh).
6 Introduction
which will first ensure that Anaconda is up-to-date and then to install the Intel Math Kernel library-linked
modules, which provide substantial performance improvements – this package requires a license which
is free to academic users and low cost to others. If acquiring a license is not possible, omit this line.
conda install can be used later to install other packages that may be of interest.
Notes
All instructions for OS X and Linux assume that ANACONDA/bin has been added to the path. If this is not
the case, it is necessary to run
cd ANACONDA
cd bin
Python can be programmed using an interactive session using IPython or by directly executing Python
scripts – text files that end with the extension .py – using the Python interpreter.
Most of this introduction focuses on interactive programming, which has some distinct advantages when
learning a language. The standard Python interactive console is very basic and does not support useful
features such as tab completion. IPython, and especially the QtConsole version of IPython, transforms
the console into a highly productive environment which supports a number of useful features:
• Tab completion - After entering 1 or more characters, pressing the tab button will bring up a list of
functions, packages, and variables which match the typed text. If the list of matches is large, pressing
tab again allows the arrow keys can be used to browse and select a completion.
• “Magic” function which make tasks such as navigating the local file system (using %cd ~/directory/
or just cd ~/directory/ assuming that %automagic is on) or running other Python programs (using run
program.py) simple. Entering %magic inside and IPython session will produce a detailed description
of the available functions. Alternatively, %lsmagic produces a succinct list of available magic com-
mands. The most useful magic functions are
– cd - change directory
• Integrated help - When using the QtConsole, calling a function provides a view of the top of the help
function. For example, entering mean( will produce a view of the top 20 lines of its help text.
• Inline figures - Both the QtConsole and the notebook can also display figure inline which produces a
tidy, self-contained environment. This can be enabled by entering %matplotlib inline in an IPython
session.
• The special variable _ contains the last result in the console, and so the most recent result can be
saved to a new variable using the syntax x = _.
OS X and Linux
This single line launcher can be saved as filename.command where filename is a meaningful name (e.g.
IPython-Terminal) to create a launcher on OS X by entering the command
chmod 755 /FULL/PATH/TO/filename.command
and then using the command as the Command in the dialog that appears.
Windows (Anaconda)
To run IPython open cmd and enter IPython in the start menu. Starting IPython using the QtConsole is
similar and is simply called QtConsole in the start menu. Launching IPython from the start menu should
create a window similar to that in figure 1.1.
Next, run
in the terminal or command prompt to generate a file named jupyter_qtconsole_config.py. This file contains
settings that are useful for customizing the QtConsole window. A few recommended modifications are
8 Introduction
These commands assume that the Bitstream Vera fonts have been locally installed, which are available
from http://ftp.gnome.org/pub/GNOME/sources/ttf-bitstream-vera/1.10/. Opening QtConsole should
create a window similar to that in figure 1.2 (although the appearance might differ) if you did not use the
recommendation configuration.
Help is available in IPython sessions using help(function). Some functions (and modules) have very long
help files. When using IPython, these can be paged using the command ?function or function? so that the
text can be scrolled using page up and down and q to quit. ??function or function?? can be used to type
the entire function including both the docstring and the code.
While interactive programming is useful for learning a language or quickly developing some simple code,
complex projects require the use of complete programs. Programs can be run either using the IPython
magic work %run program.py or by directly launching the Python program using the standard interpreter
using python program.py. The advantage of using the IPython environment is that the variables used in
the program can be inspected after the program run has completed. Directly calling Python will run the
program and then terminate, and so it is necessary to output any important results to a file so that they
can be viewed later.4
4
Programs can also be run in the standard Python interpreter using the command:
exec(compile(open(’filename.py’).read(),’filename.py’,’exec’))
1.5 Using Python 9
Once you have saved this file, open the console, navigate to the directory you saved the file and enter
python firstprogram.py. Finally, run the program in IPython by first launching IPython, and the using
%cd to change to the location of the program, and finally executing the program using %run firstprogram.py.
When writing Python code, only a small set of core functions and variable types are available in the in-
terpreter. The standard method to access additional variable types or functions is to use imports, which
explicitly allow access to specific packages or functions. While it is best practice to only import required
functions or packages, there are many functions in multiple packages that are commonly encountered
in these notes. Pylab is a collection of common NumPy, SciPy and Matplotlib functions that can be eas-
ily imported using a single command in an IPython session, %pylab. This is nearly equivalent to calling
from pylab import *, since it also sets the backend that is used to draw plots. The backend can be manu-
ally set using %pylab backend where backend is one of the available backends (e.g., qt5 or inline). Similarly
%matplotlib backend can be used to set just the backend without importing all of the modules and func-
tions come with %pylab .
Most chapters assume that %pylab has been called so that functions provided by NumPy can be called
10 Introduction
Figure 1.3: A successful test that matplotlib, IPython, NumPy and SciPy were all correctly installed.
without explicitly importing them.
To make sure that you have successfully installed the required components, run IPython using shortcut
or by running ipython or jupyter qtconsole run in a terminal window. Enter the following commands,
one at a time (the meaning of the commands will be covered later in these notes).
>>> %pylab qt5
>>> x = randn(100,100)
>>> y = mean(x,0)
>>> import seaborn
>>> plot(y)
>>> import scipy as sp
If everything was successfully installed, you should see something similar to figure 1.3.
A jupyter notebook is a simple and useful method to share code with others. Notebooks allow for a fluid
synthesis of formatted text, typeset mathematics (using LATEX via MathJax) and Python. The primary method
for using notebooks is through a web interface, which allows creation, deletion, export and interactive
editing of notebooks.
1.5 Using Python 11
Figure 1.4: The default IPython Notebook screen showing two notebooks.
To launch the jupyter notebook server, open a command prompt or terminal and enter
jupyter notebook
This command will start the server and open the default browser which should be a modern version of
Chrome (preferable), Chromium or Firefox. If the default browser is Safari, Internet Explorer or Edge, the
URL can be copied and pasted into Chrome. The first screen that appears will look similar to figure 1.4,
except that the list of notebooks will be empty. Clicking on New Notebook will create a new notebook,
which, after a bit of typing, can be transformed to resemble figure 1.5. Notebooks can be imported by
dragging and dropping and exported from the menu inside a notebook.
As you progress in Python and begin writing more sophisticated programs, you will find that using an In-
tegrated Development Environment (IDE) will increase your productivity. Most contain productivity en-
hancements such as built-in consoles, code completion (or IntelliSense, for completing function names)
and integrated debugging. Discussion of IDEs is beyond the scope of these notes, although Spyder is a
reasonable choice (free, cross-platform). Aptana Studio is another free alternative. My preferred IDE is
PyCharm, which has a community edition that is free for use (the professional edition is low cost for aca-
demics).
12 Introduction
Figure 1.5: An IPython notebook showing formatted markdown, LATEX math and cells containing code.
Spyder
Spyder is an IDE specialized for use in scientific applications of Python rather than for general purpose
application development. This is both an advantage and a disadvantage when compared to a full featured
IDE such as PyCharm, Python Tools for Visual Studio (PVTS), PyDev or Aptana Studio. The main advantage
is that many powerful but complex features are not integrated into Spyder, and so the learning curve is
much shallower. The disadvantage is similar - in more complex projects, or if developing something that is
not straight scientific Python, Spyder is less capable. However, netting these two, Spyder is almost certainly
the IDE to use when starting Python, and it is always relatively simple to migrate to a sophisticated IDE if
needed.
Spyder is started by entering spyder in the terminal or command prompt. A window similar to that in
figure 1.6 should appear. The main components are the editor (1), the object inspector (2), which dynam-
ically will show help for functions that are used in the editor, and the console (3). By default, Spyder opens
a standard Python console, although it also supports using the more powerful IPython console. The object
inspector window, by default, is grouped with a variable explorer, which shows the variables that are in
memory and the file explorer, which can be used to navigate the file system. The console is grouped with
an IPython console window (needs to be activated first using the Interpreters menu along the top edge),
and the history log which contains a list of commands executed. The buttons along the top edge facilitate
saving code, running code and debugging.
1.6 Exercises
1. Install Python.
5. Explore tab completion in IPython by entering a<TAB> to see the list of functions which start with
a and are loaded by pylab. Next try i<TAB>, which will produce a list longer than the screen – press
ESC to exit the pager.
All
Whitespace sensitivity
Python is whitespace sensitive and so indentation, either spaces or tabs, affects how Python interprets
files. The configuration files, e.g. ipython_config.py, are plain Python files and so are sensitive to whitespace.
Introducing white space before the start of a configuration option will produce an error, so ensure there
is no whitespace before active lines of a configuration.
14 Introduction
Windows
Spaces in path
Unicode in path
Python does not always work well when a path contains Unicode characters, which might occur in a user
name. While this isn’t an issue for installing Python or Anaconda, it is an issue for IPython which looks in
c:\user\username\.ipython for configuration files. The solution is to define the HOME variable before launch-
ing IPython to a path that has only ASCII characters.
mkdir c:\anaconda\ipython_config
set HOME=c:\anaconda\ipython_config
c:\Anaconda\Scripts\activate econometrics
ipython profile create econometrics
ipython --profile=econometrics
The set HOME=c:\anaconda\ipython_config can point to any path with directories containing only ASCII
characters, and can also be added to any batch file to achieve the same effect.
OS X
If the user account used is running as root, then Anaconda may install to /anaconda and not ~/anaconda by
default. Best practice is not to run as root, although in principle this is not a problem, and /anaconda can
be used in place of ~/anaconda in any of the instructions.
1.A.2 register_python.py
import sys
from _winreg import *
# tweak as necessary
version = sys.version[:3]
1.A Additional Installation Issues 15
installpath = sys.prefix
def RegisterPy():
try:
reg = OpenKey(HKEY_LOCAL_MACHINE, regpath)
except EnvironmentError:
try:
reg = CreateKey(HKEY_LOCAL_MACHINE, regpath)
except Exception, e:
print "*** Unable to register: %s" % e
return
if __name__ == "__main__":
RegisterPy()
The simplest method to install the Python scientific stack is to use directly Continuum Analytics’ Ana-
conda. These instructions describe alternative installation options using virtual environments, which al-
low alternative configurations to simultaneously co-exist on a single system. The primary advantage of a
virtual environment is that it allows package versions to be frozen so that code that upgrading a module
or all of Anaconda does not upgrade the packages in a particular virtual environment.
Windows
Installation on Windows requires downloading the installer and running. These instructions use ANA-
CONDA to indicate the Anaconda installation directory (e.g. the default is C:\Anaconda). Once the setup
has completed, open a command prompt (cmd.exe) and run
cd ANACONDA
conda update conda
conda update anaconda
conda create -n econometrics qtconsole notebook matplotlib numpy pandas scipy spyder statsmodels
conda install -n econometrics cython lxml nose numba numexpr pytables sphinx xlrd xlwt html5lib
seaborn
16 Introduction
which will first ensure that Anaconda is up-to-date and then create a virtual environment named econo-
metrics. Using a virtual environment is a best practice and is important since component updates can
lead to errors in otherwise working programs due to backward incompatible changes in a module. The
long list of modules in the conda create command includes the core modules. conda install contains the
remaining packages and is shown as an example of how to add packages to an existing virtual environment
after it has been created. It is also possible to install all available Anaconda packages using the command
conda create -n econometrics anaconda.
The econometrics environment must be activated before use. This is accomplished by running
ANACONDA\Scripts\activate.bat econometrics
from the command prompt, which prepends [econometrics] to the prompt as an indication that virtual
environment is active. Activate the econometrics environment and then run
cd c:\
ipython
which will open an IPython session using the newly created virtual environment.
Virtual environments can also be created using specific versions of packages using pinning. For ex-
ample, to create a virtual environment naed python2 using Python 2.7 and NumPy 1.10,
which will install the requested versions of Python and NumPy as well as the latest version of SciPy and
pandas that are compatible with the pinned versions.
Linux and OS X
where x.y.z will depend on the version being installed and ISA will be either x86 or more likely x86_64.
The OS X installer is available either in a GUI installed (pkg format) or as a bash installer which is installed
in an identical manner to the Linux installation. After installation completes, change to the folder where
Anaconda installed (written here as ANACONDA, default ~/anaconda) and execute
cd ANACONDA
cd bin
./conda update conda
./conda update anaconda
./conda create -n econometrics qtconsole notebook matplotlib numpy pandas scipy spyder statsmodels
./conda install -n econometrics cython lxml nose numba numexpr pytables sphinx xlrd xlwt html5lib
seaborn
which will first ensure that Anaconda is up-to-date and then create a virtual environment named econo-
metrics with the required packages. conda create creates the environment and conda install installs ad-
ditional packages to the existing environment. conda install can be used later to install other packages
that may be of interest. To activate the newly created environment, run
source ANACONDA/bin/activate econometrics
1.A Additional Installation Issues 17
Python comes in a number of flavors which may be suitable for econometrics, statistics and numerical
analysis. This chapter explains why 3.5, the latest release of Python 3, was chosen for these notes and
highlights some of the available alternatives.
Python 2.7 is the final version of the Python 2.x line – all future development work will focus on Python 3.
The reasons for using 3.x (especially 3.5+) are:
• Virtually all modules needed to perform data analysis and econometrics are tested using Python 3.5
(in addition to other versions).
• Python 3 has introduced some nice language changes that help for numerical computing, such as
using a default division that will produce a floating point number when dividing two integers. Old
versions of Python would produce 0 when evaluating 1/2.
• A new operator has been introduced that will simplify writing matrix-intensive code (@, as in x @ y).
While it was once the case that Python 2.7 was a better choice, there are now clear reasons to prefer 3.5.
Intel’s MKL and AMD’s GPUOpen libraries provide optimized linear algebra routines. The functions in
these libraries execute faster than basic those in linear algebra libraries and are, by default, multithreaded
so that a many linear algebra operations will automatically make use all of the processors on your system.
Most standard builds of NumPy do not include these, and so it is important to use a Python distribution
built with an appropriate linear algebra library (especially if computing inverses or eigenvalues of large
matrices). The three primary methods to access NumPy built with the Intel MKL are:
• Use the pre-built NumPy binaries made available by Christoph Gohlke for Windows.
20 Python 2.7 vs. 3 (and the rest)
• Follow instructions for building NumPy on Linux with MKL (which is free on Linux).
There are no pre-built libraries using AMD’s GPUOpen, and so it is necessary to build NumPy from scratch
if using an AMD processor (or buy an Intel system, which is an easier solution).
Some other variants of the recommended version of Python are worth mentioning.
Enthought Canopy is an alternative to Anaconda. It is available for Windows, Linux and OS X. Canopy
is regularly updated and is currently freely available in its basic version. The full version is also freely
available to academic users. Canopy is built using MKL, and so matrix algebra performance is very fast.
2.3.2 IronPython
IronPython is a variant which runs on the Common Language Runtime (CLR , aka Windows .NET). The
core modules – NumPy and SciPy – are available for IronPython, and so it is a viable alternative for nu-
merical computing, especially if already familiar with the C# or interoperation with .NET components
is important. Other libraries, for example, matplotlib (plotting) are not available, and so there are some
important limitations.
2.3.3 Jython
Jython is a variant which runs on the Java Runtime Environment (JRE). NumPy is not available in Jython
which severely limits Jython’s usefulness for numeric work. While the limitation is important, one advan-
tage of Python over other languages is that it is possible to run (mostly unaltered) Python code on a JVM
and to call other Java libraries.
2.3.4 PyPy
PyPy is a new implementation of Python which uses Just-in-time compilation and LLVM to accelerate
code, especially loops (which are common in numerical computing). It may be anywhere between 2 - 500
times faster than standard Python. Unfortunately, at the time of writing, the core library, NumPy is only
partially implemented, and so it is clearly not ready for use. Current plans are to have a version ready in
the near future, and if so, PyPy may become the a viable choice for numerical computing.
Most differences between Python 2.7 and 3 are not important for using Python in econometrics, statistics
and numerical analysis. If one wishes to use Python 2.7, it is important to understand these differences.
Note that these differences are important in stand-alone Python programs.
2.A Relevant Differences between Python 2.7 and 3 21
2.A.1 print
print is a function used to display test in the console when running programs. In Python 2.7, print is a
keyword which behaves differently from other functions. In Python 3, print behaves like most functions.
The standard use in Python 2.7 is
print 'String to Print'
which resembles calling a standard function. Python 2.7 contains a version of the Python 3 print, which
can be used in any program by including
from __future__ import print_function
2.A.2 division
Python 3 changes the way integers are divided. In Python 2.7, the ratio of two integers was always an
integer, and so results are truncated towards 0 if the result was fractional. For example, in Python 2.7, 9/5
is 1. Python 3 gracefully converts the result to a floating point number, and so in Python 3, 9/5 is 1.8. When
working with numerical data, automatically converting ratios avoids some rare errors. Python 2.7 can use
the Python 3 behavior by including
from __future__ import division
It is often useful to generate a sequence of numbers for use when iterating over data. In Python 2.7, best
practice is to use the keyword xrange to do this. In Python 3, this keyword has been renamed range.
Unicode is an industry standard for consistently encoding text. The computer alphabet was originally
limited to 128 characters which is insufficient to contain the vast array of characters in all written lan-
guages. Unicode expands the possible space to allow for more than 1,000,000 characters. Python 3 treats
all strings as unicode unlike Python 2.7 where characters are a single byte, and unicode strings require the
special syntax u'unicode string' or unicode('unicode string'). In practice this is unlikely to impact most
numeric code written in Python except possibly when reading or writing data. If working in a language
where characters outside of the standard but limited 128 character ASCII set are commonly encountered,
it may be useful to use
from __future__ import unicode_literals
Andræus Melvinus.
FOOTNOTES:
1 ‘A man of science as well as of philosophic mind would employ
himself well in examining those accounts of prodigies in the early
annalists and chroniclers, which of late years have been regarded
as only worthy of contempt.’—Southey—Omniana, i. 266.
2 De Fratribus Minoribus nulla est quæstio, professi siquidem
simulatam paupertatem, nulla prædia, nullos fundos habent; sed
sub prætextu pietatis ex interceptis testamentis, et stultæ pietatis
zelo, ditissimi facti sunt: quod ex eventu, post infelicem pugnam
de Flodden, compertum est: nam qui eo pugnaturi
proficiscebantur, nisi confessione facta remissionem a Fratribus
Minoribus impetrassent, omnia mala ominabantur. Interea omnem
pecuniam, monumenta, et si quid pretiosum alioqui habebant,
eorum fidei committebant, sperantes, se mortuis, illos ea quæ
credebantur omnia fide integra posteris suis restituros: at illi,
eorum qui in prælio occubuerunt, nec fidem reposcere poterant,
bona in fundi comparatione, et ecclesiæ et monasterii
exstructione ad sui ordinis homines convertebant: nec aliter
accidit in acie Pinquini.—Craig, Jus Feudale, lib. i.
3 Registrum Episcopatus Aberdonensis, ii., 309, 310.
4 As often as I turn my eyes to the niceness and elegance of our
own times, the ancient manners of our forefathers appear sober
and venerable, but withal rough and horrid.—Buchanan: De Jure
Regni, as quoted by Dugald Stewart in Preliminary Dissertation,
Encyclopædia Britannica.
5 This phrase occurs in an order of the provost of Edinburgh (Earl
of Arran), dated 1518, excusing Francis Bothwell from taking the
part of Little John.—Napier’s Life of Napier of Merchiston, p. 53.
6 See the Rev. Joseph Hunter’s tract, The Ballad Hero Robin
Hood, 1852; making it at length tolerably certain that the outlaw
lived in the reign of Edward II, and for a short time held office in
that king’s household.
7 Arnot’s History of Edinburgh.
8 Scots Acts, 1555.
9 Persons in the employment of the craftsmen; journeymen.
10 From a sculpture on the Magdalen Chapel, Cowgate,
Edinburgh.
11 Refreshment at 4 o’clock afternoon. Latterly, the term has
been applied to tea-drinking.
12 A road in the line of the present Princes Street.
13 Knox says she frowned here, and gave the books to Arther
Erskine, the captain of her guard, ‘the maist pestilent papist
within the realm.’
14 Anti-tune, antiphone, or response.
15 Notes to Ancient Scottish Poems from the Bannatyne
Manuscript, 1770.
16 From a unique copy of this tract a reprint was given by Mr
John Robertson to the Bannatyne Club, 1833.
17 See under October 1570; also April 5, 1603.
18 Comedy of Errors, Act III. sc. 2.
19 In July 1538, there is an entry in the treasurer’s books, of 14s.
‘to Alexander Naper for mending of the Queen’s sadill and her
cheriot, in Sanct Androis.’ In January 1541-2, there is another: ‘To
mend the Quenis cheriot vi-1/4 elnis blak velvet, £16, 17s. 6d.’
Besides something for cramosie, satin, and fringes.
20 History of the Family of Mackenzie, MS. in possession of J. W.
Mackenzie, Esq., W.S., Edinburgh.
21 A tract containing the disputation was printed by Lekprivik in
1563, and has been republished, Edinburgh, George Ramsay &
Co., 1812. Dr M‘Crie, in his Life of John Knox, gives an ample
abstract of this curious pamphlet.
22 Randolph to Cecil, Edin. Nov. 30, 1562. Chalmers’s Life of
Queen Mary.
23 Edin. Council Register, apud Maitland.
24 In England, the spring of 1562 had been marked by excessive
rains, and the harvest was consequently bad. Towards the end of
the year, plague broke out in the crowded and harassed
population of Havre, in France, then undergoing a siege, and
from the garrison it was imparted to England, which had been
prepared for its reception by the famine. There it prevailed
throughout the whole year 1563, carrying off 20,000 persons in
London alone. ‘The poor citizens,’ says Stowe, ‘were this year
plagued with a threefold plague—pestilence, dearth of money,
and dearth of victuals; the misery whereof were too long here to
write. No doubt the poor remember it.’ On account of the plague
at Michaelmas, no term was kept, and there was no lord-mayor’s
dinner! The plague spread into Germany, where it was estimated
to have carried off 300,000 persons.
25 See notes to Scott’s Lay of the Last Minstrel.
26 This curious contract is printed entire in Pitcairn, iii. 390.
27 Scott’s notes, ut supra.
28 There is a place called Tarlair near Banff.
29 Nicol Burne’s Disputation, p. 143.
30 While Drury lay before the castle, Lord Fleming entered into a
hostile correspondence with Sir George Carey, one of Elizabeth’s
officers. This is given in Holinshed’s Chronicle.
31 Mr Pennant, from whom the above translation is borrowed,
says, by a strange mistake, ‘on one of the deer.’
32 William Barclay, De Regno et Regali Potestate adversus
Monarchomachos. Parisiis, 1600. This author was a native of
Aberdeenshire, but finally settled at Angers, in France, as
Professor of Civil Law in the University there. He died in 1604.
Bishop Geddes, in introducing this extract from Barclay’s forgotten
work to the notice of the Society of the Antiquaries of Scotland
(1782), remarks that a still more grand entertainment of the
same kind was given in 1529 to King James V., his mother, Queen
Margaret, and the pope’s legate, by the then Earl of Athole, and
that an account of the affair has been preserved in Lindsay of
Pitscottie’s History of Scotland. The venerable bishop adds: ‘Need
I take notice that the hunting described by Barclay bears some
resemblance to the batidas of the present king of Spain, where
several huntsmen form a line and drive the deer through a
narrow pass, at one side of which the king, with some attendants,
has his post, in a green but of boughs, and slaughters the poor
animals as they come out almost as fast as charged guns can be
put into his hand and he fire them. These are things sufficiently
known; and the same manner of stag-hunting is practised in Italy,
Germany, and other parts of Europe.’421
33 Gunn’s Historical Enquiry respecting the Harp in the Highlands.
1807.
34 Agnes Strickland’s Life of Queen Mary.
35 Archæologia Scotica, ii. 287.
36 Richard Bannatyne’s Memorials, p. 238.
37 Dalyell’s Darker Superstitions of Scotland, p. 130.
38 Walter Goodall and Miss Agnes Strickland have been misled by
the description of the place in Bothwell’s Act of Forfeiture—‘ad
pontes, vulgo vocatos foulbriggs‘—into the belief that the queen
was seized at the suburb of Edinburgh formerly called Foulbriggs,
and now Fountain Bridge. In reality, the expression in the Act,
rightly translated, applies to the place indicated in the Diurnal of
Occurrents—‘at the Briggs, commonly called Foulbriggs,’ the
syllable foul being presumably a vulgar casual addition which the
ancient marshy condition of the place rendered appropriate. All
the other contemporary writers place the scene of the seizure at
the Almond—Buchanan, Birrel, and Herries—while Sir James
Melville, who was one of the party seized, says ‘betwixt Linlithgow
and Edinburgh’—an expression he could scarcely have used if the
fact had happened close to the city. In Ane Chronicle of the Kings
of Scotland, printed by the Maitland Club, and apparently
contemporary, the brig of Awmont is the locality assigned. But the
most powerful evidence on the subject, and what sets the matter
at rest, is a Remission under the Privy Seal, of date October 1,
1567, to Andrew Redpath, for his being concerned in ‘besetting
the queen’s way ... near the water of Awmond, and for taking and
ravishing her,’ &c. It may be remarked that there is no evidence of
the suburb alluded to by Miss Strickland having been called
Foulbriggs, or having existed at all, at that time, while we have
proof of the existence of a place on the Almond Water, under the
name of the Briggs, long before this time. In the Register of the
Privy Seal is ‘ane lettre maid to Robert Hamilton in Briggis,
makand him capitane and kepar of the place and palace of
Linlithgow,’ &c. 1543, Aug. 22.
39 Privy Seal Register.
40 Carries.
41 Nickname.
42 Garret.
43 Searches.
44 Thievery.
45 Ere.
46 Till.
47 Ancient Scottish Poems, 2 vols. 1786.
48 Border Minstrelsy, i. 157.
49 Burgh Record of Canongate, Maitland Club Mis., ii. 303.
50 Babees, halfpence, from bas billon, a low piece of money.
51 Hume’s Hist. House of Douglas.
52 Privy Seal Register.
53 Discoverie and Historie of the Gold Mynes in Scotland. Written
in 1619. Bannatyne Club, 1825.
54 Holinshed’s Chronicle.
55 The original, preserved in the General Register House, is
printed at length in Pitcairn, iii. 394.
56 Privy Seal Register.
57 Council Register, quoted in Maitland’s History of Edinburgh, p.
32.
58 Ane Breve Descriptioun of the Pest, &c. 1568.
59 Mr M. Napier’s Notes to Spottiswoode’s History, Spot. Club
edition.
60 Where Napier had other estates.
61 The bishop was about to go to York, to attend the
investigation respecting the queen.
62 Justiciary Records, MS., Adv. Lib., quoted by Mr Mark Napier.
63 Burgh Records of Canongate, Mait. Club Mis., ii. 313.
64 The pest was severe in London in autumn 1569, whether by
communication from Scotland does not appear.
65 Ane Addicioun of Scottis Cornicklis and Deidis, printed from an
original manuscript by Thomas Thomson, Esq.
66 Memorials of George Bannatyne. Edited by Sir Walter Scott.
Bannatyne Club-book, 1829.
67 Extracts from Canongate Council Register, Maitland Club
Miscellany, ii. 814.
68 Ane Trajedie in forme of ane Diallog betwix Honour, Gude
Fame, and the Authour heirof, in ane Trance. Lekprevik, 1570.
69 Dalyell’s Illustrations of Scottish History, p. 521.
70 Harrison’s translation, apud Holinshed.
71 Extracta e Chronicis Scocie. Edin. 1842.
72 Sir William Sinclair, who records these curious particulars, was
Lord Justice-general of Scotland, and altogether an estimable
person. According to Father Hay: ‘He gathered a great many
manuscripts, which had been taken by the rabble out of our
monasteries in the time of the Reformation.’—Genealogy of the
Sinclairs of Roslin, edited by James Maidment, Esq. 1835. See
something further about him under June 1623.
73 The distance from Bathgate to Edinburgh is eighteen miles.
74 Bannatyne’s Journal, 46.
75 Calderwood, iii. 20, 167, and note.
76 The couplet almost verbatim occurs in the prophecies of
Bertlingtoun, in R. Waldegrave’s brochure, already quoted (under
Jan. 1, 1561-2):
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookmass.com