100% found this document useful (4 votes)
25 views

Download Complete Scientific Inference Learning From Data 1st Edition Simon Vaughan PDF for All Chapters

Vaughan

Uploaded by

marselouien0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (4 votes)
25 views

Download Complete Scientific Inference Learning From Data 1st Edition Simon Vaughan PDF for All Chapters

Vaughan

Uploaded by

marselouien0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Visit https://ebookgate.

com to download the full version and


explore more ebooks

Scientific Inference Learning From Data 1st Edition


Simon Vaughan

_____ Click the link below to download _____


https://ebookgate.com/product/scientific-inference-
learning-from-data-1st-edition-simon-vaughan/

Explore and download more ebooks at ebookgate.com


Here are some recommended products that might interest you.
You can download now and explore!

Learning From Data A short course 1st Edition Yaser S.


Abu-Mostafa

https://ebookgate.com/product/learning-from-data-a-short-course-1st-
edition-yaser-s-abu-mostafa/

ebookgate.com

Learning Rails 1st Edition Simon St. Laurent

https://ebookgate.com/product/learning-rails-1st-edition-simon-st-
laurent/

ebookgate.com

Modern Statistics With R From Wrangling and Exploring Data


to Inference and Predictive Modelling second edition Måns
Thulin
https://ebookgate.com/product/modern-statistics-with-r-from-wrangling-
and-exploring-data-to-inference-and-predictive-modelling-second-
edition-mans-thulin-2/
ebookgate.com

Modern Statistics with R From Wrangling and Exploring Data


to Inference and Predictive Modelling Second Edition Måns
Thulin
https://ebookgate.com/product/modern-statistics-with-r-from-wrangling-
and-exploring-data-to-inference-and-predictive-modelling-second-
edition-mans-thulin/
ebookgate.com
Pattern theory from representation to inference Ulf
Grenander

https://ebookgate.com/product/pattern-theory-from-representation-to-
inference-ulf-grenander/

ebookgate.com

Advanced Analytics with Spark Patterns for Learning from


Data at Scale 1st Edition Sandy Ryza

https://ebookgate.com/product/advanced-analytics-with-spark-patterns-
for-learning-from-data-at-scale-1st-edition-sandy-ryza/

ebookgate.com

Understanding Psychology as a Science An Introduction to


Scientific and Statistical Inference 2008th Edition Zoltan
Dienes
https://ebookgate.com/product/understanding-psychology-as-a-science-
an-introduction-to-scientific-and-statistical-inference-2008th-
edition-zoltan-dienes/
ebookgate.com

Speaking Pictures 1st Edition Virginia Mason Vaughan

https://ebookgate.com/product/speaking-pictures-1st-edition-virginia-
mason-vaughan/

ebookgate.com

Learning Qlikview Data Visualization 1st Edition Karl


Pover

https://ebookgate.com/product/learning-qlikview-data-
visualization-1st-edition-karl-pover/

ebookgate.com
SCIENTIFIC INFERENCE

Providing the knowledge and practical experience to begin analysing scientific


data, this book is ideal for physical sciences students wishing to improve their data
handling skills.
The book focuses on explaining and developing the practice and understanding
of basic statistical analysis, concentrating on a few core ideas, such as the visual
display of information, modelling using the likelihood function, and simulating
random data.
Key concepts are developed through a combination of graphical explanations,
worked examples, example computer code and case studies using real data. Stu-
dents will develop an understanding of the ideas behind statistical methods and
gain experience in applying them in practice. Further resources are available at
www.cambridge.org/9781107607590, including data files for the case studies so
students can practice analysing data, and exercises to test students’ understanding.

simon vaughan is a Reader in the Department of Physics and Astronomy,


University of Leicester, where he has developed and runs a highly regarded course
for final year physics students on the subject of statistics and data analysis.
SCIENTIFIC INFERENCE
Learning from data

SIMON VAUGHAN
University of Leicester
University Printing House, Cambridge CB2 8BS, United Kingdom

Cambridge University Press is a part of the University of Cambridge.


It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning and research at the highest international levels of excellence.

www.cambridge.org
Information on this title: www.cambridge.org/9781107607590
© S. Vaughan 2013
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2013
Printing in the United Kingdom by TJ International Ltd. Padstow Cornwall
A catalogue record for this publication is available from the British Library
Library of Congress Cataloguing in Publication data
Vaughan, Simon, 1976– author.
Scientific inference : learning from data / Simon Vaughan.
pages cm
Includes bibliographical references and index.
ISBN 978-1-107-02482-3 (hardback) – ISBN 978-1-107-60759-0 (paperback)
1. Mathematical statistics – Textbooks. I. Title.
QA276.V34 2013
519.5 – dc23 2013021427
ISBN 978-1-107-02482-3 Hardback
ISBN 978-1-107-60759-0 Paperback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication,
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
For my family
Contents

For the student page x


For the instructor xii
1 Science and statistical data analysis 1
1.1 Scientific method 1
1.2 Inference 3
1.3 Scientific inference 6
1.4 Data analysis in a nutshell 7
1.5 Random samples 8
1.6 Know your data 10
1.7 Language 11
1.8 Statistical computing using R 12
1.9 How to use this book 12
2 Statistical summaries of data 14
2.1 Plotting data 14
2.2 Plotting univariate data 16
2.3 Centre of data: sample mean, median and mode 18
2.4 Dispersion in data: variance and standard deviation 21
2.5 Min, max, quantiles and the five-number summary 24
2.6 Error bars, standard errors and precision 25
2.7 Plots of bivariate data 28
2.8 The sample correlation coefficient 36
2.9 Plotting multivariate data 38
2.10 Good practice in statistical graphics 43
2.11 Chapter summary 44
3 Simple statistical inferences 46
3.1 Inference about the mean of a sample 46

vii
viii Contents

3.2 Difference in means from two samples 49


3.3 Straight line fits 51
3.4 Linear regression in practice 56
3.5 Residuals: what lies beneath 58
3.6 Case study: regression of Reynolds’ data 59
3.7 Chapter summary 63
4 Probability theory 64
4.1 Experiments, outcomes and events 64
4.2 Probability 69
4.3 The rules of the probability calculus 72
4.4 Random variables 82
4.5 The visual perception of randomness 89
4.6 The meaning of ‘probability’ and ‘random’ 89
4.7 Chapter summary 92
5 Random variables 94
5.1 Properties of random variables 94
5.2 Discrete random variables 100
5.3 Continuous random variables 110
5.4 Change of variables 116
5.5 Approximate variance relations (or the propagation
of errors) 120
5.6 Chapter summary 122
6 Estimation and maximum likelihood 124
6.1 Models 124
6.2 Case study: Rutherford & Geiger data 125
6.3 Maximum likelihood estimation 129
6.4 Weighted least squares 133
6.5 Case study: pion scattering data 139
6.6 Chapter summary 140
7 Significance tests and confidence intervals 142
7.1 A thought experiment 142
7.2 Significance testing and test statistics 143
7.3 Pearson’s χ 2 test 146
7.4 Fixed-level tests and decisions 153
7.5 Interpreting test results 156
7.6 Confidence intervals on MLEs 159
7.7 Chapter summary 166
Contents ix

8 Monte Carlo methods 169


8.1 Generating pseudo-random numbers 169
8.2 Estimating sampling distributions by Monte Carlo 175
8.3 Computing confidence by bootstrap 181
8.4 The power of Monte Carlo 183
8.5 Further reading 184
8.6 Chapter summary 184
Appendix A Getting started with statistical computation 185
A.1 What is R? 185
A.2 A first R session 185
A.3 Entering data 187
A.4 Quitting R 188
A.5 More mathematics 188
A.6 Writing your own R scripts 189
A.7 Producing graphics in R 190
A.8 Saving graphics in R 192
A.9 Good practice with R 193
Appendix B Data case studies 195
B.1 Michelson’s speed of light data 195
B.2 Rutherford–Geiger radioactive decay 196
B.3 A study of fluid flow 198
B.4 The HR diagram 199
B.5 A particle physics experiment 202
B.6 Atmospheric conditions in New York City 205
Appendix C Combinations and permutations 207
C.1 Permutations 207
C.2 Combinations 208
C.3 Probability of combinations 209
Appendix D More on confidence intervals 210
Appendix E Glossary 214
Appendix F Notation 219
References 221
Index 223
For the student

Science is not about certainty, it is about dealing rigorously with uncertainty. The
tools for this are statistical. Statistics and data analysis are therefore an essential
part of the scientific method and modern scientific practice, yet most students of
physical science get little explicit training in statistical practice beyond basic error
handling. The aim of this book is to provide the student with both the knowledge and
the practical experience to begin analysing new scientific data, to allow progress
to more advanced methods and to gain a more statistically literate approach to
interpreting the constant flow of data provided by modern life.
More specifically, if you work through the book you should be able to accomplish
the following.
r Explain aspects of the scientific method, types of logical reasoning and data
analysis, and be able to critically analyse statistical and scientific arguments.
r Calculate and interpret common quantitative and graphical statistical summaries.
r Use and interpret the results of common statistical tests for difference and asso-
ciation, and straight line fitting.
r Use the calculus of probability to manipulate basic probability functions.
r Apply and interpret model fitting, using e.g. least squares, maximum likelihood.
r Evaluate and interpret confidence intervals and significance tests.

Students have asked me whether this is a book about statistics or data analysis or
statistical computing. My answer is that they are so closely connected it is difficult
to untangle them, and so this book covers areas of all three.
The skills and arguments discussed in the book are highly transferable: statistical
presentations of data are used throughout science, business, medicine, politics and
the news media. An awareness of the basic methods involved will better enable you
to use and critically analyse such presentations – this is sometimes called statistical
literacy.

x
For the student xi

In order to understand the book, you need to be familiar with the mathematical
methods usually taught in the first year of a physics, engineering or chemistry
degree (differential and integral calculus, basic matrix algebra), but this book is
designed so that the probability and statistics content is entirely self-contained.
For the instructor

This book was written because I could not find a suitable textbook to use as the
basis of an undergraduate course on scientific inference, statistics and data analysis.
Although there are good books on different aspects of introductory statistics, those
intended for physicists seem to target a post-graduate audience and cover either
too much material or too much detail for an undergraduate-level first course. By
contrast, the ‘Intro to stats’ books aimed at a broader audience (e.g. biologists,
social scientists, medics) tend to cover topics that are not so directly applicable
for physical scientists. And the books aimed at mathematics students are usually
written in a style that is inaccessible to most physics students, or in a recipe-book
style (aimed at science students) that provides ready-made solutions to common
problems but develops little understanding along the way.
This book is different. It focuses on explaining and developing the practice and
understanding of basic statistical analysis, concentrating on a few core ideas that
underpin statistical and data analysis, such as the visual display of information,
modelling using the likelihood function, and simulating random data. Key con-
cepts are developed using several approaches: verbal exposition in the main text,
graphical explanations, case studies drawn from some of history’s great physics
experiments, and example computer code to perform the necessary calculations.1
The result is that, after following all these approaches, the student should both
understand the ideas behind statistical methods and have experience in applying
them in practice.
The book is intended for use as a textbook for an introductory course on data
analysis and statistics (with a bias towards students in physics) or as self-study
companion for professionals and graduate students. The book assumes familiarity
with calculus and linear algebra, but no previous exposure to probability or statistics

1 These are based on R, a freely available software package for data analysis and statistics and used in many
statistics textbooks.

xii
For the instructor xiii

is assumed. It is suitable for a wide range of undergraduate and postgraduate science


students.
The book has been designed with several special features to improve its value
and effectiveness with students:
r several complete data analysis case studies using real data from some of history’s
great experiments
r ‘example boxes’ – approximately 20 boxes throughout the text that give specific,
worked examples for concepts as they are discussed
r ‘computer practice boxes’ – approximately 90 boxes throughout the text that give
working R code to perform the calculations discussed in the text or produce the
plots shown
r graphical explanations of important concepts
r appendices that provide technical details supplementary to the main text
r a well-populated glossary of terms and list of notational conventions.

The emphasis on a few core ideas and their practical applications means that
some subjects usually covered in introductory statistics texts are given little or
no treatment here. Rigorous mathematical proofs are not covered – the interested
reader can easily consult any good reference work on probability theory or math-
ematical statistics to check these. In addition, we do not cover some topics of
‘classical’ statistics that are dealt with in other introductory works. These topics
include
r more advanced distribution functions (beta, gamma, multinomial, . . . )
r ANOVA and the generalised linear model
r characteristic functions and the theory of moments
r decision and information theories
r non-parametric tests
r experimental design
r time series analysis
r multivariate analysis (principal components, clustering, . . . )
r survival analysis
r spatial data analysis.
Upon completion of this book the student should be in a much better position to
understand any of these topics from any number of more advanced or comprehen-
sive texts.
Perhaps the ‘elephant in the room’ question is: what about Bayesian methods?
Unfortunately, owing to practical limitations there was not room to include full
chapters developing Bayesian methods. I hope I have designed the book in such a
way that it is not wholly frequentist or Bayesian. The emphasis on model fitting
xiv For the instructor

using the likelihood function (Chapter 6) could be seen as the first step towards a
Bayesian analysis (i.e. implicitly using flat priors and working towards the posterior
mode). Fortunately, there are many good books on Bayesian data analysis that can
then be used to develop Bayesian ideas explicitly. I would recommend Gelman et al.
(2003) generally and Sivia and Skilling (2006) or Gregory (2005) for physicists in
particular. Albert (2007) also gives a nice ‘learn as you compute’ introduction to
Bayesian methods using R.
1
Science and statistical data analysis

It is remarkable that a science which began with the consideration of


games of chance should have become the most important object of human
knowledge.
Pierre-Simon Laplace (1812)
Théorie Analytique des Probabilités

Why should a scientist bother with statistics? Because science is about dealing
rigorously with uncertainty, and the tools to accomplish this are statistical. Statistics
and data analysis are an indispensable part of modern science.
In scientific work we look for relationships between phenomena, and try to
uncover the underlying patterns or laws. But science is not just an ‘armchair’ activ-
ity where we can make progress by pure thought. Our ideas about the workings
of the world must somehow be connected to what actually goes on in the world.
Scientists perform experiments and make observations to look for new connec-
tions, test ideas, estimate quantities or identify qualities of phenomena. However,
experimental data are never perfect. Statistical data analysis is the set of tools that
helps scientists handle the limitations and uncertainties that always come with data.
The purpose of statistical data analysis is insight not just numbers. (That’s why
the book is called Scientific Inference and not something more like Statistics for
Physics.)

1.1 Scientific method


Broadly speaking, science is the investigation of the physical world and its phenom-
ena by experimentation. There are different schools of thought about the philosophy
of science and the scientific method, but there are some elements that almost every-
one agrees are components of the scientific method.

1
2 Science and statistical data analysis

Figure 1.1 A cartoon of a simplified model of the scientific method.

Hypothesis A hypothesis or model is an explanation of a phenomenon in terms


of others (usually written in terms of relations or equations), or the suggestion
of a connection between phenomena.
Prediction A useful hypothesis will allow predictions to be made about the
outcome of experiments or observations.
Observation The collection of experimental data in order to investigate a
phenomenon.
Inference A comparison between predictions and observations that allows us
to learn about the hypothesis or model.
What distinguishes science from other disciplines is the insistence that ideas be
tested against what actually happens in Nature. In particular, hypotheses must
make predictions that can be tested against observations. Observations that match
closely the predictions of a hypothesis are considered as evidence in support of
the hypothesis, but observations that differ significantly from the predictions count
as evidence against the hypothesis. If a hypothesis makes no predictions about
possible observations, how can we learn about it through observation?
Figure 1.1 gives a summary of a simplified scientific method. Models and
hypotheses1 can be used to make predictions about what we can observe.

1 The terms ‘hypothesis’, ‘model’ and ‘theory’ have slightly different meanings but are often used interchange-
ably in casual discussions. A theory is usually a reasonably comprehensive, abstract framework (of definitions,
assumptions and relations or equations) for describing generally a set of phenomena, that has been tested and
found at least some degree of acceptance. Examples of scientific theories are classical mechanics, thermody-
namics, germ theory, kinetic theory of gases, plate tectonics etc. A model is usually more specific. It might be
the application of a theory to a particular situation, e.g. a classical mechanics model of the orbit of Jupiter. Some
1.2 Inference 3

Hypotheses may come from some more general theory, or may be more ad hoc,
based on intuition or guesswork about the way some phenomenon might work.
Experiments or observations of the phenomenon can be made, and the results com-
pared with the predictions of the hypothesis. This comparison allows one to test
the model and/or estimate any unknown parameters. Any mismatch between data
and model predictions, or other unpredicted findings in the data, may suggest ways
to revise or change the model. This process of learning about hypotheses from data
is scientific inference. One may enter the cycle at any point: by proposing a model,
making predictions from an existing model, collecting data on some phenomenon
or using data to test a model or estimate some of its parameters. In many areas of
modern science, the different aspects have become so specialised that few, if any,
researchers practice all of these activities (from theory to experiment and back),
but all scientists need an appreciation of the other steps in order to understand the
‘big picture’. This book focuses on the induction/inference part of the chain.

1.2 Inference
The process of drawing conclusions based on what is already known is called
inference. There are two types of reasoning process used in inference: deductive
and non-deductive.

1.2.1 Deductive reasoning (from general to specific)


The first kind of reasoning is deductive reasoning. This starts with premises and
follows the rules of logic to arrive at conclusions. The conclusions are therefore
true as long as the premises are true. Philosophers say the premises entail the
conclusion. Mathematics is based on deductive reasoning: we start from axioms,
follow the rules of logic and arrive at theorems. (Theorems should be distinguished
from theories – the former are the product of deductive reasoning; the latter are
not.) For example, the two propositions ‘A is true implies B is true’ and ‘A is true’
together imply ‘B is true’. This type of argument is a simple deduction known as
a syllogism, which comprises a major premise and a minor premise; together they
imply a conclusion:
Major premise : A ⇒ B (read: A is true implies B is true)
Minor premise : A (read: A is true)
Conclusion : B (read: B is true).
Deductive reasoning leads to conclusions, or theorems, that are inescapable given
the axioms. One can then use the axioms and theorems together to deduce more
authors go on to distinguish hypotheses as models, and their parameters, which may be speculative, as they are
used in statistical inference. For now we have no need to distinguish between models and hypotheses.
4 Science and statistical data analysis

theorems, and so on. A theorem2 is something like ‘A ⇒ B’, which simply says
that the truth value of A is transferred to B, but it does not, in and of itself, assert
that A or B are true. If we happen to know that A is indeed true, the theorem tells
us that B must also be true. The box gives a simple proof that there is no largest
prime number, a purely deductive argument that leads to an ineluctable conclusion.

Box 1.1
Deduction example – proof of no largest prime number
r Suppose there is a largest prime number; call this pN , the Nth prime.
r Make a list of each and every prime number: p1 = 2, p2 = 3, p3 = 5, until pN .
r Now form a new number q from the product of the N primes in the list, and add one:

N
q =1+ pi = 1 + (p1 × p2 × p3 × · · · × pN ) (1.1)
i=1

which is either prime or it is not.


r This new number q is larger than every prime in the list, but it is not divisible by
any prime in the list – it always leaves a remainder of one.
r This means q is prime since it has no prime factors (the fundamental theorem of
arithmetic says that any integer larger than 1 has a unique prime factorisation).
r But this is a contradiction. We have found a prime number q that is larger than
every number in our list, in contradiction with our definition of pN . Therefore our
original assumption – that there is a largest prime, pN – must be false.

Deduction involves reasoning from the general to the specific. If a general


principle is true, we can conclude that any particular cases satisfying the general
principle are true. For example:

Major premise : All monkeys like bananas


Minor premise : Zippy is a monkey
Conclusion : Zippy likes bananas.

The conclusion is unavoidable given the premises. (This type of argument is given
the technical name modus ponens by philosophers of logic.) If some theory is true
we can predict that its consequences must also be true. This applies to probabilistic
as well as deterministic theories. Later on we consider flipping coins, rolling dice,
and other random events. Although we cannot precisely predict the outcome of

2 It is worth noting here that the logical implication used above, e.g. B ⇒ A, does not mean that A can be derived
from B, but only that if B is true then A must also be true, or that the propositions ‘B is true’ and ‘B and A are
both true’ must have the same truth value (both true, or both false).
1.2 Inference 5

individual events (they are random!), we can derive frequencies for the various
outcomes in repeated events.

1.2.2 Inductive reasoning (from specific to general)


Inductive reasoning is a type of non-deductive reasoning. Induction is often said to
describe arguments from special cases to general ones, or from effects to causes.
For example, if we observe that the Sun has risen every day for many days, we can
inductively reason that it will continue to do so. We cannot directly deduce that the
Sun will rise tomorrow (there is no logical contradiction implied if it does not).
The basic point about the limited power of our inferences about the real world
(i.e. our inductive reasoning) was made most forcefully by the Scottish philosopher
David Hume (1711–1776), and is now known as the problem of induction. The
philosopher and mathematician Bertrand Russell furnished us with a memorable
example in his book The Problems of Philosophy (Russell, 1997, ch. 4):
imagine a chicken that gets fed by the farmer every day and so, quite understandably,
imagines that this will always be the case . . . until the farmer wrings its neck! The chicken
never expected that to happen; how could it? – given it had no experience of such an event
and the uniformity of its previous experience had been so great as to lead it to assume the
pattern it had always observed (chicken gets fed every day) was universally true. But the
chicken was wrong.3

You can see that inductive reasoning does not have the same power as deductive
reasoning: a conclusion arrived at by deductive reasoning is necessarily true if the
premises are true, whereas a conclusion arrived at by inductive reasoning is not
necessarily true, it is based on incomplete information. We cannot deduce (prove)
that the Sun will rise tomorrow, but nevertheless we do have confidence that it
will. We might say that deductive reasoning concerns statements that are either
true or false, whereas inductive reasoning concerns statements whose truth value
is unknown, about which we are better to speak in terms of ‘degree of belief’ or
‘confidence’. Let’s see an example:
Major premise : All monkeys we have studied like grapes
Minor premise : Zippy is a monkey
Conclusion : Zippy likes grapes.
The conclusion is not unavoidable, other conclusions are allowed. There is no
logical contradiction in concluding
Conclusion : Zippy does not like grapes.

3 By permission of Oxford University Press.


6 Science and statistical data analysis

But the premises do give us some information. It seems plausible, even probable,
that Zippy likes grapes.

1.2.3 Abductive reasoning (inference to the best explanation)


There is another kind of non-deductive inference, called abduction, or inference to
the best explanation. For our purposes, it does not matter whether abduction is a
particular type of induction, or another kind of non-deductive inference alongside
induction. Let’s go straight to an example:

Premise : Nelly likes bananas


Premise : The banana left near to Nelly has been eaten
Conclusion : Nelly ate the banana.

Again the conclusion is not unavoidable, other conclusions are valid. Perhaps
someone else ate the banana. But the original conclusion seems to be in some sense
the simplest of those allowed. This kind of reasoning, from observed data to an
explanation, is used all the time in science.
Induction and abduction are closely related. When we make an inductive infer-
ence from the limited observed data (‘the monkeys in our sample like grapes’) to
unobserved data (‘Zippy likes grapes’) it is as if we implicitly passed through a
theory (‘all monkeys like grapes’) and then deduced the conclusion from this.

1.3 Scientific inference


Scientific work employs all the above forms of reasoning. We use deductive rea-
soning to go from general theories to specific predictions about the data we could
observe, and non-deductive reasoning to go from our limited data to general con-
clusions about unobserved cases or theories.
Imagine A is the theory of classical mechanics and B is the predicted path of a
rocket deduced from the theory and the details of the launch. Now, we make some
observations and find the rocket did indeed follow the predicted path B (as well
as we can determine). Can we conclude that A is true? We may infer A, but not
deductively. Other conclusions are possible. In fact, the observational confirmation
of one prediction (or even a thousand) does not prove the theory in the same sense
as a deductive proof. A different theory may make indistinguishable predictions in
all of the cases considered to date, but differ in its predictions for other (e.g. future)
observations.
Experimental and observational science is all about inductive reasoning, going
from a finite number of observations or results to a general conclusion about
1.4 Data analysis in a nutshell 7

unobserved cases (induction), or a theory that explains them (abduction). In recent


years, there has been a lot of interest in showing that inductive reasoning can be
formalised in a manner similar to deductive reasoning, so long as one allows for
the uncertainty in the data and therefore in the conclusions (Jeffreys, 1961; Jaynes,
2003).
You might still have reservations about the need for statistical reasoning. After
all, the great experimental physicist Ernest Rutherford is supposed to have said

If your experiment needs statistics, you ought to have done a better experiment!4

Rutherford probably didn’t say this, or didn’t mean for it to be taken at face value.
Nevertheless, statistician Bradley Efron, about a hundred years later, contrasted this
simplistic view with the challenges of modern science (Efron, 2005):

Rutherford lived in a rich man’s world of scientific experimentation, where nature gen-
erously provided boatloads of data, enough for the law of large numbers to squelch any
noise. Nature has gotten more tight-fisted with modern physicists. They are asking harder
questions, ones where the data is thin on the ground, and where efficient inference becomes
a necessity. In short, they have started playing in our ball park.

But it is not just scientists who use (or should use) statistical data analysis. Any
time you have to draw conclusions from data you will make use of these skills.
This is true for particle physics as well as journalism, and whether the data form
part of your research or come from a medical test you were given you need to be
able to understand and interpret them properly, making inferences using methods
built on the same basic principles.

1.4 Data analysis in a nutshell


The analysis of data5 can be broken into different modes that are employed either
individually or in combination; the outcome of one mode of analysis may inform
the application of other modes.

Data reduction This is the process of converting raw data into something more
useful or meaningful to the experimenter: for example, converting the voltage
changes in a particle detector (e.g. a proportional counter) into the records of
the times and energies of individual particle detections. In turn, these may be
further reduced into an energy spectrum for a specific type of particle.

4 The earliest reference to this phrase I can find is Bailey (1967, ch. 2, p. 23).
5 ‘Data’ is the plural of ‘datum’ and means ‘items of information’, although it has now become acceptable to use
‘data’ as a singular mass noun rather like ‘information’.
8 Science and statistical data analysis

Exploratory data analysis (EDA) is an approach to data analysis that uses


quantitative and graphical methods in an attempt to reveal new and inter-
esting patterns in the data. One does not test a particular hypothesis, but
instead ‘plays around with the data’, searching for patterns suggestive of new
hypotheses.
Inferential data analysis Sometimes known as ‘confirmational data analysis’.
We can divide this into two main tasks: model checking and parameter esti-
mation. The former is the process of choosing which of a set of models
provides the most convincing explanation of the data; the latter is the process
of estimating values of a model’s unknown parameters.

Exploratory data analysis is all about summarising the data in ways that might
provide clues about their nature, and inferential data analysis is about making
reasonable and justified inferences based on the data and some set of hypotheses.

1.5 Random samples


Our data about the real world are almost always incomplete, affected by random
errors, or both. Let’s say we wanted to find the answer to some important question:
does the UK population prefer red or green sweets? We could survey the entire
population and in principle get a complete answer, but this would normally be
impractical. So we settle for a subset of the population, and assume this is rep-
resentative of the population at large. Our results from the subset of people we
actually survey is a sample and this is drawn from some population (of all the
responses from the entire population). The sample is just one of the many possible
samples that could be obtained from the same population.
But what we’re interested in is the population, so we need to use what we know
about the sample to infer something about the population. A small sample is easy
to collect, but smaller samples are also more susceptible to random fluctuations
(think of surveying just one person and extrapolating his/her answer to the entire
population); a larger sample is less prone to such fluctuations but is also harder to
collect. We also need to be sure to sample randomly and in an unbiased fashion – if
we only sample younger people, or people in certain counties, these may not reflect
the wider population. We need ways to quantify the properties of the sample, and
also to quantify what we can learn about the population. This is statistics.
You may be left thinking: what’s this got to do with experiments in the physical
sciences? We often don’t have a simple population from which we pull a random
sample. Each time we perform some measurement (or series of measurements) we
are collecting a sample of possible data. We can think of our sample as being drawn
from a population, a hypothetical population of all the possible data that could be
1.5 Random samples 9

Figure 1.2 Illustration of the distinct concepts of accuracy and precision as applied
to the positions of ‘shot’ on a target.

produced from our measurement(s). The differences between samples are due to
randomness in the experiment or measurement processes.

1.5.1 Errors and uncertainty


The type of randomness described above is usually called random error (or mea-
surement error) by physicists (the term error is used differently by statisticians6 ).
Here, error does not mean a mistake as in the usual sense. To most scientists the
‘measurement error’ is an estimate of the repeatability of a measurement. If we take
some data and use them to infer the speed of sound through air, what is the error
on our measurement? If we repeat the entire experiment – under almost identical
conditions – chances are the next measurements will be slightly different, by some
unpredictable amount. As will further repeats. The ‘random error’ is a quantitative
indication of how close repeated results will be. Data with small errors are said to
have high precision – if we repeat the measurement the next value is likely to be
very close to the previous value(s).
In addition to random errors, there is another type of error called systematic
error. A systematic error is a bias in a measurement that leads to the values being
systematically either too low or too high, and may arise from the selection of
the sample under study or the calibration of the instrument used. Data with small
systematic error are said to be accurate; if only we could reduce the random error
we could get a result extremely close to the ‘true’ value. Figure 1.2 illustrates
the difference between precision and accuracy. The experimenter usually works
to reduce the impact of both random and systematic errors (by ‘beating down the

6 To a statistician, ‘error’ is a technical term for the discrepancy between what is observed and what is expected.
10 Science and statistical data analysis

errors’) in the design and execution of the experiment, but the reality is that such
errors can never be completely eliminated.
It is important to distinguish between accuracy and precision. These two con-
cepts are illustrated in Figure 1.2. Precise data are narrowly spread, whereas accu-
rate data have values that fall (on average) around the true value. Precision is an
indicator of variation within the data and accuracy is a measure of variation between
the data and some ‘true’ value. These apply to direct measurements of simple
quantities and also to more complicated estimates of derived quantities (Chapters 6
and 7).

1.6 Know your data


There are several types of data you may be confronted with. The main types are as
follows.
Categorical data take on values that are not numerical but can be placed in
distinct categories. For example, records of gender (male, female) and particle
type (electron, pion, muon, proton etc.) are categorical data.
Ordinal data have values that can be ranked (put in order) or have a rating
scale attached, but the differences between the ranks cannot be compared. An
example is the Likert-type scale that you see on many surveys: 1, strongly
disagree; 2, disagree; 3, neutral; 4, agree; 5, strongly agree. These have a
definite order, but the difference between options 1 and 2 might not be the
same as between options 3 and 4.
Discrete data have numerical values that are distinct and separate (e.g. 1, 2,
3, . . . ). Examples from physics might be the number of planets around stars,
or the number of particles detected in a certain time interval.
Continuous data may take on any value within a finite or infinite interval. You
can count, order and measure continuous data: for example, the energy of
an accelerated particle, temperature of a star, ocean depth, magnetic field
strength etc.
Furthermore, data may have many dimensions.

Univariate data concern only one variable (e.g. the temperature of each star in
a sample).
Bivariate data concern two variables (e.g. the temperatures and luminosity of
stars in a sample). Each data point contains two values, like the coordinates
of a point on a plane.
Multivariate data concern several variables (e.g. temperature, luminosity, dis-
tance etc. of stars). Each data point is a point in an N-dimensional space, or
an N-dimensional vector.
1.7 Language 11

As mentioned previously, there are two main roles that variables play.
Explanatory variables (sometimes known as independent variables) are
manipulated or chosen by the experimenter/observer in order to examine
change in other variables.
Response variables (sometimes known as dependent variables) are observed in
order to examine how they change as a function of the explanatory variables.
For example, if we recorded the voltage across a circuit element as we drive it with
different AC frequencies, the frequency would be the explanatory variable, and
the response variable would be the voltage. Usually the error in the explanatory
variable is far smaller than, and can be neglected by comparison with, the error on
the response variables.

1.7 Language
The technical language used by statisticians can be quite different from that com-
monly used by scientists, and this language barrier is one of the reasons that science
students (and professional researchers!) have such a hard time with statistics books
and papers. Even within disciplines there are disagreements over the meaning and
uses of particular terms.
For example, physicists often say they measure or even determine the value of
some physical quantity. A statistician might call this estimation. Physicists tend
to use words like error and uncertainty interchangeably and rather imprecisely.
In these cases, where conventional statistical language or notation offers a more
precise definition, we shall use it. This is a deliberate choice. By using terminology
and notation more like that of a formal statistics course, and less like that of an
undergraduate laboratory manual, we hope to give the readers more scope for using
and developing their knowledge and skills. It should be easier to understand more
advanced texts on aspects of data analysis or statistics, and understand analyses
from other fields (e.g. biology, medicine).
This means that we do not explicitly make use of the definitions set out in the
Guide to the Expression of Uncertainty in Measurement (GUM, 2008). The doc-
ument (now with revisions and several supplements) is intended to establish an
industrial standard for the expression of uncertainty. Its recommendations included
categorising uncertainty into ‘type A’ (estimated based on statistical treatment of
a sample of data) and ‘type B’ (evaluated by other means), using ‘standard uncer-
tainty’ for the standard deviation of an estimator, ‘coverage factor’ for a multiplier
on the ‘combined standard uncertainty’. And so on. These recommendations may
be valuable within some fields such as metrology, but they are not standard in most
physics laboratories (research or teaching) as of 2013, and are unlikely to be taken
12 Science and statistical data analysis

up by the broader community of researchers using and researching statistics and


data analysis.

1.8 Statistical computing using R


You will need to be able to use a computer to do statistical data analysis on all
but the smallest datasets. It is still possible to understand the ideas and methods of
statistical data analysis in purely theoretical terms, without learning how to perform
the analysis using a computer. The purpose of this book is to help you not only
understand and interpret simple statistical analyses, but also perform analyses on
data, and that means using a computer.
Throughout this book we give examples of statistical computing using the R
environment (see Appendix A). R is an environment for statistical computation
and data analysis. It is really a programming language with an integrated suite of
software for manipulating data, producing plots and performing calculations, and
has a very wide range of powerful statistical tools ‘built in’. Using R it is relatively
simple to perform statistical calculations accurately – this means you can spend
less time worrying about the computational details, and more time thinking about
the data and the statistical concepts. Appendix A provides a gentle introduction
and a walkthrough of R.
Throughout the text are shaded boxes (R.boxes) containing the R code to carry
out or demonstrate the procedures discussed in the accompanying text. Lines of
R are written with typewriter font; these are meant to be typed at the R
command line. As you progress through the book, working through the examples
of R code, you will acquire the skills necessary to complete the data analysis
case studies (and hopefully more besides). Of course, R is just one of the options
you have for carrying out statistical computing. If your preferences lie elsewhere
you should still be able to gain from the book by skipping past the R.boxes, or
translating their contents into your favourite computing language.

1.9 How to use this book


This book is intended to provide a reasonably self-contained introduction to design-
ing, performing and presenting statistical analyses of experimental data. Several
devices are used to encourage you, the reader, to engage with the material rather
than just read it. When a new term is used for the first time it usually appears in
italics and is then defined, and to aid your memory there is a glossary of statistical
terms towards the back of the book, along with a crib sheet for the mathematical
notation. Dotted throughout the notes are two types of text box: white boxes contain
examples or applications of ideas discussed in the text; shaded boxes (‘R.boxes’)
1.9 How to use this book 13

contain examples using the R computing environment for you to work through
yourself. We rely heavily on examples to illustrate the main ideas, and these are
based on real data. The datasets are discussed in Appendix B.
In outline, the rest of the book is organised as follows.
r Chapter 2 discusses numerical and graphical summaries of data, and the basics
of exploratory data analysis.
r Chapter 3 introduces some of the basic recipes of statistical analyses, such as
looking for difference of the mean, or estimating the gradient of a straight line
relationship.
r Chapter 4 introduces the concept of probability, starting with discrete, random
events. We then discuss the rules of the probability calculus and develop the
theory of random variables.
r Chapter 5 extends the discussion of probability to discuss some of the most
frequently encountered distributions (and also mentions, in passing, the central
limit theorem).
r Chapter 6 discusses the fitting of simple models to data and the estimation of
model parameters.
r Chapter 7 considers the uncertainty on the parameter estimates, and model testing
(i.e. comparing predictions of hypotheses to data).
r Chapter 8 discusses Monte Carlo methods, computer simulations of random
experiments that can be used to solve difficult statistical problems.
r Appendix A describes how to get started in the computer environment R used in
the examples throughout the text.
r Appendix B introduces the data case studies used throughout the text.
r Appendix C provides a refresher on combinations and permutations.
r Appendix D discusses the construction of confidence intervals (extending the
discussion from Chapter 7).
r A glossary can be found on p. 217.
r A list of the notation can be found on p. 224.
2
Statistical summaries of data

The greatest value of a picture is when it forces us to notice what we


never expected to see.
John Tukey (1977),
statistician and pioneer of exploratory data analysis

How should you summarise a dataset? This is what descriptive statistics and
statistical graphics are for. A statistic is just a number computed from a data
sample. Descriptive statistics provide a means for summarising the properties of
a sample of data (many numbers or values) so that the most important results
can be communicated effectively (using few numbers). Numerical and graphical
methods, including descriptive statistics, are used in exploratory data analysis
(EDA) to simplify the uninteresting and reveal the exceptional or unexpected in
data.

2.1 Plotting data


One of the basic principles of good data analysis is: always plot the data. The
brain–eye system is incredibly good at recognising patterns, identifying outliers
and seeing the structure in data. Visualisation is an important part of data analysis,
and when confronted with a new dataset the first step in the analysis should be to
plot the data. There is a wide array of different types of statistical plot useful in data
analysis, and it is important to use a plot type appropriate to the data type. Graphics
are usually produced for screen or paper and so are inherently two dimensional,
even if the data are not.
The variables can often be classified as explanatory or response. We are usually
interested in understanding the behaviour of the response variable as a function of
the explanatory variable, where the explanatory variable is usually controlled by

14
2.1 Plotting data 15

the experimenter. Different plots are suitable depending on the number and type of
the response variable.

r Data with one variable (univariate)


– If the data are continuous, we can make a histogram showing how the data are
distributed. A smooth density curve is an alternative to a histogram.
– If the data are discrete or categorical, we could produce a bar chart,
similar to a histogram but with gaps between the bars to indicate their
discreteness.
– If the data are a time series (a series of points taken at distinct times), we can
make a time series plot by marking them as points on the x–y plane with y the
data and x the time corresponding to each data point.
– If the data are fractions of a whole, we may use compositional plots such as the
pie chart; however, these are rarely used in scientific and statistical graphics
(it is usually more efficient to present the proportions in a table or a bar
chart).
r Data with two variables (bivariate)
– If both variables are continuous, we may use a scatter plot where the data are
plotted as points on the x–y plane.
– There are many ways of augmenting a standard scatter plot, such as joining the
points with lines (if the order is important or if it improves clarity), overlaying
a smoothed curve or theoretical prediction curve and including error bars to
indicate the precisions of the measurements.
– If the explanatory variable is discrete (or binned continuous), we may choose
from a dotchart, boxplot, stripchart or others.
r Data with many variables (multivariate)
– A matrix of several scatter plots, each showing a different pair of variables,
may be used to illustrate the dependence of each variable upon each of the
others.
– A coplot shows several scatter plots of the same two variables, where the data
in each panel of the plot differ by the values of a third variable.
– With three continuous variables we can make a projection of the three-
dimensional equivalent of the scatter plot.
– Another variation on the three-dimensional scatter plot is the bubble plot, which
uses differently sized symbols to represent a third variable.
– If we have one response variable and two explanatory variables, we can make
an image using either greyscale, colours or contours to indicate the values of
the response variable over the explanatory dimensions, or we can construct a
projection of the surface, e.g. z = f (x, y).
16 Statistical summaries of data

30
25
20
Frequency
15
10
5
0

600 700 800 900 1000 1100


Speed − 299 000 (km s–1)

Figure 2.1 Histogram of the 100 Michelson speed-of-light data points.

2.2 Plotting univariate data


Michelson’s data – see Appendix B, section B.1 – records 100 experimental values
from his speed-of-light experiment. For compactness the tabulated data have had
the leading three digits removed (i.e. 299 000 km s−1 subtracted). How should we
plot these data? One option is an index plot, which plots points on the x–y plane at
coordinates (1, y1 ), (2, y2 ) and so on, one point for each data value yi . The order
of the points is simply the order they occur in the table, which may (or may not)
be the order they were obtained. Such a plot would make it much easier to see the
‘centre’ and ‘spread’ of the sample, compared with a table of raw numbers. But
there are more revealing ways to view the data.

2.2.1 Histogram
One way to simplify univariate data is to produce a histogram. A histogram is
a diagram that uses rectangles to represent frequency, where the areas of each
rectangle are proportional to the frequencies. To produce a histogram one must
first choose the locations of the bins into which the data are to be divided, then one
simply counts the number of data points that fall within each bin. See Figure 2.1
(and R.box 2.1).
A histogram contains less information than the original data – we know how
many data points fell within a particular bin (e.g. the 700–800 bin in Figure 2.1),
but we have lost the information about which points and their exact values. What
we have lost in information we hope to gain in clarity; looking at the histogram it
is clear how the data are distributed, where the ‘central’ value is and how the data
points are spread around it.
2.2 Plotting univariate data 17

R.Box 2.1
Histograms
The R command to produce and plot a histogram is hist(). The following shows
how to produce a basic histogram from Michelson’s data (see Appendix B,
section B.1):
hist(morley$Speed)

We can specify (roughly) how many histogram bins to use by using the breaks
argument, and we can also alter the colour of the histogram and the labels as follows:

hist(morley$Speed, breaks=25, col="darkgray",


main="", xlab="speed - 299,000 (km/s)")

This hist() command is quite flexible. See the help pages for more information
(type ?hist).

2.2.2 Bar chart


The bar chart is a relative of the histogram. Frequencies are indicated by the
lengths of bars, which should be of equal width. Bar charts are used for discrete
or categorical data, and a histogram is used for continuous data; neighbouring
histogram bins touch each other, bar chart bars do not. For example, measurements
of the speed of light are (in principle) continuous since the measured value can
take any real number over some range, and so a histogram may be used. But if we
were to plot data from a poll of support for different political parties, we should
use a bar chart, since the data are categorical (different parties).
Figure 2.2 shows a bar chart for the data recorded by Rutherford and Geiger (see
Appendix B, section B.2). The data record the number of intervals during which
there were zero scintillations, one scintillation, two scintillations, up to 14 (there
were no intervals with 15 or more scintillations). The data are discrete – the number
of scintillations per interval, shown along the horizontal axis, must be an integer –
and so a bar chart is appropriate.

R.Box 2.2
A simple bar chart
There are two simple ways to produce bar charts using R. Let’s illustrate this using the
Rutherford and Geiger data (see Appendix B, section B.2):

plot(rate, freq, type="h")


plot(rate, freq, type="h", bty="n",
18 Statistical summaries of data

500
400
Frequency
300
200
100
0

0 2 4 6 8 10 12 14
Rate (counts/interval)

Figure 2.2 Bar chart showing the Rutherford and Geiger (1910) data of the fre-
quency of alpha particle decays. The data comprise recordings of scintillations in
7.5 s intervals, over 2608 intervals, and this plot shows the frequency distribution
of scintillations per interval.

xlab="Rate (counts/interval)",
ylab="Frequency", lwd=5)

The first line produces a very basic plot using the type="h" argument. The second
line produces an improved plot with user-defined axis labels, thicker lines/bars and no
box enclosing the data area. An alternative is to use the specialised command
barplot().

barplot(freq, names.arg=rate, space=0.5,


xlab="Rate (cts/interval)",
ylab="Frequency")

Here the argument space=0.5 determines the sizes of the gaps between the bars, and
names.arg defines the labels for the x-axis. If the data were categorical, we could
produce a bar chart by setting the names.arg argument to the list of categories.

2.3 Centre of data: sample mean, median and mode


Probably the first conclusion we might draw from looking at Michelson’s data is
that the measured values lie close to 299 800 km s−1 . What we have just done
is make a numerical summary of the data – if we needed to communicate the
most important aspects of this dataset to a colleague in the smallest amount of
information, a sensible place to start would be with a summary like this, which
gives some idea of the ‘centre’ of the data. But instead of making a quick informal
2.3 Centre of data: sample mean, median and mode 19

299 700 299 800 299 900 300 000


Speed (km s–1)

Figure 2.3 Illustration of the mean as the balance point of a set of weights. The
data are the first 20 of the Michelson data points.

guess of the centre we could instead calculate and quote the mean of the sample,
defined by

1
n
x= xi (2.1)
n i=1

where xi (i = 1, 2, . . . , n) are the individual data points in the sample and n is the
size of the sample. If x are our data, then x̄ is the conventional symbol for the
sample mean. The sample mean is just the sum of all the data points, divided by
the number of data points. Strictly, this is the arithmetic mean. The mean of the
first 20 Michelson data values is 909 km s−1 :
1
x̄ = (850 + 740 + 900 + 1070 + 930 + 850 + . . . + 960) = 909.
20
One way to view the mean is as the balancing point of the data stretched out
along a line. If we have n equal weights and place them along a line at locations
corresponding to each data point, the mean is the one location along the line where
the weights balance, as illustrated in Figure 2.3.
The mean is not the only way to characterise the centre of a sample. The sample
median is the middle point of the data. If the size of the sample, n, is odd, the
median is the middle value, i.e. the (n + 1)/2th largest value. If n is even, the
median is the mean of the middle two values (the n/2th and n/2 + 1th ordered
values). The median has the sometimes desirable property that it is not so easily
swayed by a few extreme points. A single outlying point in a dataset could have a
dramatic effect on the sample mean, but for moderately large n one outlier will have
little effect on the median. The median of the first 20 light speed measurements is
940 km s−1 , which is not so different from the mean – take a look at Figure 2.1 and
notice that the histogram is quite symmetrical about the mean.
The last measure of the centre we shall discuss is the sample mode, which is
simply the value that occurs most frequently. If the variable is continuous, with no
repeating values, the peak of a histogram is taken to be the mode. Often there is
more than one mode; in the case of the 100 speed of light values, there are two
values that occur most frequently (810 and 880 km s−1 occur 10 times each). Once
20 Statistical summaries of data

mode

median
0.10

mean
p(x)
0.00

0 2 4 6 8 10
x

Figure 2.4 Illustration of the locations of the mean, median and mode for an
asymmetric distribution, p(x), where x is some random variable.

we bin the Michelson data into a histogram it becomes clear that the distribution
has a single mode around 800–850 km s−1 (see Figure 2.1).
Now we have three measures of centrality, but the one that is used the most is
the mean, often just called the average. If we have some theoretical distribution of
data spread over some range, we may calculate the mean, median and mode using
methods discussed in Chapter 5.
Figure 2.4 illustrates how the three different measures differ for some theoretical
distribution. The mean is like the centre of gravity of the distribution (if we imagine
it to be a distribution of mass density along a line); the median is simply the 50%
point, i.e. the point that divides the curve into halves with equal areas (equal
mass) on each side; the mode is the peak of the distribution (the densest point).
If the distribution is symmetrical about some point, the mean and median will be
the same, and if it is symmetrical about a single peak then the mode will also
be the same, but in general the three measures differ.

R.Box 2.3
Mean, median and mode in R
We can use R to calculate means and medians quite easily using the appropriately
named mean() and median() commands. The variable morley$Speed contains
the 100 speed values of Michelson. To calculate the mean and median, and add on the
offset (299 000 km s−1 ), type
mean(morley$Speed) + 299000
median(morley$Speed) + 299000

The modal value is not quite as easy to calculate as the mean or median since there is
no built-in function for this. One simple way to find the mode is to view a histogram
of the data and select the value corresponding to the peak.
2.4 Dispersion in data: variance and standard deviation 21

Box 2.1
Different averages
Imagine a room containing 100 working adults randomly selected from the
population. Then Bill Gates walks into the room. What happens to the mean wealth of
the people in the room? What about the median or mode? These different measures of
‘centre’ react very differently to an extreme outlier (such as Bill Gates). What will
happen to the average height (mean, median and mode) of the people in the room if
the world’s tallest man walks in?
What is the average number of legs for an adult human? The mode and the median
are surely two, but the mean number of legs is slightly less than two!

2.4 Dispersion in data: variance and standard deviation


The sample mean is a very simple and useful single-number summary of a sample,
and it gives us an idea of the typical location of the data. If we required slightly more
information about the sample a good place to start would be with some measure of
the spread of the data around this central location: the dispersion around the mean.
We could start by calculating the mean of the deviations between each data value
and the sample mean. But this is useless as it always equals zero. Take another look
at the definition for the sample mean (equation 2.1) and notice how the sample
mean is the one value that ensures the (data – mean) deviations sum to zero (recall
the balance of Figure 2.3):
1 1 1
n n n
n
(xi − x̄) = xi − x̄ = x̄ − x̄ = x̄ − x̄ = 0. (2.2)
n i=1 n i=1 n i=1 n
The negative deviations exactly cancel the positive deviations.
If instead we square the deviations, then all the elements of the sum are positive
(or zero), so the average of the squared deviation seems like a more useful measure
of the spread in a sample. The sample variance is defined as
1 
n
sx2 = (xi − x̄)2 . (2.3)
n − 1 i=1
This is almost the mean of the squared deviations. But notice that we have divided
by n − 1 rather than n: the story behind this is sketched out in the box. Table 2.1
illustrates explicitly the steps involved in calculating the variance using the first 20
values from the Michelson dataset: first we compute the sample mean, then subtract
this from the data, and compute the sum of the squared data − mean deviations.
Of course, in real data analysis this calculation is always performed by computer.
22 Statistical summaries of data

Table 2.1 Illustration of the computation of variance using the first n = 20 data
values from Michelson’s speed of light data. Here xi are the data values, and the
sample mean is their sum divided by n: x̄ = 18 180/20 = 909 km s−1 . The
xi − x̄ are the deviations, which always sum to zero. The squared deviations are
positive (or zero) valued and sum to a non-negative number. The sum of squared
deviations divided by n − 1 gives the sample variance:
s 2 = 209 180/19 = 11 009.47 km2 s−2 .

i 1 2 3 4 5 ··· 20 sum

Data xi (km s−1 ) 850 740 900 1 070 930 ··· 960 18 180
xi − x̄ (km s−1 ) −59 −169 −9 161 21 ··· 51 0
(xi − x̄)2 (km2 s−2 ) 3481 28 561 81 25 921 441 ··· 2601 209 180

The sample variance is always non-negative (i.e. either zero or positive), and
will not have the same units as the data. If the xi are in units of kg, the sample mean
will have the same units (kg) but the sample variance will be in units of kg2 .√The
standard deviation is the positive square root of the sample variance, i.e. s = s 2 ,
and has the same units as the data xi . Standard deviation is a measure of the typical
deviation of the data points from the sample mean. Sometimes this is called the
RMS: the root mean square (of the data after subtracting the mean).

Box 2.2
Why 1/(n − 1) in the sample variance?
The sample variance is normalised by a factor 1/(n − 1), where a factor 1/n might
seem more natural if we want the mean of the squared deviations. As discussed above,
the sum of the deviations (x − x̄) is always zero. If we have the sample mean the last
deviation can be found once we know the other n − 1 deviations, and so when we
average the square deviation we divide by the number of independent elements, i.e.
n − 1. This known as Bessel’s correction.
Using 1/(n − 1) makes the resulting estimate unbiased. Bias is the difference
between an average statistic and the true value that it is supposed to estimate, and an
unbiased statistic gives the right result when given a sufficient amount of data (i.e. in
the limit of large n). For more details of the bias in the variance, see section 5.2.2 of
Barlow (1989), or any good book on mathematical statistics.

The variance, or standard deviation, gives us a measure of the spread of the data
in the sample. If we had two samples, one with s 2 = 1.0 and one with s 2 = 1.7,
we would know the that the typical deviation
√ (from the mean) is 30% times larger
in the second sample (recall that s = s ).2
2.4 Dispersion in data: variance and standard deviation 23

R.Box 2.4
Variance and standard deviation
R has functions to calculate variances and standard deviations. For example, in order
to calculate the mean, variance and standard deviation of the numbers 1, 2, . . . , 50:

x <- 1:50
mean(x)
var(x)
sd(x)

Likewise to calculate the variance of the entire Michelson sample

Speed <- morley$Speed


var(Speed)

The first line defines a new array in order to save us having to use the prefix
morley$. . . every time we wish to access these data.

R.Box 2.5
Calculating with subarrays
If we want to calculate the variance for each of Michelson’s five ‘experiments’ (each
one is a block of 20 consecutive values) individually, we could use

mask <- morley$Expt == 2


mask
Speed[mask]
var(Speed[mask])

Note the use of the double equals sign (==) in testing for equality. The first line forms
an array mask, the same size as the Speed array, with values that are TRUE where the
condition is met (i.e. Expt == 2), and FALSE elsewhere. The third line forms a
subarray from Speed by taking only those elements that occur where mask is TRUE).
The third line shows how to compute the variance of this subset of the original data.
We can repeat this process using a loop as follows:
for (i in 1:5) {
print(var(Speed[morley$Expt==i]))
}

This looks quite complicated, so let’s unpack it. The first part for (i in 1:5)
{. . .} defines a loop. The second part (inside the curly brackets) defines what is to
happens each time around the loop. The loop runs once for each of i = 1, 2, 3, 4, 5,
24 Statistical summaries of data

and each time round it prints the variance of the sample of data with the corresponding
experiment number i. The following may help illustrate the way loops are written
in R:

for (i in 1:10) { print(i) }

2.5 Min, max, quantiles and the five-number summary


A simple two-point indicator of the spread of a data sample is the pair (minimum,
maximum). Other measures of a sample commonly used in descriptive statistics
are quantiles. The α quantile is simply the data point below which a fraction α of
the data occur. The 0.25 quantile is then simply the value for which 25% of the
data points are lower. The 0.5 quantile is the median. Some quantiles have special
names, for example the 0.25, 0.5 and 0.75 quantiles are called the first, second
and third quartiles, respectively. The median is the second quartile. The difference
between the 0.75 and 0.25 quantiles is called the interquartile range (IQR). (Note
that the first and third quartiles can be obtained by splitting the data about the
median, and then finding the medians of the lower and upper halves.)
John Tukey (see Tukey, 1977) suggested a simple and compact five-number
summary of a univariate dataset, now known as the Tukey five-number summary.
This comprises the minimum, first quartile, median (second quartile), third quartile
and maximum values of a sample. From these five numbers, one can get a reasonable
impression of the way the data are distributed: the centre of the sample (median),
the way the central 50% of the data are spread around the median (IQR) and the
most extreme (lowest, highest) values in the sample.

R.Box 2.6
Tukey’s five-number summary
There are two functions in R to calculate variations on Tukey’s five-number summary.
The first is

fivenum(0:100)
fivenum(Speed)

Here the reported values for the first, second (median) and third quartiles are given as
the closest actual data values. There is a variation on this:

summary(0:100)
summary(Speed)

The two methods differ slightly in how the quartiles are calculated. Note that the
summary() command calculates the mean for free.
2.6 Error bars, standard errors and precision 25

2.6 Error bars, standard errors and precision


From the above, we now have some numerical and graphical ways to summarise
data, and in particular its centre and spread. However, we still have not made any
attempt to quantify how precise these summaries might be. There are 100 values
in the Michelson datasets, divided into five experiments, each of 20 measure-
ments. For each of the experiments, we can calculate a mean and variance for the
20 measurements. From these, we may calculate the standard error on the sample
mean. Here it is:

sx2
SEx̄ = (2.4)
n
which is just the square root of the sample variance, sx2 (equation 2.3), divided by
the size of the sample, n. We shall not be concerned with where this formula comes
from until later chapters. For now, we consider it a useful, simple, approximate
formula for the uncertainty on the sample mean, x̄.
What is the meaning of the standard error? Imagine repeating an experiment n
times and, to get the ‘best’ result, taking the sample mean of the measurements,
x̄1 . We could repeat the whole set of n experiments and calculate another sample
mean, x̄2 , and so on. If we do this many times, we have a sample of mean values,
x̄j , each of which is an independent estimate of the population (‘true’) mean, μ.
The standard error is an estimate of the standard deviation of the sample means
from the expected (population) mean value. In other words, we expect the sample
means to be about one standard error from the population mean. Thus the standard
error gives us an idea of the precision of the sample mean. You can see that as
n increases, the standard error decreases; one would expect the precision of the
mean to improve as more data are acquired. In statistics the word precision (see
section 1.6) is sometimes used for the reciprocal of the variance of the data. The
precision of the mean x̄ is 1/SE2x̄ .
Let’s look at the sample means and standard errors for the Michelson data divided
into five ‘experiments’. Figure 2.5 shows the sample means and their standard
errors. The standard errors are illustrated by error bars, which run from x̄ − SEx̄
to x̄ + SEx̄ . This figure summarises each of the five experiments in terms of two
numbers each, the mean and a measure of its precision, and the five experiments
can easily be compared with each other and the modern, accepted value.

R.Box 2.7
Standard errors in R
There is no single command to compute the standard error in R, but one may make
use of the var() function to make the calculation simple. For example, to compute
the mean, variance and standard error of the Michelson data
26 Statistical summaries of data

950
Speed − 299 000 (km s–1)
900
850
800
750

1 2 3 4 5
Experiment

Figure 2.5 The sample means for each of the five ‘experiments’ of Michelson,
each comprising 20 measurements. The standard errors for each mean are indicated
by the error bars. Notice the sidebars at the end of each error bar. These help define
the ends of each error bar, but may clutter the graphic when there are a lot of data to
present. The dotted line shows the modern value for the speed of light in air. From
this graphic, one can start to make inferences about Michelson’s measurements.

x <- morley$Speed
mean(x)
var(x)
sqrt( var(x) / length(x) )

where the length(x) function returns the number of data points.

R.Box 2.8
Standard errors by group, part 1
It is possible to calculate a statistic (e.g. mean or variance) for each of the five
experiments in an efficient manner by first re-organising the data into a matrix. Once
this is done we can make use of some powerful matrix tools in R. In the following
example, the speed data are converted to a matrix with 20 rows (and therefore five
columns, one for each ‘experiment’) called speed.
speed <- matrix(morley$Speed, nrow=20)
speed
[,1] [,2] [,3] [,4] [,5]
[1,] 850 960 880 890 890
[2,] 740 940 880 810 840
[3,] 900 960 880 810 780
2.6 Error bars, standard errors and precision 27

[4,] 1070 940 860 820 810


[5,] 930 880 720 800 760
[6,] 850 800 720 770 810
... ... ... ... ... ...

It is important to check that the matrix is arranged in the right way. Here we see all the
data from first experiment in the first column – compare with the output of

morley$Speed[morley$Expt == 1]

R.Box 2.9
Standard errors by group, part 2
With the Michelson data arranged in a matrix, we can use the apply() command to
apply any function, e.g. mean() or var(), to every row or column of the matrix. For
example, to calculate the mean and variance of the data in each column, and then store
the results in new data objects, we can use
speed.mean <- apply(speed, 2, mean)
speed.var <- apply(speed, 2, var)
speed.var

The command apply(speed, 2, var) takes the matrix called speed and applies
the function var() to each of its columns to calculate the variance. You could also
use mean, sd, sum, or any other valid R command. The second argument (i.e. 2)
specifies columns should be analysed. If instead we used 1, we would get the variance
over each row. This approach, applying the same function over rows or columns of an
array, is usually faster (on large datasets) and more elegant than using loops.

R.Box 2.10
Standard errors by group, part 3
Finally, the standard errors for the five ‘experiments’ are just the square roots of these
variances divided by the number of data points in each experiment. We find the
number of data points in each column using the command apply() to apply the
length() function (we know the answer is 20).

speed.n <- apply(speed, 2, length)


se <- sqrt(speed.var / speed.n)
se
data.frame(speed.mean, speed.var, speed.n, se)

Remember that R is case sensitive, so se is not the same object as SE. The last line
uses the four new vectors (of the means, variance, lengths and standard errors) as
28 Statistical summaries of data

columns of a new object, a data frame (similar to a matrix but the columns may be
formed from different types of data).

R.Box 2.11
Plotting error bars
There are several ways to add error bars to a graphic in R. One way is using the
segments() command to draw a series of line segments between x− error and
x+ error. If we have sample means with standard errors (as in the previous box), we
may plot them as follows:
Expt <- 1:length(speed.mean)
plot(Expt, speed.mean, ylim=c(780,950), pch=16,
bty="l", xlab="Experiment",
ylab="Speed - 299,000 (km/s)")
segments(Expt, speed.mean-se, Expt, speed.mean+se)

where the second line plots the data and the third line adds the error bars. The
segments command takes as its input segments(x0,y0,x1,y1) and draws lines
between coordinates (x0,y0) and (x1,y1). A variation on this is to use the arrows
command to give each error bar a sidebar (as in Figure 2.5):

arrows(Expt, speed.mean-se, Expt, speed.mean+se,


code=3, angle=90, length=0.1)

Where the first four arguments give the coordinates of the endpoints (as for the
segments() command), and the last three define two-sided arrows (code=3 means
draw an arrow head at both ends of the arrow), with flat arrow heads (angle=90) and
the extent of the arrow heads (length=0.1).

It is common in physical science to expect error bars accompanying data when-


ever appropriate; they immediately allow the viewer to gauge the precision of the
estimate or measurement. What use is an estimate without any measure of how
reliable it is?

2.7 Plots of bivariate data


2.7.1 Scatter plot
So far we have considered only data that are records of the values of a single vari-
able, such as Michelson’s speed of light measurements. However, a great deal of
data analysis concerns data with more than one variable, often one or more response
variable, observed or measured at different values of one or more explanatory
variables.
2.7 Plots of bivariate data 29

R.Box 2.12
Scatter plots in R
The R command plot() will produce a basic scatter plot from two (equal length)
arrays of numbers. The Hipparcos data shown in Figure 2.6 are described in
Appendix B (section B.4). Using the reduced data file hip clean.txt we can
produce a simple plot

hip <- read.table("hip_clean.txt", header=TRUE)

This creates a data array called hip that contains the contents of the file: 14 columns
and 5740 rows of data. A simple scatter plot may be produced using

plot(hip$BV, hip$V)

However, with a little more effort we can do much better than this.

The simplest way to visualise data with two continuous variables is a scatter plot,
where each data point (pair of numbers) is treated as a coordinate and is marked
with a symbol on the x–y plane. Scatter diagrams are used to reveal relationships
between pairs of variables, and are among the most widely used diagrams in all
of science. They can be enormously powerful; indeed, some of the most important
diagrams and relations in science were discovered by examination of scatter plots.
Figure 2.6 shows one such example from astronomy. This is a Hertzsprung–
Russell diagram (sometimes known as a colour–magnitude diagram) and shows the
luminosity against colour index for a sample of nearby stars. Each point represents
a star, the horizontal position of the points represents the B − V colour index
(a simple measure of the colour of the star, which depends on its temperature),
and the vertical position represents the absolute magnitude (an upside-down and
logarithmic measure of the luminosity). When these two variables are used to
construct a scatter diagram for a sample of stars, it is clear there is a great deal of
structure in the data, patterns that would not be at all obvious by examination of a
table of numbers, or of graphical examination of either variable separately.

R.Box 2.13
Basic scatter plot design
The following command shows how to produce a better scatter plot:

plot(hip$BV, hip$V.abs, pch=1, cex=0.5, bty="n",


ylim=c(16, -3), xlim=c(-0.3, 2.0),
ylab="V.abs (mag)", xlab="B-V (mag)")
30 Statistical summaries of data

0
V (mag)
5
10
15

0.0 0.5 1.0 1.5 2.0


B − V (mag)

Figure 2.6 Example of a scatter plot showing data on 5740 stars using data from
the Hipparcos astronomy satellite. Plotted is the V -band (green) absolute (distance
corrected) magnitude against the B − V colour index (difference between B and
V -band magnitudes, a blue–green colour). Each point represents a star: brighter
(smaller magnitude) stars are at the top, bluer stars are on the left. The plot clearly
reveals structure in the data: most stars fall in the band from top left to bottom
right, with a small island in the top right. This type of diagram is of fundamental
importance in stellar astrophysics. For comparison we also show the histograms
of each of the two variables (V and B − V ) separately. The structure in the data
is only apparent when looking at the two variables together using a scatter plot.

Here we have plotted Vabs , the absolute magnitude stored in the V.abs column (not
the apparent magnitude in the V column), against B − V . The pch=1 argument
selects a plot symbol (1 is a hollow circle); cex=0.5 makes the symbols smaller than
default. A small, hollow symbol was chosen here to reduce the clutter from the large
number of points to be plotted.
The option ylim=c(16, -3) sets the range of the vertical axis to run from 16 at
the bottom to −3 at the top. The xlim argument is used to control the horizontal axis
span. The arguments xlab and ylab are for setting the axis labels, and finally
bty="n" defines what type of box to enclose the plot in ("n" means no box).
For more information on the arguments that can be changed within the plot()
command, try ?plot and ?par.

How does one decide which observable to plot on the horizontal axis, and
which on the vertical axis? In an experiment one usually studies the response of
some variable(s) to changes in experimenter-controlled explanatory variables, in
which case the explanatory variable is plotted along the horizontal axis and the
response variable plotted along the vertical axis. However, it is often the case that
neither variable is obviously an explanatory variable. For example, if we recorded

.
.
Random documents with unrelated
content Scribd suggests to you:
The Project Gutenberg eBook of Plain tales,
chiefly intended for the use of charity schools
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

Title: Plain tales, chiefly intended for the use of charity schools

Author: Anonymous

Release date: September 4, 2023 [eBook #71559]

Language: English

Original publication: London: Vernor and Hood, 1799

Credits: Carol Brown, Charlene Taylor and the Online


Distributed Proofreading Team at https://www.pgdp.net
(This file was produced from images generously made
available by The Internet Archive/American Libraries.)

*** START OF THE PROJECT GUTENBERG EBOOK PLAIN


TALES, CHIEFLY INTENDED FOR THE USE OF CHARITY
SCHOOLS ***
Plain Tales,
Printed for Vernor & Hood, 31 Poultry, April 1799.
P L A I N TA L E S ,
CHIEFLY INTENDED FOR

THE USE

OF

CHARITY SCHOOLS.
LONDON:

Printed for VERNOR and HOOD,

No. 31, Poultry.

1799.
Plain Tales.

TALE I.

ukey Dawkins and Polly Wood had been some time in the
charity-school. They had behaved very well, and could do a good
deal of work: they were regular in going at the exact time, and so
soon as school hours were over, they went strait home to see what
they could do to assist their mothers. As they were diligent, they
sometimes got a spare half hour to take a walk in the fields. This was
of great service to their health, and helped to make them strong,
active, and cheerful. One evening, after they had been working very
hard, their mothers gave them leave to go. Out they set, as brisk as
larks; they tripped over the stile very nimbly, and had soon gathered
a handful of primroses and violets. Presently they heard a loud noise
at a little distance, and away they ran to find out what it was. In a
wood, not far off, they observed a man felling a large tree, and
around lay a great number of chips. I wonder, said Sukey Dawkins, if
any body makes use of these: how glad my mother would be to have
some to light her fires with; let us ask the carpenter. Pray, said she,
do you think the person who owns these, would give me leave to
take a few home to my mother?—Yes, said the man, I think he
would: they belong to Mr. Ownoak, who is walking in the next field,
and you may ask him, if you will. O, said Polly Wood, do not let us
go, I cannot abide to ask: her companion replied, what is there to be
ashamed of, I am not a going to do any thing wrong; and, unless I
was, I do not see what reason I have to be ashamed. These chips
are of no use to this gentleman, and, perhaps, he does not think how
useful they might be to others. Come, let us make haste: so she
went up to Mr. Ownoak, and said—Pray, Sir, will you give me leave
to take a few of those chips home to light my mammy’s fire? Who is
your mammy, my little girl, said he? Widow Dawkins, sir. Where does
she live? In the Well-yard. How many children has she? Four, sir. I
am the oldest: I strive to do a little, but we are very poor, and my
mother has hard work to get cloaths, food, and firing; so that a few
chips would be very useful to us. You may take as many as you can
carry, my child, said he; and you may come again to-morrow, and the
next day, and, if your companion wants any, let her have some too.
Away they ran, and told the carpenter that Mr. Ownoak had given
them leave to take some. Sukey Dawkins had on a good strong
woollen apron, which she had made of one of her mother’s, so she
began filling it with chips; but Polly Wood’s apron was an old ragged
checked one. Sukey had often begged her companion to endeavour
to mend her cloths; but this she had too much neglected, and was
now very sorry she had. However, Sukey helped her to pin it
together as well as she could; and, after filling them as fully as they
would hold, and wishing the carpenter a good night, away they set
off towards home. As they were getting over the last stile, Polly’s
tattered apron gave way, and down fell all the chips. This was a sad
disaster, and she began to cry; but her companion asked her if
crying could possibly remedy the misfortune, and begged her not to
do what a little baby would. Let us think what is best to be done, that
is all we ought to do when any accident happens. Let us see: well,
your gown is whole, that is a good thing; suppose you take it up, and
put the chips in that, and, if you like, I will help you to mend your
apron to-morrow. So they picked up the chips again as fast as they
could, and made haste to get home. Mother, said Sukey, I am afraid
you thought me long; but these will make amends for staying. She
then threw down the chips under the coal-shed, and told her mother
how she came by them. Her mother thanked her very kindly for her
attention to the comfort of the family, and told her she believed, that,
if she had not been so good a girl, and often contrived, in some way
to help her, they must all have gone to the workhouse. Sukey was
much more satisfied with herself that evening, than if she had been
romping with the girls in the street, and went to bed thankful that she
had been useful.
Children, in many a different way,
Can give their friends delight;
Nor will she pass a useless day,
Who brings home chips at night.
TALE II.

other, said Nancy Bennet, I wish you would let us have tea
to breakfast: there are neighbour Spendalls and their children
drinking tea every morning when I go by to school, and we never
have it but on Sunday afternoons. My dear, said her mother, every
thing which is good for you, that I can buy, I wish you to have; but
there are many reasons which would make it improper for us to drink
much tea: One is, that it is very dear, and affords but little
nourishment: Another, that it is neither pleasant nor wholesome
without cream and sugar. Two pounds of the coarsest sugar I could
buy, would cost eighteen pence. With that eighteen pence I could
buy you a new shift; the sugar, you know, would be soon gone and
forgotten; the shift will help to keep you warm and comfortable for
years. Which would you rather have? O the shift, said she to be
sure. Well, my dear, said her mother, it is by denying ourselves tea
that we are able to get a comfortable change of shirts and shifts; and
another advantage is, that I believe we have better health than many
people who live a good deal on tea. Your father finds himself more
able to work after bread and cheese and a pint of beer, than he
would after tea: And a bason of milk-porridge is a much more
satisfying meal for us; and, it is a very happy thing, that the most
wholesome food is generally the cheapest. Ploughmen and
milkmaids, who look so ruddy, and are the most healthy people in
the kingdom, seldom taste tea. Part of their health and strength, it is
true, is owing to their rising early, going to bed early, and living a
good deal out of doors: but we, who are obliged to do our work more
in the house, ought to get the most wholesome food we can; and,
spending our money in tea and sugar, would deprive us of many
more useful things. I have heard my mother say, that tea was very
little drank when she was young; and, I believe, people were quite as
healthy and as happy then. For one quarter of a year, I laid by, every
week, just as much as I should have laid out had we drank tea. This,
at the least I could reckon it, was one shilling and sixpence a week.
As there are twelve weeks in a quarter of a year, this, you know,
came to eighteen shillings; and, with that money, I bought myself and
you, these good stuff gowns, which have kept us so warm all the
winter, and a pair of sheets for your bed: Would you rather have
been starved in rags, and drank tea; or, comfortably clad, and had
milk-porridge? O, I have heard enough about tea, said Nancy, give
me milk-porridge, a stuff gown, and new sheets.
If comfort round a cottage fire,
The poor desire to see,
Let them to useful things aspire,
And learn to banish tea.
TALE III.

enny Bunney sometimes did an errand for her school-


mistress: sometimes she took her mother’s work to the warehouse,
and was often employed to go on other errands, because she was
very quick, never loitering on the road. She was also careful to
remember what was told her, and carry a proper message. She had
a sufficient pleasure in being useful, and finding herself trusted, and
did not wish for any other reward; however, the people where she
went, were very kind, and would sometimes give her a halfpenny.
There was a woman lived very near where she did, who sold apples
and gingerbread, &c. these she thought looked very nice, and
sometimes she would buy a halfpenny-worth, but there was very little
for money; she had soon eaten it, and found herself not at all
satisfied. What a foolish thing, said she to herself, will it be to spend
all my money in this way, and have nothing useful for it. I will lay by
the halfpence I get till I can buy something useful, and then I shall
find which affords me the most satisfaction. She observed, that her
mother had long worked very hard to get food and cloaths for her
children, and that she hardly ever bought anything for herself. Her
caps were almost worn out, and Jenny knew that she did not know
how to get any new ones: so she asked her mistress, at the school,
to be so good as to tell her how much would buy her mother two
caps. Her mistress told her she thought she could buy her two for ten
pence: so she saved all the halfpence she got, and very anxious she
was till the number was compleated: then, the next time she went to
school, she gave it to her mistress to lay it out. The following
morning the caps were bought, and ready for her to make. She
worked hard, and, at night, had hemmed the border, set it on neatly,
and finished one cap! The second day her task was compleated, and
the caps carried home. If she had had a dozen given to herself, I do
not think her joy would have been half so great as that she had, in
the thought of giving these to her mother. As soon as she got into the
house, she ran up to her and said, mother, I have got a little present
for you, if you please to accept it. A present, said she! what is it?
Jenny then pulled out the caps, and put one on her mother’s head,
and the other in her lap. How came you by these, said she? Who
sent them? Mother, said Jenny, I have bought and made them
myself: You do a great deal for me, and I am sorry that I can help
you no more; however, I feel more glad that I could buy you these,
than if any body else had given you them. My dear, said her mother,
where could you get the money? O, said she, you know that I had
many odd half-pence given me, these I kept till I got enough to buy
you two caps, as I thought it would give me more pleasure than
laying it out in any thing else. Her mother almost cried for joy, to find
she had so good a child, and told her she should value the caps
more than if any fine lady had given her them. Young, as you are,
you now find how much you can do to render your parents
comfortable; and I rejoice, that poor as we are, you will never want
pleasure, since you have learned that you need only try to be useful.
When gingerbread and apples lure,
I’ll think on Jenny Bunney:
Rememb’ring pleasures that endure,
Are better worth my money.
TALE IV.

ear to Jenny Bunney lived Nancy Thoughtless. She too,


sometimes, had halfpence given her; but they soon went at the
apple-woman’s in cakes, gingerbread, nuts, &c. Sometimes she
would save several in her little box; but she did not think of laying
them out in any thing useful, and they soon followed the rest. One
very sharp winter, in which they found it hard work to get victuals, her
father had a very long illness: this was a great trial; however, the
poor woman, his wife, kept up her spirits pretty well. All worked who
were able, and they just managed to live, every day hoping the
father would get better. One day, said her mother to Nancy, my dear,
I wish I had a little wine to give your father, he is very weak, and I
think it might do him good; but it is dear, and I have no money to buy
any with. You know that I never go, nor send you a begging, for it is
generally the idle and wasteful who beg; and, as I am not one of
them, I do not choose to follow their example. I think I have seen you
take the halfpence which were given you to your little box. Perhaps
you have as much as six-pence, this would buy a little wine for your
poor father; and, I dare say, you will be glad to put it to such a use.
Money, my dear child, is of no more value than stones or dirt, any
further than as it is useful; and, it is every body’s duty to make the
best use he can of all he has. I dare say you feel that you can do
nothing better with yours, than buy your father a little wine. I need
say no more, you will run up stairs and fetch it. Nancy hung down
her head, and did not stir. Her mother waited: at last she burst out a
crying, O, mother, said she, I have no six-pence, I have not even a
half-penny. How have you laid it out, said her mother? O I have
wasted it all in gingerbread and nuts, and now I have none to buy my
poor father a drop of wine with. What shall I do! What shall I do! Her
mother told her, as crying could not bring back her money, she had
better give over. I am very sorry, said she, you have lost all the
pleasure you would now have had in doing good to your father, and
helping the family; but, perhaps, you like the remembrance of your
nuts and your gingerbread better. O, mother, do not say so; I would
rather have never tasted them if I could but now buy the wine. My
dear, said she, I hope you will be wiser then for the future, and
always remember, that those things which please the longest, are
the best.
She, who in trifles, spends her gain,
Will lose all lasting pleasure;
And when she would do good, in vain
Laments her wasted treasure.
TALE V.

s Mary Atkins was one day going to fetch some turnips for
dinner, she saw, at the corner of Poverty Lane, a second-hand shop,
at the door of which hung a great deal of ragged finery. There was a
tawdry flowered gown: to be sure, it had some holes in it, but it was
well starched, and made a show: there was, likewise, an old muslin
cap, with a pleated border, and a fine red ribband round it. Mary went
home, and told her mother she wished her to go with her to Poverty-
lane, to buy something at the second-hand shop, for she had seen
some very pretty things there; and Sally Idle had bought a white
apron for six-pence, and a muslin handkerchief for two-pence. My
dear, said her mother, there is not a place in the town I have so great
a dislike to as a rag-shop, for such it may properly be called; and, it
is one great cause of the ruin of poor people, that they lay out their
money at these shops. The apron and handkerchief which Sally Idle
bought, would, probably, be in rags the first time they were washed,
and she would then find that she had laid out her money in a very
wrong manner. The pleated bordered cap you saw, was, I dare say,
already in holes; and, perhaps, after once washing it, could be
pleated no more: besides, such a thing would take a great deal of
time, which poor people have not to spare. I would rather see a plain
cloth cap, with a strong lawn border, set strait on, which would wear
well for years, than such fine ones which would not last a month. The
cotton gown, perhaps, I could buy for half what I gave for my new
stuff one; but it would often want washing, and that would take a
great deal of time, which would very much hinder my work at the
wheel. Soap too, is very dear, so that it would soon cost me more
than that I have: besides, I think it very untidy to see a poor woman
with a dirty bit of a cotton gown all in rags, when she might, by a little
contrivance, have a comfortable stuff one. Poor people, in general,
find it difficult to raise money enough at a time to go to the shops and
buy a new garment: but my way is to put by, weekly, a little out of
what every one gets. You know you have each a place to put your
own in, and, by many a little being often put together, it soon
becomes a good deal. When I want a new garment for any of us, I
go and see how much is in the drawer, and if there is not enough,
your father and I endeavour to make it up out of our own earnings. I
should think it a shameful waste, indeed, to spend my money and
my children’s at a rag shop. I never have done it, nor do I ever mean
to do it; but, if you think it a better way, you are very welcome to try.
But, as I think it a disgrace for an industrious woman to be seen
there, you will excuse my going with you. O, said Mary, I will not go, I
am convinced that your way is best; and, now I think of it, Sally Idle
had a great many rents in the linen gown, which I know she bought
there but a little time since, and it looked very dirty and untidy too.
Some people, said her mother, may laugh at my putting by the six-
pences and the penny’s every week, but I am sure we have a great
deal of comfort from it; and, it matters not who laughs, so long as we
are certain that we are doing right. I do not think that I should hoard
up a great many shillings and guineas as if I could get them, for they
are only desirable to make use of; but I know it to be my duty to do
the best I can with my little, and, while I do that, you may be sure I
shall not go to the rag-shop.
Ruin within the rag-shop stands,
And all who dare to enter,
With tattered bargains in their hands,
Repent so rash a venture.
TALE VI.

olly Brown went one day to carry her grandmam a little


broth, for the poor can do good to others as well as the rich. Her
mother desired her to go carefully, not to stay by the way, and to
come strait back: she said she would. As she was going, she met
Sukey Playful and Dolly Careless: where are you going? said they.
To take my grandmam some broth. Come, said they, set it down a
little while, and have a run with us. O no, said she, I cannot now, my
mammy desired I would make haste; beside, the broth will be cold.
When a little girl knows what is right, she ought to listen to no
persuasions to do wrong. They told her, her mammy would never
know anything about it: that they were going to buy a half-penny
worth of apples, and would give her one if she would go with them.
Come, said they, you may set down the jug in this snug place, and
we shall soon be back again. At last she consented; but she had no
comfort as she went, nor when she had got her apple; for she
thought, if the jug should be thrown down, what should she do. They
made haste, but when they came back to the place, a dog had
thrown down the jug, and spilt all the broth. Polly began to cry most
terribly, and scolded Sukey and Dolly for persuading her to go, when
she might have recollected that it was her own fault for not minding
her duty. They were a good deal frightened: however, they said,
never mind it, as the jug is not broke, you can go home and tell your
mammy you took the broth, and, perhaps, she will never know any
thing about it. Polly dried her eyes, took up the jug, and went home;
but she was very uneasy, and felt that she did not like her play-
fellows half so well as she had done before, for they had now taught
her to do wrong. When she got home, well, said her mother, how
does your grandmam do, my dear, and how did she like the broth; for
I dare say she was hungry enough, poor soul, and would eat them
directly? Polly said, she was much as usual, and liked them very
well. All the day she was very dull, and found she could not work
with half so much pleasure as she used to do. At night, when she
went to bed, she was very uncomfortable indeed; she had been
taught always to tell the truth, as the only way to be happy herself, or
of any use to others. She now felt that she had deceived her mother,
and therefore did not deserve to be trusted by her. Thus she
continued very uneasy all the week: On Saturday night, when her
mother had done all her work, and washed the young children and
put them to bed, Polly, said she, I think I will just step and see how
your grandmother does: you, my dear, will take care of the house;
and mend a hole in your father’s stocking for to-morrow. You begin to
be a great help to me now, and I thank God that I have one child to
depend upon for a little comfort and assistance: be sure to take care
against the fire and candle, I shall soon be back again. She then
went out, but Polly’s heart was ready to break: she had always,
before, deserved her mother’s praise, and it was the next comfort
she had to the satisfaction of her own mind. But now she had
deceived her; she was miserable; she was going to be found out;
and she could no more expect to be trusted. The grandmother was
very glad to see her daughter, and began to enquire after all the
children, and particularly Polly, who, she said, was now a notable
little maid, and would soon, she hoped, be a great comfort to them
all. But child, said she, I am afraid you have raised no broth lately, for
you used to be so good as to send me some, and it is now many a
long day since I have had any. Mother, said she, you forget, we
made broth on Monday, and Polly brought you some then. Well, said
she, I believe my memory fails me, but I thought it had been longer.
Here is my neighbour Green, who brings in her wheel sometimes,
she has sat with me a good deal this week, it may be that she can
tell. Monday, Monday, let me see, said Betty Green; no, neighbour, I
am sure Polly brought none on Monday, for that was the day we
made some at our house, and I brought you a little of mine. Well,
said Polly’s mother, I do not know how it could be, but I will enquire
when I get home. I must now wish you a good night, for my husband
will want his supper. You have a shift here over the line that wants
mending I see: Polly is now very ready with her needle, they have
taught her so well in the charity-school. I am sure she will be glad to
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookgate.com

You might also like