Data Analysis and Graphics Using R 1st Edition Matthew Norman - The ebook is ready for download with just one simple click
Data Analysis and Graphics Using R 1st Edition Matthew Norman - The ebook is ready for download with just one simple click
https://ebookultra.com/download/using-r-for-data-management-
statistical-analysis-and-graphics-1st-edition-nicholas-j-horton/
https://ebookultra.com/download/data-analysis-and-graphics-using-r-an-
example-based-approach-1st-edition-john-maindonald/
https://ebookultra.com/download/data-analysis-and-graphics-using-r-an-
example-based-approach-third-edition-john-maindonald/
https://ebookultra.com/download/data-analysis-and-graphics-using-r-an-
example-based-approach-2nd-edition-john-maindonald/
Statistics and Data Analysis for Microarrays Using R and
Bioconductor 2nd Edition Sorin Draghici
https://ebookultra.com/download/statistics-and-data-analysis-for-
microarrays-using-r-and-bioconductor-2nd-edition-sorin-draghici/
https://ebookultra.com/download/clinical-trial-data-analysis-using-r-
chapman-hall-crc-biostatistics-series-1st-edition-din-chen/
https://ebookultra.com/download/data-analysis-using-sql-and-excel-1st-
edition-gordon-s-linoff/
https://ebookultra.com/download/guerilla-data-analysis-using-
microsoft-excel-1st-edition-bill-jelen/
https://ebookultra.com/download/analysis-of-categorical-data-
with-r-1st-edition-christopher-r-bilder/
Data Analysis and Graphics Using R 1st Edition Matthew
Norman Digital Instant Download
Author(s): Matthew Norman
ISBN(s): 9781852337162, 1852337168
Edition: 1
File Details: PDF, 38.88 MB
Year: 2003
Language: english
An Example-based
", -
- .. -
-n.7~T. Approach
AX ,.- -- , .-, 7,'u.74
CAMBRIDGE SERIES IN STATISTICAL AND PROBABILISTIC
MATHEMATICS
Editorial Board:
Already published
1. Bootstrap Methods and Their Application, A.C. Davison and D.V. Hinkley
2. Markov Chains, J. Norris
3. Asymptotic Statistics, A.W. van der Vaart
4. Wavelet Methodsfor Time Series Analysis, D.B. Percival and A.T. Walden
5. Bayesian Methods, T. Leonard and J.S.J. Mu
6. Empirical Processes in M-Estimation, S. van de Geer
7. Numerical Methods of Statistics, J. Monahan
8. A User's Guide to Measure-Theoretic Probability, D. Pollard
9. The Estimation and Tracking of Frequency, B.G. Quinn and E.J. Hannan
Data Analysis and Graphics
Using R - an Example-based Approach
John Maindonald
Centre for Bioinformation Science, John Curtin School of Medical Research
and Mathematical Sciences Institute, Australian National University
and
John Braun
Department of Statistical and Actuarial Science University of Western Ontario
CAMBRIDGE
UNIVERSITY PRESS
PUB1,ISHED BY T H E PRESS S Y N D I C A T E OF T H E U N I V E R S I T Y OF C A M B R I D G E
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
CAMBRIDGE U N I V E R S I T Y PRESS
The Edinburgh Building, Cambridge CB2 2RU, UK
40 West 20th Street, New York, NY 10011-4211, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
Ruiz de Alarc6n 13,28014 Madrid, Spain
Dock House, The Waterfront, Cape Town 8001, South Africa
Reprinted 2004
A catalogue record for this book is available from the British Library
[D. A. Freedman]
Contents
Preface page xv
1 A Brief Introduction to R
1.1 A Short R Session
1.1.1 R must be installed!
1.1.2 Using the console (or command line) window
1.1.3 Reading data from a file
1.1.4 Entry of data at the command line
1.1.5 Online help
1.1.6 Quitting R
1.2 The Uses of R
1.3 The R Language
1.3.1 R objects
1.3.2 Retaining objects between sessions
1.4 Vectors in R
1.4.1 Concatenation -joining vector objects
1.4.2 Subsets of vectors
1.4.3 Patterned data
1.4.4 Missing values
1.4.5 Factors
1.5 Data Frames
1.5.1 Variable names
1.5.2 Applying a function to the columns of a data frame
1.5.3* Data frames and matrices
1.5.4 Identification of rows that include missing values
1.6 R Packages
1.6.1 Data sets that accompany R packages
1.T Looping
1.8 R Graphics
1.8.1 The function plot ( ) and allied functions
1.8.2 Identification and location on the figure region
1.8.3 Plotting mathematical symbols
Contents
3.5 Recap
3.6 Further Reading
3.7 Exercises
4 An Introduction to Formal Inference
4.1 Standard Errors
4.1.1 Population parameters and sample statistics
4.1.2 Assessing accuracy - the standard error
4.1.3 Standard errors for differences of means
4.1.4* The standard error of the median
4.1.5* Resampling to estimate standard errors: bootstrapping
4.2 Calculations Involving Standard Errors: the t-Distribution
4.3 Conjidence Intervals and Hypothesis Tests
4.3.1 One- and two-sample intervals and tests for means
4.3.2 Confidence intervals and tests for proportions
4.3.3 Confidence intervals for the correlation
4.4 Contingency Tables
4.4.1 Rare and endangered plant species
4.4.2 Additional notes
4.5 One-Way Unstructured Comparisons
4.5.1 Displaying means for the one-way layout
4.5.2 Multiple comparisons
4.5.3 Data with a two-way structure
4.5.4 Presentation issues
4.6 Response Curves
4.7 Data with a Nested Variation Structure
4.7.1 Degrees of freedom considerations
4.7.2 General multi-way analysis of variance designs
4.8* Resampling Methods for Tests and Conjidence Intervals
4.8.1 The one-sample permutation test
4.8.2 The two-sample permutation test
4.8.3 Bootstrap estimates of confidence intervals
4.9 Further Comments on Formal Inference
4.9.1 Confidence intervals versus hypothesis tests
4.9.2 If there is strong prior information, use it!
4.10 Recap
4.11 Further Reading
4.12 Exercises
5 Regression with a Single Predictor
5.1 Fitting a Line to Data
5.1.1 Lawn roller example
5.1.2 Calculating fitted values and residuals
5.1.3 Residual plots
5.1.4 The analysis of variance table
5.2 Outliers, Influence and Robust Regression
Contents
References
Index of R Symbols and Functions
Index of Terms
Index of Names
Preface
This book is an exposition of statistical methodology that focuses on ideas and concepts,
and makes extensive use of graphical presentation. It avoids, as much as possible, the use
of mathematical symbolism. It is particularly aimed at scientists who wish to do statis-
tical analyses on their own data, preferably with reference as necessary to professional
statistical advice. It is intended to complement more mathematically oriented accounts of
statistical methodology. It may be used to give students with a more specialist statistical
interest exposure to practical data analysis.
The authors can claim, between them, 40 years of experience in working with researchers
from many different backgrounds. Initial drafts of the monograph were constructed from
notes that the first author prepared for courses for researchers, first of all at the University of
Newcastle (Australia) over 1996-1 997, and greatly developed and extended in the course
of work in the Statistical Consulting Unit at The Australian National University over 1998-
2001. We are grateful to those who have discussed their research with us, brought us
their data for analysis, and allowed us to use it in the examples that appear in the present
monograph. At least these data will not, as often happens once data have become the basis
for a published paper, gather dust in a long-forgotten folder!
We have covered a range of topics that we consider important for many different areas
of statistical application. This diversity of sources of examples has benefits, even for those
whose interests are in one specific application area. Ideas and applications that are useful in
one area often find use elsewhere, even to the extent of stimulating new lines of investigation.
We hope that our book will stimulate such cross-fertilization. As is inevitable in a book that
has this broad focus, there will be specific areas - perhaps epidemiology, or psychology, or
sociology, or ecology - that will regret the omission of some methodologies that they find
important.
We use the R system for the computations. The R system implements a dialect of the
influential S language that is the basis for the commercial S-PLUS system. It follows
S in its close linkage between data analysis and graphics. Its development is the result
of a co-operative international effort, bringing together an impressive array of statistical
computing expertise. It has quickly gained a wide following, among professionals and non-
professionals alike. At the time of writing, R users are restricted, for the most part, to a
command line interface. Various forms of graphical user interface will become available in
due course.
The R system has an extensive library of packages that offer state-of-the-art-abilities.
Many of the analyses that they offer were not, 10 years ago, available in any of the standard
xvi Preface
packages. What did data analysts do before we had such packages? Basically, they adapted
more simplistic (but not necessarily simpler) analyses as best they could. Those whose
skills were unequal to the task did unsatisfactory analyses. Those with more adequate skills
carried out analyses that, even if not elegant and insightful by current standards, were often
adequate. Tools such as are available in R have reduced the need for the adaptations that
were formerly necessary. We can often do analyses that better reflect the underlying science.
There have been challenging and exciting changes from the methodology that was typically
encountered in statistics courses 10 or 15 years ago.
The best any analysis can do is to highlight the information in the data. No amount of
statistical or computing technology can be a substitute for good design of data collection,
for understanding the context in which data are to be interpreted, or for skill in the use of
statistical analysis methodology. Statistical software systems are one of several components
of effective data analysis.
The questions that statistical analysis is designed to answer can often be stated sim-
ply. This may encourage the layperson to believe that the answers are similarly simple.
Often, they are not. Be prepared for unexpected subtleties. Effective statistical analysis
requires appropriate skills, beyond those gained from taking one or two undergraduate
courses in statistics. There is no good substitute for professional training in modern tools
for data analysis, and experience in using those tools with a wide range of data sets. No-
one should be embarrassed that they have difficulty with analyses that involve ideas that
professional statisticians may take 7 or 8 years of professional training and experience to
master.
Acknowledgements
Many different people have helped us with this project. Winfried Theis (University of
Dortmund, Germany) and Detlef Steuer (University of the Federal Armed Forces, Hamburg,
xviii Preface
Germany) helped with technical aspects of working with LA$, with setting up a cvs server
to manage the LA$ files, and with helpful comments. Lynne Billard (University of Georgia,
USA), Murray Jorgensen (University of Waikato, NZ) and Berwin Turlach (University of
Western Australia) gave valuable help in the identification of errors and text that required
clarification. Susan Wilson (Australian National University) gave welcome encouragement.
Duncan Murdoch (University of Western Ontario) helped set up the DAAG package. Thanks
also to Cath Lawrence (Australian National University) for her Python program that allowed
us to extract the R code, as and when required, from our IbT@ files. The failings that remain
are, naturally, our responsibility.
There are a large number of people who have helped with the providing of data sets.
We give a list, following the list of references for the data near the end of the book. We
apologize if there is anyone that we have inadvertently failed to acknowledge. Finally,
thanks to David Tranah of Cambridge University Press, for his encouragement and help in
bringing the writing of this monograph to fruition.
References
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J. & Kriiger, L. 1989. The Empire
of Chance. Cambridge University Press.
SAS Institute Inc. 1996. JMP Start Statistics. Duxbury Press, Belmont, CA.
These (and all other) references also appear in the consolidated list of references near the
end of the book.
Conventions
Text that is R code, or output from R, is printed in a verbatim text style. For example,
in Chapter 1 we will enter data into an R object that we call aus tpop. We will use the
plot ( ) function to plot these data. The names of R packages, including our own DAAG
package, are printed in italics.
Starred exercises and sections identify more technical items that can be skipped at a first
reading.
Solutions to exercises
Solutions to selected exercises are available from the website
http://www.maths.anu.edu.au/-johnmlr-book.htm1
See also www.cambridge.org/052 1813360
A Chapter by Chapter Summary
The fitted model determinesjtted or predicted values of the signal. The residuals (which
estimate the noise component) are what remain after subtracting the fitted values from the
observed values of the signal.
The normal distribution is widely used as a model for the noise component.
Haphazardly chosen samples should be distinguished from random samples. Inference
from haphazardly chosen samples is inevitably hazardous. Self-selected samples are par-
ticularly unsatisfactory.
The line or curve for the regression of a response variable y on a predictor x is different
from the line or curve for the regression of x on y. Be aware that the inferred relationship
is conditional on the values of the predictor variable.
The model matrix, together with estimated coefficients, allows for calculation of predicted
or fitted values and residuals.
Following the calculations, it is good practice to assess the fitted model using standard
forms of graphical diagnostics.
Simple alternatives to straight line regression using the data in their raw form are
transforming x andlor y,
using polynomial regression,
fitting a smooth curve.
For size and shape data the allometric model is a good starting point. This model assumes
that regression relationships among the logarithms of the size variables are linear.
Multiple lines are fitted as an interaction between the variable and a factor with as many
levels as there are different lines.
Scatterplot smoothing, and smoothing terms in multiple linear models, can also be handled
within the linear model framework.
Both principal components analysis, and discriminant analysis, allow the calculation of
scores, which are values of the principal components or discriminant functions, calculated
observation by observation. The scores may themselves be used as variables in, e.g., a
regression analysis.
This first chapter is intended to introduce readers to the basics of R. It should provide an
adequate basis for running the calculations that are described in later chapters.
In later chapters, the R commands that handle the calculations are, mostly, confined to
footnotes. Sections are included at the ends of several of the chapters that give further
information on the relevant features in R. Most of the R commands will run without change
in S-PLUS.
The first element is labeled [I] even when, as here, there is just one element! The >
indicates that R is ready for another command.
In a sense this chapter, and much of the rest of the book, is a discussion of what is
possible by typing in statements at the command line. Practice in the evaluation of arithmetic
I . A Brief Introduction to R
Year ACT
expressions will help develop the needed conceptual and keyboard skills. Here are simple
examples:
Anything that follows a # on the command line is taken as a comment and ignored by R.
There is also a continuation prompt that appears when, following a carriage return, the
command is still not complete. By default, the continuation prompt is + (in this book we
will omit both the prompt (>) and the continuation prompt (+), whenever command line
statements are given separately from output).
This reads in the data, and stores them in the data frame A C T p o p . The <- is a left angle
bracket (<) followed by a minus sign (-). It means "the values on the right are assigned to
the name on the left". Note the use of header=TRUE to ensure that R uses the first line to
get header information for the columns.
Type in A C T p o p at the command line prompt, and the data will be displayed almost
as they appear in Table 1.1 (the only difference is the introduction of row labels in the
R output). The object A C T p o p is an example of a data frame. Data frames are the usual
way for organizing data sets in R. More information about data frames can be found in
Section 1.5.
Case is significant for names of R objects or commands. Thus, ACTPOP is different from
ACTpop. (For file names on Microsoft Windows systems, the Windows conventions apply,
and case does not distinguish file names. On Unix systems letters that have a different case
are treated as different.)
We now plot the ACT population between 1917 and 1997 by using
The option pch=16 sets the plotting character to solid black dots. Figure 1.1 shows the
graph.
We can make various modifications to this basic plot. We can specify more informative
axis labels, change the sizes of the text and of the plotting symbol, add a title, and so on.
More information is given in Section 1.8.
elasticband <- d a t a . f r a m e ( ~ t r e t c h = c ( 4 6 , 5 4 , 4 8 ~ 5 0 ~ 4 4 ~ 4 2 , 5 2 ) ,
di~tance=c(148,182,173~166~109~141,166))
1. A Brief Introduction to R
Stretch Distance
Different R implementations offer different choices of modes of access into the help pages
(thus Microsoft Windows systems offer a choice between a form of help that displays the
relevant help file, html help, and compiled html help. The choice between these different
modes of access is made at startup. See help ( Rpro f i1e ) for details).
Two functions that can be highly useful in searching for functions that perform a desired
task are apropos ( ) and help. search ( ) . We can best explain their use by giving
specific examples. Thus try
apropos ( "sort") # Try, also, apropos ("sor")
# This lists all functions whose names include the
# character string "sort".
help.search("sort") # Note that the argument is "sort"
# This lists all functions that have the word 'sort' as
# an alias or in their title.
Note also example ( ) . This initiates the running of examples, if available, of the use
of the function specified by the function argument. For example:
example ( image)
# for a 2 by 2 layout of the last 4 plots, precede with
# par(mfrow=c(2,2))
# to prompt for each new graph, precede with par(ask=T)
1.2 The Uses of R 5
Much can be learned from experimenting with R functions. It may be helpful to create
a simple artificial data set with which to experiment. Another possibility is to work with a
subset of the data set to which the function will, finally, be applied. For extensive experi-
mentation it is best to create a new workspace where one can work with copies of any user
data sets and functions.
Among the abilities that are documented in the help pages, there will be some that bring
pleasant and unexpected surprises. There may be insightful and helpful examples. There
are often references to related functions. In most cases there are technical references that
give the relevant theory. While the help pages are not intended to be an encyclopedia on
statistical methodology, they contain much helpful commentary on the methods whose
implementation they document. It can help enormously, before launching into use of an R
function, to check the relevant help page!
1.1.6 Quitting R
One exits or quits R by using the q ( ) function:
There will be a message asking whether to save the workspace image. Clicking Yes
(the safe option) will save all the objects that remain in the working directory - any that
were there at the start of the session and any that were added (and not removed) during the
session.
Depending on the implementation, alternatives may be to click on the File menu and then
on Exit, or to click on the x in the top right hand comer of the R window. (Under Linux,
clicking on x exits from the program, but without asking whether to save the workshop
image.)
Note: In order to quit from the R session we had to type q ( ) . This is because q is a
function. Typing q on its own, without the parentheses, displays the text of the function on
the screen. Try it!
As a first example, consider the data frame c a r s that is in the base package. This has
two columns (variables), with the names speed and d i s t. Typing in summary ( c a r s )
gives summary information on these variables.
Thus, we can immediately see that the range of speeds (first column) is from 4 mph to
25 mph, and that the range of distances (second column) is from 2 feet to 120 feet.
1.3.1 R objects
All R entities, including functions and data structures, exist as objects that can be operated
on as data. Type in Is ( ) (or obj ec t s ( ) ) to see the names of all objects in the workspace.
One can restrict the names to those with a particular pattern, e.g., starting with the letter "p".
(Type in help ( 1s ) and help ( grep ) for more details. The pattern-matching conventions
are those used for grep ( ) , which is modeled on the Unix grep command. For example,
Is (pattern="pm ) lists all object names that include the letter "p". To get all object
names that start with the letter "p", specify 1s (pattern=""pn) .) As noted earlier,
typing the name of an object causes the printing of its contents.
It is often possible and desirable to operate on objects - vectors, arrays, lists and so on - as
a whole. This largely avoids the need for explicit loops, leading to clearer code. Section 1.2
gave an example.
where the names of objects that are to be removed should appear in place of <obj I>,
<obj2 > , . . . . For example, to remove the objects cel s i u s and fahrenhei t from the
workspace image before quitting, type
rm(celsius, fahrenheit)
q(
In general, we have left it to the reader to determine which objects should be removed once
calculations are complete.
I. A Brief Introduction to R
1.4 Vectors in R
Vectors may have mode "logical", "numeric", "character" or "list". Examples of vectors
are
> c(T, F, F, F, T, T, F )
[I] TRUE FALSE FALSE FALSE TRUE TRUE FALSE
The first two vectors above are numeric, the third is logical, and the fourth is a character
vector. Note the use of the global variables F ( =FALSE) and T ( =TRUE) as a convenient
shorthand when logical values are entered.
2. Use negative subscripts to omit the elements in nominated subscript positions (take
care not to mix positive and negative subscripts):
3. Specify a vector of logical values. The elements that are extracted are those for which
the logical value is TRUE. Thus, suppose we want to extract values of x that are greater
than 10.
> X > l O
[I] FALSE TRUE FALSE TRUE TRUE
> x [ x > 101
[I] 11 15 12
y <- ~ ( 1 NA,
, 3, 0, NA)
Note that any arithmetic operation or relation that involves NA generates an NA. Specifically,
be warned that y [ y==NA] < - 0 leaves y unchanged. The reason is that all elements of
10 I . A Brief Introduction to R
y==NA evaluate to NA. This does not identify an element of y, and there is no assignment.
To replace all NAs by 0, use the function i s .na ( ) , thus
The functions mean ( ) ,median ( ) , range ( ) , and a number of other functions, take the
argument na .rm=TRUE;i.e. remove NAs, then proceed with the calculation. By default,
these and related functions will fail when there are NAs. By default, the table ( ) function
ignores NAs.
1.4.5 Factors
A factor is stored internally as a numeric vector with values 1, 2, 3 , . . . , k. The value k is
the number of levels. The levels are character strings.
Consider a survey that has data on 691 females and 692 males. If the first 691 are females
and the next 692 males, we can create a vector of strings that holds the values thus:
gender < - c(rep("female",691),rep("male",692))
Internally, the factor gender is stored as 691 Is, followed by 692 2s. It has stored with it
a table that holds the information
1 female
2 male
In most contexts that seem to demand a character string, the 1 is translated into female
and the 2 into male.The values female and ma1 e are the levels of the factor. By default,
the levels are chosen to be in sorted order for the data type from which the factor was
formed, so that female precedes male. Hence:
> levels(gender)
[ I ] "female" "malet'
Note that if gender had been an ordinary character vector, the outcome of the above
levels command would have been NULL. The order of the factor levels determines the
order of appearance of the levels in graphs and tables that use this information. To cause
ma1 e to come before female,use
This syntax is available both when the factor is first created, and later to change the order
in an existing factor. Take care that the level names are correctly spelled. For example,
specifying " Male " in place of male " in the 1eve1 s argument will cause all values
'I
One advantage of factors is that the memory required for storage is less than for the
corresponding character vector when there are multiple values for each factor level, and the
levels are long character strings.
In each case, one can view the contents of the object type by entering type at the command
line, thus:
> type
[I] C L M Sm Sp V
Levels: C L M Sm Sp V
> attach(Cars93.summary)
# R can now access the columns of Cars93.summary directly
> abbrev
[1]C L M SmSpV
Levels: C L M Sm Sp V
> detach("Cars93.summary")
# Not strictly necessary, but tidiness is a good habit!
# In R, detach(Cars93.summary) is an acceptable alternative
Detaching data frames that are no longer in use reduces the risk of a clash of variable names,
e.g., two different attached data frames that have a column with the name a b b r e v , or an
abbrev both in the workspace and in an attached data frame.
In Windows versions, use of e d i t ( ) allows access to a spreadsheet-like display of a
data frame or of a vector. Users can then directly manipulate individual entries or perform
data entry operations as with a spreadsheet. For example,
To close the spreadsheet, click on the File menu and then on Close.
data(airqua1ity)
names(airqua1ity)
[ l l "Ozone" "So1ar.R" "Wind" 'I Temp " "Month" " Day "
The names ( ) function serves a second purpose. To change the name of the abbrev
variable (the fourth column) in the C a r s 9 3 .summary data frame to c o d e , type
names(Cars93.~ummary)[4]<- "code"
The function na .omit ( ) omits any rows that contain missing values. For example
newpossum <- na.omit(possum) # Has three fewer rows than possum
1.6 R Packages
The recommended R distribution includes a number of packages in its library. Note in
particular base, eda, ts (time series), and MASS. We will make frequent use both of the
MASS package and of our own DAAG package. DAAG, and other packages that are not
included with the default distribution, can be readily downloaded and installed.
Installed packages, unless loaded automatically, must then be loaded prior to use. The
base package is automatically loaded at the beginning of the session. To load any other
installed package, use the 1ibrary ( ) function. For example,
library (MASS) # Loads the MASS package
Replace "base" by the name of any other installed package, as required (type in
library ( ) to get the names of the installed packages).
In order to bring any of these data frames into the workspace, the user must specifically
request it. (Ensure that the relevant package is loaded.) For example, to access the data set
ai rqual i ty from the base package, type in
data (airquality) # Load airquality into the workspace
Such objects should be removed (rm ( ai rqual i ty ) ) when they are not for the time being
required. They can be loaded again as occasion demands.
1.7* Looping
A simple example of a for loop is1
> for (i in 1:5) print(i)
[ll 1
r11 2
r11 3
[I1 4
[ll 5
Here is a possible way to estimate population growth rates for each of the Australian states
and territories:
data(austpop) # population figures for all
# Australian states
growth.rates <- numeric(8) # numeric(8) creates a numeric
# vector with 8 elements, all set
# equal to 0
for ( j in seq(2,9)) {
growth.rates [j-11 < - (austpop[9, j]-austpop[l, j ] ) /
austpop[l, jl 1
growth.rates <- data.frame(growth.rates)
row.names(growth.rates) <- names(austpop[c(-1,-lo)])
# We have used row.names() to name the rows of the data frame
The result is
NSW
Vic
Qld
SA
WA
Tas
NT
ACT
Note that in contrast to the example in Subsection 1.5.2, we now have an inline function,
i.e. one that is defined on the fly and does not have or need a name. The effect is to assign
the columns of the data frame austpop [ , - c ( 1, 10 ) 1 ,in turn, to the function argument
x. With x replaced by each column in turn, the function returns ( x [ 9 ] -x [ 1 ] ) / x [ 1 ] .
In R there is often a better alternative, perhaps using one of the built-in functions, to
writing an explicit loop. Loops can incur severe computational overhead.
1.8 R Graphics
The functions plot ( ) , points ( ) , lines ( ) , text ( ) , mtext ( ) , axis ( ) ,
identify ( ) , etc. form a suite that plot graphs and add features to the graph. To see
some of the possibilities that R offers, enter
plot (y - x)
Readers might show the second of these graphs to their friends, asking them to identify the
pattern! By holding with the left mouse button on the lower border until a double sided arrow
appears and dragging upwards, the vertical dimension of the graph sheet can be shortened.
If sufficiently shortened, the pattern becomes obvious. The eye has difficulty in detecting
patterns of change where the angle of slope is close to the horizontal or close to the vertical.
Then try this:
attach(austpop)
plot(year, ACT, t y p e = " l M ) # Join the points ( " 1 " = "line")
detach(austpop)
increases the text and plot symbol size 25% above the default. Adding mex=l .2 5 makes
room in the margin to accommodate the increased text size.
1.8 R Graphics
a Human
@Chimp @Gorilla
Rhesus monkey
lPotar monkey
It is good practice to store the existing settings, so that they can be restored later. For this,
specify, e.g.,
oldpar <- par (cex=l.25, mex=l. 25) # Use par (oldpar) to restore
# earlier settings
The size of the axis annotation can be controlled, independently of the setting of c e x , by
specifying a value for cex. a x i s . Similarly, cex. l a b e l s may be used to control the
size of the axis labels.
Type in h e l p ( p a r ) to get a list of all the parameter settings that are available with
par ( ) .
one
Figure 1.3: Each of 17 panelists compared two milk samples for sweetness. One of the samples had
one unit of additive, while the other had four units of additive.
The resulting graph would be adequate for identifying points, but it is not a presentation
quality graph. We now note the changes that are needed to get Figure 1.2. In Figure 1.2 we
use the xlab (x-axis) and ylab (y-axis) parameters to specify meaningful axis titles. We
move the labeling to one side of the points by including appropriate horizontal and vertical
offsets. We multiply c h w <- par ( ) $cxy [ 11 by 0.1 to get an horizontal offset that is
one tenth of a character width, and similarly for chh < - par ( ) $ cxy [ 2 1 in a vertical
direction. We use pch=16 to make the plot character a heavy black dot. This helps make
the points stand out against the labeling.
Here is the R code for Figure 1.2:
attach (primates)
plot(x=Bodywt, y=Brainwt, pch=16, xlab="Body weight (kg)",
ylab="Brain weight (g)",xlim=c(0,300),ylim=c(0,1500))
chw <- par()$cxy[l]
chh <- par()$cxy[2]
text (x=Bodywt+chw,y=Brainwt+c( - . 1, 0, 0,.I, 0)*chh,
labels=row.names(primates),adj=O)
detach (primates)
Where xlim and/or ylim is not set explicitly, the range of data values determines the
limits. In any case, the axis is by default extended by 4% relative to those limits.
Rug plots
The function r u g ( ) adds vertical bars, showing the distribution of data values, along one
or both of the x- and y-axes of an existing plot. Figure 1.3 has rugs on both the x- and
y-axes. Data were from a tasting session where each of 17 panelists assessed the sweetness
of each of two milk samples, one with four units of additive, and the other with one unit of
1.8 R Graphics
Points are in the eight distinct colors of the default palette, one of which is "white". These
are recycled as necessary.
The default palette is a small selection from the built-in colors. The function c o l o r s ( )
returns the 657 names of the built-in colors, some of them aliases for the same color. The
following repeats the plots above, but now using the first 100 of the 657 built-in colors.
In either case, the user positions the cursor at the location for which co-ordinates are required,
and clicks the left mouse button. Depending on the platform, the identification or labeling
of points may be terminated by pointing outside of the graphics area and clicking, or by
clicking with a button other than the first. The process will anyway terminate after some
default number n of points, which the user can set. (For identif y ( ) the default setting
is the number of data points, while for l o c a t o r ( ) the default is 500.)
I . A Brief Introduction to R
Figure 1.5: Total length of possums versus age, for each combination of population (the Australian
state of Victoria or other) and sex (female or male). Further details of these data are in Subsection 2.1.1.
attach (Animals)
plot(body, brain)
plot (sqrt(body), sqrt (brain))
plot((body)"O.l, (brain)"O.l)
plot(log(body), log(brain))
detach("Anima1s")
par(mfrow=c (1,l)) # Restore to 1 figure per page
> library(1attice)
> data (possum) # DAAG must be loaded
> table(possum$Pop, possum$sex) # Graph reflects layout of this
# table
f m
Vic 24 22
other 19 39
> xyplot(tot1ngth - age I sex*Pop, data=possum)
Note that, as we saw in Subsection 1.5.4, there are missing values for age in rows 44 and
46 that xyplo t ( ) has silently omitted. The factors that determine the layout of the panels,
i.e., sex and Pop in Figure 1.5, are known as conditioning variables.
1. A Brief Introduction to R
There will be further discussion of the lattice package in Subsection 2.1.5. It has functions
that offer a similar layout for many different types of plot. To see further examples of the
use of xypl ot ( ) , and of some of the other lattice functions, type in
example (xyplot)
It is the relative sizes of these parameters that matter for screen display or for incorporation into Word and similar programs.
Once pasted (from the clipboard) or imported into Word, graphs can be enlarged or shrunk by pointing at one corner, holding
down the left mouse button, and pulling.
1.9 Additional Points on the Use of R in This Book 23
different classes of object. For example, one can give a data frame as the argument to
p l o t . Try
Author: Voltaire
Language: English
FRANÇOIS-MARIE AROUET
( VOLTAIRE )
## A PHILOSOPHICAL DICTIONARY
## CANDIDE
## MICROMEGAS
## VOLTAIRE'S ROMANCES
## ROMANCES
SOCRATES
LETTERS ON ENGLAND
TABLES OF CONTENTS OF
VOLUMES
A PHILOSOPHICAL
DICTIONARY
By Voltaire
A B C D E F G H I J K L M N O P Q R S T U V W XYZ
A I
A. IDEA.
A, B, C, OR ALPHABET. IDENTITY.
ABBÉ. IDOL—IDOLATER—
ABBEY—ABBOT. IDOLATRY.
ABLE—ABILITY. IGNATIUS LOYOLA.
ABRAHAM. IGNORANCE.
ABUSE. IMAGINATION.
ABUSE OF WORDS. IMPIOUS.
ACADEMY. IMPOST.
ADAM. IMPOTENCE.
ADORATION. INALIENATION—
ADULTERY. INALIENABLE.
AFFIRMATION OR OATH. INCEST.
AGAR, OR HAGAR. INCUBUS.
ALCHEMY. INFINITY.
ALKORAN; INFLUENCE.
ALEXANDER. INITIATION.
ALEXANDRIA. INNOCENTS.
ALGIERS. INQUISITION.
ALLEGORIES. INSTINCT.
ALMANAC. INTEREST.
ALTARS, TEMPLES, RITES, INTOLERANCE.
SACRIFICES, ETC. INUNDATION.
AMAZONS.
AMBIGUITY—
EQUIVOCATION. J
AMERICA.
AMPLIFICATION. JEHOVAH.
ANCIENTS AND MODERNS. JEPHTHAH.
ANECDOTES. JESUITS; OR PRIDE.
ANGELS. JEWS.
ANNALS. JOB.
ANNATS. JOSEPH.
ANTHROPOMORPHITES. JUDÆA.
ANTI-LUCRETIUS. JULIAN.
ANTIQUITY. JUST AND UNJUST.
APIS. JUSTICE.
APOCALYPSE.
ANTI-TRINITARIANS.
APOCRYPHA—APOCRYPHAL. K
APOSTATE.
APOSTLES. KING.
APPARITION. KISS.
APPEARANCE.
APROPOS.
ARABS;
ARARAT.
L
ARIANISM.
ARISTEAS. LAUGHTER.
ARISTOTLE. LAW (NATURAL).
ARMS—ARMIES. LAW (SALIC).
AROT AND MAROT. LAW (CIVIL AND
ECCLESIASTICAL).
ART OF POETRY.
ARTS—FINE ARTS. LAWS.
ASMODEUS. LAWS (SPIRIT OF).
LENT.
ASPHALTUS.
ASS. LEPROSY, ETC.
ASSASSIN— LETTERS (MEN OF).
ASSASSINATION. LIBEL.
ASTROLOGY. LIBERTY.
ASTRONOMY, LIBERTY OF OPINION.
ATHEISM. LIBERTY OF THE PRESS.
ATHEIST. LIFE.
ATOMS. LOVE.
AVARICE. LOVE OF GOD.
AUGURY. LOVE (SOCRATIC LOVE).
AUGUSTINE. LUXURY.
AUGUSTUS (OCTAVIUS).
AVIGNON.
AUSTERITIES. M
AUTHORS.
AUTHORITY. MADNESS.
AXIS. MAGIC.
MALADY—MEDICINE.
B MAN.
MARRIAGE.
MARY MAGDALEN.
BABEL. MARTYRS.
BACCHUS. MASS.
BACON (ROGER). MASSACRES.
BANISHMENT. MASTER.
BAPTISM. MATTER.
BARUCH, OR BARAK, AND MEETINGS (PUBLIC).
DEBORAH; MESSIAH.
BATTALION. METAMORPHOSIS.
BAYLE. METAPHYSICS.
BDELLIUM. MIND (LIMITS OF THE
BEARD. HUMAN).
BEASTS. MIRACLES.
BEAUTIFUL (THE). MISSION.
BEES. MONEY.
BEGGAR—MENDICANT MONSTERS.
BEKKER, MORALITY.
BELIEF. MOSES.
BETHSHEMESH.
BILHAH—BASTARDS MOTION.
BISHOP. MOUNTAIN.
BLASPHEMY.
BODY.
BOOKS. N
BOURGES.
BRACHMANS—BRAHMINS. NAIL.
BREAD-TREE. NATURE.
BUFFOONERY—BURLESQUE NECESSARY—NECESSITY.
—LOW COMEDY. NEW—NOVELTIES.
BULGARIANS. NUDITY.
BULL. NUMBER.
BULL (PAPAL). NUMBERING.
C O
CÆSAR. OCCULT QUALITIES.
CALENDS. OFFENCES (LOCAL).
CANNIBALS. ONAN.
CASTING (IN METAL). OPINION.
CATO. OPTIMISM.
CELTS. ORACLES.
CEREMONIES—TITLES— ORDEAL.
PRECEDENCE. ORDINATION.
CERTAIN—CERTAINTY. ORIGINAL SIN.
CHAIN OF CREATED BEINGS. OVID.
CHAIN OR GENERATION OF
EVENTS.
CHANGES THAT OCCURRED
IN THE GLOBE.
P
CHARACTER.
PARADISE.
CHARITY.
PASSIONS.
CHARLES IX.
PAUL
CHINA.
PERSECUTION.
CHRISTIANITY. PETER (SAINT).
CHRISTMAS. PETER THE GREAT AND J.J.
CHRONOLOGY. ROUSSEAU.
CHURCH. PHILOSOPHER.
CHURCH OF ENGLAND. PHILOSOPHY.
CHURCH PROPERTY. PHYSICIANS.
CICERO. PIRATES OR BUCCANEERS.
CIRCUMCISION. PLAGIARISM.
CLERK—CLERGY. PLATO.
CLIMATE. POETS.
COHERENCE—COHESION— POISONINGS.
ADHESION. POLICY.
COMMERCE. POLYPUS.
COMMON SENSE. POLYTHEISM.
CONFESSION. POPERY.
CONFISCATION. POPULATION.
CONSCIENCE. POSSESSED.
CONSEQUENCE. POST.
CONSTANTINE. POWER—OMNIPOTENCE.
CONTRADICTIONS. POWER.
CONTRAST. PRAYER (PUBLIC),
CONVULSIONARIES. THANKSGIVING, ETC.
CORN. PREJUDICE.
COUNCILS. PRESBYTERIAN.
COUNTRY. PRETENTIONS
CRIMES OR OFFENCES. PRIDE.
CRIMINAL. PRIESTS.
CROMWELL. PRIESTS OF THE PAGANS.
CUISSAGE. PRIOR, BUTLER, AND SWIFT.
CURATE (OF THE COUNTRY). PRIVILEGE—PRIVILEGED
CURIOSITY. CASES
CUSTOMS—USAGES. PROPERTY.
CYRUS. PROPHECIES.
PROPHETS.
PROVIDENCE.
PURGATORY.
D Q
DANTE. QUACK (OR CHARLATAN).
DAVID.
DECRETALS.
DELUGE (UNIVERSAL). R
DEMOCRACY.
DEMONIACS. RAVAILLAC.
DESTINY. REASONABLE, OR RIGHT.
DEVOTEE. RELICS.
DIAL. RELIGION.
DICTIONARY. RHYME.
DIOCLETIAN. RESURRECTION.
DIONYSIUS, ST. (THE RIGHTS.
AREOPAGITE), RIVERS.
DIODORUS OF SICILY, AND ROADS.
HERODOTUS. ROD.
DIRECTOR. ROME (COURT OF).
DISPUTES.
DISTANCE.
DIVINITY OF JESUS.
DIVORCE.
S
DOG.
SAMOTHRACE.
DOGMAS.
SAMSON.
DONATIONS.
SATURN'S RING.
DRINKING HEALTHS.
SCANDAL.
THE DRUIDS.
SCHISM.
SCROFULA.
E SECT.
SELF-LOVE.
SENSATION.
EASE. SENTENCES (REMARKABLE).
ECLIPSE. SENTENCES OF DEATH.
ECONOMY (RURAL). SERPENTS.
ECONOMY OF SPEECH— SHEKEL.
ELEGANCE. SIBYL.
ELIAS OR ELIJAH, AND SINGING.
ENOCH. SLAVES.
ELOQUENCE. SLEEPERS (THE SEVEN).
EMBLEMS. SLOW BELLIES (VENTRES
ENCHANTMENT. PARESSEUX).
END OF THE WORLD. SOCIETY OF LONDON, AND
ENTHUSIASM. ACADEMIES.
ENVY. SOCRATES.
EPIC POETRY. SOLOMON.
EPIPHANY. SOMNAMBULISTS AND
EQUALITY. DREAMERS.
ESSENIANS. SOPHIST.
ETERNITY. SOUL.
EUCHARIST. SPACE.
EXECUTION. STAGE (POLICE OF THE).
EXECUTIONER. STATES—GOVERNMENTS.
EXPIATION. STATES-GENERAL.
EXTREME. STYLE.
EZEKIEL. SUPERSTITION.
FABLE. SYMBOL, OR CREDO.
FACTION. SYSTEM.
FACULTY.
FAITH.
FALSITY. T
FALSITY OF HUMAN
VIRTUES. TABOR, OR THABOR.
TALISMAN.
F TARTUFFE—TARTUFERIE.
TASTE.
TAUROBOLIUM.
FANATICISM. TAX—FEE.
FANCY. TEARS.
FASTI. TERELAS.
FATHERS—MOTHERS— TESTES.
CHILDREN. THEISM.
FAVOR. THEIST.
FAVORITE. THEOCRACY.
FEASTS. THEODOSIUS.
FERRARA. THEOLOGIAN.
FEVER. THUNDER.
FICTION. TOLERATION.
FIERTÉ. TOPHET.
FIGURE. TORTURE.
FIGURED—FIGURATIVE. TRANSUBSTANTIATION.
FIGURE IN THEOLOGY. TRINITY.
FINAL CAUSES. TRUTH.
FINESSE, FINENESS, ETC. TYRANNY.
FIRE. TYRANT.
FIRMNESS.
FLATTERY.
FORCE (PHYSICAL). U
FORCE—STRENGTH.
FRANCHISE. UNIVERSITY.
FRANCIS XAVIER. USAGES.
FRANKS—FRANCE—FRENCH
FRAUD.
FREE-WILL.
FRENCH LANGUAGE.
V
FRIENDSHIP.
VAMPIRES.
FRIVOLITY.
VELETRI,
VENALITY.
G VENICE.
VERSE.
VIANDS.
GALLANT. VIRTUE.
GARGANTUA. VISION.
GAZETTE. VISION OF CONSTANTINE.
GENEALOGY. VOWS.
GENESIS. VOYAGE OF ST. PETER TO
GENII. ROME.
GENIUS.
GEOGRAPHY.
GLORY—GLORIOUS. W
GOAT—SORCERY.
GOD—GODS. WALLER.
GOOD—THE SOVEREIGN WAR.
GOOD, A CHIMERA. WEAKNESS ON BOTH SIDES.
GOOD. WHYS (THE).
GOSPEL. WICKED.
GOVERNMENT. WILL.
GOURD OR CALABASH. WIT, SPIRIT, INTELLECT.
GRACE. WOMEN.
GRACE (OF).
GRAVE—GRAVITY.
GREAT—GREATNESS.
GREEK.
X, Y, Z
GUARANTEE.
XENOPHANES.
GREGORY VII.
XENOPHON,
YVETOT.
H ZEAL.
ZOROASTER.
DECLARATION INQUIRERS,
HAPPY—HAPPILY. AND DOUBTERS,
HEAVEN (CIEL MATÉRIEL).
HEAVEN OF THE ANCIENTS.
HELL.
HELL (DESCENT INTO).
HERESY.
HERMES.
HISTORIOGRAPHER.
HISTORY.
HONOR.
HUMILITY.
HYPATIA.
LIST OF PLATES
VOLTAIRE AT THE AGE OF THIRTY—Frontispiece
MAHOMET
ANCIENT GREECE
THE BASTILLE—Frontispiece
A TYPE OF BEAUTY
AN ASTROLOGER
ALEXANDER'S TRIUMPH
OLIVER CROMWELL
A LAND STORM
DESCARTES
OLD ROUEN—frontispiece
MONTESQUIEU
ANCIENT ROME
JOHN CALVIN
VOLTAIRE: THE HOUDON BUST—Frontispiece
JOHN LOCKE
THE VISION
PIERRE CORNEILLE
ZADIG;
Or, The Book of Fate.
An Oriental History
By Voltaire
CONTENTS
CHAP. I.
The blind Eye page 1
CHAP. II.
The Nose 13
CHAP. III.
The Dog and the Horse, &c. 20
CHAP. IV.
The Envious Man 33
CHAP. V.
The Force of Generosity 45
CHAP. VI.
The Just Judge 53
CHAP. VII.
The Force of Jealousy 63
CHAP. VIII.
The Thresh’d Wife 79
CHAP. IX.
The Captive 89
CHAP. X.
The Funeral Pile 100
CHAP. XI.
The Evening’s Entertainment 111
CHAP. XII.
The Rendezvous 124
CHAP. XIII.
The Free-booter 135
CHAP. XIV.
The Fisherman 147
CHAP. XV.
The Basilisk 159
CHAP. XVI.
The Tournaments 187
CHAP. XVII.
The Hermit 205
CHAP. XVIII.
The Riddles, or Ænigmas 225
CANDIDE
By Voltaire
CONTENTS
CHAPTER PAGE
How Candide was brought up in a Magnificent
I. 1
Castle, and how he was expelled thence
II. What became of Candide among the Bulgarians 5
How Candide made his escape from the Bulgarians,
III. 9
and what afterwards became of him
How Candide found his old Master Pangloss, and
IV. 13
what happened to them
Tempest, Shipwreck, Earthquake, and what became
V. of Doctor Pangloss, Candide, and James the 18
Anabaptist
How the Portuguese made a Beautiful Auto-da-fé, to
VI. prevent any further Earthquakes: and how Candide 23
was publicly whipped
How the Old Woman took care of Candide, and how
VII. 26
he found the Object he loved
VIII. The History of Cunegonde 30
What became of Cunegonde, Candide, the Grand
IX. 35
Inquisitor, and the Jew
In what distress Candide, Cunegonde, and the Old
X. 38
Woman arrived at Cadiz; and of their Embarkation
XI. History of the Old Woman 42
XII. The Adventures of the Old Woman continued 48
How Candide was forced away from his fair
XIII. 54
Cunegonde and the Old Woman
How Candide and Cacambo were received by the
XIV. 58
Jesuits of Paraguay
How Candide killed the brother of his dear
XV. 64
Cunegonde
Adventures of the Two Travellers, with Two Girls,
XVI. 68
Two Monkeys, and the Savages called Oreillons
Arrival of Candide and his Valet at El Dorado, and
XVII. 74
what they saw there
XVIII. What they saw in the Country of El Dorado 80
What happened to them at Surinam and how
XIX. 89
Candide got acquainted with Martin
XX. What happened at Sea to Candide and Martin 98
Candide and Martin, reasoning, draw near the Coast
XXI. 102
of France
XXII. What happened in France to Candide and Martin 105
Candide and Martin touched upon the Coast of
XXIII. 122
England, and what they saw there
XXIV. Of Paquette and Friar Giroflée 125
XXV.The Visit to Lord Pococurante, a Noble Venetian 133
Of a Supper which Candide and Martin took with Six
XXVI. 142
Strangers, and who they were
XXVII. Candide's Voyage to Constantinople 148
What happened to Candide, Cunegonde, Pangloss,
XXVIII. 154
Martin, etc.
How Candide found Cunegonde and the Old Woman
XXIX. 159
again
XXX. The Conclusion 161
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookultra.com