Download ebooks file Handbook of statistical analysis and data mining applications Second Edition Elder - eBook PDF all chapters
Download ebooks file Handbook of statistical analysis and data mining applications Second Edition Elder - eBook PDF all chapters
com
https://ebookluna.com/download/handbook-of-statistical-
analysis-and-data-mining-applications-ebook-pdf/
OR CLICK BUTTON
DOWNLOAD NOW
https://ebookluna.com/product/ebook-pdf-handbook-of-statistical-
analysis-and-data-mining-applications-2nd-edition/
ebookluna.com
https://ebookluna.com/download/predictive-modeling-in-biomedical-data-
mining-and-analysis-ebook-pdf/
ebookluna.com
https://ebookluna.com/product/ebook-pdf-the-analysis-of-biological-
data-second-edition/
ebookluna.com
https://ebookluna.com/product/ebook-pdf-data-mining-for-business-
analytics-concepts-techniques-and-applications-in-r/
ebookluna.com
(eBook PDF) An Introduction to Statistical Methods & Data
Analysis 7th
https://ebookluna.com/product/ebook-pdf-an-introduction-to-
statistical-methods-data-analysis-7th/
ebookluna.com
https://ebookluna.com/product/ebook-pdf-data-mining-for-business-
analytics-concepts-techniques-and-applications-with-jmp-pro/
ebookluna.com
https://ebookluna.com/product/ebook-pdf-data-mining-concepts-and-
techniques-3rd/
ebookluna.com
https://ebookluna.com/product/ebook-pdf-data-mining-for-business-
analytics-concepts-techniques-and-applications-with-xlminer-3rd-
edition/
ebookluna.com
https://ebookluna.com/product/ebook-pdf-data-mining-and-predictive-
analytics-2nd-edition/
ebookluna.com
HANDBOOK OF STATISTICAL ANALYSIS
AND DATA MINING APPLICATIONS
HANDBOOK OF
STATISTICAL
ANALYSIS AND
DATA MINING
APPLICATIONS
SECOND EDITION
AUTHORS
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or any information storage and retrieval system, without
permission in writing from the publisher. Details on how to seek permission, further information about the
Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center
and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other
than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using
any information, methods, compounds, or experiments described herein. In using such information or methods
they should be mindful of their own safety and the safety of others, including parties for whom they have a
professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability
for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or
from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
ISBN 978-0-12-416632-5
Note: This list includes all the extra tuto- 8. TUTORIAL “W”—Diabetes Control in
rials published with the 1st edition of this Patients [Field: Medical Informatics]
handbook (2009). These can be considered 9. TUTORIAL “X”—Independent
“enrichment” tutorials for readers of this 2nd Component Analysis [Field: Separating
edition. Since the 1st edition of the handbook Competing Signals]
will not be available after the release of the 10. TUTORIAL “Y”—NTSB Aircraft
2nd edition, these extra tutorials are carried Accidents Reports [Field: Engineering—
over in their original format/versions of soft- Air Travel—Text Mining]
ware, as they are still very useful in learning 11. TUTORIAL “Z”—Obesity Control in
and understanding data mining and predic- Children [Field: Preventive Health Care]
tive analytics, and many readers will want to 12. TUTORIAL “AA”—Random Forests
take advantage of them. Example [Field: Statistics—Data Mining]
List of Extra Enrichment Tutorials that 13. TUTORIAL “BB”—Response
are only on the ELSEVIER COMPANION Optimization [Field: Data Mining—
web page, with data sets as appropriate, for Response Optimization]
downloading and use by readers of this 2nd 14. TUTORIAL “CC”—Diagnostic Tooling
edition of handbook: and Data Mining: Semiconductor Industry
[Field: Industry—Quality Control]
1. TUTORIAL “O”—Boston Housing
15. TUTORIAL “DD”—Titanic—Survivors
Using Regression Trees [Field:
of Ship Sinking [Field: Sociology]
Demographics]
16. TUTORIAL “EE”—Census Data
2. TUTORIAL “P”—Cancer Gene [Field:
Analysis [Field: Demography—Census]
Medical Informatics & Bioinformatics]
17. TUTORIAL “FF”—Linear & Logistic
3. TUTORIAL “Q”—Clustering of Shoppers
Regression—Ozone Data [Field:
[Field: CRM—Clustering Techniques]
Environment]
4. TUTORIAL “R”—Credit Risk
18. TUTORIAL “GG”—R-Language
using Discriminant Analysis [Field:
Integration—DISEASE SURVIVAL
Financial—Banking]
ANALYSIS Case Study [Field: Survival
5. TUTORIAL “S”—Data Preparation and
Analysis—Medical Informatics]
Transformation [Field: Data Analysis]
19. TUTORIAL “HH”—Social Networks
6. TUTORIAL “T”—Model Deployment
Among Community Organizations
on New Data [Field: Deployment of
[Field: Social Networks—Sociology &
Predictive Models]
Medical Informatics]
7. TUTORIAL “V”—Heart Disease Visual
20. TUTORIAL “II”—Nairobi, Kenya
Data Mining Methods [Field: Medical
Baboon Project: Social Networking
Informatics]
xi
xii LIST OF TUTORIALS ON THE ELSEVIER COMPANION WEB PAGE
This book will help the novice user be- • Asking the wrong question—when
come familiar with data mining. Basically, looking for a rare phenomenon, it may
data mining is doing data analysis (or statis- be helpful to identify the most common
tics) on data sets (often large) that have been pattern. These may lead to complex
obtained from potentially many sources. As analyses, as in item 3, but they may also
such, the miner may not have control of the be conceptually simple. Again, you may
input data, but must rely on sources that have need to take care that you don't overfit
gathered the data. As such, there are prob- the data.
lems that every data miner must be aware of • Don't become enamored with the data.
as he or she begins (or completes) a mining There may be a substantial history from
operation. I strongly resonated to the mate- earlier data or from domain experts that
rial on “The Top 10 Data Mining Mistakes,” can help with the modeling.
which give a worthwhile checklist: • Be wary of using an outcome variable (or
• Ensure you have a response variable and one highly correlated with the outcome
predictor variables—and that they are variable) and becoming excited about the
correctly measured. result. The predictors should be “proper”
• Beware of overfitting. With scads of predictors in the sense that they (a) are
variables, it is easy with most statistical measured prior to the outcome and (b)
programs to fit incredibly complex are not a function of the outcome.
models, but they cannot be reproduced. It • Do not discard outliers without solid
is good to save part of the sample to use justification. Just because an observation
to test the model. Various methods are is out of line with others is insufficient
offered in this book. reason to ignore it. You must check the
• Don't use only one method. Using only circumstances that led to the value. In
linear regression can be a problem. any event, it is useful to conduct the
Try dichotomizing the response or analysis with the observation(s) included
categorizing it to remove nonlinearities and excluded to determine the sensitivity
in the response variable. Often, there are of the results to the outlier.
clusters of values at zero, which messes • Extrapolating is a fine way to go
up any normality assumption. This, of broke; the best example is the stock
course, loses information, so you may market. Stick within your data, and
want to categorize a continuous response if you must go outside, put plenty
variable and use an alternative to of caveats. Better still, restrain the
regression. Similarly, predictor variables impulse to extrapolate. Beware that
may need to be treated as factors rather pictures are often far too simple and
than linear predictors. A classic example we can be misled. Political campaigns
is using marital status or race as a linear oversimplify complex problems (“my
predictor when there is no order. opponent wants to raise taxes”; “my
xiii
xiv FOREWORD 1 FOR 1st EDITION
opponent will take us to war”) when using mean replacement, almost the
the realities may imply we have same set of predictor variables surfaced,
some infrastructure needs that can be but the residual sum of squares was 20.
handled only with new funding or we I then used multiple imputation and
have been attacked by some bad guys. found approximately the same set of
Be wary of your data sources. If you are predictors but had a residual sum of
combining several sets of data, they need squares (median of 20 imputations) of
to meet a few standards: 25. I find that mean replacement is rather
• The definitions of variables that are optimistic but surely better than relying
being merged should be identical. Often, on only complete cases. Using stepwise
they are close but not exact (especially regression, I find it useful to replicate
in metaanalysis where clinical studies it with a bootstrap or with multiple
may have somewhat different definitions imputations. However, with large data
due to different medical institutions or sets, this approach may be expensive
laboratories). computationally.
• Be careful about missing values. Often, To conclude, there is a wealth of material
when multiple data sets are merged, in this handbook that will repay study.
missing values can be induced: one
variable isn't present in another data set; Peter A. Lachenbruch
what you thought was a unique variable Oregon State University, Corvallis, OR,
name was slightly different in the two United States
sets, so you end up with two variables American Statistical Association,
that both have a lot of missing values. Alexandria, VA, United States
• How you handle missing values can be Johns Hopkins University, Baltimore,
crucial. In one example, I used complete MD, United States
cases and lost half of my sample; all UCLA, Los Angeles, CA, United States
variables had at least 85% completeness, University of Iowa, Iowa City, IA,
but when put together, the sample lost United States
half of the data. The residual sum of University of North Carolina, Chapel
squares from a stepwise regression was Hill, NC, United States
about 8. When I included more variables
Foreword 2 for 1st Edition
xv
xvi FOREWORD 2 FOR 1st EDITION
However, the book is best read a few the excellent “History of Statistics and Data
chapters at a time while actively doing Mining” chapter and chapters 16, 17, and
the data mining rather than read cover to 18. These are broadly applicable and should
cover (a daunting task for a book this size). be read by even the most experienced data
Practitioners will appreciate tutorials that miners.
match their business objectives and choose The Handbook of Statistical Analysis and
to ignore other tutorials. They may choose Data Mining Applications is an exceptional
to read sections on a particular algorithm to book that should be on every data miner's
increase insight into that algorithm and then bookshelf or, better yet, found lying open
decide to add a second algorithm after the next to their computer.
first is mastered. For those new to a partic-
ular software tool highlighted in the tutori- Dean Abbott
als section, the step-by-step approach will Abbott Analytics, San Diego, CA,
operate much like a user's manual. Many United States
chapters stand well on their own, such as
Preface
xvii
xviii PREFACE
turn the key in the ignition, step on the gas is enough here to permit you to construct
and the brake at the right times, and turn the “smart enough” business operations with a
wheel to change direction in a safe manner, relatively small amount of the right informa-
and voilà, you are an expert user of the very tion. James Taylor developed this concept
complex technology under the hood. The for automating operational decision-making
other half of the story is the instruction man- in the area of enterprise decision man-
ual and the driver's education course that agement (Raden and Taylor, 2007). Taylor
help you to learn how to drive. recognized that companies need decision-
This book provides the instruction man- making systems that are automated enough
ual and a series of tutorials to train you how to keep up with the volume and time-critical
to do data mining in many subject areas. We nature of modern business operations.
provide both the right tools and the right These decisions should be deliberate, pre-
intuitive explanations (rather than formal cise, and consistent across the enterprise;
mathematical definitions) of the data mining smart enough to serve immediate needs
process and algorithms, which will enable appropriately; and agile enough to adapt
even beginner data miners to understand the to new opportunities and challenges in the
basic concepts necessary to understand what company. The same concept can be applied
they are doing. In addition, we provide many to nonoperational systems for customer re-
tutorials in many different industries and lationship management (CRM) and market-
businesses (using many of the most common ing support. Even though a CRM model for
data mining tools) to show how to do it. cross sell may not be optimal, it may enable
several times the response rate in prod-
uct sales following a marketing campaign.
OVERALL ORGANIZATION Models like this are “smart enough” to drive
OF THIS BOOK companies to the next level of sales. When
models like this are proliferated through-
We have divided the chapters in this book out the enterprise to lift all sales to the next
into four parts to guide you through the as- level, more refined models can be developed
pects of predictive analytics. Part I covers the to do even better. This e nterprise-wide “lift”
history and process of predictive analytics. in intelligent operations can drive a com-
Part II discusses the algorithms and methods pany through evolutionary rather than rev-
used. Part III is a group of tutorials, which olutionary changes to reach long-term goals.
serve in principle as Rome served—as the Companies can leverage “smart enough”
central governing influence. Part IV presents decision systems to do likewise in their pur-
some advanced topics. The central theme of suit of optimal profitability in their business.
this book is the education and training of Clearly, the use of this book and these tools
beginning data mining practitioners, not the will not make you experts in data mining.
rigorous academic preparation of algorithm Nor will the explanations in the book per-
scientists. Hence, we located the tutorials in mit you to understand the complexity of the
the middle of the book in Part III, flanked by theory behind the algorithms and methodol-
topical chapters in Parts I, II, and IV. ogies so necessary for the academic student.
This approach is “a mile wide and an inch But we will conduct you through a relatively
deep” by design, but there is a lot packed into thin slice across the wide practice of data
that inch. There is enough here to stimulate mining in many industries and disciplines.
you to take deeper dives into theory, and there We can show you how to create powerful
PREFACE xix
predictive models in your own organization Coauthor Gary Miner wishes to thank his
in a relatively short period of time. In addi- wife, Linda A. Winters-Miner, PhD, who has
tion, this book can function as a springboard been working with Gary on similar books over
to launch you into higher-level studies of the the past 30 years and wrote several of the tu-
theory behind the practice of data mining. torials included in this book, using real-world
If we can accomplish those goals, we will data. Gary also wishes to thank the following
have succeeded in taking a significant step in people from his office who helped in various
bringing the practice of data mining into the ways, including Angela Waner, Jon Hillis, Greg
mainstream of business analysis. Sergeant, and Dr. Thomas Hill, who gave per-
The three coauthors could not have done mission to use and also edited a group of the
this book completely by themselves, and tutorials that had been written over the years
we wish to thank the following individuals, by some of the people listed as guest authors in
with the disclaimer that we apologize if, by this book. Dr. Dave Dimas, of the University of
our neglect, we have left out of this “thank- California—Irvine, has also been very helpful
you list” anyone who contributed. in providing suggestions for enhancements for
Foremost, we would like to thank ac- this second edition—THANK YOU DAVE !!!
quisitions editor (name to use?) and others Without all the help of the people men-
(names). Bob Nisbet would like to honor tioned here and maybe many others we failed
and thank his wife, Jean Nisbet, PhD, who to specifically mention, this book would never
blasted him off in his technical career by re- have been completed. Thanks to you all!
typing his PhD dissertation five times (be-
fore word processing) and assumed much
of the family's burdens during the writing Bob Nisbet
of this book. Bob also thanks Dr. Daniel B. Gary Miner
Botkin, the famous global ecologist, for in- Ken Yale
troducing him to the world of modeling and
exposing him to the distinction between
viewing the world as machine and viewing Reference
it as organism. And thanks are due to Ken Raden, N., Taylor, J., 2007. Smart Enough Systems: How to
Reed, PhD, for inducting Bob into the prac- Deliver Competitive Advantage by Automating Hidden
Decisions. Prentice Hall, NJ, ISBN: 9780132713061.
tice of data mining.
Introduction
Often, data analysts are asked, “What very different ways of arriving at the same
are statistical analysis and data mining?” In conclusion, a decision. We will introduce
this book, we will define what data mining some basic analytic history and theory in
is from a procedural standpoint. But most Chapters 1 and 2.
people have a hard time relating what we The basic process of analytic modeling is
tell them to the things they know and under- presented in Chapter 3. But it may be diffi-
stand. Before moving on into the book, we cult for you to relate what is happening in
would like to provide a little background for the process without some sort of tie to the
data mining that everyone can relate to. The real world that you know and enjoy. In many
Preface describes the many changes in ac- ways, the decisions served by analytic mod-
tivities related to data mining since the first eling are similar to those we make every day.
edition of this book was published in 2009. These decisions are based partly on patterns
Now, it is time to dig deeper and discuss the of action formed by experience and partly by
differences between statistical analysis and intuition.
data mining (aka predictive analytics).
Statistical analysis and data mining are PATTERNS OF ACTION
two methods for simulating the unconscious
operations that occur in the human brain to A pattern of action can be viewed in
provide a rationale for decision-making and terms of the activities of a hurdler on a
actions. Statistical analysis is a very directed race track. The runner must start success-
rationale that is based on norms. We all think fully and run to the first hurdle. He must
and make decisions on the basis of norms. decide very quickly how high to jump to
For example, we consider (unconsciously) clear the hurdle. He must decide when and
what the norm is for dress in a certain situa- in what sequence to move his legs to clear
tion. Also, we consider the acceptable range the hurdle with minimum effort and with-
of variation in dress styles in our culture. out knocking it down. Then, he must run
Based on these two concepts, the norm and a specified distance to the next hurdle and
the variation around that norm, we render do it all over again several times, until he
judgments like “that man is inappropriately crosses the finish line. Analytic modeling is
dressed.” Using similar concepts of mean a lot like that.
and standard deviation, statistical analy- The training of the hurdler's “model” of
sis proceeds in a very logical way to make action to run the race happens in a series of
very similar judgments (in principle). On operations:
the other hand, data mining learns case by
case and does not use means or standard • Run slow at first.
deviations. Data mining algorithms build • Practice takeoff from different positions
patterns, clarifying the pattern as each case to clear the hurdle.
is submitted for processing. These are two • Practice different ways to move the legs.
xxi
xxii INTRODUCTION
• Determine the best ways to do each activity. two classes (for dichotomous keys) and those
• Practice the best ways for each activity who don't. Along with this joke is a similar
over and over again. recognition from the outside that taxono-
mists are divided also into two classes: the
This practice trains the sensory and motor
“lumpers” (who combine several species into
neurons to function together most efficiently.
one) and the “splitters” (who divide one spe-
Individual neurons in the brain are “trained”
cies into many). These distinctions point to
in practice by adjusting signal strengths and
a larger dichotomy in the way people think.
firing thresholds of the motor nerve cells. The
In ecology, there used to be two schools
performance of a successful hurdler follows
of thought: autoecologists (chemistry, phys-
the “model” of these activities and the process
ics, and mathematics explain all) and the
of coordinating them to run the race. Creation
synecologists (organism relationships in
of an analytic “model” of a business process to
their environment explain all). It wasn't until
predict a desired outcome follows a very simi-
the 1970s that these two schools of thought
lar path to the training regimen of a hurdler. We
learned that both perspectives were needed
will explore this subject further in Chapter 3
to understand the complexities in ecosys-
and apply it to develop a data mining process
tems (but more about that later). In business,
that expresses the basic activities and tasks per-
there are the “big picture” people versus
formed in creating an analytic model.
“detail” people. Some people learn by fol-
lowing an intuitive pathway from general to
HUMAN INTUITION specific (deduction). Often, we call them “big
picture” people. Other people learn by fol-
In humans, the right side of the brain is lowing an intuitive pathway from specific to
the center for visual and esthetic sensibil- general (inductive). Often, we call them “de-
ities. The left side of the brain is the center tail” people. Similar distinctions are reflected
for quantitative and time-regulated sensi- in many aspects of our society. In Chapter 1,
bilities. Human intuition is a blend of both we will explore this distinction to a greater
sensibilities. This blend is facilitated by the depth in regards to the development of sta-
neural connections between the right side tistical and data mining theory through time.
of the brain and the left side. In women, the Many of our human activities involve
number of neural connections between the finding patterns in the data input to our sen-
right and left sides of the brain is 20% greater sory systems. An example is the mental pat-
(on average) than in men. This higher con- tern that we develop by sitting in a chair in
nectivity of women's brains enables them to the middle of a shopping mall and making
exercise intuitive thinking to a greater extent some judgment about patterns among its cli-
than men. Intuition “builds” a model of re- entele. In one mall, people of many ages and
ality from both quantitative building blocks races may intermingle. You might conclude
and visual sensibilities (and memories). from this pattern that this mall is located in
an ethnically diverse area. In another mall,
you might see a very different pattern. In
PUTTING IT ALL one mall in Toronto, a great many of the
TOGETHER stores had Chinese titles and script on the
windows. One observer noticed that he was
Biological taxonomy students claim (in the only non-Asian seen for a half hour. This
jest) that there are two kinds of people in led to the conclusion that the mall catered
taxonomy—those who divide things up into to the Chinese community and was owned
INTRODUCTION xxiii
(probably) by a Chinese company or person. of the model. We will discuss this approach in
Statistical methods employed in testing this more detail in Chapter 1. Data mining doesn't
“hypothesis” would include the following: start with a model; it builds a model with the
data. Thus, statistical analysis uses a model to
• Performing a survey of customers to gain
characterize a pattern in the data; data mining
empirical data on race, age, length of time
uses the pattern in the data to build a model.
in the United States, etc.
This approach uses deductive reasoning,
• Calculating means (averages) and
following an Aristotelian approach to truth.
standard deviations (an expression of the
From the “model” accepted in the beginning
average variability of all the customers
(based on the mathematical distributions as-
around the mean).
sumed), outcomes are deduced. On the other
• Using the mean and standard deviation
hand, data mining methods discover patterns
for all observations to calculate a metric
in data inductively, rather than deductively,
(e.g., student's t-value) to compare with
following a more Platonic approach to truth.
standard tables.
We will unpack this distinction to a much
• If the metric exceeds the standard table
greater extent in Chapter 1.
value, this attribute (e.g., race) is present
Which is the best way to do it? The an-
in the data at a higher rate than expected
swer is it depends. It depends on the data.
at random.
Some data sets can be analyzed better with
More advanced statistical techniques can statistical analysis techniques, and other data
accept data from multiple attributes and sets can be analyzed better with data mining
process them in combination to produce a techniques. How do you know which ap-
metric (e.g., average squared error), which proach to use for a given data set? Much ink
reflects how well a subset of attributes (se- has been devoted to paper to try to answer
lected by the processing method) predict that question. We will not add to that effort.
desired outcome. This process “builds” an Rather, we will provide a guide to general an-
analytic equation, using standard statistical alytic theory (Chapter 2) and broad analytic
methods. This analytic “model” is based on procedures (Chapter 3) that can be used with
averages across the range of variation of the techniques for either approach. For the sake
input attribute data. This approach to finding of simplicity, we will refer to the joint body of
the pattern in the data is basically a deduc- techniques as analytics. In Chapter 4, we in-
tive, top-down process (general to specific). troduce some of the many data preparation
The general part is the statistical model em- procedures for analytics.
ployed for the analysis (i.e., normal paramet- Chapter 5 presents various methods for
ric model). This approach to model building selecting candidate predictor variables to
is very “Platonic.” In Chapter 1, we will ex- be used in a statistical or machine-learning
plore the distinctions between Aristotelian model (differences between statistical and
and Platonic approaches for understanding machine-learning methods of model build-
truth in the world around us. ing are discussed in Chapter 1). Chapter 6 in-
Part I—Introduction and overview of data troduces accessory tools and some advanced
mining processes. features of many data mining tools.
Both statistical analysis and data mining Part II—Basic and advanced algoithms,
algorithms operate on patterns: statistical and their application to common problems.
analysis uses a predefined pattern (i.e., the Chapters 7 and 8 discuss various basic
parametric model) and compares some mea- and advanced algorithms used in data min-
sure of the observations to standard metrics ing modeling applications. Chapters 9 and 10
xxiv INTRODUCTION
discuss the two general types of models, clas- significance, “luck” and ethics in data mining
sification and prediction. Chapter 11 presents applications. The book ends with Chapter 22,
some methods for evaluating and refining which gives an overview of the IBM Watson
analytic models. Chapters 12–15 describe technology, which IBM is trying to leverage
how data mining methods are applied to to solve many analytic problems. It is likely
four common applications. Part III contains that even these new processing strategies are
a group of tutorials that show how to apply not the end of the line in data mining devel-
various data mining tools to solve common opment. Chapter 1 ends with the statement
problems. Part IV discusses various issues of that we will discover increasingly novel and
model complexity, ethical use, and advanced clever ways to mimic the most powerful pat-
processes. Chapter 16 describes the para- tern recognition engine in the universe, the
dox of complexity. Chapter 17 introduces human brain.
the principle of “good-enough” models. One step further in the future could be to
Chapter 18 presents a list of data preparation drive the hardware supporting data mining
activities in the form of a cookbook, along to the level of portable devices like phones
with some caveats of using data mining (pre- and medical data loggers, even to smaller
dictive analytics) methods. Chapter 19 intro- applications in nanotechnology. In power-
duces one of the newest development areas, ful biological quantum computers, the size
deep learning. Some practitioners think that of pin heads (and smaller) may be the next
many data mining analyses will move in the wave of technological development to drive
direction of using deep learning algorithms. data mining advances. Rather than the sky,
Chapters 20 and 21 present various issues of the atom is the limit.
Bob Nisbet
September, 2017
Frontispiece
xxv
xxvi FRONTISPIECE
that combines art and science–intuition and Elder likes to say: ‘Go data mining!’ It really does
expertise in collecting and understanding data save enormous time and money. For those with
in order to make accurate models that realis- the patience and faith to get through the early
tically predict the future that lead to informed stages of business understanding and data trans-
strategic decisions thus allowing correct ac- formation, the cascade of results can be extremely
tions ensuring success, before it is too late...to- rewarding.”
day, numeracy is as essential as literacy. As John Gary Miner, September, 2017
Biographies of the Primary
Authors of This Book
BOB NISBET, PHD
xxvii
xxviii BIOGRAPHIES OF THE PRIMARY AUTHORS OF THIS BOOK
resulting in the first major book on the genetics of Alzheimer’s disease. In the mid-1990s,
Dr. Miner turned his data analysis interests to the business world, joining the team at
StatSoft and deciding to specialize in data mining. He started developing what eventually
became the Handbook of Statistical Analysis and Data Mining Applications (coauthored with
Dr. Robert A. Nisbet and Dr. John Elder), which received the 2009 American Publishers Award
for Professional and Scholarly Excellence (PROSE). Their follow-up collaboration, Practical
Text Mining and Statistical Analysis for Non-structured Text Data Applications, also received a
PROSE award in February 2013. Gary was also the coauthor of Practical Predictive Analytics
and Decisioning Systems for Medicine (Academic Press, 2015). Overall, Dr. Miner’s career has
focused on medicine and health issues and the use of data analytics (statistics and predictive
analytics) in analyzing medical data to decipher fact from fiction.
Gary has also served as a merit reviewer for Patient-Centered Outcomes Research Institute
(PCORI) that awards grants for predictive analytics research into the comparative effective-
ness and heterogeneous treatment effects of medical interventions including drugs among
different genetic groups of patients; additionally, he teaches online classes in “introduction to
predictive analytics,” “text analytics,” “risk analytics,” and “healthcare predictive analytics”
for the University of California, Irvine. Recently, until his “official retirement” 18 months ago,
he spent most of his time in his primary role as senior analyst/health-care applications spe-
cialist for Dell | Information Management Group, Dell Software (through Dell’s acquisition
of StatSoft (www.StatSoft.com) in April 2014). Currently, Gary is working on two new short
popular books on “health-care solutions for the United States” and “patient-doctor genomics
stories.”
1
The Background for Data Mining
Practice
PREAMBLE
You must be interested in learning how to practice data mining; otherwise, you would
not be reading this book. We know that there are many books available that will give a good
introduction to the process of data mining. Most books on data mining focus on the features
and functions of various data mining tools or algorithms. Some books do focus on the chal-
lenges of performing data mining tasks. This book is designed to give you an introduction to
the practice of data mining in the real world of business.
One of the first things considered in building a business data mining capability in a com-
pany is the selection of the data mining tool. It is difficult to penetrate the hype erected around
the description of these tools by the vendors. The fact is that even the most mediocre of data
mining tools can create models that are at least 90% as good as the best tools. A 90% solu-
tion performed with a relatively cheap tool might be more cost-effective in your organiza-
tion than a more expensive tool. How do you choose your data mining tool? Few reviews
are available. The best listing of tools by popularity is maintained and updated yearly by
http://KDNuggets.com. Some detailed reviews available in the literature go beyond just a dis-
cussion of the features and functions of the tools (see Nisbet, 2006, Parts 1–3). The interest in an
unbiased and detailed comparison is great. We are told that the “most downloaded document
in data mining” is the comprehensive but decade-old tool review by Elder and Abbott (1998).
The other considerations in building a business's data mining capability are forming the
data mining team, building the data mining platform, and forming a foundation of good data
mining practice. This book will not discuss the building of the data mining platform. This
subject is discussed in many other books, some in great detail. A good overview of how to
build a data mining platform is presented in Data Mining: Concepts and Techniques (Han
and Kamber, 2006). The primary focus of this book is to present a practical approach to build-
ing cost-effective data mining models aimed at increasing company profitability, using tuto-
rials and demo versions of common data mining tools.
Just as important as these considerations in practice is the background against which they
must be performed. We must not imagine that the background doesn't matter… it does matter,
Handbook of Statistical Analysis and Data Mining Applications 3 Copyright © 2018 Elsevier Inc. All rights reserved.
https://doi.org/10.1016/B978-0-12-416632-5.00001-3
4 1. The Background for Data Mining Practice
whether or not we recognize it initially. The reason it matters is that the capabilities of statis-
tical and data mining methodology were not developed in a vacuum. Analytic methodology
was developed in the context of prevailing statistical and analytic theory. But the major driver
in this development was a very pressing need to provide a simple and repeatable analysis
methodology in medical science. From this beginning developed modern statistical analysis
and data mining. To understand the strengths and limitations of this body of methodology
and use it effectively, we must understand the strengths and limitations of the statistical the-
ory from which they developed. This theory was developed by scientists and mathematicians
who brought together previous thinking and combined it with original thinking to bring
structure to it. But this thinking was not one-sided or unidirectional; there arose several views
on how to solve analytic problems. To understand how to approach the solving of an analytic
problem, we must understand the different ways different people tend to think. This history
of statistical theory behind the development of various statistical techniques bears strongly
on the ability of the technique to serve the tasks of a data mining project.
and limitations of this body of methodology and use it effectively, we must understand the
strengths and limitations of the statistical theory from which it developed. The thinking of the
scientists and mathematicians who developed the theory was not one-sided or unidirectional;
there arose several views on how to solve analytic problems. In order to understand how to ap-
proach the solving of an analytic problem, we must understand the different ways people tend
to think. This history of theory behind the development of analytic techniques bears strongly
on the ability of the technique to serve the tasks of an analytic project.
Analysis of patterns in data is not new. The concepts of average and grouping can be dated
back to about 1000 BC in China (Daobin, 1999). In ancient China and Greece, statistics were
gathered to help heads of state govern their countries in fiscal and military matters. These of-
ficial activities point to the likelihood that the words “statistic” and “state” evolved from the
same root. In the sixteenth and seventeenth centuries, games of chance were popular among
the wealthy, prompting many questions about probability to be addressed to famous mathe-
maticians (Fermat, Leibnitz, etc.). These questions led to much research in mathematics and
statistics during the ensuing years.
Two branches of statistical analysis developed in the 18th century, Bayesian and classical
statistics (Fig. 1.1). To treat both fairly in the context of history, both will be considered in
the first generation of statistical analysis. Thomas Bayes was an 18th-century theologian and
philosopher. At that time, it was primarily the theologians and philosophers who had enough
time to speculate on mathematical topics. Bayes believed that the probability of an event's
occurrence in the future is equal to the probability of its past occurrence divided by the prob-
ability of all competing events. Analysis proceeds based on the concept of conditional probabil-
ity: the probability of an event occurring given that another event has already occurred (past
events). Bayesian analysis begins with the quantification of the investigator's existing state of
knowledge, beliefs, and assumptions about past events. These subjective priors are combined
with observed data in a current experiment quantified probabilistically through an objective
function of some sort.
Bayes’ theorem is stated mathematically as the following equation:
P ( B| A ) P ( A )
P ( A|B ) =
P (B) (1.1)
where A is the event in question (see event) and B is the combined probability of all com-
peting events (for P(B) ≠ 0).
• P(A) and P(B) are the probabilities of observing A and B independently.
P(B) represents the combined probabilities of all competing events.
Modern Statistics: A Duality? 7
Interest in probability picked up early among biologists following Mendel in the latter part of
the 19th century. Sir Francis Galton, founder of the School of Eugenics in England, and his suc-
cessor Karl Pearson developed the concepts of regression and correlation for analyzing genetic
data. Later, Pearson and colleagues extended their work to the social sciences. While the devel-
opment of probability theory flowed out of the work of Galton and Pearson, early predictive
methods followed Bayes' approach. A major concern, however, was that Bayesian approaches
to inference testing could lead to widely different conclusions by different medical investiga-
tors, if they used different sets of prior probabilities. This set of prior probabilities included in
the calculations of Bayes Rule were subjectively selected, referred to as subjective priors.
8 1. The Background for Data Mining Practice
This situation bothered Ronald Fisher greatly, following Pearson as director of the center for
eugenics at the University College of London. In response, Fisher developed a system for infer-
ence testing in medical studies based on his concept of standard deviation. The classical statisti-
cal approach of Fisher (that flowed out of mathematical works of Gauss and Laplace) considered
that the joint probability, rather than the conditional probability of the Bayesians, was the appro-
priate basis for analysis. The joint probability function expresses the probability that X takes the
specific value x and Y takes value y, as a function of x and y jointly. There are no subjective priors
in this calculation, accepting only those data that can be measured at the same time during an ex-
periment. This means that only data captured in a given experiment could be used to predict an
outcome. Fisher's goal in developing his system of statistical inference was to provide medical
investigators with a common set of tools for use in comparison with studies of effects of different
treatments by different investigators. But in order to make his system work even with large sam-
ples, Fisher had to make a number of assumptions to define his “parametric model.”
Assumptions of the Parametric Model
1. Data fits a known distribution (e.g., normal, logistic, and Poisson)
Fisher's early work was based on calculation of the parameter, standard deviation, which
assumes that data are distributed in a normal distribution. The normal distribution is
bell-shaped, with the mean (average) at the top of the bell, with “tails” falling off evenly
at the sides.
Standard deviation of a variable is the square root of the quantity—the sum of all absolute
deviations of all values from the mean squared divided by the total count of the data points
(n) −1, as shown in Eq. (1.2).
å (x - x)
2
S=
n -1 (1.2)
where x is the value of one data point, x is the mean, and n is the total number of
data points. The subtraction of 1 from the total number expresses (to some extent) the
increased uncertainty of the result due to grouping (summing the squared deviations).
Subsequent developments used modified parameters based on the logistic and
Poisson distributions (logistic regression and Poisson regression). The assumption that
data follow a particular known distribution is necessary in order to draw upon the
characteristics of the distribution function for making inferences. All of these parametric
methods run the gauntlet of dangers related to force-fitting data from the real world into
a mathematical construct that may not fit (Fig. 1.2).
2. Factor independency
In parametric predictive systems, the variable to be predicted (Y) is considered as a
function of predictor variables (Xs) that are assumed to have independent effects on Y. That
is, the effect on Y of each X-variable is not dependent on the effects on Y of any other X-
variable. This situation could be created in the laboratory by allowing only one factor (e.g.,
a treatment) to vary while keeping all other factors constant (e.g., temperature, moisture,
and light). But in the real world, such laboratory control is not possible. As a result, it must
be possible to consider situations in which some factors do affect other factors, that is, have
a joint effect on Y. This problem is called collinearity. When it occurs between more than
2 factors, it is termed multicollinearity. The multicollinearity problem led statisticians to
Modern Statistics: A Duality? 9
FIG. 1.2 The normal curve showing that 95% (100 − 2.5 − 2.5) of the data are found between −2 and +2 standard
deviations from the mean.
include an interaction term in the relationship that supposedly represented the combined
effects. Use of this interaction term functioned as a magnificent kluge, and the reality of its
effects was seldom analyzed. Later development included a number of interaction terms,
one for each interaction the investigator thought might be present.
3. Linear additivity
Not only must the X-variables be independent in the parametric model, but also their
effects on Y must be cumulative and linear. That means that the effect of each factor is
added to or subtracted from the combined effects of all X-variables on Y. But what if the
relationship between Y and predictors (X-variables) is not additive, but multiplicative
or divisive? This is the case in modeling forests. General effects of light, moisture, and
nutrients must be multiplied together (not added) to relate to tree growth. Such functions
can only be expressed by exponential equations that usually generate very nonlinear
relationships. Assuming linear additivity for these relationships in the natural world
would cause large errors in the predicted values for tree growth (Botkin, 1993). This is
often the case with their use in business data systems.
4. Constant variance (homoscedasticity)
The variance throughout the range of each variable is assumed to be constant. This
means that if you divided the range of a variable into bins, the variance across all records
for bin #1 is similar to the variance for all the other bins of that variable. If the variance
throughout the range of a variable differs significantly from constancy, it is said to be
heteroscedastic. The error in the predicted value caused by the combined heteroscedasticity
among all variables can be quite significant.
5. Variables must be numerical and continuous
This assumption means that data must be numeric (or it must be transformable to a number
before analysis) and the number must be part of a distribution that is inherently continuous.
Integer values are not continuous, they are discrete (e.g., there are no integers between 1 and
2; hence, 1.3 is not an integer). Classical parametric statistical methods are not valid for use
with discrete data, because the probability distributions for continuous and discrete data are
different. Still, scientists and business analysts have used them anyway. Problems will be
greatest where the approximation to continuous data is not close.
Random documents with unrelated
content Scribd suggests to you:
CHAPTER X.
MAJOR BYNG’S SUGGESTION.
Major Byng, a wiry, dried-up little officer, with remarkably thin legs
and sporting proclivities, was reclining in a long chair, in the
verandah of the Napier Hotel, Poonah, smoking his after-breakfast
“Trichy,” and running his eye over the “Asian” pocket-book.
“Hullo, Byng, old man!” cried a loud cheerful voice, and looking up,
his amaze was depicted in the countenance he turned upon
Clarence Waring.
“Waring! Why—I thought,” putting down his book and sitting erect.
“Thought I had gone home—sold out and was stone broke. But
here I am, you see, on my legs again.”
“Delighted to hear it,” with a swift glance at Waring’s well-to-do air
and expensive-looking clothes. “Sit down, my dear boy,” he cried
cordially, “sit down and have a cheroot, and tell me all about yourself
and what has brought you back again to the land of regrets? Is it tea,
coffee, or gold?”
“Gold, in one sense. I am companion to a young millionaire, or
rather to the nephew of a man who has so much money—and no
children—that he does not know what to do.”
“And who is the young man? Does he know what to do?”
“His name is Jervis—his rich uncle is married to my sister; we are
connections, you see, and when he expressed a desire to explore
the gorgeous East, my sister naturally suggested me for the post of
guide, philosopher, and friend.”
Here Major Byng gave a short sharp laugh, like a bark.
“We landed in Bombay ten days ago, and are going to tour about
and see the world.”
“What is the programme?”
“My programme is as follows: Poonah races, Secunderabad races,
Madras races, a big game shoot in Travancore, expense no object,
elephants, beaters, club-cook, coolies with letters, and ice for the
champagne. Then I shall run him about in the train a bit, and show
him Delhi, Agra, Jeypore; after that we will put in the end of the cold
weather in Calcutta. I have lots of pals there, and from Calcutta we
will go to the hills, to Shirani. I shall be glad to see the old club again
—many a fleeting hour have I spent there!”
“That same club had a shocking bad name for gambling and bear
fighting,” said Major Byng significantly.
“I believe it had, now you mention it; but you may be sure that it
has reformed—like myself.”
“And this young fellow—what is he like?”
“Quiet, gentlemanly, easy-going, easily pleased, thinks every one
a good sort,” and Waring laughed derisively; “abhors all fuss or
show, never bets, never gets up in the morning with a head, no
expensive tastes.”
“In fact, his tastes are miserably beneath his opportunities! What a
pity it is that the millionaire is not your uncle!”
“Yes, instead of merely brother-in-law, and brothers-in-law are
notoriously unfeeling. However, I have adopted mine as my own
blood relation, for the present. I boss the show. Come and dine with
me to-night, and tell me all the ‘gup,’ and give me the straight tip for
the Arab purse.”
“All right. Is this young Jervis a sportsman?”
“He is a first-class man on a horse, and he plays polo, but he does
not go in for racing—more’s the pity!”
“Plays polo, does he? By Jove!” and an eager light shone in the
major’s little greenish eyes. “I’ve a couple of ponies for sale——”
“He does not want them now, whatever he may do later in Calcutta
or in the hills. I shall be looking out for three or four for myself, good
sound ones, mind you, Byng, up to weight. I’ve put on flesh, you see,
but I dare say my anxious responsibilities will wear me down a bit.
Jervis does not weigh more than ten stone, and, talk of the devil,
here he comes.”
Major Byng turned his head quickly, as at this moment Waring’s
travelling companion, a slight, active-looking young man, entered the
compound, closely pursued by a swarm of hawkers, and their
accompanying train of coolies, bearing on their heads the inevitable
Poonah figures, hand-screens, pottery, beetle-work, silks, silver, and
jewellery.
“I say, Waring,” he called out as he approached, “just look at me!
One would think I was a queen bee. If this goes on, you will have to
consign me to a lunatic asylum, if there is such a place out here.”
“Mark, let me introduce you to my old friend, Major Byng.”
Major Byng bent forward in his chair—to stand up was too great
an exertion even to greet a possible purchaser of polo ponies—
smiled affably, and said—
“You are only just out, I understand. How do you like India?”
“So far, I loathe it,” sitting down as he spoke, removing his topee,
and wiping his forehead. “Ever since I landed, I have lived in a state
of torment.”
“Ah, the mosquitoes!” exclaimed Major Byng, sympathetically; “you
will get used to them. They always make for new arrivals and fresh
blood.”
“No, no; but human mosquitoes! Touts, hawkers, beggars,
jewellers, horse-dealers. They all set upon me from the moment I
arrived. Ever since then, my life is a burthen to me. It was pretty bad
on board ship. Some of our fellow-travellers seemed to think I was a
great celebrity, instead of the common or ordinary passenger; they
loaded me with civil speeches, and the day we got into Bombay I
was nearly buried alive in invitations, people were so sorry to part
with me!”
“Here is a nice young cynic for you!” exclaimed Captain Waring,
complacently. “He is not yet accustomed to the fierce light that beats
upon a good-looking young bachelor, heir to thirty thousand a year
——”
“Why not make it a hundred thousand at once, while you are about
it?” interrupted the other impatiently. “How could they tell I was heir
to any one? I’m sure I am a most everyday-looking individual. My
uncle’s income is not ticketed on my back!”
“It was in one sense,” exclaimed Waring, with a chuckle.
“It was only with the common, vulgar class that I was so
immensely popular.”
“My dear fellow, you are much too humble minded. You were
popular with every one.”
“No, by no means; I could have hugged the supercilious old dame
who asked me with a drawl if I was in any way related to Pollitt’s
patent fowl food? I was delighted to answer with effusion, ‘Nephew,
ma’am.’ She despised me from the very bottom of her soul, and
made no foolish effort to conceal her feelings.”
“Ah! She had no daughters,” rejoined Waring, with a scornful
laugh. “The valet told all about you. He had nothing on earth to do,
but magnify his master and consequently exalt himself. Your value is
reflected in your gentleman’s gentleman, and he had no mock
modesty, and priced you at a cool million! By the way, I saw him
driving off just now in the best hotel landau, with his feet on the
opposite cushions, and a cigarette in his mouth. He is a magnificent
advertisement.”
They were now the centre of a vast mob of hawkers, who formed a
squatting circle, and the verandah was fully stocked. The jewellers
had already untied their nice little tin boxes from their white calico
wrappers, and their contents were displayed on the usual enticing
squares of red saloo.
“Waring Sahib!” screamed an ancient vendor with but one eye.
“Last time, three four years ago, I see you at Charleville Hotel,
Mussouri, I sell your honour one very nice diamond bangle for one
pretty lady——”
“Well, Crackett, I’m not such a fool now. I want a neat pearl pin for
myself.” He proceeded to deliberately select one from a case, and
then added with a grin, “That time, I paying for lady; this time,
gentleman,” pointing to Jervis, “paying for me.”
“I can’t stand it,” cried Jervis, jumping to his feet. “Here is the man
with the chestnut Arab and the spotted cob with pink legs, that has
been persecuting me for two days; and here comes the boy with the
stuffed peacock who has stalked me all morning; and—I see the girl
in the thunder and lightning waistcoat. I know she is going to ask me
to ride with her,” and he snatched up his topee and fled.
Major Byng noticed Jervis at the table d’hôte that evening. He had
been cleverly “cut off” from Waring, and was the prey of two over-
dressed, noisy young women. Mrs. Pollitt was mistaken, second-rate
people did come to India.
“I’ll tell you what, Waring!” he said to that gentleman, who was in
his most jovial, genial humour, “that young fellow is most shamefully
mobbed. His valet has given him away. If you don’t look out, he will
slip his heel ropes and bolt home. Pray observe his expression! Just
look at those two women, especially at the one who is measuring the
size of her waist with her serviette, for his information. He will go
back by the next steamer; it is written on his forehead!”
“No, he won’t do that,” rejoined Clarence, with lazy confidence.
“He has a most particular reason for staying out here for a while; but
I grant you that he is not enjoying himself, and does not appear to
appreciate seeing the world—and it is not a bad old world if you
know the right way to take it. Now, if I were in his shoes,” glancing
expressively across the table, “I’d fool that young woman to the very
top of her bent!”
In the billiard-room, when Mark joined them, Major Byng said—
“I saw your dismal plight at dinner, and pitied you. If you want to
lead a quiet life, and will take an old soldier’s advice, I would say, get
rid of the valet, send him home with half your luggage. Then start
from a fresh place, where no one knows you, with a good
Mussulman bearer, who is a complete stranger to your affairs. Let
Clarence here be paymaster—he can talk the language, and looks
wealthy and important—he won’t mind bearing the brunt, or being
taken for a rich man if the trouble breaks out again, and you can live
in peace and gang your ain gait.”
The Major’s advice was subsequently acted upon,—with most
excellent results. The cousins meanwhile attended the Poonah
races, where Clarence met some old acquaintances.
One of them privately remarked to Major Byng—
“Waring seems to have nine lives, like a cat, and looks most
festive and prosperous. I saw him doing a capital ready-money
business with the ‘Bookies’ just now—and he is a good customer to
the Para Mutual. It is a little startling to see him in the character of
mentor. I only hope he won’t get into many scrapes!”
“Oh, Telemachus has his head screwed on pretty tight, and he will
look after Waring—the pupil will take care of the teacher. He is a real
good sort, that boy. I wonder if his people know how old Clarence
used to race, and carry on and gamble at the lotteries, and generally
play the devil when he was out here?”
“Not they!” emphatically.
“He owes me one hundred rupees this three years, but he is such
a tremendous Bahadur now, that I am ashamed to remind him of
such a trifling sum. I sincerely hope that he has turned over a new
leaf and is a reformed character. What do you say, Crompton?”
“I say ‘Amen,’ with all my heart,” was the prompt response.
Mark Jervis had gone straight to the agents, Bostock & Bell’s, the
day he had landed in Bombay, and asked for his father’s address.
He only obtained it with difficulty and after considerable delay. The
head of the firm, in a private interview, earnestly entreated him to
keep the secret, otherwise they would get into trouble, as Major
Jervis was a peculiar man and most mysterious about his affairs,
which were now entirely managed by a Mr. Cardozo. Major Jervis
had not corresponded with them personally, for years. He then
scribbled something on a card, which he handed to the new arrival,
who eagerly read, “Mr. Jones, Hawal-Ghât, via Shirani, N.W.P.” The
major’s son despatched a letter with this superscription by the very
next post.
CHAPTER XI.
A RESERVED LADY.
A hot moonless night towards the end of March, and the up-mail
from Bombay to Calcutta has come to a standstill. The glare from the
furnace and the carriage lamps lights up the ghostly looking
telegraph-posts, the dusty cactus hedge, and illuminates a small
portion of the surrounding jungle. Anxiously gazing eyes see no sign
of a station, or even of a signalman’s hut, within the immediate glare
—and beyond it there looms a rocky, barren tract, chiefly swallowed
up in inscrutable darkness.
There is a babel of men’s voices, shrill and emotional, and not
emanating from European throats, a running of many feet, and
above all is heard the snorting of the engine and the dismal shrieks
of the steam whistle.
“What does it all mean?” inquired a silvery treble, and a fluffy head
leant out of a first-class ladies’ compartment.
“Nothing to be alarmed about,” responded a pleasant tenor voice
from the permanent way. “There has been a collision between two
goods trains about a mile ahead, and the line is blocked.”
“Any one killed?” she drawled.
“Only a couple of niggers,” rejoined the pleasant voice, in a
cheerful key.
“Dear me!” exclaimed the lady with sudden animation; “why,
Captain Waring, surely it cannot be you!”
“Pray why not?” now climbing up on the foot-board. “And do I
behold Mrs. Bellett?” as the head and shoulders of a good-looking
man appeared at the window, and looked into the carriage, which
contained a mountain of luggage, two ladies, a monkey, and a small
green parrot.
“Where have you dropped from?” she inquired. “I thought you had
left India for ever and ever. What has brought you back?”
“The remembrance of happier days,” he answered, with a
sentimental air, “and a P. and O. steamer.”
“But you have left the service, surely?”
“Yes, three years ago; it was too much of a grind at home.
Formerly I was in India on duty, now I am out here for pleasure. No
bother about over-staying my leave—no fear of brass hats.”
“Meanwhile, is there any fear of our being run into by another
train?” inquired the second lady nervously, a lady who sat at the
opposite side of the compartment with her head muffled up in a pink
shawl.
“Not the smallest; we are perfectly safe.”
“Captain Waring, this is my sister, Mrs. Coote,” explained Mrs.
Bellett. “And now perhaps you can tell us where we are, and what is
to become of us?”
“As to where you are, you are about three miles from Okara
Junction; as to what will happen to you, I am afraid that you will have
to walk there under my escort—if I may be permitted that honour.”
“Walk three miles!” she repeated shrilly. “Why, I have not done
such a thing for years, and I have on thin shoes. Could we not go on
the engine?”
“Yes, if the engine could fly over nearly a hundred luggage
waggons. It is a fine starlight night; we will get a lamp, and can keep
along the line. They have sent for a break-down gang, and we shall
catch another train at Okara. We will only have about an hour or two
to wait.”
“Well, I suppose we must make the best of it!” said Mrs. Coote,
“like others,” as numbers of natives flocked past, chattering volubly,
and carrying their bedding and bundles.
“I wish we could get supper at Okara,” said her sister. “I am sure
we shall want it after our tramp; but I know we need not build on
anything better than a goat chop, and the day before yesterday’s
curry. However, I have a tea-basket.”
“I can go one better,” said Captain Waring. “I have a tiffin-basket,
well supplied with ice, champagne, cold tongue, potted grouse—
cake—fruit——”
“You are making me quite ravenous,” cried Mrs. Bellett. “But how
are you to get all these delicacies to Okara?”
“By a coolie, I hope. If the worst comes to the worst, I will carry
them on my head, sooner than leave them behind. However, rupees
work wonders, and I expect I shall get hold of as many as will carry
the basket, and also your baggage; I suppose fifty will do?” and with
a grin, he climbed down out of sight.
“What a stroke of luck, Nettie!” exclaimed Mrs. Bellett. “He used to
be such a friend of mine at Mussouri, and imagine coming across
him in this way! He seems to be rolling in money; he must have
come in for a fortune, for he used to be frightfully hard up. I’m so
glad to meet him.”
“Yes, it’s all very fine for you, who are dressed,” rejoined the other
in a peevish voice; “but just look at me in an old tea-jacket, with my
hair in curling-pins!”
“Oh, you were all right! I’m certain he never noticed you!” was the
sisterly reply. “Let us be quick and put up our things. I wish to
goodness the ayah was here,” and she began to bustle about, and
strap up wraps and pillows, and collect books and fans.
Every one in the train seemed to be in a state of activity, preparing
for departure, and presently many parties on foot, with lanterns,
might be seen streaming along the line. Captain Waring promptly
returned with a dozen coolies, and soon Mrs. Bellett’s carriage was
empty. She and her sister were assisted by Captain Waring and a
young man—presumably his companion. Ere descending, Mrs.
Bellett, who had a pretty foot, paused on the step to exhibit the
thinness of her shoes, and demanded, as she put out her Louis-
Quatorze sole, “how she was to walk three miles in that, along a
rough road?”
The two ladies were nevertheless in the highest spirits, and
appeared to enjoy the novelty of the adventure. Ere the quartette
had gone twenty yards, the guard came shouting after them—
“Beg pardon, sir,” to Captain Waring, “but there is a lady quite
alone in my charge. I can’t take her on; I must stay and see to the
baggage, and remain here. And would you look after her?”
“Where is she?” demanded Waring, irritably.
“Last carriage but one—reserved ladies, first-class.”
“I say, Mark,” turning to his friend, “if she is a reserved lady, you
are all right. He is awfully shy, this young fellow,” he explained to his
other companions, with a loud laugh. “I don’t mind betting that she is
old—and you know you are fond of old women—so just run back like
a good chap. You see, I have Mrs. Bellett and her sister—you won’t
be five minutes behind us, bring on the reserved lady as fast as you
can.”
The other made no audible reply, but obediently turned about, and
went slowly past the rows of empty carriages until he came nearly to
the end of the train. Here he discovered a solitary white figure
standing above him in the open door of a compartment, and a girlish
voice called down into the dark—
“Is that you, guard?”
“No,” was the answer; “but the guard has sent me to ask if I can
help you in any way.”
A momentary pause, and then there came a rather doubtful
“Thank you.”
“Your lamp has gone out, I see, but I can easily strike a match and
get your things together. There is a block on the line, and you will
have to get down and walk on to the next station.”
“Really? Has there been an accident? I could not make out what
the people were saying.”
“It is not of much consequence—two goods trains disputing the
right of way; but we shall have to walk to Okara to catch the
Cawnpore mail.”
“Is it far?”
“About three miles, I believe.”
“Oh, that is not much! I have not many things—only a dressing-
bag, a rug, and a parasol.”
“All right; if you will pass them down, I will carry them.”
“But surely there is a porter,” expostulated the lady, “and I need
not trouble you.”
“I don’t suppose there is what you call a porter nearer than
Brindisi, and all the coolies are taking out the luggage. Allow me to
help you.”
In another second the young lady, who was both light and active,
stood beside him on the line. She was English; she was tall; and she
wore a hideously shaped country-made topee—that was all that he
could make out in the dim light.
“Now, shall we start?” he asked briskly, taking her bag, rug, and
parasol.
“Please let me have the bag,” she entreated. “I—I—that is to say, I
would rather keep it myself. All my money is in it.”
“And I may be a highwayman for what you know,” he returned,
with a laugh. “I give you my word of honour that, if you will allow me
to carry it, I will not rob you.”
“I did not mean that,” she stammered.
“Then what did you mean? At any rate I mean to keep it. The other
passengers are on ahead—I suppose you are quite alone?”
“Almost. There is a servant in the train who is supposed to look
after me, but I am looking after him, and seeing that he is not left
behind at the different junctions. We cannot understand one word we
exchange, so he grins and gesticulates, and I nod and point; but it all
comes to nothing, or worse than nothing. I wanted some tea this
morning, and he brought me whisky and soda.”
“And have you no one to rely on but this intelligent attendant?”
“No. The people I came out with changed at Khandala, and left me
in charge of the guard, and in a through carriage to Allahabad; and of
course we never expected this.”
“So you have just come out from home?” he observed, as they
walked along at a good pace.
“Yes; arrived yesterday morning in the Arcadia.”
“Then this is the first time you have actually set foot on Indian soil,
for trains and gharries do not count?”
“It is. Are there”—looking nervously at the wild expanse on either
hand—“any tigers about, do you think?”
“No, I sincerely hope not, as I have no weapon but your parasol.
Joking apart, you are perfectly safe. This”—with a wave of the
aforesaid parasol—“is not their style of hunting-ground.”
“And what is their style, as you call it?”
“Oh, lots of high grass and jungle, in a cattle country.”
“Have you shot many tigers?”
“Two last month. My friend and I had rather good sport down in
Travancore.”
“I suppose you live out here?”
“No, I have only been about six months in the country.”
“I wish I had been six months in India.”
“May I ask why?”
“Certainly you may. Because I would be going home in six months
more.”
“And you only landed forty-eight hours ago! Surely you are not
tired of it already. I thought all young ladies liked India. Mind where
you are going! It is very dark here. Will you take hold of my arm?”
“No, thank you,” rather stiffly.
“Then my hand? You really had better, or you will come a most
awful cropper, and trip over the sleepers.”
“Here is an extraordinary adventure!” said Honor to herself. “What
would Jessie and Fairy say, if they could see me now, walking along
in the dark through a wild desolate country, hand-in-hand with an
absolutely strange young man, whose face I have never even seen?”
A short distance ahead were groups of chattering natives—women
with red dresses and brass lotahs, which caught the light of their
hand-lanterns (a lantern is to a native what an umbrella is to a
Briton); turbaned, long-legged men, who carried bundles, lamps, and
sticks. The line was bordered on either hand by thick hedges of
greyish cactus; here and there glimmered a white flower; here and
there an ancient bush showed bare distorted roots, like the ribs of
some defunct animal. Beyond stretched a dim mysterious landscape,
which looked weird and ghostly by the light of a few pale stars. The
night was still and oppressively warm.
“You will be met at Allahabad, I suppose?” observed Honor’s
unknown escort, after a considerable silence.
“Yes—by my aunt.”
“You must be looking forward to seeing her again?”
“Again! I have never seen her as yet.” She paused, and then
continued, “We are three girls at home, and my aunt and uncle
wished to have one of us on a visit, and I came.”
“Not very willingly, it would seem,” with a short laugh.
“No; I held out as long as I could. I am—or rather was—the useful
one at home.”
“And did your aunt and uncle stipulate for the most useful niece?”
“By no means—they—they, to tell you the truth, they asked for the
pretty one, and I am not the beauty of the family.”
“No? Am I to take your word for that, or are you merely fishing?”
“I assure you that I am not. I am afraid my aunt will be
disappointed; but it was unavoidable. My eldest sister writes, and
could not well give up what she calls her literary customers. My next
sister is—is—not strong, and so they sent me—a dernier ressort.”
She was speaking quite frankly to this stranger, and felt rather
ashamed of her garrulity; but he had a pleasant voice, he was the
first friendly soul she had come across since she had left home, and
she was desperately home-sick. A long solitary railway journey had
only increased her complaint, and she was ready to talk of home to
any one—would probably have talked of it to the chuprassi,—if he
could have understood her!
Her escort had been an unscrupulous, selfish little woman, whose
nurse, having proved a bad sailor, literally saddled her good-natured,
inexperienced charge with the care of two unruly children, and this in
a manner that excited considerable indignation among her fellow
passengers.
“Why should you call yourself a dernier ressort?” inquired her
companion, after a pause, during which they continued to stumble
along, she holding timidly by the young man’s arm.
“Because I am; and I told them at home with my very last breath
that I was not a bit suited for coming out here, and mixing with
strangers—nothing but strangers—and going perpetually into what is
called ‘smart’ society, and beginning a perfectly novel kind of life. I
shall get into no end of scrapes.”
“May I ask your reason for this dismal prophecy?”
“Surely you can guess! Because I cannot hold my tongue. I blurt
out the first thing that comes into my head. If I think a thing wrong, or
odd, I must say so; I cannot help it, I am incurable. People at home
are used to me, and don’t mind. Also, I have a frightful and wholly
unconscious habit of selecting the most uncomfortable topics, and
an extremely bad memory for the names and faces of people with
whom I have but a slight acquaintance; so you see that I am not
likely to be a social success!”
“Let us hope that you take a gloomy view of yourself. For instance,
what is your idea of an uncomfortable topic?”
“If I am talking to a person with a cast in the eye, I am positively
certain ere long to find myself conversing volubly about squints; or, if
my partner wears a wig, I am bound to bring wigs on the tapis. I
believe I am possessed by some mischievous imp, who enjoys my
subsequent torture.”
“Pray how do you know that I have not a squint, or a wig, or both?
A wig would not be half a bad thing in this hot climate; to take off
your hair as you do your hat would often be a great relief! Ah, here
we are coming to the scene of the collision at last,” and presently
they passed by a long row of waggons, and then two huge engines,
one across the line, the other reared up against it; an immense
bonfire burnt on the bank, and threw the great black monsters into
strong outline. Further on they came to a gate and level crossing.
The gate of the keeper’s hut stood wide open, and on the threshold
a grey-haired old woman sat with her head between her knees,
sobbing; within were moans, as if wrung from a sufferer in acute
anguish. Honor’s unknown companion suddenly halted, and
exclaimed impulsively—
“I’m afraid some one has been badly hurt; if you don’t mind, I’ll just
go and see.”
Almost ere she had nodded a quick affirmative, he had vaulted
over the gate, and left her.
CHAPTER XII.
TWO GOOD SAMARITANS.
In all her life, the youngest Miss Gordon had never felt so utterly
solitary or forsaken as now, when she stood alone on the line of the
Great Indian Peninsular Railway. Before her the party of natives, with
their twinkling lanterns, were gradually reaching vanishing point;
behind her was a long, still procession of trucks and waggons, that
looked like some dreadful black monster waiting for its prey; on
either hand stretched the greyish unknown mysterious landscape,
from which strange unfamiliar sounds, in the shape of croakings and
cries, were audible. Oh! when would her nameless companion
return? She glanced anxiously towards the hut, it was beyond the
gate, and down a steep bank, away from the road; animated figures
seemed to pass to and fro against the lighted open door. Ah! here
came one of them, her escort, who had in point of fact been only
absent five minutes, and not, as she imagined, half an hour.
“It is a stoker who has been cut about the head and badly
scalded,” he explained breathlessly. “They are waiting for an
apothecary from Okara, and meanwhile they are trying a native herb
and a charm. They don’t seem to do the poor chap much good. I
think I might be able to do something better for him, though I have no
experience, beyond seeing accidents at football and out hunting; but
I cannot leave you here like this, and yet I cannot well ask you inside
the hut, the heat is like a furnace—and—altogether—it—it would be
too much for you, but if you would not mind waiting outside just for a
few minutes, I’d get you something to sit on.”
“Thank you, but I would rather go in—I have attended an
ambulance class—‘first aid,’ you know, and perhaps I may be of
some little use; there is sticking-plaster, eau-de-Cologne, and a pair
of scissors in my bag.”
“Well, mind; you must brace up your nerves,” he answered, as he
pushed open the gate, and led her down the crumbling sandy incline.
The heat within the hut was almost suffocating; as the girl,
following her guide, entered, every eye was instantly fixed upon her
in wide surprise.
By the light of a small earthen lamp, which smoked horribly, she
distinguished the figure of a man crouching on the edge of a
charpoy; he was breathing in hard hoarse gasps, and bleeding from
a great gash above his eye.
A Eurasian, in a checked cotton suit, stood by, talking incessantly
—but doing nothing else. There were also present, besides the old
woman—a veritable shrivelled-up hag—two native men, possibly the
“bhai-bands,” or chums of the sufferer; in a corner, a large black
pariah sat watching everything, with a pair of unwinking yellow eyes;
and on another charpoy lay a still figure, covered with a sheet. A few
earthen chatties, a mat, a huka, and some gaudy English prints—for
the most part nailed upside down—completed the picture. Hitherto
the travelling companions had been to each other merely the
embodiment of an undefined figure and a voice; the light of the little
mud lamp, whose curling smoke threw outlines of dancing black
devils on the walls, now introduced them for the first time face to
face. To Honor Gordon stood revealed an unexpectedly good-looking
young man, slight and well built, with severely cut features, and a
pair of handsome hazel eyes, which were surveying her gravely. A
gentleman, not merely in his speech and actions, but in his bearing.
He, on his part, was not in the least surprised to behold a pale but
decidedly pretty girl; by means of some mysterious instinct he had
long made up his mind that the owner of such a delicate hand and
sweet clear voice could not be otherwise than fair to see.
“The apothecary cannot be here for one hour!” exclaimed the
Eurasian, glibly. “He,” pointing to the patient, “is very bad. We have
put some herbs to his arm, and the back of his head; but I, myself,
think that he will die!” he concluded with an air of melancholy
importance.
Some kind of a bandage was the first thing Honor asked for, and
asked for in vain; she then quickly unwound the puggaree from her
topee, and tore it into three parts.
Then she bathed and bandaged the man’s head, with quick and
sympathetic fingers, whilst Jervis held the lamp, offered suggestions,
and looked on, no less impressed than amazed; he had hitherto had
an idea that girls always screamed and shrank away from the sight
of blood and horrors.
This girl, though undeniably white, was as cool and self-
possessed, as firm, yet gentle, as any capable professional nurse.
The scalded arm and hand—a shocking spectacle—were attended
to by both. The great thing was to exclude the air, and give the
sufferer at least temporary relief. With some native flour, a bandage
was deftly applied, the arm placed in a sling, and the patient’s head
was bathed with water and eau-de-Cologne. Fanned assiduously by
the girl’s fan, he began to feel restored, he had been given heart, he
had been assured that his hurts were not mortal, and presently he
languidly declared himself better.
The natives who stood round, whilst the sahib and Miss Sahib
ministered so quickly and effectually to their friend, now changed
their lamentations to loud ejaculations of wonder and praise. Miss
Gordon was amazed to hear her companion giving directions to
these spectators in fluent and sonorous Hindustani, and still more
astounded when, as she took up her topee, preparatory to departure,
the Eurasian turned to him, and said in an impressive squeak—
“Sir, your wife is a saint—an angel of goodness”—and then, as an
hasty afterthought, he added, “and beauty!”
Before Jervis could collect his wits and speak, she had replied—
“I am not this gentleman’s wife; we are only fellow-passengers.
Why should you think so?” she demanded sharply.
“Because—oh, please do not be angry—you looked so suitable,”
he answered with disarming candour. “Truly, I hope you may be
married yet, and I wish you both riches, long life, and great
happiness,” he added, bowing very low, lamp in hand.
Honor passed out of the hut, with her head held extraordinarily
high, scrambled up the bank, and proceeded along the line at a
headlong pace in indignant silence.
She now maintained a considerable distance between herself and
her escort; no doubt her eyes were becoming accustomed to the dim
light, and at any rate there was that in her air which prevented him
offering either arm or hand. In spite of the recent scene in which they
had both been actors, where he had clipped hair and cut plaster, and
she had applied bandages and scanty remedies to the same “case,”
they were not drawn closer together; on the contrary, they were
much further apart than during the first portion of their walk, and the
young lady’s confidences had now entirely ceased. She confined
herself exclusively to a few bald remarks about the patient, and the
climate, remarks issued at intervals of ten minutes, and her answers
to his observations were confined to “Yes” and “No.” At last Okara
station was reached; and, to tell the truth, neither of them were sorry
to bring their tête-à-tête to a conclusion. The dazzling lights on the
platform made their eyes blink, as they threaded their way to the
general refreshment room, discovering it readily enough by sounds
of many and merry voices, who were evidently availing themselves
of its somewhat limited resources.
It was not a very large apartment, but it was full. The table was
covered with a thin native tablecloth, two large lamps with punkah
tops, and two cruet-stands and an American ice-pitcher were placed
at formal intervals down the middle. It was surrounded with people,
who were eating, drinking, and talking. At the further end sat Captain
Waring, supported on either hand by his two fair companions, three
men—young and noisy, whom they evidently knew—and a prim,
elderly woman, who looked inexpressibly shocked at the company,
and had pointedly fenced herself off from Mrs. Bellett with a teapot
and a wine-card. Captain Waring’s friends had not partaken of tea
(as the champagne-bottle testified). The tongue, cake, and fruit had
also evidently received distinguished marks of their esteem. Mrs.
Bellett put up her long eyeglass, and surveyed exhaustively the pair
who now entered.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookluna.com