100% found this document useful (3 votes)
26 views

PDF Creating Good Data: A Guide To Dataset Structure and Data Representation Harry J. Foxwell Download

Representation

Uploaded by

mozarrink
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
26 views

PDF Creating Good Data: A Guide To Dataset Structure and Data Representation Harry J. Foxwell Download

Representation

Uploaded by

mozarrink
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Download the full version of the textbook now at textbookfull.

com

Creating Good Data: A Guide to Dataset


Structure and Data Representation Harry J.
Foxwell

https://textbookfull.com/product/creating-good-
data-a-guide-to-dataset-structure-and-data-
representation-harry-j-foxwell/

Explore and download more textbook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Guide to Data Analytics Aicpa

https://textbookfull.com/product/guide-to-data-analytics-aicpa/

textbookfull.com

Data Lake Analytics on Microsoft Azure: A Practitioner's


Guide to Big Data Engineering Harsh Chawla

https://textbookfull.com/product/data-lake-analytics-on-microsoft-
azure-a-practitioners-guide-to-big-data-engineering-harsh-chawla/

textbookfull.com

Data and the Built Environment: A Practical Guide to


Building a Better World Using Data 1st Edition Ian Gordon

https://textbookfull.com/product/data-and-the-built-environment-a-
practical-guide-to-building-a-better-world-using-data-1st-edition-ian-
gordon/
textbookfull.com

Professionalism and Teacher Education Voices from Policy


and Practice Amanda Gutierrez

https://textbookfull.com/product/professionalism-and-teacher-
education-voices-from-policy-and-practice-amanda-gutierrez/

textbookfull.com
A Course in Behavioral Economics Erik Angner

https://textbookfull.com/product/a-course-in-behavioral-economics-
erik-angner/

textbookfull.com

Advances in Neural Computation Machine Learning and


Cognitive Research III Selected Papers from the XXI
International Conference on Neuroinformatics October 7 11
2019 Dolgoprudny Moscow Region Russia Boris Kryzhanovsky
https://textbookfull.com/product/advances-in-neural-computation-
machine-learning-and-cognitive-research-iii-selected-papers-from-the-
xxi-international-conference-on-neuroinformatics-
october-7-11-2019-dolgoprudny-moscow-region-russia/
textbookfull.com

Painless Grammar Elliott

https://textbookfull.com/product/painless-grammar-elliott/

textbookfull.com

Creative Foundations 1st Edition Vicki Boutin

https://textbookfull.com/product/creative-foundations-1st-edition-
vicki-boutin/

textbookfull.com

Internal Security for Civil Services Main Examination GS


Paper III 2nd Edition M. Karthikeyan

https://textbookfull.com/product/internal-security-for-civil-services-
main-examination-gs-paper-iii-2nd-edition-m-karthikeyan/

textbookfull.com
Beginning Rails 6: From Novice to Professional Brady
Somerville

https://textbookfull.com/product/beginning-rails-6-from-novice-to-
professional-brady-somerville/

textbookfull.com
Creating
Good Data
A Guide to Dataset Structure and
Data Representation

Harry J. Foxwell
Creating Good Data
A Guide to Dataset Structure
and Data Representation

Harry J. Foxwell
Creating Good Data: A Guide to Dataset Structure and Data Representation

Harry J. Foxwell
Fairfax, VA, USA

ISBN-13 (pbk): 978-1-4842-6102-6 ISBN-13 (electronic): 978-1-4842-6103-3


https://doi.org/10.1007/978-1-4842-6103-3

Copyright © 2020 by Harry J. Foxwell


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with
every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an
editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the
trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not
identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to
proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Susan McDermott
Development Editor: Laura Berendson
Coordinating Editor: Jessica Vakili
Distributed to the book trade worldwide by Springer Science+Business Media New York, 1 NY Plaza,
New York NY 10004. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or
visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is
Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware
corporation.
For information on translations, please e-mail booktranslations@springernature.com; for reprint, paperback,
or audio rights, please e-mail bookpermissions@springernature.com.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and
licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales
web page at http://www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available to readers
on GitHub via the book’s product page, located at www.apress.com/9781484261026. For more detailed
information, please visit http://www.apress.com/source-code.
Printed on acid-free paper
To Eileen, for her endless love and support.
Table of Contents
About the Author����������������������������������������������������������������������������������������������������� ix

About the Technical Reviewer��������������������������������������������������������������������������������� xi


Acknowledgments������������������������������������������������������������������������������������������������� xiii

Introduction�������������������������������������������������������������������������������������������������������������xv

Chapter 1: The Need for Good Data��������������������������������������������������������������������������� 1


Who This Book Is For��������������������������������������������������������������������������������������������������������������������� 1
Assumptions���������������������������������������������������������������������������������������������������������������������������� 2
The Importance of Getting Data Right������������������������������������������������������������������������������������������� 3
What Exactly Is “Data” and Where Does It Come From?��������������������������������������������������������� 4
What Is “Good” Data?�������������������������������������������������������������������������������������������������������������� 5
Where “Bad” Data Comes From���������������������������������������������������������������������������������������������� 6
Preventive Action��������������������������������������������������������������������������������������������������������������������� 8
Summary�������������������������������������������������������������������������������������������������������������������������������������� 9
Chapter References���������������������������������������������������������������������������������������������������������������������� 9

Chapter 2: Basic Data Types and When to Use Them���������������������������������������������� 11


Four Analytic Data Types������������������������������������������������������������������������������������������������������������� 12
Nominal/Categorical Data������������������������������������������������������������������������������������������������������ 13
Ordinal Data��������������������������������������������������������������������������������������������������������������������������� 16
Ratio Data������������������������������������������������������������������������������������������������������������������������������ 21
Interval Data�������������������������������������������������������������������������������������������������������������������������������� 23
Other Data Types������������������������������������������������������������������������������������������������������������������������� 24
Summary������������������������������������������������������������������������������������������������������������������������������������ 25
Chapter References�������������������������������������������������������������������������������������������������������������������� 26

v
Table of Contents

Chapter 3: Representing Quantitative Data������������������������������������������������������������ 27


Units of Measurement����������������������������������������������������������������������������������������������������������������� 27
Magnitudes and Quantities���������������������������������������������������������������������������������������������������� 28
Time Data������������������������������������������������������������������������������������������������������������������������������ 29
Money and Currency Data����������������������������������������������������������������������������������������������������� 31
Transformations and Indexing����������������������������������������������������������������������������������������������� 31
Measurement Standards������������������������������������������������������������������������������������������������������������� 32
Other Quantitative Measurement Issues������������������������������������������������������������������������������� 33
Summary������������������������������������������������������������������������������������������������������������������������������������ 34
Chapter References�������������������������������������������������������������������������������������������������������������������� 34

Chapter 4: Planning Your Data Collection and Analysis����������������������������������������� 37


Describing, Comparing, and Predicting��������������������������������������������������������������������������������������� 37
Example: Choosing a Data Type�������������������������������������������������������������������������������������������������� 38
Plan for Visualizing Your Data and Analysis��������������������������������������������������������������������������� 39
Data Analysis Tools��������������������������������������������������������������������������������������������������������������������� 43
Summary������������������������������������������������������������������������������������������������������������������������������������ 44
Chapter References�������������������������������������������������������������������������������������������������������������������� 45

Chapter 5: Good Datasets��������������������������������������������������������������������������������������� 47


Sharing Data������������������������������������������������������������������������������������������������������������������������������� 47
Dataset Dictionaries/Metadata��������������������������������������������������������������������������������������������������� 48
Good Metadata���������������������������������������������������������������������������������������������������������������������� 49
What’s in a Name?���������������������������������������������������������������������������������������������������������������� 50
Dataset Formats�������������������������������������������������������������������������������������������������������������������������� 51
Keep It Simple����������������������������������������������������������������������������������������������������������������������� 52
Is Your Data Ready?��������������������������������������������������������������������������������������������������������������� 56
Summary������������������������������������������������������������������������������������������������������������������������������������ 57
Chapter References�������������������������������������������������������������������������������������������������������������������� 57

vi
Table of Contents

Chapter 6: Good Data Collection����������������������������������������������������������������������������� 59


What Is Bias?������������������������������������������������������������������������������������������������������������������������������� 59
Major Types of Bias��������������������������������������������������������������������������������������������������������������������� 60
Sampling Bias������������������������������������������������������������������������������������������������������������������������ 61
More Data Collection Problems��������������������������������������������������������������������������������������������������� 62
Recognizing and Reducing Bias�������������������������������������������������������������������������������������������������� 64
Understanding Outliers���������������������������������������������������������������������������������������������������������� 64
The Consequences of Bias���������������������������������������������������������������������������������������������������������� 65
Summary������������������������������������������������������������������������������������������������������������������������������������ 65
Chapter References�������������������������������������������������������������������������������������������������������������������� 66

Chapter 7: Dataset Examples and Use Cases��������������������������������������������������������� 67


The Titanic Survivor Dataset������������������������������������������������������������������������������������������������������� 67
The IBM Employee Attrition Dataset�������������������������������������������������������������������������������������������� 68
The Internet Movie Database (IMDb)������������������������������������������������������������������������������������������� 69
US Hurricane Data����������������������������������������������������������������������������������������������������������������������� 70
UFO Sighting Data����������������������������������������������������������������������������������������������������������������������� 71
Lessons Learned������������������������������������������������������������������������������������������������������������������������� 72
Useful Dataset Sources��������������������������������������������������������������������������������������������������������������� 72
Summary������������������������������������������������������������������������������������������������������������������������������������ 73
Chapter References�������������������������������������������������������������������������������������������������������������������� 73

Chapter 8: Cleaning Your Data�������������������������������������������������������������������������������� 75


Data Cleaning Challenges����������������������������������������������������������������������������������������������������������� 75
Assessing Data Quality���������������������������������������������������������������������������������������������������������� 77
Software and Methods for Data Cleaning����������������������������������������������������������������������������������� 77
General Procedures��������������������������������������������������������������������������������������������������������������� 77
Microsoft Excel���������������������������������������������������������������������������������������������������������������������� 78
R Project�������������������������������������������������������������������������������������������������������������������������������� 79
Python����������������������������������������������������������������������������������������������������������������������������������� 84

vii
Table of Contents

Operating System Utilities����������������������������������������������������������������������������������������������������� 88


AI/ML-Based Software����������������������������������������������������������������������������������������������������������� 89
Summary������������������������������������������������������������������������������������������������������������������������������������ 90
Chapter References�������������������������������������������������������������������������������������������������������������������� 90

Chapter 9: Good Data Analytics������������������������������������������������������������������������������ 93


What Is Good Analytics?�������������������������������������������������������������������������������������������������������������� 93
Big Data Analytics������������������������������������������������������������������������������������������������������������������ 94
Data for Good������������������������������������������������������������������������������������������������������������������������������ 95
Summary������������������������������������������������������������������������������������������������������������������������������������ 97
Chapter References�������������������������������������������������������������������������������������������������������������������� 97

Appendix A: Recommended Reading���������������������������������������������������������������������� 99


B
 ooks������������������������������������������������������������������������������������������������������������������������������������������ 99
Websites����������������������������������������������������������������������������������������������������������������������������������� 100
Oldies but Goodies�������������������������������������������������������������������������������������������������������������������� 101

Index��������������������������������������������������������������������������������������������������������������������� 103

viii
About the Author
Dr. Harry J. Foxwell teaches graduate data analytics courses
at George Mason University’s Department of Information
Sciences and Technology. He draws on his decades of
prior experience as a Principal System Engineer for Oracle
and for other major IT companies to help his students
understand the concepts, tools, and practices of big data
projects. He is a coauthor of several books on operating
systems administration and is a designer of the data analytics
curricula for his university courses. He is also a US Army
combat veteran, having served in Vietnam as a Platoon
Sergeant in the 1st Infantry Division. He lives in Fairfax,
Virginia, with his wife Eileen and two bothersome cats. Find out more about him at
https://cs.gmu.edu/~hfoxwell/.

ix
About the Technical Reviewer
Thomas Plunkett has extensive experience with big data and data analytics. He has
taught university courses on related technical topics.

xi
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Acknowledgments
I have benefited greatly from valuable encouragement and support for this work from
numerous colleagues at George Mason University. Dr. James Baldo, Director of the Data
Analytics Engineering program, provided helpful early advice and focus suggestions.
And special thanks to Ms. Vidhyasri Ganapathi, Teaching Assistant for several of my data
analytics courses, for identifying students’ challenges in learning and practicing data
science and for confirming their need for this guidance in preparing good datasets.

xiii
Introduction
Extracting actionable knowledge from data is a major ongoing challenge of modern IT in
corporations, governments, and academia. Creating effectively usable datasets requires
an understanding of data quality issues and of data types and the related analytics which
can properly be applied. There are numerous data analytics resources – books, articles,
blogs, and even commercial software – describing how to clean up and transform
data after it has been collected, yet there is little practical guidance on how to avoid or
minimize the typical “data cleaning” tasks beforehand. Such guidance and best practices
are needed to eliminate or reduce lengthy dataset preparation.
Data analysts are often simply presented with datasets for exploration and study
which are poorly designed, leading to difficulties in interpretation and to delays in
producing usable results. In fact, some analysts report spending up to 80% of their
time just getting data ready to be explored so that it can be effectively interpreted.
And much data analytics training and published resources focus on how to clean and
transform datasets before serious analyses can even begin. Inappropriate or confusing
representations, unit of measurement choices, coding errors, missing values, outliers,
and others can be avoided by using good data item selection, good dataset design and
collection, and by understanding how data types determine the kinds of analyses that
can be performed.
Why not create good data from the start, keeping in mind how it will be used,
rather than fixing it after it is collected?
Creating Good Data discusses the principles and best practices of dataset creation
and covers basic data types and their related appropriate statistics and visualizations.
Following these guidelines results in more effective analyses and presentations of
your research data. A key focus of this book is why certain data types and structures
are chosen for representing concepts and measurements, in contrast to the usual
discussions of how to analyze a specific data type once it has been selected.

xv
CHAPTER 1

The Need for Good Data


Without data you’re just another person with an opinion.
—W. Edwards Deming, Data Scientist [1]

Learning about data analytics tools and methods typically begins with discussions of
how to prepare a given dataset for analysis. The reason for this is that many datasets
have problems – defects in design, missing or incorrect data items, and non-standard
file formats. This often leads to lengthy and complex tasks required to produce datasets
ready for efficient analysis. Unfortunately, the critical first step – understanding the
nature of data representation – is frequently missing or not sufficiently addressed in
resources about data analytics, especially for practitioners just starting their technical
careers. Thus, in this chapter, we start with the detailed understanding of data – what it
is, how it is expressed, and what we mean by “good” and “bad” data. Only by basing your
analyses on good data will you produce trustworthy interpretations of your research,
leading to good decisions and knowledge-based actions. Let’s get started.

Who This Book Is For


The demand for data analytics professionals is growing dramatically. Universities are
scrambling to train new analysts and scientists, and this is reflected in the number
of new courses, books, and other resources which focus on tools and methods for
extracting knowledge from data. Creating Good Data focuses on the starting point for
analysis – data creation – for those whose tasks include gathering and interpreting data
from any discipline:

• Industry, business, and academic researchers and practitioners –


anyone who makes decisions based on data analytics

1
© Harry J. Foxwell 2020
H. J. Foxwell, Creating Good Data, https://doi.org/10.1007/978-1-4842-6103-3_1
Chapter 1 The Need for Good Data

• New data analysts and data scientists starting their careers

• Corporate trainers and university instructors who teach data


analytics

• Students who are learning methods and tools for exploring data

Assumptions
We assume you have a basic knowledge of statistical methods and tools for summarizing
and visualizing datasets, including using tools such as R, Python, and SQL, and perhaps
some familiarity with commercial software such as SAS, SPSS, and Tableau. Many of you
likely already have a library of data analytics texts and other resources that cover data
cleaning and presentation, but who would like “early intervention” in dataset design.
All professionals in the rapidly growing data analytics field can benefit from
instruction on creating data themselves or on guiding others who will create datasets
for their analyses. Data analysts who are called upon to explore and explain other
researchers’ data can thus guide and encourage the creation of better datasets.
Readers of Creating Good Data will use it regularly as a reference, for practitioners
as well as for students taking data analytics courses. The book can also serve as a
supplementary textbook for such courses.
By the end of Creating Good Data, you will understand

• Principles and best practices for creating and collecting data

• Basic data types and representations


• How to select data types, anticipating analysis goals

• Dataset formats and best practices for creating and sharing datasets
• Examples and use cases (good and bad)

• Dataset creation and cleaning tools

And you will be able to create datasets that

• Clearly represent the measurements, quantities, and characteristics


relevant to your research

• Minimize time-consuming data cleaning prior to analysis

• Permit clear and accurate statistical summaries and visualizations

2
Chapter 1 The Need for Good Data

Brief code examples from R, Python, and SQL will be included, but this book is not
intended to be a complete tutorial for data analysis coding in those languages – there are
plenty of those [2,3,4]. Our focus will be on dataset format and data representation using
those programming tools.

The Importance of Getting Data Right


Research and exploration of any kind frequently starts with an idea, inspiration, or
curious observation about some phenomenon. Then some claim is made about the
nature of that phenomenon. Data provides evidence for or against the claim. Without
evidence – good evidence (i.e., good data!) – such claims are essentially worthless. And
we approach the process of validating or falsifying the claim with a scientific attitude [5].
That is, we care about evidence and will change our assumptions and theories if new
evidence requires such change. That’s why the field of data analytics (the “synthesis of
knowledge from information”) is part of data science:
…the extraction of useful knowledge directly from data through a process of
discovery, or of hypothesis formulation and hypothesis testing. [6]
A data scientist must therefore understand and implement the concepts, tools, and
processes necessary to create, manage, and extract value from data, from the creation
of the data through to the decisions and actions based upon the analytical results.
Figure 1-1 illustrates a typical data analytics process. In this book, we focus on the initial
steps needed to produce good data and to minimize time-consuming data cleaning and
transformation tasks.

Figure 1-1. Typical steps in the data analytics process

3
Chapter 1 The Need for Good Data

What Exactly Is “Data” and Where Does It Come From?


Informally, “data” can be thought of as any collection of symbols representing a set of
measurements or observations about some event or occurrence. Other meanings might
include lists of “facts” or “statistics,” although any collection of words, documents, web
pages, and emails can also be considered data. Some such data is purposely designed
and collected, as in scientific studies, but other data might be considered “accidental” –
likely no one purposely designed Twitter as a formal data collection system, yet today it
has evolved into a rich mine of useful knowledge about political and social sentiment,
and even a source of information about public health and disease epidemic outbreaks.
More specifically, and we will say more about this in the next chapters, data consists
of numbers, characters, words, images, and other symbols, which have definitive types
and characteristics that directly imply how to summarize and visualize their meanings
and relationships.
Our interconnected, digital world is awash with digital data. Social media,
commerce and business records, scientific measurements, sports statistics, government
records, traffic surveillance, health records, wearable devices – the list is endless. The
sheer amount of that data and the speed with which it comes at us is enormous and
growing rapidly. For example, in a single minute of Internet activity, shown in Figure 1-2,
nearly half a million Tweets, a million Facebook logins, almost two million emails, and
18 million text messages are happening, and that’s just mostly from social media and
from personal and business communications.

4
Chapter 1 The Need for Good Data

Figure 1-2. Data generated during a single Internet minute in 2018 [7]
www.visualcapitalist.com/internet-minute-2018/

What Is “Good” Data?


Good data comes from explicit design and collection decisions about how to represent
individual data items and how to present them in a dataset. It permits timely,
informative, and ethical analytics and conclusions. Good data items have several critical
characteristics needed to ensure valid and useful analysis:

• Accuracy

• Measurements and characteristics must correctly reflect what is


being observed.

5
Chapter 1 The Need for Good Data

• Relevance

• Items selected for analysis must directly relate to the


phenomenon being studied.

• Representative

• Data types must be chosen appropriately to reflect what is being


studied.

• Well-defined
• Data items’ meanings must be unambiguously defined in a
schema, metadata, or data dictionary.

• Complete

• Selected data items must include all potentially relevant


measurements and characteristics.

• Granular

• Selected data types should have sufficient range and detail to


capture the full variability of the data items.

“Data are people” [8]. Getting data “right” can have important, at times life-critical,
consequences – like data from testing the effectiveness of the Ebola vaccine, calculating
consumer financial decisions based on credit scores, or determining sentences for
crimes. Awareness and ethical practices concerning human-relevant data should always
be implemented in data selection and dataset management.
Good data even has the potential for changing fundamental beliefs. The astronomer
Kepler was taught and strongly believed that planetary orbits must be perfect circles;
his Mars data proved otherwise and led to his famous formulation of the laws of motion
for the planets. And today, climate scientists produce and publish data with the hope of
convincing the world about the dangers of climate change. Bad climate study data and
analysis simply encourages dangerous climate change denial; good climate data has the
potential for changing minds.

Where “Bad” Data Comes From


Bad data, on the other hand, hinders or delays analysis and almost certainly results in
misleading, inaccurate, or even harmful conclusions. The well-known phrase “garbage

6
Chapter 1 The Need for Good Data

in, garbage out” succinctly describes this situation. The sources of bad data include pre-­
collection design decisions, collection errors, and post-collection interpretation errors.
Understanding these sources and planning to address them is essential to effective and
accurate dataset analysis.

Some Causes of Bad Data


“Bad” data is often the result of human error and poor planning. Obtaining good data is a
complex process with many opportunities for mistakes:
• Creation and pre-collection errors

• Methodological failures: Poorly designed experiments, surveys,


or instrumentation

• Bad documentation: Unclear definitions of terms and missing or


confusing schema, metadata, or data dictionary

• Misspecification of data types and formats: Misunderstanding


the purpose and selection of data types and forgetting or avoiding
standard data types

• Collection errors

• Poor collection instructions and methods: Lack of clear processes


for recording data

• Ineffective enforcement of data recording rules: Lack of


monitoring and oversight
• Misinterpretation of data items: Lack of clarity about item
meanings

• Transcription/typos: No checking or validation of recorded data

• Fraudulent answers/observations: Purposely misleading or


nonsensical responses

• Missing data: Failure to understand and correct reasons for


non-­answers

• Impossible, out-of-range data: No bounds checking on data items

7
Chapter 1 The Need for Good Data

• Post-collection and analysis errors

• Moving/copying: Recording or storage-related mistakes

• Misinterpretation: Misunderstanding meaning of data items or


responses

• Timeliness, data “rot”: Expiration of times, locations, or other


characteristics

• Some typical recording errors


• Recording numbers with leading zeros (0013 instead of 13)

• Using uppercase letter O for number zero (0); hard to spot

• Transposing letters or numbers (LA for AL, 32 for 23)

• Inconsistent use of naming conventions (Italy/Italia, US/USA,


Germany/Deutchland/Allemagne)

Preventive Action
Some data analysts report spending the majority of their time on a project cleaning,
transforming, and preparing their assigned datasets [9]. Obviously, this is costly in time,
money, and technical resources. And as Deming also points out, trying to solve this
problem after the data have been created is not an effective solution:
Inspection does not improve the quality, nor guarantee quality. Inspection
is too late. The quality, good or bad, is already in the product. —W. Edwards
Deming [1]
That is, the “product” – data – needs to be created from the start using practices and
components that at least minimize the “bugs” in your datasets. Of course, eliminating
all such problems is probably not possible, but if you can get off to a good start with your
analytical projects’ data, you will produce better and more trusted results.

8
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Chapter 1 The Need for Good Data

S
 ummary
In this introductory chapter, we learned about the need for good data, what we mean by
“good” and “bad” data, and the origins of potential dataset problems. Minimizing such
problems requires awareness of how data collection can fail and by using procedures
that ensure quality project design and execution.
The next chapter examines the numerous data types and formats which can be used
to represent observations. Thoroughly understanding these data characteristics and
using them appropriately will help you significantly in designing and executing your
research.

C
 hapter References
[1] W. Edwards Deming Quotes, https://quotes.deming.org/

[2] Mailund, Thomas. Beginning Data Science in R. New York NY:


Apress, 2017.

[3] Hui, Eric. Learn R for Applied Statistics. New York NY: Apress,
2019.

[4] Nelli, Fabio. Python Data Analytics. New York NY: Apress, 2018.

[5] McIntyre, Lee. The Scientific Attitude. Cambridge MA: MIT


Press, 2019.

[6] NIST, Special Publication 1500-1: Big Data Interoperability


Framework: Volume 1, Definitions, 2017, https://bigdatawg.
nist.gov/_uploadfiles/NIST.SP.1500-1r1.pdf

[7] Desjardins, Jeff; “What Happens in an Internet Minute in


2018?,” Visual Capitalist, May 14, 2018, www.visualcapitalist.
com/internet-minute-2018/, used by permission.

[8] Ten simple rules for responsible big data research, https://
dash.harvard.edu/bitstream/handle/1/32630692/5373508.pdf

[9] Only 3% of Companies’ Data Meets Basic Quality Standards,


­https://hbr.org/2017/09/only-3-of-companies-data-meets-
basic-­quality-­standards

9
CHAPTER 2

Basic Data Types


and When to Use Them
Without a systematic way to start and keep data clean, bad data will
happen.
—Donato Diorio [1]

Decisions about how to represent data measurements for your research projects
have important consequences – they directly determine what kinds of statistical and
visualization methods can ultimately be used for analysis and presentation of your
results. This means you need to select representation types thoughtfully with your
analytical goals in mind while at the same time trying to avoid any form of bias in what
you decide to measure and what you anticipate your data exploration and analysis tasks
will look like.
When we consider a “type” for a data item, we need to specify its context and
purpose for our discussion. For programming languages, we define computational
data types (storage formats), such as integer (short and long), floating point (single and
double precision), character (single and multicharacter strings), Boolean (0/1, T/F),
and derived types such as pointers, how many bits or bytes they use, and how they are
referenced by the syntax of the language.
For data analytics, however, we focus on how a data item is to be used and
interpreted and so refer to analytical data types. Additionally, we categorize such data
items as qualitative or quantitative, and then we discuss how varieties within these two
categories represent specific measurement requirements.

11
© Harry J. Foxwell 2020
H. J. Foxwell, Creating Good Data, https://doi.org/10.1007/978-1-4842-6103-3_2
Chapter 2 Basic Data Types and When to Use Them

Numeric quantities and non-numeric measurements and characteristics can also


be classified into several subtypes depending on their levels of descriptive precision and
granularity. In this chapter, we discuss the varieties of analytical data types and review
their typical statistical and visual summary methods.

Four Analytic Data Types


Keep in mind that the purpose of data analytics is to describe, compare, and predict
useful insights about some phenomenon. Correctly choosing data types for those tasks
must therefore anticipate and enhance our ability to obtain those insights, and such
choices ultimately will determine the proper and relevant tools and methods that can be
used in our analyses.
Before you can describe a phenomenon and select a representation for it, you must
first understand and define it. Even for something as clear and “obvious” as a physical
quantity (e.g., mass) or as elusive as a mental attitude (e.g., prejudice), precise definition
is needed. This helps to avoid misinterpretation during later analysis. So, for example,
you might specifically define a survey respondent’s age as “the exact number of years
and months since their officially recorded birthday” or define their politics as “their
most recent registered membership in a recognized political party.” Only after properly
defining what you need to characterize can you select an appropriate representation
type. Such definitions must have the potential to capture the full range of possible values
for what is being measured or characterized.

Note There is no absolute correct way to represent a quality or quantity, only


what is helpful in description and comparison for your analytic requirements.

We will now discuss the four generally accepted data types, Nominal, Ordinal,
Interval, and Ratio (often abbreviated NOIR), and include brief Python or R code
examples to illustrate basic statistical and visualization methods for the data types
presented.

12
Chapter 2 Basic Data Types and When to Use Them

Nominal/Categorical Data
Nominal (also called categorical) measures are used to capture qualitative attributes that
have no size or extent characteristics. They are simply labels or names for some observed
attribute such as country of birth, occupation, manufacturing brand, or language
spoken. This also includes binomial (or dichotomous) characteristics such as true/
false, yes/no, or agree/disagree. Such measures explicitly have no implication of order
among the labels. For example, for a person’s primary spoken language (English, French,
Spanish, Farsi, Chinese, etc.), there is no implied order among the languages; you
can’t claim that French is “bigger” than English in any linguistic or mathematical sense
(unless perhaps if you are from France!). Moreover, you can’t do any kind of arithmetical
operations among the category members – there is no concept of a “mean language.”
A nominal data representation must have two important characteristics: its
categories must be exhaustive and mutually exclusive. Exhaustive means that the
categories cover all possible values in some manner, although there might be a generic
“other” category that encompasses multiple cases of low frequency or importance.
Mutually exclusive means there cannot be any cases that belong to more than one
category. Nominal data items might have many categories represented (e.g., country of
birth, where there are nearly 200) or only a few such as gender (male or female).

Tip Many studies have used gender as a qualifying nominal variable, but such
classification is not always well-defined and recent usage might include “other” or
specific “non-binary” values. Be aware of any such relevant ambiguities in the data
items you select for your analysis.

Because there is no implied quantity for a nominal data item, this limits the kinds
of summary statistics, visualizations, and comparisons that are allowable for analysis.
Nominal data item collections don’t have means, maxima or minima, or measures of
variability like standard deviations. All that is possible to characterize such data are
frequencies of occurrence – how many there are in each category, which can be expressed
as absolute counts or as percentages or proportions of the total. Visual summaries of
such data include various forms of bar charts and pie* charts. And when counting the
frequency of items in each possible category, the category with the largest number
of items is the mode (although there might be several categories with relatively high
frequencies, referred to as multimodal).

13
Chapter 2 Basic Data Types and When to Use Them

To illustrate nominal data (and subsequent data types), let’s examine a sample
synthetic dataset (constructed for illustration) of hypothetical college graduates, GD-­
Data.csv [3]. Only the first ten records of 500 are shown:

gender;age;degree;field;wrkfld;annsal;payfair;jobsat
Female;40;BS;Engr;Yes;78.0;4;4
Male;39;MS;Engr;Yes;64.0;4;4
Male;36;MS;Comp;No;70.0;3;4
Male;42;MS;Comp;Yes;85.0;5;3
NotSay;39;BS;Comp;Yes;71.0;5;3
Male;38;MS;Biol;Yes;113.0;3;4
Male;38;MS;Comp;Yes;84.0;3;3
Male;37;MS;Chem;Yes;61.0;3;2
Other;28;MS;Chem;Yes;72.5;3;2
Female;31;MS;Comp;Yes;73.5;4;3
...

Data fields and types in this dataset are defined as follows:

gender: Male, Female, Other, NotSay (String/Nominal)


age: years old (Integer/Ratio)
degree: BS, MS, PHD (String/Ordinal)
field: Engr, Comp, Biol, Phys, Chem (String/Nominal)
workfield: (working in field?) Yes, No (String/Nominal)
annsal: annual salary in $K (Float/Ratio)
payfair: (are paid fairly?) 1-5 (Integer/Likert)
jobsat: (are satisfied in job?) 1-5 (Integer/Likert)
...

As we see from Listing 2-1, field is a character string representing a category, so the
permissible analytics for this data item includes counts, relative frequencies, and bar
chart visualizations, as shown in Figure 2-1.

14
Random documents with unrelated
content Scribd suggests to you:
Evelyn took the handkerchief with a trembling hand, and
examined the corner, close to the lantern. "My husband's,"
she said.

"Then he has taken the wrong turn and gone this way."

"Clever girl," Mr. Trevelyan could not help murmuring. "Lead


on, Jean. Remember, the handkerchief may have been
blown from a distance. However, our business now is to
reach Walters. Take my lantern, if you like. Adams will light
us. But don't go too far."

"If he has been all these hours wandering among the


marshes—" whispered Evelyn. "And no possible shelter—"

"It would be exhausting," assented Mr. Trevelyan. "But I


hope we shall be in time to prevent ill effects."

They pressed on; Jean still the leader, showing a sagacity


which would have fitted her for a mountaineer's or a
backwoodsman's life. Soon they saw a lantern and two dim
figures advancing, and when within hearing distance, Mr.
Trevelyan shouted—

"Found anything?"

"No, sir! Nothing."

Evelyn stood still, conscious of failing power. "May I rest one


moment?" she asked.

"Ah! I was afraid—" Mr. Trevelyan checked himself, for


reproaches now were useless. "Lean all your weight on me,
Mrs. Villiers. So—your whole weight. I must not let you sit
down. It will pass off. Don't speak for a minute, but keep up
a brave heart."
She sighed inaudibly, and closed her eyes. Mr. Trevelyan
stood like a rock, supporting her; and the two men came
up. A few words were exchanged. They had searched in
vain for traces of General Villiers, and had themselves
become what Jean called "entangled among the dykes,"
losing their bearings, and unable to find the path they had
left. Walters was not at home in the marshes, and Ricketts
was by no means a brilliant youth. Jean's approaching
lantern had been their guide.

"No use going farther that way, if you have hunted


thoroughly," said Mr. Trevelyan.

They had done so, Walters averred—thoroughly. "All along


the dykes, and all round about, everywhere."

"Through the furthest corner of the meadow beyond the


second dyke from here?" promptly asked Jean, indicating
the direction.

There was a pause. No, they had not ventured quite so far.
They had only taken a look at the said corner.

"It was an awful bad bit to get over," Ricketts said solemnly.
"The General could never have taken that way, sure! The
stile was leaning to one side, the path almost broken away;
and the piece of marsh beyond was enclosed with dykes—
no second way out of it."

"Father, may I see?"

"Yes! You know what you are about. Only take care. After
that, we must get Mrs. Villiers home."

Jean moved off, and Ricketts, a big awkward boy, straggled


uncertainly in her rear. Like most of the poor in Mr.
Trevelyan's parish, he adored Jean, looking upon her as a
creature of another sphere.

Mr. Trevelyan despatched Adams to the Ricketts' cottage, to


make sure that General Villiers had not meantime found his
way thither; and he kept Walters by him with the other
lantern, till Jean should return. They could follow her swift
sure movements by her lantern, as she climbed the nearest
stile, and struck across the snow beyond.

Then Evelyn roused herself. "Thanks," she said. "I am so


sorry to be a trouble. I think I was faint for a minute. I am
better now. Where is Jean?"

"She will be back directly. She has gone to take a look


beyond where Walters was."

"Gone alone! Are you not afraid? I don't know which is the
most wonderful, you or Jean. Suppose she were to slip into
the water? O do come too."

"Perhaps you had better move; you will be getting chilled.


Gently, there is no hurry," he said. "When Jean comes back,
I am going to send you and her home."

Jean did not return quickly. The three went over the first
dyke, Walters leading; and then they followed Jean's small
footprints. A minute later, Jean's voice rang out from the
distance, clear and thrilling, with a now sound in it.

"She's found something," Walters exclaimed.

"Father! Come!" The distant appeal cut like a blade through


the air, yet Jean did not scream.

"Will you wait here with Walters? I must take the lantern.
Don't stir till I come back."
"O no—I must go too."

To pause for discussion was impossible. The second dyke


had to be reached and crossed, and the crossing, it could
not be denied, was "awful bad." Mr. Trevelyan lifted Evelyn
sheer over the stile, and all but carried her through the half
knee-deep slush beyond—slush just enough frozen to be
slippery, not enough to keep them from sinking into it. A
false step might have plunged both into the dyke; but the
other side was reached, and Jean came to meet them.

Mr. Trevelyan knew in a moment—knew as his eyes fell


upon Jean—what had happened. He had never before seen
her thus. Every trace of colour had left her face, and the
eyes looked out fixedly from two surrounding hollows which
had suddenly sprung into existence. Yet Jean was herself,
which means that she was not thinking of herself.

"Not Evelyn!" broke from her blanched lips, and she


clutched Evelyn's hands, as in a vice, with ice-cold fingers.
"Father—you and Walters—over there—not Evelyn! O not
Evelyn!"

"Why not?" Evelyn was perfectly composed now, not nearly


so pale as Jean.

"I can't tell you! Father, don't let her! Don't let her!" cried
Jean, in smothered agony. "Keep her away! Don't let her
go!"

"Jean, dear, I am not a child to be made. Your father knows


me better. Tell me, have you found him?"

"If you would but wait—till the others have been—"

"No: I am going on."


Jean's arm fell with a despairing gesture. Then she grasped
the lamp which Walters held, thrust it into her father's
hand, muttered hoarsely—

"Come—after us—" and led the way with Walters, urging


him vehemently to a speed far beyond Evelyn's powers.

Mr. Trevelyan, fully understanding, did his best to hold Mrs.


Villiers back.

So Jean and Walters arrived first at the spot, where General


Villiers lay, face downwards, on the snow-covered grass,
which here sloped a little into the almost full dyke. He
seemed to have fallen thus, exhausted, overcome with cold,
and his face was buried in the half frozen mud.

Jean by herself had struggled in vain to move the prostrate


figure, but Walters could do what she had failed to
accomplish. When Evelyn came up, her husband lay asleep,
as it might be, on a winding-sheet of pure snow, "like a
warrior taking his rest," calm and silent; while Jean knelt
beside him, wiping away with her handkerchief the mud
which still clung to the sealed eyes, the rigid and purple
lips.

"My poor master!" groaned Walters.

Evelyn did not at once grasp how things were, or, if in her
heart she knew, she would not at once accept the truth. A
mist over the moon had thickened into cloud, blotting out
most of its light, but now the cloud rolled on, leaving a clear
landscape. The quiet face could be seen plainly—hardly
paler than Jean's. Evelyn's glance went from the one to the
other.

"He is found!" she said; and putting Mr. Trevelyan aside, she
went forward alone.
"Then he lost his way, as Jean thought. And Jean has found
him!—Jean!" with an accent of wonder. "Has he fainted? We
must got him home quickly. He is so cold—only feel him!
Cannot we give him—something—do—something?"
uncertainly, as if she did not know what she said. "William,
dearest! Dear—I have come to you."

Jean, shaken by the shock of her discovery, could not


endure this. One hard sob broke from the girl's lips, racking
her whole frame in its struggle to escape, and startling
herself at least as much as others.

"Jean!" her father breathed, and she had herself in hand


instantly; but that slight sound had done the business.

Evelyn looked across, with a dim smile, full of anguish.

"Jean!" she said. "Why—Jean—"

Then she swayed slowly forward and fell, prone and


senseless, upon her husband's body.

CHAPTER IX.

BROUGHT HOME.

"Life and Thought have gone away


Side by side,
Leaving door and windows wide;
Careless tenants they!"

"All within is dark as night:


In the windows is no light;
And no murmur at the door,
So frequent on its hinge before."
TENNYSON.

WHETHER he had simply lost his way in the storm, and had
wandered to and fro among the marshes, finding himself
again and again turned back by intercepting dykes, till so
exhausted that when he slipped and fell, he had no strength
to rise; or whether some undetected heart-weakness,
rendering him unfit to cope with the icy gale, had resulted
in sudden failure of the heart's action, who of those present
could say? All was over, long before they found him.

He had died, it would seem, a painless death, even though


in some measure, a death of suffocation. He had met the
great change suddenly, quietly, in pursuit of duty, in an act
of unselfish kindness. The look in the dead face was not as
of one conquered, but as of one victorious. To such a man
as General Villiers, living habitually in the presence of his
God, death, however unexpected, could not in effect be
sudden, since he was always ready for it.

Jean would never in future years forget those few minutes,


when she stood alone beside the lifeless body. She had not,
it is true, any very strong liking for the General personally.
He had been kind to her in a ceremonious fashion, and she
had looked upon him as the inevitable appendage to his
wife, whom she passionately loved—not in all respects a
satisfactory appendage, viewed with Jean's fastidious eyes,
because she privately counted that he did not fully
appreciate Evelyn.

Perhaps the parting between husband and wife, witnessed


by her that afternoon, had somewhat shaken this aspect of
matters. In any case, the General had been a familiar figure
in Jean's life; a fine figure always, manly and gentlemanly;
and to see him thus was terrible—lying dead on the cold
white snow, bathed in the cold white moonlight, with the
cold white marshes around—while not another human being
was near. There lay the pull. We are so constituted that the
mere fact of somebody near, at such a moment, is a help—
even though the somebody may be powerless to assist.

Had a mere child stood by, the chill of that icy solitude
would not have entered, as it did, into the very depths of
Jean's organisation. Her actual grief was, indeed, for
Evelyn, not for herself; but nine-tenths of what Jean
suffered in life always had been and always would be for
others: and the suffering was no whit less keen on that
account. Rather, it was more keen, because more pure and
noble in kind.

Evelyn's fainting fit did not last long, and when she rallied,
the native force of her character at once asserted itself.
Instead of giving way to a display of grief, adding to others'
difficulties, she stood resolutely up, insisted on walking, and
decisively set Mr. Trevelyan free, as well as Walters and
Adams—the latter having returned—for the heavy task
before them. Ricketts had been sent to the cottage to
procure a shutter, and if possible, additional help. To convey
such a weight over such ground would be no light matter;
and a man lodging there, but seldom back till late, would
probably be in by this time. The lad's own lameness
rendered him of small avail.
"Jean will give me her arm. I want nothing more," Evelyn
said steadily. "Only Jean, please. I shall not faint again. You
must not think of me at all. We will go on, and—you will
bring him home—quickly, please!" with unutterable
entreaty.

Even Mr. Trevelyan's stoicism was not proof against her


look.

"If—if anything can be done—" But she did not finish her
sentence, for she knew as well as he that it was too late,
that nothing whatever could be done.

The Rector's eyes were full, nay, wet all round.

"I cannot thank you!" she said. "I owe you—so much!
Come, Jean, dear."

That walk always stood out before Jean in after life, as one
of the worst experiences she had ever had to go through.
Her most pressing desire was to keep Evelyn well ahead,
that she might not see aught of what went on behind. There
was to be no delay. Mr. Trevelyan and the two men would
start at once with—it—alas! No longer him—hoping soon to
meet coming aid, which indeed would be needed.

A whisper from Mr. Trevelyan urged Jean to haste; and


Evelyn herself probably felt that she had not strength to
endure the sight. She made no effort to hang back, and
never cast a glance to rear. Weary she must have been, and
the fixed face was white as snow in the moonlight, yet she
walked swiftly, unfalteringly, making no hardship of the
stiles, scarcely pressing on Jean's arm.

No words passed between the two for the greater part of


the way. Even when they encountered young Ricketts and
the lodger, bearing the shutter between them, it was Jean,
not Evelyn, who begged them to make haste.

Evelyn only shivered silently. Jean bent her whole attention


to guiding Evelyn's steps, to giving all possible support:
while Evelyn seemed to be hardly conscious where she was
or what she did.

Not till the marshes were left behind, not till the large final
meadow between marsh and high road were reached, did
Jean venture to say—

"If you would only lean upon me more! You must be so


tired!"

Evelyn's answer, not an answer in reality, came as if wrung


from her: "O Jean, if I had been different! If I had only been
different! If I had never given him pain!"

Jean dared not go into that question. She could trust


neither herself nor Evelyn, after all they had gone through.
She knew by Evelyn's shortened breath and failing steps
that tears were streaming; and it was only by a fierce
bracing of her own powers that she could force herself to
say in everyday accents—

"I think you might make more use of my arm. We shall soon
be at home now. Are you very wet?"

"I don't know."

To keep Evelyn to her earlier pace was no longer possible.


She fell into a slower and slower walk, till Jean began to
fear that the sad procession behind must surely overtake
them. The high road was left, but the ascending avenue-
path through the Park grounds taxed Evelyn to her utmost.
It was all she could do to drag one foot after the other, and
more than once she came to a complete pause, swaying
feebly, as if on the verge of another swoon.

Jean urged her on with touch and voice, and Evelyn


responded in renewed efforts; but when the front door was
reached, and Evelyn stumbled up the two steps, Jean knew
that she could have done no more. Anything more deathlike
than her face, as she came into the lighted hall, could
hardly have been imagined.

The housekeeper, Mrs. Stowe, stood there, and with her


Miss Devereux; the latter, as a matter of course, talking,
the former listening. News of the General's disappearance
having reached Sybella, she had driven at once to the Park,
determined there to remain till the mystery should be
cleared up. Jean had seen fresh carriage-marks on the snow
outside, leading up to the door and round to the stables.

Evelyn saw nothing. Guided by Jean, she reached the great


oaken arm-chair, and dropped into it; her lips white: her
eyes closed.

"My dear Evelyn! Then you have come, and it is all right,"
cried Sybella, starting forward. "And he is found! I said so! I
was sure it was nothing! I knew he must have taken shelter
somewhere. Such an imprudent thing to go out in the snow!
A man of his age! If people will be so foolish—! I shouldn't
wonder if he had a bad cold afterwards—and rheumatism,
of course. How wet you are, both of you! Really, it is quite
madness! I can't think what Mr. Trevelyan was after to let
you go! Such folly! If you had just stayed at home quietly!
It is too imprudent! Look at the state of your skirts. Is she
faint?"—to Jean. "Where is General Villiers? Is he coming? I
drove over, in spite of the weather, when I heard—when
Pearce brought me word—and the horses are put up here."
The first rush of Sybella's effervescence had always to be
endured; it could no more be checked than the rush from a
freshly uncorked champagne-bottle; but neither Stowe or
Jean was idle. Wine and hot water stood ready on the hall-
table, for Stowe had rightly conjectured that they would be
needed: and while Jean pulled off Evelyn's wet gloves, and
rubbed her icy fingers, Stowe brought a tumbler of
steaming liquid.

"Drink it, ma'am—it will do you good," she entreated.

Evelyn was not fainting. She opened her eyes, whispered a


low "Thanks," and made the effort; but after a few sips she
sank back with the same look of powerlessness.

Sybella talked on, wondering, conjecturing, pitying,


blaming.

Evelyn showed no consciousness of her presence.

Jean drew the housekeeper aside.

"We must get her upstairs," she whispered. "They are—


coming—with him! They will be here directly. Send for Dr.
Ingram, please—and oh! Do get her upstairs! Don't you
understand? Oh, don't ask questions, only be quick—only
get her upstairs!" implored Jean. "They are coming—with
him! He was—found—there—on the marsh!"

Stowe understood now, and was stunned with the shock,


unable to act. Before them all, she sat down, shaking
visibly; the first time in her well-regulated life that she had
ever taken such a liberty. She could only stare at Jean; and
Jean knew there was no time to lose.

"Evelyn dear, you must come to your bedroom," she said,


quitting Mrs. Stowe, and bending over the carved oak-chair.
"Come at once! Yes—now—come with me!"

The violet eyes opened slowly.

"Come, dear! Come, Evelyn! Please come!"

"Nonsense, Jean! What do you mean?" demanded Miss


Devereux, nettled by what she counted to be interference.
If Jean had proposed to keep Evelyn downstairs, she would
have been the first to urge an opposite course. Nothing
done by a Trevelyan could possibly be right in Sybella's
eyes. "Much best for her to stay here a few minutes, till she
gets warmer. Oh, you mean—to change her dress. But she
is not fit to walk yet. When General Villiers arrives—"

"O hush!" entreated Jean.

"Really, Jean—"

"Miss Trevelyan says we are to send for Dr. Ingram,"


whispered the housekeeper's tremulous voice, close to Miss
Devereux. The wording of the sentence was unfortunate.

"Send for Dr. Ingram! What for? Mrs. Villiers will be all right
in a few minutes. She is just overdone—as anybody of any
sense might expect her to be. Really, Jean, I think you a
little over-rate your position here," declared Miss Devereux,
in aggrieved accents. "Evelyn has been very kind to you, no
doubt, giving you the run of the Park and all that, but you
are hardly more than a child. I really don't quite see what
you have to do with giving orders. Evelyn ought to take
some more food before she moves. I never heard of
anything so mad, as taking her to the marshes on such a
night. If I had been here!—But some people have no
common-sense. When General Villiers comes in, he will say
—"
Evelyn stood up, her face rigid with anguish. "My own room
—" she said distinctly. "Send for Dr. Ingram at once, Stowe
—and wait here till—No, only Jean with me!" As Sybella
drew near. "Only Jean!"

Sybella fell back.

Evelyn passed away with Jean, and Stowe vanished to obey


the order.

"Well! I really do think—" gasped the astonished lady. "I


really do! And I her aunt! And as for Jean Trevelyan! But
she always was demented about those Trevelyans! Such a
stupid uninteresting girl! And Mr. Trevelyan as stiff as a
poker—the most disagreeable man I ever saw! I am sure
they are the last people I would ever go to in trouble. But
then I am always so sensitive to manner—my feelings are
so easily hurt—I really could not stand that sort of thing. It
would make me quite ill! And the idea of sending for Dr.
Ingram! Evelyn merely wants a good night's rest—and as
for the General, I suppose he is just overdone, and can't get
along fast. What else is to be expected, if people will be so
crazy? And to send for Dr. Ingram, because Jean orders it!
Ridiculous! I can't endure Dr. Ingram for my part."

Sybella had developed this dislike gradually: and no doubt,


at the foundation of it, lay his relationship to the Trevelyans.

Then, as Mrs. Stowe returned.

"What is all the fuss about, Stowe? And why is the General
so long? I suppose he found shelter somewhere, but he
ought to be here by this time. And Mr. Trevelyan—how he
could allow Mrs. Villiers to take that walk, with only Miss
Trevelyan—no proper protection?"
Sybella's flow of remarks was cut short. Mr. Trevelyan's
voice was heard outside, speaking in subdued accents; and
into the lighted hall was brought a silent presence, before
which even Sybella's volubility failed.

For Evelyn was indeed a widow!

Not till midnight, when Dr. Ingram had departed, and when
Evelyn was asleep under the influence of a semi-opiate, did
Jean venture to leave her, and to steal downstairs. She
believed that her father was there; but what might be the
next step for either of them, Jean could not so much as
conjecture. All she knew was that she herself could do and
bear no more.

Mr. Trevelyan stood below, in the hall, as if at that moment


expecting her. He had had a little warning from Dr. Ingram
to "look after Jean!"

And he had also gone through a small passage-of-arms with


Miss Devereux, wherein of course, since he was a
gentleman, the lady had had the last word. Sybella felt it to
be her duty—her positive duty—to circumvent the
machinations of these pushing Trevelyans, and to protect
her dear niece from falling hopelessly into their clutches.

She did not exactly say as much to Mr. Trevelyan, but she
looked it every inch; and there was no mistaking what she
meant, as she professed an eager desire not to be a burden
on Mr. Trevelyan's time—he was always so busy—so much
to do—and she, of course, a single lady, with so few ties—
what more natural than that she should remain at the Park,
and devote herself to her poor niece?
Yes, she would stay over the night, of course—oh, certainly
—and as many nights as her dear niece might require her.
Impossible to leave the young widow alone! Could Mr.
Trevelyan think it of her? Oh, quite impossible! Would Mr.
Trevelyan and Jean like to make use of her carriage to
convey them home? It was so late, and of course they were
fatigued. Grimshaw would think nothing—oh, nothing at all
—of that little extra round on his way to the Brow. So easily
managed! And really, the sooner the house was quiet for
her beloved niece—though none of them could ever forget
the trouble to which Mr. Trevelyan had put himself—still, at
such a time, complete quiet was so very essential—

Mr. Trevelyan bowed assent. He did not wear an attractive


expression at the moment. His bow was most gentlemanly,
but a sardonic sneer lurked in the corners of his mouth, and
his eyes scanned Miss Devereux, as they might have
scanned some uncommon specimen of worm or beetle kind,
from an ineffably superior intellectual height.

Sybella felt the contempt without understanding it, and she


was irritated.

The passage-at-arms ended as she wished. The Trevelyans


would go home that night, and would not even use her
offered carriage—which in itself was a relief, since she stood
greatly in awe of what the stable autocrat, Grimshaw, might
say. But although she had her will, although she was to be
left in undisturbed possession of the field, Sybella was not
satisfied. She could never delude herself into thinking that
she had the mastery of Mr. Trevelyan's iron will. He yielded:
yet if he had chosen not to yield, she could not have made
him.

When he stood waiting in the hall for Jean, he looked


precisely as usual: upright, composed, grim. Not a hair was
disorganised: not a muscle was disturbed. A close observer
might perhaps have noted a slight softening of expression,
as he studied his daughter.

"Where's your hat, child?"

"Are we to walk home?" For the first time within Jean's


recollection, the two miles to be traversed loomed before
her imagination as a gigantic impossibility.

"No," in a suppressed voice. "Ingram undertook to send a


fly, and it is here now. If Miss Devereux were not going to
stay—" and a pause.

"She doesn't want us. But poor Evelyn!"

"Mrs. Villiers will send when she wants you. We can't force
ourselves, even for her sake. Where's your ulster? To be
sure—it went to be dried."

A touch of the bell brought Walters, carrying the ulster. "I


did hope you'd both have stayed over the night, sir," he
murmured, as he helped Jean to put it on.

"No—I think not. Miss Trevelyan has done enough. She will
look round in the morning."

"Mrs. Villiers is asleep now," Jean said kindly to the man.

Mr. Trevelyan stopped to fasten some of Jean's buttons;


then drew her hand within his arm. "Come, we must be off,"
he said. "Mind, Walters—anything we can do for Mrs. Villiers
—"

"Yes, sir—I understand—thank you, sir."


The drive home was altogether silent. Jean could not trust
herself to speak. She had eaten almost nothing since one
o'clock, and the long strain was making itself felt.

"I sent word to your aunt that, if we came at all, we should


be late—that she must not stay up, but might leave a good
fire in the study," remarked Mr. Trevelyan, as they stopped
at the Rectory. "And—tea. I thought you would rather have
something here than at the Park. Walters would have got
anything that I wished—but—jump out!"

Jean was past jumping. She descended somehow, and


made her way to the study, where indeed a cheery fire
blazed, and tea-things were outspread. Madame Collier's
voice over the stairs kept Mr. Trevelyan back; and Jean
could hear an exchange of low-voiced communications.

There was an exclamation or two in Madame Collier's voice,


and then—

"On the marshes!—In the snow!—Too late!—All over!"—at


intervals from her father.

Jean stood over the fire, feeling strangely. It had been such
a terrible day. Only ten hours since she had quitted the
Rectory, light-hearted and joyous—and all this to have come
since! She felt as if ten weeks might have passed over her
head. A vision rose before Jean of the General's tall figure
and kind face, as he had come into his wife's boudoir; and
then of the same, lying stark and cold in the white snow;
and then of Evelyn's desolate misery; and a suffocating
lump rose in her throat.

"Aunt Marie will see you presently, but I can't have talk to-
night. You must go to bed as soon as you have had
something to eat," said Mr. Trevelyan, entering.
He poked the fire carefully, arranged a bed of hot coals with
deft fingers, and placed the kettle thereon.

"It will boil directly. Sandwiches—that's right. Sit down,


Jean."

He pushed a chair towards her, and she obeyed, with a


despairing sense of having come to the end of everything.
Thus far she had kept up with marvellous courage for a girl
of sixteen; but some measure of reaction was almost
inevitable.

"Jean, my dear," said Mr. Trevelyan, looking at her. Then—

"Poor little girl!" came in a tone which she had never heard
from him before.

He had been strongly stirred, and the underlying tenderness


of the man for once pushed its way to the surface.

To Jean's utter amazement, she found herself sobbing, with


her face on his shoulder, and one of his arms round her. Not
only so, but as the paroxysm continued, he held her more
tightly, and she heard him say—

"Never mind; don't be ashamed. You have done splendidly!


—Like my own Jean!"

"O father!—If I could help it—"

"You can't just yet. Never mind. You won't be the worse for
this."

Presently after a judicious pause—

"Now! Have you cried enough? I must make the tea."


Jean struggled manfully, and chained down the rising sobs;
but she clung to him still, drawing long breaths of mingled
pain and comfort. To her renewed amazement, his lips
touched her brow with a light kiss.

"That's right. I am proud of my girl. Now sit up, and be


brave. No, don't stand. I'll be tea-maker for once. You want
something to eat."

"Oh, I can't!"

"You must. It will do nobody any particular good for you to


starve yourself."

The essential common-sense of the remark was so like what


she herself might have said to another, that Jean almost
smiled. She was placed by Mr. Trevelyan in his deep arm-
chair, made to lean back, and supplied with necessaries—
nay, finding how she trembled, he even held the full cup to
her lips. Though the first few mouthfuls threatened to choke
her, a different state of things speedily followed. The inward
shuddering grew less; and she was at length able to say
with some degree of composure—

"Father, you don't think Evelyn will miss me when she


wakes?"

"I don't know, my dear," he answered, too truthful to deny


the possibility. "I only know that no choice was left to us.
Miss Devereux has the rights of kinship; and we have only
the rights of friendship. After all, the matter is in Mrs.
Villiers' own hands. If she chooses, she can dismiss Miss
Devereux and send for you."

"And if she does—"

"Then she shall have you."

You might also like