0% found this document useful (0 votes)
16 views

Geographic Data Analysis Using R Xindong He instant download

The document is a comprehensive guide to geographic data analysis using the R programming language, authored by Xindong He. It covers various statistical techniques and methods for analyzing geographic data, including descriptive statistics, correlation analysis, linear regression, and geographic network analysis, among others. The book aims to equip geography students with essential skills in R for effective data analysis in the context of environmental and geographical research.

Uploaded by

rekaftsapone
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Geographic Data Analysis Using R Xindong He instant download

The document is a comprehensive guide to geographic data analysis using the R programming language, authored by Xindong He. It covers various statistical techniques and methods for analyzing geographic data, including descriptive statistics, correlation analysis, linear regression, and geographic network analysis, among others. The book aims to equip geography students with essential skills in R for effective data analysis in the context of environmental and geographical research.

Uploaded by

rekaftsapone
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

Geographic Data Analysis Using R Xindong He

download

https://ebookbell.com/product/geographic-data-analysis-using-r-
xindong-he-58732246

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Geographic Data Analysis Using R 2nd Edition Xindong He

https://ebookbell.com/product/geographic-data-analysis-using-r-2nd-
edition-xindong-he-58755800

Geographic Health Data Fundamental Techniques For Analysis 1st Edition


Boscoe

https://ebookbell.com/product/geographic-health-data-fundamental-
techniques-for-analysis-1st-edition-boscoe-5394726

Timeintegrative Geographic Information Systems Management And Analysis


Of Spatiotemporal Data 1st Edition Dr Thomas Ott

https://ebookbell.com/product/timeintegrative-geographic-information-
systems-management-and-analysis-of-spatiotemporal-data-1st-edition-dr-
thomas-ott-4198870

Data Analysis And Statistics For Geography Environmental Science And


Engineering Acevedo

https://ebookbell.com/product/data-analysis-and-statistics-for-
geography-environmental-science-and-engineering-acevedo-5144568
Geographical Data Science And Spatial Data Analytics In R Lex Comber

https://ebookbell.com/product/geographical-data-science-and-spatial-
data-analytics-in-r-lex-comber-20458896

Geographic Data Science With R Michael C Wimberly

https://ebookbell.com/product/geographic-data-science-with-r-michael-
c-wimberly-49138230

Geographic Data Science With Python 1st Edition Sergio Rey Dani
Arribasbel

https://ebookbell.com/product/geographic-data-science-with-python-1st-
edition-sergio-rey-dani-arribasbel-56514430

Geographic Data Mining And Knowledge Discovery Second Edition Chapman


Hall Crc Data Mining And Knowledge Discovery Series 2nd Edition Harvey
J Miller

https://ebookbell.com/product/geographic-data-mining-and-knowledge-
discovery-second-edition-chapman-hall-crc-data-mining-and-knowledge-
discovery-series-2nd-edition-harvey-j-miller-2023104

Geographic Data Imperfection 1 From Theory To Applications


Battonhubert

https://ebookbell.com/product/geographic-data-imperfection-1-from-
theory-to-applications-battonhubert-10823174
Xindong He

Geographic
Data Analysis
Using R
Geographic Data Analysis Using R
Xindong He

Geographic Data Analysis


Using R
Xindong He
College of Geography and Planning
Chengdu University of Technology
Chengdu, Sichuan, China

ISBN 978-981-97-4021-5 ISBN 978-981-97-4022-2 (eBook)


https://doi.org/10.1007/978-981-97-4022-2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2024

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore

If disposing of this product, please recycle the paper.


Preface

We live in an era dominated by geographic data and information services, where


global climate change presents diverse challenges to research in sustainable devel-
opment. To mitigate and adapt to these changes, active response from geographic
research is essential. Achieving this goal demands extensive geographic data encom-
passing a wide range of environmental elements, such as geology, topography, water,
soil, biology, climate, economy, society, and culture. It also requires a wealth of
longitudinal observational data, including satellite remote sensing, environmental
monitoring, and large-scale mobile and location service data. However, our objec-
tive extends beyond mere data collection; it involves mining pertinent information for
geographic research from this vast dataset. This task necessitates the use of suitable
analytical tools.
Renowned for its statistical computing prowess, the R language is one of the
most widely used platforms for data analysis. For geography students, gaining a
foundational understanding of the R language and developing skills in geographic
data analysis are crucial to keep up with the rapid advancements in data science. This
need highlights my frequent advice to students to integrate R into their academic
pursuits.
Traditionally, students majoring in geography have focused significantly on
learning and practicing various analysis methods. Excel and SPSS were the primary
tools in education for data processing and analysis. However, with the advent of big
data, artificial intelligence, and data science, R has emerged as a superior alternative
due to its more advanced capabilities.
I personally favor the R language for analyzing geographic data because it appears
more ‘sophisticated’ than Excel. I believe learning R greatly benefits students,
as it challenges them to deeply engage with data analysis tasks and methodically
understand and apply measurement methods in ‘Quantitative Geography’. Many
students recognize the value of mastering R. However, the complexity of topics like
advanced mathematics, linear algebra, and probability and statistics can make them
hesitant to delve into this statistical analysis language. Therefore, a tutorial filled
with practical cases, enabling progressive practice and discovery in learning R, is

v
vi Preface

essential for students’ academic development in geographic analysis methods and


proficiency in R.
This book is written primarily from a geographer’s perspective. The following is
a brief chapter summary:
Chapter 1: It focuses on the introduction to geographic data and the basics of the
R language.
Chapter 2: It covers using R for descriptive statistical analysis of the annual
average temperature data from 837 surface meteorological stations across China
in 2020, such as mean, median, mode, and standard deviation, along with shape
measures like skewness and kurtosis. It also compares the histogram and bar in the
data plotting using the 2021 population data from various provinces and cities in
China.
Chapter 3: It focuses on the correlation analysis of observation data from 837
surface meteorological stations in China in 2020. Both the cor() and cor.test() func-
tions, included in R’s base package, are employed to meet our analytical require-
ments effectively. The rcorr() function is employed to derive the p-value matrix of
the correlation matrix. Additionally, the corr.test() function is utilized to rapidly
generate a pairwise correlation matrix for an entire dataset, complete with p-values
and confidence intervals.
Chapter 4: It demonstrates the process of conducting linear regression analysis
using R. Two distinct regression models were developed using data from 837 surface
meteorological stations across China in 2020. The first model examined the relation-
ship between air pressure (Prs) and altitude (Alt). The second model, more complex,
considered altitude (Alt), air pressure (Prs), temperature (Tem), longitude (Lon),
and latitude (Lat) as independent variables to predict precipitation (Pre). Each model
underwent thorough verification.
Chapter 5: It presents a comprehensive guide to conducting Geographically
Weighted Regression (GWR) analysis using R. Data from 837 surface meteorolog-
ical stations across mainland China are utilized to develop a GWR model, analyzing
precipitation in relation to temperature and pressure. The insights obtained from
statistical tests and the visual interpretation of generated maps indicate that a local
model is more suitable for capturing the spatially varying relationships inherent in
the data.
Chapter 6: It introduces the processing and analysis of time series data, focusing on
the two fundamental models for prediction. These concepts are illustrated through
the analysis of ten years of daily temperature data from a surface meteorological
station in Henan, China. The Holt–Winters and ARIMA models are employed to
predict future temperature trends.
Chapter 7: A detailed introduction is provided on performing clustering analysis
using elevation and annual average temperature data from surface stations in 2020.
The k–means, k–medoids, and AHC methods are utilized, with corresponding R
codes and maps of the clustering results presented. The significant role of clustering
analysis methods in supporting geographical delineation is also demonstrated.
Chapter 8: Principal component analysis (PCA) is a frequently used method in
geographic data analysis. Especially with the large-scale application of geographic
Preface vii

big data and hyperspectral remote sensing, data dimensionality reduction supported
by the PCA approach becomes one of the key steps. In this chapter, PCA and cluster
are identified as key exploratory methods in the regionalization of temperatures in
China, emphasizing the importance of understanding and interpreting their results
from a geographic research perspective. Although PCA is effective in reducing
dimensionality and enhancing the effectiveness of clustering, its application should
not be considered universal. Direct clustering may be more appropriate when the
original feature possesses clear interpretative or significant value.
Chapter 9: In the study of land uses and their changes, the Markov chain is a
commonly used predictive analysis method. In this chapter, the land use changes
for 2005 and 2015 in a specific area in China are analyzed. With 2015 as the base
year, the Markov chains method is employed to predict land use changes over the
subsequent ten years.
Chapter 10: The steps for conducting geographic network analysis, such as
analyzing a road network in R, are outlined, including the associated code. As an
example, the shortest path from Chengdu University of Technology (CDUT) to the
nearest fire station in downtown Chengdu is calculated.
Chapter 11: It demonstrates how to apply IDW, Ordinary Kriging, proximity
polygons, and nearest neighbor interpolation methods in R, using mean annual
temperatures from 837 meteorological stations in China for 2020. Although Ordi-
nary Kriging offers the most precise and smoothest interpolation results, it is complex
to calculate and requires an appropriate semivariogram model. The proximity poly-
gons method boasts a long history. The IDW method is straightforward but sensitive
to local extremes. The nearest neighbor method is relatively simple to use.
Notable features of this book are highlighted as follows:
• The book serves as an invaluable resource for geography students engaged in
studying quantitative analysis.
• Geographers seeking to analyze geographic data quantitatively will find this book
particularly useful.
• The organization of the content reflects the authors’ experience and the complexity
of the methods frequently employed in recent years for conducting quantitative
geographic data analysis.
• Detailed explanations are provided for the analysis results of the main methods
and functions in R.
• A comprehensive list of the principal functions and packages utilized in this book
is included.
Xindong He, Ph.D.
Associate Professor, Tenured College
of Geography and Planning
Chengdu University of Technology
Chengdu, Sichuan, China
Contents

1 Introduction to Geographic Data and R . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Geographic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Spatial Data and Attribute Data . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Quantitative and Qualitative Data . . . . . . . . . . . . . . . . . . . 2
1.2 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Installing R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Updating R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Using RStudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Geographic Data Analysis and R . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Descriptive Analysis of Geographic Data . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 An Overview of Descriptive Statistical Analysis . . . . . . . . . . . . . . 7
2.2 Preparing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Loading Required Packages . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Reading and Displaying Data . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Measures of Central Tendencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Measures of Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.4 Coefficient of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Measures of Shape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.1 Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.2 Kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

ix
x Contents

2.6 Comprehensive Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


2.6.1 Using describe() Function . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6.2 Using summary() Function . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.7.1 Scatterplot Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.7.2 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7.3 Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 An Overview of Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Calculating the Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.1 Calculating the Correlation Between Two
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.2 Calculating the Correlation Coefficients of a Vector . . . . 39
3.3 Visualizing Correlation Coefficients Matrix . . . . . . . . . . . . . . . . . . 42
3.3.1 Plotting with corrplot() Function . . . . . . . . . . . . . . . . . . . . 42
3.3.2 Plotting with chart.Correlation() Function . . . . . . . . . . . . 43
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Linear Regression Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1 An Overview of Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Simple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.1 Reading and Plotting Data . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.2 Building a Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.3 Analyze the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2.4 Analyze the Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Multiple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.1 Examining Bivariate Relationships . . . . . . . . . . . . . . . . . . 53
4.3.2 Multiple Linear Regression Model . . . . . . . . . . . . . . . . . . 55
4.3.3 Regression Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.4 Improvement Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3.5 Choose the Best Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Geographically Weighted Regression Analysis . . . . . . . . . . . . . . . . . . . 79
5.1 An Overview of Geographically Weighted Regression . . . . . . . . . 79
5.2 Preprocessing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2.1 Importing the Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2.2 Transforming the Projection . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.3 Visualizing the Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.3 Spatial Autocorrelation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.4 Pereforming GWR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4.1 Calculating Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4.2 Fit and Summarize the GWR Model . . . . . . . . . . . . . . . . . 92
5.4.3 Diagnostics and Model Checking . . . . . . . . . . . . . . . . . . . 94
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Contents xi

6 Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105


6.1 An Overview of Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Preprocessing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2.1 Reading Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2.2 Convert the Data to a Time Series . . . . . . . . . . . . . . . . . . . 106
6.2.3 Descriptive Statistics Summary . . . . . . . . . . . . . . . . . . . . . 107
6.3 Identify the Time Series Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3.1 Viewing Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.3.2 Removing Seasonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.3.3 Seasonal Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.3.4 Stationarity Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.5 ACF and PACF Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.3.6 White Noise Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.3.7 Time Series Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 124
6.4 Time Series Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.4.1 Holt-Winters Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.4.2 ARIMA Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7 Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.1 An Overview of Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.2 Partitioning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.2.1 k-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.2.2 k-Medoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.3 Hierarchical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.3.1 Agglomerative Hierarchical Clustering (AHC) . . . . . . . . 146
7.3.2 Divisive Hierarchical Clustering . . . . . . . . . . . . . . . . . . . . 152
7.4 The Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8 Principal Component Analysis (PCA) . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.1 An Overview of Principal Component Analysis . . . . . . . . . . . . . . . 155
8.2 Performing PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
8.2.1 Reading and Plotting Data . . . . . . . . . . . . . . . . . . . . . . . . . 156
8.2.2 Conducting PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2.3 View the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2.4 Visualizing the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.3 Further Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.3.1 Performing Clustering Analysis . . . . . . . . . . . . . . . . . . . . . 161
8.3.2 Visualize the Results of Clustering . . . . . . . . . . . . . . . . . . 161
8.3.3 Evaluating Clustering Quality . . . . . . . . . . . . . . . . . . . . . . 162
8.3.4 Mapping the Cluster Results . . . . . . . . . . . . . . . . . . . . . . . . 163
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
xii Contents

9 Markov Chain Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167


9.1 An Overview of Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
9.2 Reading and Plotting the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.2.1 Reading Land-Use Raster Data . . . . . . . . . . . . . . . . . . . . . 168
9.2.2 Visualizing Land-Use Raster Data . . . . . . . . . . . . . . . . . . . 170
9.3 Performing Markov Chain Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 172
9.3.1 Checking the Resolution and Extent . . . . . . . . . . . . . . . . . 172
9.3.2 Calculating Transition Probabilities . . . . . . . . . . . . . . . . . 173
9.3.3 Creating a Markov Chain Object . . . . . . . . . . . . . . . . . . . . 173
9.3.4 Forecasting Future Land-Use Changes . . . . . . . . . . . . . . . 174
9.3.5 Visualizing the Future Land-Use Raster Data . . . . . . . . . 175
9.4 Further Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
9.4.1 Land-Use Changes From 2005 to 2015 . . . . . . . . . . . . . . . 176
9.4.2 Land-Use Changes From 2015 to 2025 . . . . . . . . . . . . . . . 176
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10 Geographic Network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.1 An Overview of Geographic Network Analysis . . . . . . . . . . . . . . . 179
10.2 Reading and Plotting Spatial Data . . . . . . . . . . . . . . . . . . . . . . . . . . 180
10.2.1 Reading Spatial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
10.2.2 Visualizing Spatial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
10.3 Performing Geographic Network Analysis . . . . . . . . . . . . . . . . . . . 182
10.3.1 Creating a Unique Index for Each Edge . . . . . . . . . . . . . . 182
10.3.2 Creating Nodes for Each Edge . . . . . . . . . . . . . . . . . . . . . . 183
10.3.3 Creating a Unique Index for Each Node . . . . . . . . . . . . . . 183
10.3.4 Merging Node Index with Edges . . . . . . . . . . . . . . . . . . . . 184
10.3.5 Deleting Duplicate Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . 185
10.3.6 Creating a Graph (Network) Object . . . . . . . . . . . . . . . . . . 185
10.3.7 Combining the Functions in sf and Tidygraph . . . . . . . . . 186
10.3.8 Plotting the Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10.4 Measuring the Centrality of the Network . . . . . . . . . . . . . . . . . . . . . 188
10.5 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
10.6 Calculate the Shortest Path Between Fire Station and CDUT . . . . 195
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
11 Spatial Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
11.1 An Overview of Spatial Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 199
11.2 Reading and Plotting Spatial Data . . . . . . . . . . . . . . . . . . . . . . . . . . 200
11.3 Inverse Distance Weighted Interpolation (IDW) . . . . . . . . . . . . . . . 203
11.4 Ordinary Kriging Interpolation (OK) . . . . . . . . . . . . . . . . . . . . . . . . 206
11.4.1 Sample Variogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
11.4.2 Variogram Model Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Contents xiii

11.5 Proximity Polygons Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 212


11.5.1 Using the voronoi() Function . . . . . . . . . . . . . . . . . . . . . . . 212
11.5.2 Using the st_voronoi() Function . . . . . . . . . . . . . . . . . . . . 216
11.5.3 Rasterizing the Voronoi Diagram . . . . . . . . . . . . . . . . . . . . 218
11.6 Nearest Neighbour Interpolation (NN) . . . . . . . . . . . . . . . . . . . . . . . 219
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
About the Author

Xindong He serves as an Associate Professor at the College of Geography and Plan-


ning, Chengdu University of Technology, Chengdu, China. He earned a Ph.D. in archi-
tecture with a focus on urban and rural planning from Tsinghua University, a Master’s
degree in physical geography from the University of Chinese Academy of Sciences
(UCAS), and a bachelor’s degree in geography from Ludong University. His primary
research interests include the application of remote sensing, GIS, and quantitative
methods to decision support, and spatial and regional planning. With more than 18
years of experience in scientific research and teaching, he has authored over 20 arti-
cles in both Chinese and international journals and contributed to the authoring of five
monographs and two textbooks. e-mail: hexindong09@cdut.edu.cn

xv
List of Figures

Fig. 1.1 Update R in windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4


Fig. 2.1 Scatter plot of air pressure and altitude at each station
in 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Fig. 2.2 Histogram of temperature in 2020 . . . . . . . . . . . . . . . . . . . . . . . . 17
Fig. 2.3 Histogram of temperature with density and normal curves . . . . 18
Fig. 2.4 Histogram of precipitation in 2020 . . . . . . . . . . . . . . . . . . . . . . . 18
Fig. 2.5 Histogram of precipitation with density and normal curves . . . . 19
Fig. 2.6 Histogram of air pressure in 2020 . . . . . . . . . . . . . . . . . . . . . . . . 20
Fig. 2.7 Histogram of air pressure with density and normal curves . . . . 21
Fig. 2.8 The barplot of 2021 population data for some provinces
and cities in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Fig. 2.9 The customized barplot of 2021 population data of some
provinces and cities in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Fig. 3.1 The scatterplot for altitude and air pressure of surface
stations in 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Fig. 3.2 The scatterplot for altitude and precipitation of surface
stations in 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Fig. 3.3 The scatterplot for altitude and temperature of surface
stations in 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Fig. 3.4 Q-Q plot of altitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Fig. 3.5 Q-Q plot of air pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Fig. 3.6 Q-Q plot of precipitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Fig. 3.7 Q-Q plot of temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Fig. 3.8 The correlation coefficients matrix plot . . . . . . . . . . . . . . . . . . . . 43
Fig. 3.9 The scatterplot matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Fig. 4.1 The scatterplot chart of 837 suface stations in 2020, China . . . . 48
Fig. 4.2 Scatter plot of air pressure and altitude . . . . . . . . . . . . . . . . . . . . 49
Fig. 4.3 The plot of residuals for lm.reg . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Fig. 4.4 A diagnostic test plot for lm.reg . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Fig. 4.5 Scatter plot matrix of the meteorological data . . . . . . . . . . . . . . 54

xvii
xviii List of Figures

Fig. 4.6 A diagnostic test plot for the mutiple linear regression
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Fig. 4.7 Q-Q plot for mfit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Fig. 4.8 The Component+Residual Plots for the model . . . . . . . . . . . . . . 60
Fig. 4.9 Q-Q plot for transformed mfit . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Fig. 4.10 Q-Q plots for three models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Fig. 4.11 Residuals analysis of three models . . . . . . . . . . . . . . . . . . . . . . . 69
Fig. 4.12 All-possible-regressions plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Fig. 4.13 Cp plot for all-possible-regressions . . . . . . . . . . . . . . . . . . . . . . . 77
Fig. 5.1 The precipitation map of 837 ground stations
across China, 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Fig. 5.2 The plot of standardised residuals agains the fitted values . . . . . 85
Fig. 5.3 The map of the spatial pattern of the regression residuals . . . . . 86
Fig. 5.4 The map of “Local Moran’s I” for residuals using
k-nearest neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Fig. 5.5 Moran scatterplot for the Pre . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Fig. 5.6 The GWR coefficients plot of Tem . . . . . . . . . . . . . . . . . . . . . . . 96
Fig. 5.7 The GWR coefficients plot of Prs . . . . . . . . . . . . . . . . . . . . . . . . 97
Fig. 5.8 The localR2 map of gwr.model . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Fig. 5.9 The t-values map of Tem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Fig. 5.10 The t-values map of Prs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Fig. 6.1 Time series plot of minimum and maximum daily
temperatures in a China station . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Fig. 6.2 Daily temperatures in a China station from 2010–2019 . . . . . . . 109
Fig. 6.3 Daily temperatures in a China station from 2010–2019 . . . . . . . 110
Fig. 6.4 30-day window for max temperatures . . . . . . . . . . . . . . . . . . . . . 111
Fig. 6.5 365-day window for max temperatures . . . . . . . . . . . . . . . . . . . . 112
Fig. 6.6 365 lagged differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Fig. 6.7 The plot of differencing TMX . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Fig. 6.8 The plot of differencing TMN . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Fig. 6.9 ACF and PACF plots for the first-differenced TMX
and TMN series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Fig. 6.10 Time series decomposition for daily maximum
temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Fig. 6.11 Time series decomposition for daily minimum
temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Fig. 6.12 Holt-Winters forecast of monthly maximum temperatures
at a Chinese station, 2020–2025 . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 6.13 Holt-Winters forecast of monthly max temperatures
at a Chinese station, 2020–2080 . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Fig. 6.14 The Arima model forecast of monthly maximum
temperatures in a China station, 2020–2025 . . . . . . . . . . . . . . . . 131
Fig. 6.15 The residuals plot of ARIMA model . . . . . . . . . . . . . . . . . . . . . . 132
Fig. 6.16 The plot of ACF and PACF of Residuals . . . . . . . . . . . . . . . . . . . 133
Fig. 7.1 The scatter plot of the K-Means clustering analysis results . . . . 139
List of Figures xix

Fig. 7.2 The 2D plots of the K-Means clustering results . . . . . . . . . . . . . 140


Fig. 7.3 K-Means cluster analysis map . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Fig. 7.4 The scatter plot of the K-Medoids clustering results . . . . . . . . . 143
Fig. 7.5 K-Medoids clustering results 2D plots . . . . . . . . . . . . . . . . . . . . 144
Fig. 7.6 K-Medoids clustering map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Fig. 7.7 The tree diagram of the AHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Fig. 7.8 The plots of the AHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Fig. 7.9 The 2D cluster plots of the AHC . . . . . . . . . . . . . . . . . . . . . . . . . 150
Fig. 7.10 AHC map of ground observation stations . . . . . . . . . . . . . . . . . . 151
Fig. 8.1 The scree plot of the AHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Fig. 8.2 2D scatter plot of the PCA results . . . . . . . . . . . . . . . . . . . . . . . . 160
Fig. 8.3 3D scatter plot of the PCA results . . . . . . . . . . . . . . . . . . . . . . . . 160
Fig. 8.4 The scatter plot of K-means cluster for PCA . . . . . . . . . . . . . . . 162
Fig. 8.5 The cluster results map of ground observation stations . . . . . . . 164
Fig. 9.1 Land-use map for the year 2005 . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Fig. 9.2 Land-use map for the year 2015 . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Fig. 9.3 Predicted land-use map for the future . . . . . . . . . . . . . . . . . . . . . 176
Fig. 10.1 The road network and firestations in central urban area,
Chengdu, China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Fig. 10.2 The plot of network in central urban area, Chengdu, China . . . . 187
Fig. 10.3 The centrality plot of the network nodes . . . . . . . . . . . . . . . . . . . 189
Fig. 10.4 The betweenness plot of the network nodes . . . . . . . . . . . . . . . . 190
Fig. 10.5 The betweenness plot of the network edges . . . . . . . . . . . . . . . . 191
Fig. 10.6 The plot of the shortest path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Fig. 10.7 The plot of the two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Fig. 10.8 The plot of the shortest path between the two points . . . . . . . . . 197
Fig. 11.1 The mean temperatures plot of 837 meteorological
observation stations in China, in 2020 . . . . . . . . . . . . . . . . . . . . . 201
Fig. 11.2 The IDW interpolation for temperatures over China,
in 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Fig. 11.3 Sample variogram plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Fig. 11.4 Sample variogram (circles) with gaussian models fitted
using weighted least squares (solid line) . . . . . . . . . . . . . . . . . . . 208
Fig. 11.5 Sample variogram (circles) with spherical models fitted
using weighted least squares (solid line) . . . . . . . . . . . . . . . . . . . 209
Fig. 11.6 Sample variogram (circles) with exponential models
fitted using weighted least squares (solid line) . . . . . . . . . . . . . . 210
Fig. 11.7 Kriged Annual Mean Temperatures over main China . . . . . . . . 211
Fig. 11.8 Thiessen polygons for ground meteorological observation
sites using voronoi() function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Fig. 11.9 Thiessen polygons for mean annual temperatures
over main China using plot() function . . . . . . . . . . . . . . . . . . . . . 213
Fig. 11.10 Thiessen polygons for mean annual temperatures
over main China using ggplot . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
xx List of Figures

Fig. 11.11 Thiessen polygons for ground meteorological observation


sites using st_voronoi() function . . . . . . . . . . . . . . . . . . . . . . . . . 217
Fig. 11.12 Thiessen polygons for ground meteorological observation
sites in mainland China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Fig. 11.13 Plotting the rasterized temperatures over main China . . . . . . . . 219
Fig. 11.14 Plotting the NN interpolation result . . . . . . . . . . . . . . . . . . . . . . . 221
Chapter 1
Introduction to Geographic Data and R

This chapter provides an introduction to geographic data and the fundamentals of


the R language, including classifications of geographic data, and the installing of R
and RStudio. Additionally, it details the sources of the data utilized in this book.

1.1 Geographic Data

Geographic data is defined as data that describes geographic entities, events, pro-
cesses, and phenomena related to specific Earth locations (ISO/TC 211 2015). The
sources of geographic data are varied (O’Brien 2005), including personal question-
naires, field surveys, experimental observations, government-published data, satel-
lite remote sensing data, and big data from the Internet and location services. Data
linked to a specific geographic location and space qualifies as geographic data. Geo-
graphic data symbolically represents the relationship between various geographical
features and phenomena, encompassing spatial location, attribute features, and tem-
poral features (Chen et al. 2000). The emergence and widespread use of Geographic
Information System (GIS) in geography have enabled the perfect integration of spa-
tial and attribute data at the geographic data level (Xu 2014).

1.1.1 Spatial Data and Attribute Data

Spatial data describes the connections between geographic entities, events, pro-
cesses, and phenomena within specific locations, regions, and spaces. In GIS, data
primarily comes in two forms: raster and vector. Raster geographic data is composed

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 1
X. He, Geographic Data Analysis Using R,
https://doi.org/10.1007/978-981-97-4022-2_1
2 1 Introduction to Geographic Data and R

of a matrix of cells, each containing spatial reference information (Dorman 2014).


Each cell in raster data holds a value that represents information about geographic
entities, like elevation or land use/cover types. This data can include thematic data,
scanned maps, digital aerial photographs, satellite images, and digital pictures. Sim-
ilarly, vector geographic data consists of ordered geographic or plane coordinates
(x, y), representing discrete points of geographic entities, phenomena, and events.
Recording these coordinates allows for representing the spatial distribution and rela-
tionships of these geographic entities, phenomena, and events on Earth’s surface
through points, lines, and polygons.
Attribute data in geography, closely linked to spatial data, is used to describe
the quantitative or qualitative characteristics of geographic entities. This includes
features like elevation, slope, temperature, precipitation, forest cover, regional pop-
ulation, gross domestic product (GDP), and land use/cover classes. There are two
main types of attribute data: quantitative and qualitative (Zhang et al. 1991).
Integrating spatial and attribute data enables the description of complex geo-
graphic entities, phenomena, events, and processes.

1.1.2 Quantitative and Qualitative Data

Quantitative data, which can be observed and measured in terms of values or counts,
is numerically expressed to indicate “how many”, “how much”, or “how often” (Burt
et al. 2009). Examples include population, rainfall, temperature, and distance.
Qualitative data can be observed but not measured numerically. It represents
types of geographic entities or phenomena and can be categorized using names,
symbols, or codes. These categories, such as land use/land cover, gender, and job
occupation, are distinct and non-overlapping (Fotheringham et al. 2000). Despite the
lack of numerical values in qualitative data, quantitative methods can still be applied
for analysis (Young 1981; Bernard 1996; Mumford et al. 2008; Hetenyi et al. 2019).
In the era of big data, the reliance on professional processing tools for geographic
data has increased significantly. Owing to its simplicity and user-friendliness, the R
language has gained prominence in processing geographic data, making it a vital
tool for geography students and researchers.

1.2 R

R is a language and environment specifically developed for statistical data com-


puting, analysis, and visualization. It encompasses a broad range of statistical and
graphical techniques, including some of the latest advancements. A key feature of
R is its open-source and flexible nature, allowing for easy extension with packages
1.2 R 3

and the addition of new functionality through custom functions. R is cross-platform,


capable of compiling and running on Windows, MacOS, and Linux (R Core Team
2022). R is free to use and offers an interactive interface, which is particularly benefi-
cial for university scholars and students. Students need to learn to handle geographic
data through interactive exploration, processing, visualization, understanding, and
interpretation (Kabacoff 2015).

1.2.1 Installing R

R can be freely downloaded from the Comprehensive R Archive Network (CRAN).


Visit https://cran.r-project.org, click on mirrors to access alternative URLs avail-
able worldwide. Choose your preferred mirror; precompiled installers are avail-
able for Windows, macOS, and Linux. Select the installation package suitable for
your operating system. Windows users should click "Download R for Windows".
Once the download is complete, run the installer and follow the default options.
Ensure that there are no spaces in the installation directory’s name.

1.2.2 Updating R

To maintain an up-to-date version of R, frequent updates are necessary. Update


methods vary across different operating systems. Windows users can update R
by using the installr package. This package is installed with the command
install.packages("installr"). Once installed, running the command library
("installr") adds an installr menu to the RGui menu bar. Selecting Update
R from this menu initiates the upgrade to the latest version. Figure 1.1 below pro-
vides a detailed illustration of this update process.
Mac users typically update R using a manual method.

1.2.3 Using RStudio

Alternatively, R can be used through RStudio software. RStudio is an integrated


development environment (IDE) specifically for R. It simplifies and accelerates pro-
gramming in R. Download RStudio from https://posit.co/downloads/ and follow the
official instructions for installation.
This book employs R version 4.3.1 and RStudio version 2023.12.1-402, under
which all R codes have been successfully executed.
4 1 Introduction to Geographic Data and R

Fig. 1.1 Update R in


windows

1.3 Geographic Data Analysis and R

The R language offers a comprehensive suite of tools for processing and analyz-
ing geographic data. With the application of big data and artificial intelligence in
geography and GIS, along with the development of geographic data science, Excel
struggles with large or complex data. In contrast, R is well-suited for such tasks,
leading to a growing demand for R language applications. Amidst these technolog-
ical advancements, students and technicians accustomed to traditional desktop GIS
tools like ArcGIS and QGIS are seeking more powerful programming languages
and command line methods for processing, analyzing, and visualizing geographic
data. The R language effectively meets these needs. For those proficient in it, R’s
command line approach is more efficient and faster than using graphical interfaces.

1.4 Data

The 2020 observation data from 837 meteorological surface stations in China orig-
inates from the “Daily Meteorological Dataset of Basic Meteorological Elements
of China National Surface Weather Stations” product. This dataset, encompassing
multi-year observations from over 2,000 stations nationwide, is published by the
China Meteorological Data Service Centre (National Meteorological Information
Center 2020). Following the author’s site selection and data aggregation process,
the dataset utilized in this book was compiled. The temperature observation data
for time series analysis, covering 2010 to 2019 from a meteorological station in
References 5

Henan Province, China, also derives from this dataset. The 2021 population data
for each Chinese province is sourced from the National Bureau of Statistics web-
site (National Bureau of Statistics 2021). DEM are available from the CGIAR-CSI
SRTM 90m Database (Jarvis et al. 2008). Land use and vector data of provinces in
China were obtained from the Resource and Environment Science and Data Center
in the Institute of Geographic Sciences and Natural Resources Research, Chinese
Academy of Sciences (XU et al. 2018; Xu 2022). China map vector data is pro-
vided by Northeast Asia Resource and Environment Big Data Center in Northeast
Institute of Geography and Agroecology, Chinese Academy of Sciences (Northeast
Asia Resource and Environment Big Data Center 2020). In Chap. 10, “Geographic
Network Analysis,” the road network data are sourced from OpenStreetMap (Open-
StreetMap contributors 2023), and the Chengdu fire stations are derived from Gaode
Map POI data (Amap 2023).

References

Amap. 2023. Chengdu Fire Station POI Data. Available from Amap - Data extracted from Amap
services. https://www.amap.com.
Bernard, H Russell. 1996. Qualitative Data, Quantitative Analysis. CAM Journal 8 (1): 9–11.
Burt, James E, Gerald M Barber, and David L Rigby. 2009. Elementary Statistics for Geographers.
Guilford Press.
Chen, Shupeng, Lu. Xuejun, and Chenghu Zhou. 2000. Introduction to Geographic Information
Systems. Beijing, China: Science Press.
Dorman, Michael. 2014. Learning r for Geospatial Analysis. Packt Publishing Ltd.
Fotheringham, A Stewart, Chris Brunsdon, and Martin Charlton. 2000. Quantitative Geography:
Perspectives on Spatial Data Analysis. United States: Sage.
Hetenyi, Gabor, Attila Lengyel, and Magdolna Szilasi. 2019. Quantitative Analysis of Qualitative
Data: Using Voyant Tools to Investigate the Sales-Marketing Interface. Journal of Industrial
Engineering and Management (JIEM) 12 (3): 393–404.
ISO/TC 211. 2015. ISO 19109:2015 Geographic Information — Rules for Application Schema.
ISO/TC 211 Secretariat. https://www.iso.org/obp/ui/#iso:std:iso:19109:ed-2:v1:en.
Jarvis, A., H. I. Reuter, A. D. Nelson, and E. Guevara. 2008. Hole-Filled SRTM for the Globe
Version 4. CGIAR Consortium for Spatial Information. http://srtm.csi.cgiar.org.
Kabacoff, Robert I. 2015. R in Action: Data Analysis and Graphics with R. 2nd ed. Manning
Publications.
Mumford, Michael D., Katrina E. Bedell-Avers, Samuel T. Hunter, Jazmine Espejo, Dawn
Eubanks, and Mary Shane Connelly. 2008. Violence in Ideological and Non-Ideological Groups:
A Quantitative Analysis of Qualitative Data. Journal of Applied Social Psychology 38 (6): 1521–
61.
National Bureau of Statistics. 2021. Annual Population Data by Province. National Bureau of
Statistics. https://data.stats.gov.cn/easyquery.htm?cn=E0103.
National Meteorological Information Center. 2020. Daily Meteorological Dataset of Basic Mete-
orological Elements of China National Surface Weather Station. National Meteorological Infor-
mation Center. https://data.cma.cn/data/cdcdetail/dataCode/A.0012.0001.html.
Northeast Asia Resource and Environment Big Data Center. 2020. China Map Vector Data. http://
wetland.igadc.cn.
O’Brien, Larry. 2005. Introducing Quantitative Geography: Measurement Methods and Gener-
alised Linear Models. London: Routledge.
6 1 Introduction to Geographic Data and R

OpenStreetMap contributors. 2023. OpenStreetMap. Data extracted from the OpenStreetMap


database, available under the Open Database License (ODbL). https://www.openstreetmap.org.
R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria:
R Foundation for Statistical Computing.
Xu, Jianhua. 2006. Quantitative Geography. Beijing, China: High Education Press.
Xu, Xinliang. 2022. Multi-Year Provincial Administrative Division Boundaries Data of China.
http://www.resdc.cn/DOI, Resource; Environment Science; Data Center. https://doi.org/10.
12078/2023010103.
XU, Xinliang, Jiyuan LIU, Shuwen ZHANG, Rendong LI, Changzhen YAN, and Shixin
WU. 2018. China Multi-Period Land Use Remote Sensing Monitoring Dataset (CNLUCC).
Resource; Environmental Science Data Registration; Publication System. https://doi.org/10.
12078/2018070201.
Young, Forrest W. 1981. Quantitative Analysis of Qualitative Data. Psychometrika 46 (4): 357–88.
Zhang, Chao, and Binggeng Yang. 1991. Fundamentals of Quantitative Geography, 2nd ed. Bei-
jing, China: Higher Education Press.
Chapter 2
Descriptive Analysis of Geographic Data

This chapter covers using R for descriptive statistical analysis of the annual average
temperature data from 837 surface meteorological stations across China in 2020,
such as mean, median, mode, and standard deviation, along with shape measures
like skewness and kurtosis. It also compares the histogram and bar in the data plot-
ting using the 2021population data from various provinces and cities in China.

2.1 An Overview of Descriptive Statistical Analysis

Upon completing the collection of geographic data tailored to the research and
project requirements, the initial step in data analysis involves observing and analyz-
ing key characteristics of the data, including mean, median, mode, and standard
deviation. Representing and graphically displaying these data characteristics con-
stitutes descriptive statistical analysis. Descriptive statistics encompass a range of
numerical and graphical techniques for organizing, presenting, and analyzing data.
The form used to describe a variable in a sample is contingent upon the measurement
level applied (Fisher et al. 2009).

2.2 Preparing Data

To illustrate the use of R in descriptive analysis of geographic data, consider the


observation data from 837 surface meteorological stations in China at 2020 as
an example. Meteorological observation data is often in Comma-Separated Val-
ues (CSV) format, also known as character-separated values. This format is pop-
ular for storing medium or small scale datasets due to its simplicity and versatility.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 7
X. He, Geographic Data Analysis Using R,
https://doi.org/10.1007/978-981-97-4022-2_2
8 2 Descriptive Analysis of Geographic Data

Its widespread software support makes it a common choice for data storage and
exchange. The essence of a CSV file is its simplicity: data is saved as text, line
by line, with each record divided into fields by a separator, and each record fol-
lows the same field sequence. For data in CSV format, R directly reads it using the
read.csv() function. This function is supported by the readr package. After read-
ing the data, R saves it as a data frame of type tibble. The simplest way to obtain
the readr package is by installing the tidyverse package.
The recent introduction of the Tidyverse package has significantly simplified
the learning process of the R language. Tidyverse comprises a suite of R packages,
such as readr, dplyr, ggplot2, tidyr, and stringr, covering a range from data
import and preprocessing to advanced transformation, visualization, modeling, and
display.

2.2.1 Loading Required Packages

packages <- c("tidyverse", "readr", "moments", "e1071", "psych", "dplyr", "ggplot2",


⊂→ "extrafont", "forcats", "showtext", "shadowtext")

for (pkg in packages) {


if (!require(pkg, character.only = TRUE)) {
install.packages(pkg, repos = "https://mirrors.tuna.tsinghua.edu.cn/CRAN/")
}
}

# Load packages after installing


sapply(packages, require, character.only = TRUE)
tidyverse readr moments e1071 psych dplyr ggplot2
TRUE TRUE TRUE TRUE TRUE TRUE TRUE
extrafont forcats showtext shadowtext
TRUE TRUE TRUE TRUE

According to the tidyverse package documentation, the readr package comes


pre-installed as a component of tidyverse in R. It can be directly used with the
read_csv() function to load CSV format data. Alternatively, the readr package
can be installed independently.

2.2.2 Reading and Displaying Data

To read and display the data, use the following commands:


data <- read_csv(file = "./data/climate2020.csv")
dplyr::glimpse(data)
Rows: 837
Columns: 8
$ FID <dbl> 673, 662, 829, 317, 349, 704, 719, 786, 808, 832, 328, 689, 325, ~
$ site <dbl> 58251, 58143, 59673, 54471, 54776, 58472, 58569, 59134, 59321, 59~
$ Pre <dbl> 1113, 1280, 2292, 584, 834, 1282, 987, 656, 769, 627, 558, 1649, ~
$ Prs <dbl> 1017, 1017, 1013, 1017, 1016, 1011, 1015, 1013, 1014, 1011, 1017,~
2.3 Measures of Central Tendencies 9

$ Tem <dbl> 16, 16, 24, 11, 13, 18, 19, 23, 23, 26, 13, 18, 13, 15, 17, 12, 1~
$ Alt <dbl> -3, -1, -1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3,~
$ Lon <dbl> 120.2833, 119.8500, 112.7667, 122.1667, 122.7000, 122.4500, 121.9~
$ Lat <dbl> 32.85000, 33.80000, 21.73333, 40.65000, 37.40000, 30.73333, 29.20~

In the dataset, site represents the meteorological observation site number, Alt
indicates the altitude, Tem denotes the annual mean temperature, Pre refers to the
annual mean precipitation, and Prs signifies the annual mean air pressure.
Next, we will demonstrate how to use R to calculate specific eigenvalues for
descriptive statistical analysis.

2.3 Measures of Central Tendencies

Mean, median, and mode represent the three primary measures of central tendency
in statistics. These measures help identify the central position within a data set, a
concept known as central tendency.

2.3.1 Mean

The mean is calculated as the sum of all values divided by their count. In R, this can
be computed using the mean() function:
# calculate the mean of the temperature:Temmean
Temmean <- mean(data$Tem)
# display Temmean
Temmean
[1] 12.58901

2.3.2 Median

The median represents the middle value in a data set, when arranged in ascending
order. It can be calculated in R using the median() function in the stats package:
# calculate the median of the temperature:Temmedian
Temmedian <- median(data$Tem)
# display Temmedian
Temmedian
[1] 14
10 2 Descriptive Analysis of Geographic Data

2.3.3 Mode

The mode refers to the value that occurs most frequently in a data set. In R, there
is no standard built-in function to compute the mode. Therefore, it’s necessary to
create a custom function for this purpose. The code is as follows:
# create a customized function
mymode <- function(x) {
return(as.numeric(names(table(x))[table(x) == max(table(x))]))
}
mymode(data$Tem)
[1] 17

2.4 Measures of Variability

2.4.1 Range

The range is defined as the difference between the largest and smallest data values
in a given set. The range can be computed in R using the following code:
# calculate the range of the temperature
max(data$Tem) - min(data$Tem)
[1] 31

Alternatively, the maximum and minimum values can be determined using the
range() function, and then the range can be calculated:
r <- range(data$Tem)
r[2] - r[1]
[1] 31

2.4.2 Variance

Variance quantifies the dispersion of data points around the mean. Low variance
suggests that the data points are generally similar and closely clustered around the
mean. Conversely, high variance indicates greater variability, with data points more
widely spread out from the mean.
The formula to calculate variance for a population is as follows:
∑n
i=1 (x i − μ)
.Variance = σ = 2
(2.1)
n

where .σ 2 represents the variance of a population, .xi denotes the .ith data point, .μ is
the population mean, and .n is the population size.
2.4 Measures of Variability 11

Alternatively, the formula to calculate variance for a sample data set is:
∑n
i=1 (x i − x)
.Variance = S =
2
(2.2)
n−1

where .s 2 denotes the variance of a sample, .xi represents the .ith observation, .x is the
sample mean, and .n indicates the size of the sample.

2.4.3 Standard Deviation

Standard deviation (SD) is a statistical measure that quantifies the dispersion or


variability within a data set. A low standard deviation indicates that data points are
generally clustered close to the mean (average value). Conversely, a high standard
deviation signifies greater variability in data points, implying a wider spread from
the mean.
To calculate the standard deviation, take the square root of the population vari-
ance:

.Population standard deviation = μ2 (2.3)

Alternatively, the standard deviation can be obtained by taking the square root of
the sample variance:

.Sample standard deviation = S2 (2.4)

In R, the functions var() and sd() are commonly utilized to calculate the vari-
ance and standard deviation of either a population or a sample.
# calculate the variance of the temperature
var(data$Tem)
[1] 43.03184
# calculate the standard deviation of the temperature
sd(data$Tem)
[1] 6.559866

2.4.4 Coefficient of Variation

The coefficient of variation (CV) is defined as the ratio of the standard deviation to
the mean. It quantifies the degree of variability in relation to the population’s mean.
A higher CV indicates a greater level of dispersion. The coefficient of variation is
particularly useful since it is dimensionless, meaning it is independent of the units
12 2 Descriptive Analysis of Geographic Data

of measurement, allowing for comparison between datasets with different units or


significantly different means.
S
Cv =
. (2.5)
x
To calculate the coefficient of variation for a dataset in R, use the following com-
mands:
# calculate CV
CV <- sd(data$Tem) / Temmean * 100
# display CV
CV
[1] 52.10788

2.5 Measures of Shape

The symmetry or asymmetry of a dataset can be determined using measures of shape,


specifically Skewness and Kurtosis, which compare the dataset’s shape to a normal
distribution curve.

2.5.1 Skewness

Skewness is a statistical measure that indicates whether a distribution is symmetric.


A distribution is considered symmetric if its right side mirrors the left side. In a sym-
metric distribution, the Skewness value is 0. For instance, in a symmetric (normal)
distribution, median equals mean equals mode, resulting in a Skewness value of 0.
If Skewness is greater than 0, the distribution is right-skewed, meaning the right tail
is longer. If Skewness is less than 0, the distribution is left-skewed, with a longer
left tail.
The formula for calculating Skewness is as follows:
∑n
i=1 (x i − x)
3
. g1 = (2.6)
(n − 1) · S 3

where S represents the standard deviation, and .x denotes the mean of the sample.

2.5.2 Kurtosis

Kurtosis is a measure indicating whether a distribution is taller or flatter compared


to a normal distribution. A Kurtosis value of 0 signifies a distribution similar to the
2.6 Comprehensive Description 13

normal distribution. A value greater than 0 indicates a distribution with a sharper


peak, while a value less than 0 suggests a flatter distribution than the normal curve.
The formula for calculating Kurtosis is:
∑n
i=1 (x i − x)
4
. g2 = (2.7)
(n − 1) · S 4

where S denotes the standard deviation, and .x represents the mean of the sample.
In R, the skewness() and kurtosis() functions from the moments package
can be utilized to calculate skewness and kurtosis, respectively. The corresponding
commands are as follows:
# calculate skewness
skewness(data$Tem)
[1] -0.2369317
# calculate kurtosis
kurtosis(data$Tem)
[1] -0.8060247

Alternatively, the skewness() and kurtosis() functions from the e1071 package
can be used to calculate these values. The respective commands are as follows:
# calculate skewness
skewness(data$Tem)
[1] -0.2369317
# calculate kurtosis
kurtosis(data$Tem)
[1] -0.8060247

The skewness of the Tem variable is -0.2369317, indicating a slight leftward or


negative skew in the distribution of the annual average temperature data from 837
meteorological observation stations in 2020. The kurtosis of the Tem variable is -
0.8060247, indicating that the distribution is platykurtic, as a normal distribution
has a kurtosis of 3.

2.6 Comprehensive Description

2.6.1 Using describe() Function

Additionally, in R, the describe() function in the psych package can be used to


calculate various statistics of geographic data, including number, mean, standard
deviation, median, trimmed mean, median absolute deviation, minimum,
maximum, range, skewness, kurtosis, and standard error of the mean.
The corresponding commands are as follows:
Tem <- data$Tem
Pre <- data$Pre
Prs <- data$Prs
Alt <- data$Alt
14 2 Descriptive Analysis of Geographic Data

describe(Tem)
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 837 12.59 6.56 14 12.75 7.41 -5 26 31 -0.24 -0.81 0.23

describe(Pre)
vars n mean sd median trimmed mad min max range skew kurtosis
X1 1 837 953.5 595.52 843 923.11 624.17 0 2922 2922 0.45 -0.56
se
X1 20.58

describe(Prs)
vars n mean sd median trimmed mad min max range skew kurtosis
X1 1 837 923.34 110.26 969 945.47 63.75 511 1017 506 -1.55 1.7
se
X1 3.81

describe(Alt)
vars n mean sd median trimmed mad min max range skew kurtosis
X1 1 837 860.81 1104 371 621.82 517.43 -3 5052 5055 1.77 2.56
se
X1 38.16

The results produced by the describe() function align with those obtained from
the previous calculation steps. Initially, when learning to analyze geographic data
with R, it is beneficial to calculate each descriptive statistic individually. However,
with proficiency, the describe() function can efficiently generate all required val-
ues in a single step.

2.6.2 Using summary() Function

Alternatively, the summary() function in R can be used, providing min, max, lower
quartile, upper quartile, mean, and median for a vector, data frame, regres-
sion model, or ANOVA model.
The basic syntax for this function is as follows:
Summary(object, . . . )
Here, object refers to the entity for which a summary desired.
summary(Tem)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-5.00 8.00 14.00 12.59 18.00 26.00

summary(Pre)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0 513.0 843.0 953.5 1400.0 2922.0

summary(Prs)
Min. 1st Qu. Median Mean 3rd Qu. Max.
511.0 883.0 969.0 923.3 1005.0 1017.0

summary(Alt)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-3.0 83.0 371.0 860.8 1185.0 5052.0
2.7 Visualization 15

2.7 Visualization

2.7.1 Scatterplot Chart

A scatterplot chart illustrates the relationship between two numerical variables,


where each dot represents an individual observation. The positioning on the x (hori-
zontal) and y (vertical) axes reflects the values of these two variables. In geography,
the negative correlation between altitude and air pressure is deduced from extensive
empirical observations.
To exemplify this, a scatterplot illustrating the negative correlation
between air pressure and altitude at meteorological observation stations in 2020
is presented.
In R, the plot() function in the graphics package is used to create scatterplots.
The basic syntax is as follows:

plot(x, y, type=“p”, main, xlab, ylab, xlim, ylim)

Here, x and y represent the datasets for the x-axis and y-axis, respectively, "p"
indicates plotting with points, main sets the overall title, xlab and ylab label the
axes, while xlim and ylim define the axes’ value ranges.
plot(Alt, Prs, type = "p", xlab = "Altitude (meters)", ylab = "Air pressure (hpa)" ,
xlim = c(-5, 5200), ylim = c(500, 1100), cex = 0.5, cex.axis = 0.5, cex.lab = 0.5,
⊂→ pch = 1)

The distribution of points in Fig. 2.1 reveals a strong negative correlation between
air pressure and altitude, indicating that air pressure decreases as altitude increases.

Fig. 2.1 Scatter plot of air


1100

pressure and altitude at each


station in 2020
1000
900
Air pressure (hpa)
800
700
600
500

0 1000 2000 3000 4000 5000


Altitude (meters)
16 2 Descriptive Analysis of Geographic Data

2.7.2 Histogram

A histogram, in contrast, is a frequency graph that displays the distribution of contin-


uous statistical or quantitative data. Each bar’s height corresponds to the frequency
in a specific range. Unlike a bar chart, which displays categorical data with gaps
between bars, a histogram groups values into continuous ranges without gaps. In
R, it’s crucial to differentiate between histograms and bar charts when visualizing
geographic data.
The hist() function in the graphics package is used for creating histograms.
The syntax is as follows:
hist(v, main, xlab, ylab, xlim, ylim, breaks, col, border)
In this syntax, v is the vector containing histogram values, main sets the title,
xlab and ylab label the axes, xlim and ylim define the axes’ ranges, breaks sets
the bar width, col determines the bar color, and border sets the border color.
Below is an example that demonstrates how to create histograms in R using tem-
perature, precipitation, and air pressure data from 2020 gathered at various ground
stations. In this example, the par() function is utilized to adjust the line thickness
of the histogram bins. The lwd parameter’s value can be modified as required to
achieve lines of varying thicknesses. After several adjustments, setting lwd = 0.2
has been found to slightly improve the line width of the bins in this instance. Addi-
tionally, the text() function is used to append frequency labels atop each bin, with
the cex parameter allowing for the adjustment of label sizes (Fig. 2.2).
# make the histogram
line <- par(lwd = 0.2)
h <- hist(Tem, breaks = 10, xlab = "Temperature (\u00B0C)", ylab = "Frequency", xlim =
⊂→ c(-10, 40), ylim = c(0, 300), labels = FALSE, main = "", col = "lightgray", lwd =
⊂→ 0.2, cex = 0.5, cex.axis = 0.5, cex.lab = 0.5)
text(x = h$mids, y = h$counts, labels = h$counts, pos = 3, cex = 0.5)

In Fig. 2.2, the x-axis denotes temperature ranges, while the y-axis represents the
frequency of observations within each range. The numbers atop each bar specify
the count of observations in respective temperature bins. The highest frequency
occurred in the 15–20 .◦ C range with 252 observations, followed by 183 observa-
tions in the 5–10 .◦ C range, and 162 in the 10–15 .◦ C range.
To enhance a histogram with probability density lines, begin by adjusting the
y-axis to density scale, achieved by setting the probability argument to TRUE
in the hist() function call. Subsequently, a probability density line can be crafted
using the density() function to determine the curve’s position, and then integrated
into the histogram with the lines() function. Additionally, for overlaying a normal
curve, employ the dnorm() function, which necessitates creating a grid of values
based on the data’s mean and standard deviation. This normal curve is then
seamlessly added atop the histogram, also using the lines() function, offering a
comprehensive view by combining both theoretical and observed data distributions
in a single visualization (Fig. 2.3).
2.7 Visualization 17

Fig. 2.2 Histogram of

300
temperature in 2020
252

250
200
183

Frequency
162

150
121

90

100
50
27

2
0

−10 0 10 20 30 40
Temperature (°C)

# Calculate the density of 'Tem'


density_values <- density(Tem)

# Calculate mean and standard deviation for 'Tem'


mean_tem <- mean(Tem, na.rm = TRUE)
sd_tem <- sd(Tem, na.rm = TRUE)

# Create a sequence of values from min to max of 'Tem'


x_values <- seq(min(Tem, na.rm = TRUE), max(Tem, na.rm = TRUE))
# Generate the normal distribution curve
y_values <- dnorm(x_values, mean = mean_tem, sd = sd_tem)

# Plot the histogram for 'Tem'


line <- par(lwd = 0.2)
h <- hist(Tem, breaks = 10, xlab = "Temperature (\u00B0C)", ylab = "Density", main =
⊂→ "", xlim = c(-10, 40), ylim = c(0, 0.07), labels = F, freq = F, prob = TRUE, col =
⊂→ "lightgray", lwd = 0.2, cex = 0.5, cex.axis = 0.5, cex.lab = 0.5)

# add the density curve


lines(density_values, col = "blue", lwd = 1)
# add a normal curve on the histogram
lines(x_values, y_values, lty = 2, col = 2, lwd = 1)

Figure 2.3 presents a histogram with the overlay of a density line and a normal distri-
bution curve. The blue curve represents the kernel density estimate. The red dotted
line illustrates a normal distribution curve for comparison purposes. It suggests that
the temperature data might not follow a normal distribution, as indicated by the mis-
alignment of the distribution’s peaks with the dotted line and the tails appearing
broader than those typically seen in a normal distribution.
Similarly, histograms for precipitation and air pressure can be created, each com-
plemented by an overlay of the respective kernel density curve and a normal distri-
bution curve for comparative analysis (Fig. 2.4).
# make the histogram
line <- par(lwd = 0.2)
18 2 Descriptive Analysis of Geographic Data

0.07
Fig. 2.3 Histogram of
temperature with density and

0.06
normal curves

0.05
0.04
Density
0.03
0.02
0.01
0.00

−10 0 10 20 30 40
Temperature (°C)

Fig. 2.4 Histogram of


140

precipitation in 2020 119


122
120
100

86 87
78
Frequency
80

72
69
64
54
60

41
40

26

13
20

4
1 1
0

0 500 1000 1500 2000 2500 3000


Precipitation (mm)

h <- hist(Pre, xlab = "Precipitation (mm)", ylab = "Frequency", xlim = c(0, 3000), ylim =
⊂→ c(0, 140), labels = F, main = "", col = "lightgray", lwd = 0.2, cex = 0.5,
⊂→ cex.axis = 0.5, cex.lab = 0.5)
text(x = h$mids, y = h$counts, labels = h$counts, pos = 3, cex = 0.5)

In Fig. 2.4, most of the precipitation values cluster around the lower end of the scale,
with the frequency decreasing as the precipitation amount increases. The highest
frequencies are for precipitation amounts roughly between 375 to 812.5 mm, after
which the frequency drops significantly. There are very few instances of very high
precipitation amounts (over 2500), as indicated by the low bars towards the right
end of the x-axis.
2.7 Visualization 19

Additionally, density and normal curves will be superimposed on the precipi-


tation histogram to enhance the analysis. The density(), dnorm() and line()
functions from the stats package will be employed for this purpose (Fig. 2.5).
# Calculate the density of 'Pre'
density_values <- density(Pre)

# Calculate mean and standard deviation for 'Pre'


mean_pre <- mean(Pre, na.rm = TRUE)
sd_pre <- sd(Pre, na.rm = TRUE)

# Create a sequence of values from min to max of 'Pre'


x_values <- seq(min(Pre, na.rm = TRUE), max(Pre, na.rm = TRUE))
# Generate the normal distribution curve
y_values <- dnorm(x_values, mean = mean_pre, sd = sd_pre)

# Plot the histogram for 'Pre'


line <- par(lwd = 0.2)
hist(Pre, xlab = "Precipitation (mm)", ylab = "Density", xlim = c(0, 3000), ylim = c(0,
⊂→ 8e-04), labels = F, freq = F, prob = TRUE, main = "", col = "lightgray", lwd = 0.2,
⊂→ cex = 0.5, cex.axis = 0.5, cex.lab = 0.5)

# Add the density curve


lines(density_values, col = "blue", lwd = 1)
# Add the normal curve as a red dashed line
lines(x_values, y_values, col = "red", lty = 2, lwd = 1)

In Fig. 2.5, the blue and red curves are the similar in Fig. 2.3. The tail on the right
side of the histogram is longer than the left side, and the mass of the distribution is
concentrated on the left.
Similar to temperature and precipitation, a histogram can also be constructed for
air pressure (Fig. 2.6).
# make the histogram of air pressure
line <- par(lwd = 0.2)

Fig. 2.5 Histogram of


8e−04

precipitation with density


and normal curves
6e−04
Density
4e−04
2e−04
0e+00

0 500 1000 1500 2000 2500 3000

Precipitation (mm)
20 2 Descriptive Analysis of Geographic Data

Fig. 2.6 Histogram of air

300
pressure in 2020 256

250
219

200
Frequency
150
106 107

100
45

50 21 25 22
17 18
1
0

500 600 700 800 900 1000 1100


Air pressure (hpa)

h <- hist(Prs, xlab = "Air pressure (hpa)", ylab = "Frequency", xlim = c(500, 1100),
⊂→ ylim = c(0, 300), labels = F, main = "", col = "lightgray", lwd = 0.2, cex = 0.5,
⊂→ cex.axis = 0.5, cex.lab = 0.5)
text(x = h$mids, y = h$counts, labels = h$counts, pos = 3, cex = 0.5)

In Fig. 2.6, this histogram illustrates the distribution of air pressure readings in a
dataset. It is evident that the majority of observations are clustered between 850
and 1050 hpa, indicating a higher occurrence of air pressure within this range. The
distribution is left-skewed, which is apparent from the tail stretching towards lower
pressure values. The bar representing the 1000–1050 hpa range has the highest fre-
quency, with 256 occurrences, suggesting that this is the most common air pressure
recorded in the dataset.
Following the method applied to temperature and precipitation, density and nor-
mal curves are likewise overlaid on the air pressure histogram, offering data analysts
an enhanced perspective of the data distribution (Fig. 2.7).
# Calculate the density of 'Prs'
density_values <- density(Prs)

# Calculate mean and standard deviation for 'Prs'


mean_prs <- mean(Prs, na.rm = TRUE)
sd_prs <- sd(Prs, na.rm = TRUE)

# Create a sequence of values from min to max of 'Prs'


x_values <- seq(min(Prs, na.rm = TRUE), max(Prs, na.rm = TRUE))
# Generate the normal distribution curve
y_values <- dnorm(x_values, mean = mean_prs, sd = sd_prs)

# Plot the histogram for 'Prs'


line <- par(lwd = 0.2)
hist(Prs, xlab = "Air pressure (hpa)", ylab = "Frequency", xlim = c(500, 1100), ylim =
⊂→ c(0, 0.008), labels = F, freq = F, prob = TRUE, main = "", col = "lightgray", lwd =
⊂→ 0.2, cex = 0.5, cex.axis = 0.5, cex.lab = 0.5)
2.7 Visualization 21

Fig. 2.7 Histogram of air

0.008
pressure with density and
normal curves

0.006
Frequency
0.004
0.002
0.000

500 600 700 800 900 1000 1100


Air pressure (hpa)

# Calculate the density of 'Prs'


density_values <- density(Prs)

# Add the density curve


lines(density_values, col = "blue", lwd = 1)
# Add the normal distribution curve as a red dashed line
lines(x_values, y_values, col = "red", lty = 2, lwd = 1)

In Fig. 2.7, this histogram with overlaid density curves showcases the distribution
of air pressure values within the dataset. The blue line is the kernel density estimate.
The red dashed line represents a normal distribution curve. The graph indicates that
while there is a concentration of values around the mean, the distribution is not
perfectly normal as depicted by the discrepancy between the density estimate and
the normal curve.

2.7.3 Bar

To illustrate the distinction between a histogram and a bar graph, this section uses
2021 population data from various provinces and cities in China (measured in tens
of millions) as an example.
In R, the ggplot() function from the ggplot2 package can be used to create a
bar graph, as shown below:
# read the data
df <- read_csv(file = "./data/chinapopulation2021en.csv")
dplyr::glimpse(df)
Rows: 31
Columns: 2
$ Pname <chr> "Guangdong", "Shandong", "Henan", "Jiangsu", "Sichuan", "Hebei",~
$ Pop <dbl> 12.60, 10.15, 9.94, 8.47, 8.37, 7.46, 6.64, 6.46, 6.10, 5.78, 5.~
22 2 Descriptive Analysis of Geographic Data

Additionally, the subset() function from the base package is used to selectively
extract a specific subset of data from the original dataset, df, which contains the
2021 population data for China.
# PN indicates Province Names,PP indicates Province Poplation
PN <- df$Pname[1:31]
PP <- df$Pop[1:31]
# Pb is the subset for blue color,Pw is the subset for white
Pb <- subset(df, PP < 4)
Pw <- subset(df, PP >= 4)

When employing the ggplot() function for plotting, the arrange() and the
mutate() functions from the dplyr package are utilized to more effectively pro-
cess the data (Fig. 2.8).
df1 <- data.frame(
population = PP,
name = factor(PN),
y = seq(length(PN)) * 0.9
)

df2 <- arrange(df1, population)


df3 <- mutate(df2, name = factor(name, levels = name))
plt <- ggplot(df3) +
geom_col(aes(population, name), fill = "#076fa2", width = 0.8, position =
⊂→ position_dodge(0.9)) +
labs(x = "Population (ten million)", y = "Province/City") +
theme(text = element_text(size = 8))
plt

As depicted in Fig. 2.8, the ggplot() function effectively generates a bar chart.
While the default rendering is often sufficient, additional enhancements can signif-
icantly improve the chart’s aesthetics. These enhancements include modifying the
axes configuration, altering the background color, removing tick marks, adding grid
lines, changing fonts, and more are implemented using the scale_continuous(),
scale_y_discrete() and geom_text() functions from the ggplot2 package, and
the geom_shadowtext() function from the shadowtext package.
plt1 <- plt +
scale_x_continuous(
limits = c(0, 14),
breaks = seq(0, 14, by = 2),
expand = c(0, 0),
position = "top"
) +
scale_y_discrete(expand = expansion(add = c(0, 0.5))) +
theme(
panel.background = element_rect(fill = "white"),
panel.grid.major.x = element_line(color = "#A8BAC4", linewidth = 0.3),
axis.ticks.length = unit(0, "mm"),
axis.title = element_blank(),
axis.line.y.left = element_line(color = "black"),
axis.text.y = element_blank(),
axis.text.x = element_text(family = "Times New Roman", size = 8)
)
2.7 Visualization 23

Guangdong
Shandong
Henan
Jiangsu
Sichuan
Hebei
Hunan
Zhejiang
Anhui
Hubei
Guangxi
Yunnan
Jiangxi
Province/City

Liaoning
Fujian
Shaanxi
Guizhou
Shanxi
Chongqing
Heilongjiang
Xinjiang
Gansu
Shanghai
Jilin
Neimenggu
Beijing
Tianjin
Hainan
Ningxia
Qinghai
Xizang

0 5 10
Population (ten million)

Fig. 2.8 The barplot of 2021 population data for some provinces and cities in China

The geom_text() and geom_shadowtext() functions are utilized in the following


example. geom_text() is employed for adding text within the bars representing
populations of 40 million or more, while geom_shadowtext() is used for creating
shadowed text to the right of bars representing populations below 40 million. This
approach effectively visualizes the population of each province or city.
plt2 <- plt1 +
geom_shadowtext(
data = Pb,
aes(Pop, y = Pname, label = Pname),
hjust = 0,
nudge_x = 0.3,
colour = "#076fa2",
bg.colour = "white",
bg.r = 0.2,
family = "Times New Roman",
size = 2
) +
geom_text(
data = Pw,
aes(0, y = Pname, label = Pname),
hjust = 0,
nudge_x = 0.3,
colour = "white",
family = "Times New Roman",
size = 2
)
24 2 Descriptive Analysis of Geographic Data

Adding a title and subtitle to the chart can greatly enhance its overall quality.
extrafont::loadfonts(quiet = TRUE)
library(showtext)
showtext_auto()
plt3 <- plt2 +
labs(
title = "Population",
subtitle = "Population of some provinces and cities in China, 2021 (in ten millions)"
) +
theme(
plot.title = element_text(
family = "Arial",
face = "bold",
size = 11
),
plot.subtitle = element_text(
family = "Arial",
size = 8
)
)
plt3

The final version of the chart, shown in Fig. 2.9, is more aesthetically pleasing and
easier to read compared to Fig. 2.8.

Fig. 2.9 The customized barplot of 2021 population data of some provinces and cities in China
Reference 25

Reference

Fisher, Murray J., and Andrea P. Marshall. 2009. Understanding Descriptive Statistics. Australian
Critical Care 22 (2): 93–97.
Chapter 3
Correlation Analysis

This chapter focuses on the correlation analysis of observation data from 837 sur-
face meteorological stations in China in 2020. Both the cor()and cor.test()
functions, included in R’s base package, are employed to meet our analytical require-
ments effectively. The rcorr() function is employed to derive the p-value matrix of
the correlation matrix. Additionally, the corr.test() function is utilized to rapidly
generate a pairwise correlation matrix for an entire dataset, complete with p-values
and confidence intervals.

3.1 An Overview of Correlation

Correlation is a statistical measure that represents the relationship between two vari-
ables. In geographic data analysis, correlation tests are often utilized to assess the
relationships among two or more variables, thereby revealing the closeness of rela-
tionships between geographic features (Xu 2006). These relationships are primarily
quantified using the correlation coefficient. For example, using the 2020 meteorolog-
ical data from China’s national meteorological observation stations, we can explore
relationships among variables like altitude, latitude, longitude, air pressure, precip-
itation, and temperature by calculating their correlation coefficients. The correla-
tion coefficient describes the degree of association between two variables, typically
ranging between .−1 and 1. A value closer to 0 indicates a weaker relationship.
Three types of correlation coefficients are commonly used in geographic data
analysis: the Pearson, Spearman, and Kendall correlation coefficients. The
Pearson coefficient, a parametric measure, assesses the linear relationship between
two variables and is appropriate for continuous variables following a normal dis-
tribution. In contrast, Spearman and Kendall coefficients are non-parametric and
suitable for both continuous and categorical variables.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 27
X. He, Geographic Data Analysis Using R,
https://doi.org/10.1007/978-981-97-4022-2_3
28 3 Correlation Analysis

3.2 Calculating the Correlation

In R, the cor() function in stats package calculates the correlation. Its syntax is as
follows:
cor(x, y, method=“pearson”, “kendall”, “spearman”)
Here, x and y are the variables, and method specifies the calculation approach, with
pearson being the default.

3.2.1 Calculating the Correlation Between Two Variables

This section demonstrates how to calculate the correlation coefficients between two
variables in R, utilizing observation data from 837 surface stations across China in
2020.
# install and load the packages
packages <- c("ggpubr", "corrplot", "Hmisc", "tidyverse", "nortest", "dplyr", "psych",
⊂→ "PerformanceAnalytics")

for (pkg in packages) {


if (!require(pkg, character.only = TRUE)) {
install.packages(pkg, repos = "https://mirrors.tuna.tsinghua.edu.cn/CRAN/")
}
}

# Load packages after installing


sapply(packages, require, character.only = TRUE)
ggpubr corrplot Hmisc
TRUE TRUE TRUE
tidyverse nortest dplyr
TRUE TRUE TRUE
psych PerformanceAnalytics
TRUE TRUE

# read meteorological observation data


data <- read_csv(file = "./data/climate2020.csv")
dplyr::glimpse(data)
Rows: 837
Columns: 8
$ FID <dbl> 673, 662, 829, 317, 349, 704, 719, 786, 808, 832, 328, 689, 325, ~
$ site <dbl> 58251, 58143, 59673, 54471, 54776, 58472, 58569, 59134, 59321, 59~
$ Pre <dbl> 1113, 1280, 2292, 584, 834, 1282, 987, 656, 769, 627, 558, 1649, ~
$ Prs <dbl> 1017, 1017, 1013, 1017, 1016, 1011, 1015, 1013, 1014, 1011, 1017,~
$ Tem <dbl> 16, 16, 24, 11, 13, 18, 19, 23, 23, 26, 13, 18, 13, 15, 17, 12, 1~
$ Alt <dbl> -3, -1, -1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3,~
$ Lon <dbl> 120.2833, 119.8500, 112.7667, 122.1667, 122.7000, 122.4500, 121.9~
$ Lat <dbl> 32.85000, 33.80000, 21.73333, 40.65000, 37.40000, 30.73333, 29.20~

The cor() function in R calculates correlations between various pairs of vari-


ables, such as altitude (Alt) and air pressure (Prs), altitude and precipitation (Pre),
as well as altitude and temperature (Tem). Prior to calculation, it’s crucial to choose
3.2 Calculating the Correlation 29

an appropriate method for computing the correlation coefficient. This involves creat-
ing scatterplots to assess the linearity of the relationships and employing statistical
tests to check for normal distribution in the data. Following these assessments, the
most suitable method for calculating the correlation coefficient can be selected.

3.2.1.1 Visualizing Data

Prior to calculating the correlation coefficients among altitude (Alt), air pres-
sure (Prs), precipitation (Pre), and temperature (Tem), scatterplots are generated
using the ggscatter() function from the ggpubr package to visually assess the
relationships between these variable pairs (Fig. 3.1).
# draw the scatterplot for altitude and air pressure
ggscatter(data, x = "Alt", y = "Prs", color = "grey37", add = "reg.line", add.params =
⊂→ list(color = "blue"), conf.int = TRUE, cor.coef = TRUE, cor.coeff.args =
⊂→ list(method = "pearson", label.x.npc = "center", label.y.npc = "top"),
⊂→ cor.coef.size = 3, xlab = "Altitude (meters)", ylab = "Air pressure (hPa)",
⊂→ point.size = 0.5, shape = 1) +
theme(axis.line = element_line(size = 0.2), axis.title.x = element_text(size = 8),
⊂→ axis.title.y = element_text(size = 8), axis.text.x = element_text(size = 6),
⊂→ axis.text.y = element_text(size = 6))

In Fig. 3.1, the scatterplot illustrates the relationship between altitude and air
pressure, demonstrating a negative correlation: as altitude increases, air pressure
decreases. This inverse relationship is visually reinforced by the downward-sloping
blue regression line. The depicted correlation coefficient, r .= −1, implies a perfect
linear relationship, an unlikely scenario in natural datasets and possibly indicative of
a calculation error. Nonetheless, the notably low p-value underscores the statistical
significance of this negative correlation, affirming that it is not a product of random
variation.
# draw the scatterplot for altitude and precipitation
ggscatter(data, x = "Alt", y = "Pre", color = "grey37", add = "reg.line", add.params =
⊂→ list(color = "blue"), conf.int = TRUE, cor.coef = TRUE, cor.coeff.args =
⊂→ list(method = "pearson", label.x.npc = "center", label.y.npc = "top"),
⊂→ cor.coef.size = 3, xlab = "Altitude (meters)", ylab = "Precipitation (mm)",
⊂→ point.size = 0.5, shape = 1) +
theme(axis.line = element_line(size = 0.2), axis.title.x = element_text(size = 8),
⊂→ axis.title.y = element_text(size = 8), axis.text.x = element_text(size = 6),
⊂→ axis.text.y = element_text(size = 6))

Figure 3.2 features a scatterplot detailing the correlation between altitude and
precipitation, where each data point represents an individual observation. The cor-
relation coefficient of .−0.41 suggests a moderate negative correlation, indicating
a general decrease in precipitation with increasing altitude. The exceptionally low
p-value validates the statistical significance of this correlation, affirming its non-
random nature. The blue regression line, along with its confidence interval, outlines
the overall trend and the predictive reliability between altitude and precipitation.
30 3 Correlation Analysis

1000 R = − 1, p < 2.2e−16

900
Air pressure (hPa)

800

700

600

500

0 1000 2000 3000 4000 5000

Altitude (meters)

Fig. 3.1 The scatterplot for altitude and air pressure of surface stations in 2020

# draw the scatterplot for altitude and temperature


ggscatter(data, x = "Alt", y = "Tem", color = "grey37", add = "reg.line", add.params =
⊂→ list(color = "blue"), conf.int = TRUE, cor.coef = TRUE, cor.coeff.args =
⊂→ list(method = "pearson", label.x.npc = "center", label.y.npc = "top"),
⊂→ cor.coef.size = 3, xlab = "Altitude (meters)", ylab = "Temperature (\u00B0C)",
⊂→ point.size = 0.5, shape = 1) +
theme(axis.line = element_line(size = 0.2), axis.title.x = element_text(size = 8),
⊂→ axis.title.y = element_text(size = 8), axis.text.x = element_text(size = 6),
⊂→ axis.text.y = element_text(size = 6))

In Fig. 3.3, the scatterplot’s blue regression line delineates the general trend in the
dataset, highlighting the variation in temperature with altitude. The negative slope
of the line, coupled with a correlation coefficient (R) of .−0.59 and a p-value below
2.2e-16, indicates a substantial negative correlation. Simplified, this suggests that
temperature typically decreases as altitude increases.

3.2.1.2 Normal Distribution Test

Figures 3.1, 3.2, and 3.3 reveal certain linear relationships between altitude and air
pressure, altitude and precipitation, as well as altitude and temperature, respectively.
3.2 Calculating the Correlation 31

3000

R = − 0.41, p < 2.2e−16

2000
Precipitation (mm)

1000

0 1000 2000 3000 4000 5000

Altitude (meters)

Fig. 3.2 The scatterplot for altitude and precipitation of surface stations in 2020

To further analyze these variables, each is subjected to a normality test. A com-


mon approach for this assessment is the Q-Q (Quantile-Quantile) plot. Utilizing the
ggqqplot() function from the ggpubr package, we can visually inspect if the data
distributions align with a normal distribution.
Q-Q plot: The Q-Q plot is a graphical tool used to determine if the distribution of
a dataset approximates a theoretical distribution, typically a normal distribution. It
compares the observed quantiles against theoretically expected quantiles of a normal
distribution (Figs. 3.4, 3.5, 3.6 and 3.7).
# Customize a ggplot2 theme
my_theme <- theme_pubr() +
theme(
axis.line = element_line(size = 0.2),
axis.text.x = element_text(size = 6),
axis.text.y = element_text(size = 6),
axis.title.x = element_text(size = 8),
axis.title.y = element_text(size = 8)
)

# Plotting with ggqqplot for altitude


ggqqplot(data$Alt, ylab = "Altitude", size = 0.5, shape = 1, ggtheme = my_theme)
32 3 Correlation Analysis

R = − 0.59, p < 2.2e−16

20
Temperature (°C)

10

0 1000 2000 3000 4000 5000

Altitude (meters)

Fig. 3.3 The scatterplot for altitude and temperature of surface stations in 2020

In Fig. 3.4, data points near the middle of the plot (around the theoretical quan-
tile of 0) align closely with the reference line, suggesting normal distribution in this
region. However, deviations at the plot’s ends, notably on the right (higher theoret-
ical quantiles), indicate non-normality in the data’s tails, potentially suggesting a
heavy-tailed distribution. Therefore, while the Alt data’s central part appears nor-
mally distributed, the tail deviations imply that the overall data may not strictly
adhere to a normal distribution.
# Plotting with ggqqplot for air pressure
ggqqplot(data$Prs, ylab = "Air pressure", size = 0.5, shape = 1, ggtheme = my_theme)

Figure 3.5 reveals that the central part of the plot aligns with the reference line,
indicating that the Prs data distribution in this area resembles a normal distribution.
Nevertheless, deviations in the tails, especially in the right tail, suggest heavier tails
than those of a normal distribution, implying that Prs is approximately normal with
potential tail deviations.
# Plotting with ggqqplot for precipitation
ggqqplot(data$Pre, ylab = "Precipitation", size = 0.5, shape = 1, ggtheme = my_theme)
3.2 Calculating the Correlation 33

5000

2500
Altitude

−2500

−2 0 2
Theoretical

Fig. 3.4 Q-Q plot of altitude

In Fig. 3.6, central data points on the Q-Q plot adhere to the reference line, indi-
cating normal distribution within this range. However, tail deviations, particularly in
the right tail where data points diverge upwards, suggest a possible right-skewed dis-
tribution with more high-value outliers than expected in a normal distribution. Most
data points fall within the confidence band, indicating that the data is close to a nor-
mal distribution. Thus, while not perfectly normal, especially with right skewness
evidence, a significant portion of the data could be considered normally distributed,
particularly around the mean.
# Plotting with ggqqplot for temperature
ggqqplot(data$Tem, ylab = "Temperature", size = 0.5, shape = 1, ggtheme = my_theme)

Figure 3.7 shows that data points in the plot’s center closely follow the refer-
ence line, hinting at normality within the central range of the distribution. How-
ever, tail deviations, especially in the right tail, suggest potential slight skewness
or heavier tails than a normal distribution. The slight S shape in the data points
implies lighter tails at the lower end and heavier tails at the upper end. Most points
within the band support the assumption of normality, despite slight tail deviations.
Therefore, while the data does not perfectly conform to a normal distribution, espe-
cially in the tails, the central mass largely fits normality, indicating that the dataset
is approximately normal, barring outliers or slight skewness.
34 3 Correlation Analysis

1200

1000
Air pressure

800

600

−2 0 2
Theoretical

Fig. 3.5 Q-Q plot of air pressure

The Q-Q plot serves as a visual method for assessing data distribution, but to gain
a more accurate understanding, it’s often necessary to complement it with additional
statistical tests for assessing normality.
Given that the sample size exceeds 50, the Anderson-Darling Test, a well-
regarded goodness-of-fit test, is suitable for this purpose.
Anderson-Darling normality test: This test evaluates how closely a given distribu-
tion approximates the observed data, and is commonly employed to assess normality.
The ad.test() function from the ggpubr package facilitates this analysis.
# Anderson-Darling normality test of altitude
ad.test(data$Alt)

Anderson-Darling normality test

data: data$Alt
A = 68.68, p-value < 2.2e-16

# Anderson-Darling normality test of air pressure


ad.test(data$Prs)

Anderson-Darling normality test


Random documents with unrelated
content Scribd suggests to you:
palace unless absolutely necessary to smoke Sarros out. Scattered
over the grounds Ricardo counted some twenty-odd Government
soldiers, all wearing that pathetically flat, crumpled appearance
which seems inseparable from the bodies of men killed in action.
The first shrapnel had probably commenced to drop in the grounds
just as a portion of the palace garrison had been marching out to
join the troops fighting at the cantonment barracks. Evidently the
men had scattered like quail, only to be killed as they ran.
From this grim scene Ricardo raised his eyes to the palace, the
castellated towers of which, looming through the tufted palms, were
reflecting the setting sun. Over the balustrade of one of the upper
balconies the limp body of a Sarros sharpshooter, picked off from the
street, drooped grotesquely, his arms hanging downward as if in
ironical welcome to the son of Ruey the Beloved. The sight induced
in Ricardo a sense of profound sadness; his Irish imagination awoke;
to him that mute figure seemed to call upon him for pity, for
kindness, for forbearance, for understanding and sympathy. Those
outflung arms of the martyred peon symbolized to Ricardo Ruey the
spirit of liberty, shackled and helpless, calling upon him for
deliverance; they brought to his alert mind a clearer realization of
the duty that was his than he had ever had before. He had a great
task to perform, a task inaugurated by his father, and which Ricardo
could not hope to finish in his lifetime. He must solve the agrarian
problem; he must develop the rich natural resources of his country;
he must provide free, compulsory education and evolve from the
ignorance of the peon an intelligence that would built up that which
Sobrante, in common with her sister republics, so woefully lacked—
the great middle class that stands always as a buffer between the
aggression and selfishness of the upper class and the helplessness
and childishness of the lower.
Ricardo bowed his head. “Help me, O Lord,” he prayed. “Thou hast
give me in Thy wisdom a man's task. Help me that I may not prove
unworthy.”
CHAPTER XXVIII

M
OTHER JENKS, grown impatient at the lack of news
concerning Webster, left Dolores to her grief in the room
across the hall and sought the open air, for of late she had
been experiencing with recurring frequency a slight feeling of
suffocation. She sat down on the broad granite steps, helped herself
to a much-needed “bracer” from her brandy flask and was gazing
pensively at the scene around her when Ricardo came up the stairs.
“'Elio!” Mother Jenks saluted him. “W'ere 'ave you been, Mr.
Bowers?”
“I have just returned from capturing Sarros, Mrs. Jenks. He is on
his way to the arsenal under guard.”
“Gor' strike me pink!” the old lady cried. “'Ave I lived to see this
day!” Her face was wreathed in a happy smile. “I wonder 'ow the
beggar feels to 'ave the shoe on the other foot, eh—
the'eartless'ound! I'm 'opin' this General Ruey will 'ave the blighter
shot.”
“You need have no worry on that score, Mrs. Jenks. I'm General
Ruey. Andrew Bowers was just my summer name, as it were.”
“Angels guard me! Wot the bloomin' 'ell surprise won't we 'ave
next. Wot branch o' the Ruey tribe do you belong to? Are you a
nephew o' him that was president before Sarros shot 'im? Antonio
Ruey, who was 'arf brother to the president, 'ad a son 'e called
Ricardo. Are you 'im, might I arsk?”
“I am the son of Ricardo the Beloved,” he answered proudly.
“Not the lad as was away at school when 'is father was
hexecuted?”
“I am that same lad, Mrs. Jenks. And who are you? You seem to
know a deal of my family history.”
“I,” the old publican replied with equal pride, “am Mrs. Colonel
'Enery Jenks, who was your father's chief of hartillery an' 'ad the
hextreme honour o' dyin' in front o' the same wall with 'im. By the
w'y, 'ow's Mr. Webster?” she added, suddenly remembering the
subject closest to her heart just then.
“His wounds are trifling. He'll live, Mrs. Jenks.”
“Well, that's better than gettin' poked in the eye with a sharp
stick,” the old dame decided philosophically.
“Do you remember my little sister, Mrs. Jenks?” Ricardo continued.
“She was in the palace when Sarros attacked it; she perished there.”
“I believe I 'ave got a slight recollection o' the nipper, sir,” Mother
Jenks answered cautiously. To herself she said: “I s'y, 'Enrietta, 'ere's
a pretty go. 'E don't know the lamb is livin' an' in the next room! My
word, wot a riot w'en 'e meets 'er!”
“I will see you again, Mrs. Jenks. I must have a long talk with
you,” Ricardo told her, and passed on into the palace; whereupon
Mother Jenks once more fervently implored the Almighty to strike
her pink, and the iron restraint of a long, hard, exciting day being
relaxed at last, the good soul bowed her gray head in her arms and
wept, moving her body from side to side the while and demanding,
of no one in particular, a single legitimate reason why she, a
blooming old baggage and not fit to live, should be the recipient of
such manifold blessings as this day had brought forth.
In the meantime Ricardo, with his hand on the knob of the door
leading to the room where Webster was having his wounds dressed,
paused suddenly, his attention caught by the sound of a sob, long-
drawn and inexpressibly pathetic. He listened and made up his mind
that a woman in the room across the entrance-hall was bewailing
the death of a loved one who answered to the name of Caliph and
John darling. Further eavesdropping convinced him that Caliph, John
darling, and Mr. John Stuart Webster were one and the same person,
and so he tilted his head on one side like a cock-robin and
considered.
“By jingo, that's most interesting,” he decided. “The wounded hero
has a sweetheart or a wife—and an American, too. She must be a
recent acquisition, because all the time we were together on the
steamer coming down here he never spoke of either, despite the fact
that we got friendly enough for such confidences. Something funny
about this. I'd better sound the old boy before I start passing out
words of comfort to that unhappy female.”
He passed on into the room. John Stuart Webster had, by this
time, been washed and bandaged, and one of the Sarros servants
(for the ex-dictator's retinue still occupied the palace) had, at Doctor
Pacheco's command, prepared a guest-chamber upstairs and
furnished a nightgown of ample proportions to cover Mr. Webster's
bebandaged but otherwise naked person. A stretcher had just
arrived, and the wounded man was about to be carried upstairs. The
late financial backer of the revolution was looking very pale and
dispirited; for once in his life his whimsical, bantering nature was
subdued. His eyes were closed, and he did not open them when
Ricardo entered.
“Well, I have Sarros,” the latter declared. Webster paid not the
slightest attention to this announcement. Ricardo bent over him.
“Jack, old boy,” he queried, “do you know a person of feminine
persuasion who calls you Caliph?”
John Stuart Webster's eyes and mouth flew wide open. “What the
devil!” he tried to roar. “You haven't been speaking to her, have you?
If you have, I'll never forgive you, because you've spoiled my little
surprise party.”
“No, I haven't been speaking to her, but she's in the next room
crying fit to break her heart because she thinks you've been killed.”
“You scoundrel! Aren't you human? Go tell her it's only a couple of
punctures, not a blowout.” He sighed. “Isn't it sweet of her to weep
over an old hunks like me!” he added softly. “Bless her tender heart!”
“Who is she?” Ricardo was very curious.
“That's none of your business. You wait and I'll tell you. She's the
guest I told you I was going to bring to dinner, and that's enough for
you to know for the present. Vaya, you idiot, and bring her in here,
so I can assure her my head is bloody but unbowed. Doctor, throw
that rug over my shanks and make me look pretty. I'm going to
receive company.”
His glance, bent steadily on the door, had in it some of the alert,
bright wistfulness frequently to be observed in the eyes of a terrier
standing expectantly before a rat-hole. The instant the door opened
and Dolores's tear-stained face appeared, he called to her with the
old-time camaraderie, for he had erased from his mind, for the
nonce, the memory of the tragedy of poor Don Juan Cafetéro and
was concerned solely with the task of banishing the tears from those
brown eyes and bringing the joy of life back to that sweet face.
“Hello, Seeress,” he called weakly. “Little Johnny's been fighting
again, and the bad boys gave him an all-fired walloping.”
There was a swift rustle of skirts, and she was bending over him,
her hot little palms clasping eagerly his pale, rough cheeks. “Oh, my
dear, my dear!” she whispered, and then her voice choked with the
happy tears and she was sobbing on his wounded shoulder. Ricardo
stooped to draw her away, but John Stuart bent upon him a look of
such frightfulness that he drew back abashed. After all, the past
twenty-four hours had been quite exciting, and Ricardo reflected
that John's inamorata was tired and frightened and probably hadn't
eaten anything all day long, so there was ample excuse for her
hysteria.
“Come, come, buck up,” Webster soothed her, and helped himself
to a long whiff of her fragrant hair. “Old man Webster had one leg in
the grave, but they've pulled it out again.”
Still she sobbed.
“Now, listen to me, lady,” he commanded with mock severity. “You
just stop that. You're wasting your sympathy; and while, of course, I
enjoy your sympathy a heap, just pause to reflect on the result if
those salt tears should happen to drop into one of my numerous
wounds.”
“I'm so sorry for you, Caliph,” she murmured brokenly. “You poor,
harmless boy! I don't see how any one could be so fiendish as to
hurt you when you were so distinctly a non-combatant.”
“Thank you. Let us forget the Hague Conference for the present,
however. Have you met your brother?” he whispered.
“No, Caliph.”
“Ricardo.”
“Yes, Jack.”
“Come here. Rick, you scheming, unscrupulous, bloodthirsty
adventurer, I have a tremendous surprise in store for you. The
sweetest girl in the world—and she's right here——”
Ricardo laughingly held up his hand. “Jack, my friend,” he
interrupted, “you're too weak to make a speech. Don't do it. Besides,
you do not have to.” He turned and bowed gracefully to Dolores. “I
can see for myself she's the sweetest girl in the world, and that she's
right here.” He held out his hand to her. “Jack thinks he's going to
spring a surprise,” he continued maliciously, “quite forgetting that a
good soldier never permits himself to be taken by surprise. I know
all about his little secret, because I heard you mourning for him
when you thought he was dead.” Ricardo favoured her with a
knowing wink. “I am delighted to meet the future Mrs. Webster. I
quite understand why you fell in love with him, because, you see, I
love him myself and do does everybody else.”
With typical Castilian courtliness he took her hand, bowed low
over it, and kissed it. “I am Ricardo Luiz Ruey,” he said, anxious to
spare his friend the task of further exhausting conversation. “And
you are——”
“You're a consummate jackass!” groaned Webster. “I'm only a dear
old family friend, and Dolores is going to marry Billy Geary. You
impetuous idiot! She's your own sister Dolores Ruey. She, Mark
Twain, and I have ample cause for common complaint against the
world because the reports of our death have been grossly
exaggerated. She didn't perish when your father's administration
crumbled. Miss Ruey, this is your brother Ricardo. Kiss her you
damn' fool—forgive me, Miss Ruey—oh, Lord, nothing matters any
more. He's gummed everything up and ruined my party. I wish I
were dead.”
Ricardo stared from the outraged Webster to his sister and back
again.
“Jack Webster,” he declared, “you aren't crazy, are you?”
“Of course he is—the old dear,” Dolores cried happily, “but I'm
not.” She stepped up to her brother, and her arms went around his
neck. “Oh, Rick,” she cried, “I'm your sister. Truly, I am.”
“Dolores. My little lost sister Dolores? Why, I can't believe it!”
“Well, you'd better believe it,” John Stuart Webster growled feebly.
“Of course, you can doubt my word and get away with it, now that
I'm flat on my back, but if you dare cast aspersions on that girl's
veracity, I'll murder you a month from now.”
He closed his eyes, feeling instinctively that he ought not spy on
such a sacred family scene. When, however, the affecting meeting
was over and Dolores was ruffling the Websterian foretop while her
brother pressed the Websterian hand and tried to say all the things
he felt but couldn't express, John Stuart Webster brought them both
back to a realization of present conditions.
“Don't thank me, sir,” he piped in pathetic imitation of the small
boy of melodrama. “I have only done me duty, and for that I cannot
accept this purse of gold, even though my father and mother are
starving.”
“Oh, Caliph, do be serious,” Dolores pleaded.
He looked up at her fondly. “Take your brother out to Mother
Jenks and prove your case, Miss Ruey,” he advised her. “And while
you're at it, I certainly hope somebody will remember I'm not
accustomed to reposing on a centre table. Rick, if you can persuade
some citizen of this conquered commonwealth to put me to bed, I'd
be obliged. I'm dead tired, old horse. I'm—ah—sleepy——”
His head rolled weakly to one side, for he had been playing a part
and had nerved himself to finish it gracefully, even in his weakened
condition. He sighed, moaned slightly, and slipped into
unconsciousness.
CHAPTER XXIX

T
HROUGHOUT the night there was sporadic firing here and
there in the city, as the Ruey followers relentlessly hunted
down the isolated detachment of Government troops which
had escaped annihilation and capture in the final rout and fallen
back on the city, where, concealing themselves according to their
nature and inclination, they indulged in more or less sniping from
windows and the roofs of buildings. The practice of taking no
prisoners was an old one in Sobrante, and few presidents had done
more than Sarros to keep that custom alive; ergo, firm in the
conviction that to surrender was tantamount to facing a firing squad
at daylight, the majority of these stragglers, with consummate
courage, fought to the death.
The capture of Buenaventura was alone sufficient to insure a brief
revolution, but the capture of Sarros was ample guarantee that the
resistance to the new order of things was already at an end.
However, Ricardo Ruey felt that the prompt execution of Sarros
would be an added guarantee of peace by effectually discouraging
any opposition to the rebel cause in the outlying districts, where a
few isolated garrisons still remained in ignorance of the momentous
events being enacted in the capital. For the time being, Ricardo was
master of life and death in Sobrante, and all of his advisers and
supporters agreed with him that a so-called trial of the ex-dictator
would be a rather useless affair. His life was forfeit a hundred times
for murder and treason, and to be ponderous over his elimination
would savour of mockery. Accordingly, at midnight, a priest entered
the room in the arsenal where Sarros was confined, and shrived him.
Throughout the night the priest remained with him, and when that
early morning march to the cemetery commenced, he walked beside
Sarros, repeating the prayers for the dying.
Upon reaching the cemetery there was a slight wait until a
carriage drove up and discharged Ricardo Ruey and Mother Jenks.
The sergeant in command of the squad saluted and was briefly
ordered to proceed with the matter in hand; whereupon he turned
to Sarros, who with the customary sang froid of his kind upon such
occasions was calmly smoking, and bowed deprecatingly. Sarros
actually smiled upon him. “Adios, amigos” he murmured. Then, as
an afterthought and probably because he was sufficient of an egoist
to desire to appear a martyr, he added heroically: “I die for my
country. May God have mercy on my enemies.”
“If you'd cared to play a gentleman's game, you blighter, you
might 'ave lived for your bally country,” Mother Jenks reminded him
in English. “Wonder if the beggar 'll wilt or will 'e go through smilin'
like my sainted 'Enery on the syme spot.”
She need not have worried. It requires a strong man to be dictator
of a Roman-candle republic for fifteen years, and whatever his sins
of omission or commission, Sarros did not lack animal courage.
Alone and unattended he limped away among the graves to the wall
on the other side of the cemetery and placed his back against it,
negligently in the attitude of a devil-may-care fellow without a worry
in life. The sergeant waited respectfully until Sarros had finished his
cigarette; when he tossed it away and straightened to attention, the
sergeant knew he was ready to die. At his command there was a
sudden rattle of bolts as the cartridges slid from the magazines into
the breeches; there followed a momentary halt, another command;
the squad was aiming when Ricardo Ruey called sharply:
“Sergeant, do not give the order to fire.”
The rifles were lowered and the men gazed wonderingly at
Ricardo. “He's too brave,” Ricardo complained. “Damn him, I can't
kill him as I would a mad-dog. I've got to give him a chance.” The
sergeant raised his brows expressively. Ah, the ley fuga, that popular
form of execution where the prisoner is given a running chance, and
the firing-squad practises wing shooting If the prisoner manages,
miraculously, to escape, he is not pursued!
A doubt, however, crossed the sergeant's mind. “But, my general,”
he expostulated, “Senor Sarros cannot accept the ley fuga. He is
very lame. That is not giving him the chance your Excellency desires
he should have.”
“I wasn't thinking of that,” Ricardo replied. “I was thinking I'm
killing him without a fair trial for the reason that he's so infernally
ripe for the gallows that a trial would have been a joke.
Nevertheless, I am really killing him because he killed my father—
and that is scarcely fair. My father was a gentleman. Sergeant, is
your pistol loaded?”
“Yes, General.”
“Give it to Senor Sarros.”
As the sergeant started forward to comply Ricardo drew his own
service revolver and then motioned Mother Jenks and the firing-
squad to stand aside while he crossed to the centre of the cemetery.
“Sarros,” he called, “I am going to let God decide which one of us
shall live. When the sergeant gives the command to fire, I shall open
fire on you, and you are free to do the same to me. Sergeant, if he
kills me and escapes unhurt, my orders are to escort him to the bay
in my carriage and put him safely aboard the steamer.”
Mother Jenks sat down on a tombstone. “Gord's truth!” she
gasped, “but there's a rare plucked 'un.” Aloud she croaked: “Don't
be a bally ass, sir.”
“Silence!” he commanded.
The sergeant handed Sarros the revolver. “You heard what I said?”
Ricardo called.
Sarros bowed gravely.
“You understand your orders, Sergeant?”
“Yes, General.”
“Very well. Proceed. If this prisoner fires before you give the word,
have your squad riddle him.” The sergeant backed away and gazed
owlishly from the prisoner to his captor. “Ready!” he called. Both
revolvers came up. “Fire!” he shouted, and the two shots were
discharged simultaneously. Ricardo's cap flew off his head, but he
remained standing, while Sarros staggered back against the wall and
there recovering himself gamely, fired again. He scored a clean miss,
and Ricardo's gun barked three times; Sarros sprawled on his face,
rose to his knees, raised his pistol halfway, fired into the sky and slid
forward on his face. Ricardo stood beside the body until the sergeant
approached and stood to attention, his attitude saying:
“It is over. What next, General?”
“Take the squad back to the arsenal, Sergeant,” Ricardo ordered
him coolly, and walked back to recover his uniform cap. He was
smiling as he ran his finger through a gaping hole in the upper half
of the crown.
“Well, Mrs. Jenks,” he announced when he rejoined the old lady,
“that was better than executing him with a firing-squad. I gave him
a square deal. Now his friends can never say that I murdered him.”
He extended his hand to help Mother Jenks to her feet. She stood
erect and felt again that queer swelling of the heart, the old feeling
of suffocation.
“Steady, lass!” she mumbled. “'Old on to me, sir. It's my bally
haneurism. Gor'—I'm—chokin'——”
He caught her in his arms as she lurched toward him. Her face
was purple, and in her eyes there was a queer fierce light that went
out suddenly, leaving them dull and glazed. When she commenced
to sag in his arms, he eased her gently to the ground and laid her on
her back in the grass.
“The nipper's safe, 'Enery,” he heard her murmur. “I've raised 'er a
lydy, s'elp me—she's back where—you found 'er— 'Enery——”
She quivered, and the light came creeping back into her eyes
before it faded forever. “Comin', 'Enery—darlin',” she whispered; and
then the soul of Mother Jenks, who had a code and lived up to it
(which is more than the majority of us do), had departed upon the
ultimate journey. Ricardo gazed down on the hard old mouth,
softened now by a little half-smile of mingled yearning and gladness:
“What a wonderful soul you had,” he murmured, and kissed her.
In the end she slept in the niche in the wall of the Catedral de la
Vera Cruz, beside her sainted 'Enery.
CHAPTER XXX

T
HREE days passed. Don Juan Cafetéro had been buried with
all the pomp and circumstance of a national hero; Mother
Jenks, too, had gone to her appointed resting-place, and El
Buen Amigo had been closed forever. Ricardo had issued a
proclamation announcing himself provisional president of Sobrante;
a convention of revolutionary leaders had been held, and a
provisional cabinet selected. A day for the national elections had
been named; the wreckage of the brief revolution had been cleared
away, and the wheels of government were once more revolving
freely and noiselessly. And while all of this had been going on, John
Stuart Webster had lain on his back, staring at the palace ceiling and
absolutely forbidden to receive visitors. He was still engaged in this
mild form of gymnastics on the third day when the door of his room
opened and Dolores looked in on him.
“Good evening, Caliph,” she called. “Aren't you dead yet?”
It was exactly the tone she should have adopted to get the best
results, for Webster had been mentally and physically ill since she
had seen him last, and needed some such pleasantry as this to lift
him out of his gloomy mood. He grinned at her boyishly.
“No, I'm not dead. On the contrary, I'm feeling real chirpy. Won't
you come in and visit for a while, Miss Ruey?”
“Well, since you've invited me, I shall accept.” Entering, she stood
beside his bed and took the hand he extended toward Her. “This is
the first opportunity I've had, Miss Ruey,” he began, “to apologize for
the shock I gave you the other day. I should have come back to you
as I promised, instead of getting into a fight and scaring you half to
death. I hope you'll forgive me, because I'm paying for my fun now
—with interest.”
“Very well, Caliph. I'll forgive you—on one condition.”
“Who am I to resist having a condition imposed upon me? Name
your terms. I shall obey.”
“I'm weary of being called Miss Ruey. I want to be Dolores—to
you.”
“By the toenails of Moses,” he reflected, “there is no escape. She's
determined to rock the boat.” Aloud he said: “All right, Dolores. I
suppose I may as well take the license of the old family friend. I
guess Bill won't mind.”
“Billy hasn't a word to say about it,” she retorted, regarding him
with that calm, impersonal, yet vitally interested look that always
drove him frantic with the desire for her.
“Well, of course, I understand that,” he countered. “Naturally,
since Bill is only a man, you'll have to manage him and he'll have to
take orders.”
“Caliph, you're a singularly persistent man, once you get an idea
into your head. Please understand me, once for all: Billy Geary is a
dear, and it's a mystery to me why every girl in the world isn't
perfectly crazy about him, but every rule has its exceptions—and
Billy and I are just good friends. I'd like to know where you got the
idea we're engaged to be married.”
“Why—why—well, aren't you?”
“Certainly not.”
“Well, you—er—you ought to be. I expected—that is, I planned—I
mean Bill told me and—and—and—er—it never occurred to me you
could possibly have the—er—crust—to refuse him. Of course you're
going to marry him when he asks you?”
“Of course I am not.”
“Ah-h-h-h!” John Stuart Webster gazed at her in frank amazement.
“Not going to marry Bill Geary!” he cried, highly scandalized.
“I know you think I ought to, and I suppose it will appear quite
incomprehensible to you when I do not——”
“Why, Dolores, my dear girl! This is most amazing. Didn't Bill ask
you to marry him before he left?”
“Yes, he did me that honour, and I declined him.”
“You what!”
She smiled at him so maternally that his hand itched to drag her
down to him and kiss her curving lips.
“Do you mind telling me just why you took this extraordinary
attitude?”
“You have no right to ask, but I'll tell you. I refused Billy because I
didn't love him enough—that way. What's more, I never could.”
He rolled his head to one side and softly, very softly, whistled two
bars of “The Spanish Cavalier” through his teeth He was properly
thunder-struck—so much so, in fact, that for a moment he actually
forgot her presence the while he pondered this most incredible state
of affairs.
“I see it all now. It's as clear as mud,” he announced finally. “You
refused poor old Bill and broke his heart, and so he went away and
hasn't had the courage to write me since. I'm afraid Bill and I both
regarded this fight as practically won—all over but the wedding-
march, as one might put it. I might as well confess I hustled the boy
down from the mine just so you two could get married and light out
on your honeymoon I figured Bill could kill two birds with one stone
—have his honeymoon and get rid of his malaria, and return here in
three or four months to relieve me, after I had the mine in
operation. Poor boy. That was a frightful song-and-dance you gave
him.”
“I suspected you were the matchmaker in this case. I must say I
think you're old enough to know better, Caliph John.”
“You did, eh? Well, what made you think so?”
She chuckled. “Oh, you're very obvious—to a woman.”
“I forgot that you reveal the past and foretell the future.”
“You are really very clumsy, Caliph. You should never try to direct
the destiny of any woman.”
“I'm on the sick list,” he pleaded, “and it isn't sporting of you to
discuss me. You're healthy—so let us discuss you. Dolores, do you
figure Bill's case to be absolutely hopeless?”
“Absolutely, Caliph.”
“Hum-m-m!”
Again Webster had recourse to meditation, seeing which, Dolores
walked to the pier-glass in the corner, satisfied herself that her
coiffure was just so and returned to his side, singing softly a little
song that had floated out over the transom of Webster's room door
into the hall one night:

A Spanish cavalier,
Went out to rope a steer,
Along with his paper cigar-r-ro!
“Caramba!” said he.
“Manana you will be
Muchù bueno carne por mio”

He turned his head and looked up at her suddenly, searchingly. “Is


there anybody else in Bill's way?” he demanded. “I admit it's none of
my business, but———-”
“Yes, Caliph, there is some one else.”
“I thought so.” This rather viciously. “I'm willing to gamble a
hundred to one, sight unseen, that whoever he is, he isn't half the
man Bill is.”
“That,” she replied coldly, “is a matter of personal opinion.”
“And Bill's clock is fixed for keeps?'
“Yes, Caliph. And he never had a chance from the start.”
“Why not?”
“Well, I met the other man first, Caliph.”
“Oh! Do you mind telling me what this other man does for a
living?”
“He's a mining man, like Billy.”
“All right! Has the son of a horsethief got a mine like Bill's? That's
something to consider, Dolores.”
“He has a mine fully as good as Billy's. Like Billy, he owns a half
interest in it, too.”
“Hum-m-m! How long have you known him?”
“Not very long.”
“Be sure you're right—then go ahead,” John Stuart Webster
warned her. “Don't marry in haste and repent at leisure, Dolores.
Know your man before you let him buy the wedding ring. There's a
heap of difference, my dear, between sentiment and sentimentality.”
“I'm sure of my man, Caliph.”
He was silent again, thinking rapidly. “Well, of course,” he began
again presently, “while there was the slightest possibility of Bill
winning you, I would have died before saying that which I am about
to say to you now, Dolores, because Bill is my friend, and I'd never
double-cross him. With reference to this other man, however, I have
no such code to consider. I'm pretty well convinced I'm out of the
running, but I'll give that lad a race if it's the last act of my life. He's
a stranger to me, and he isn't on the job to protect his claim, so why
shouldn't I stake it if I can? But are you quite certain you aren't
making a grave mistake in refusing Billy? He's quite a boy, my dear. I
know him from soul to suspenders, and he'd be awfully good to you.
He's kind and gentle and considerate, and he's not a mollycoddle,
either.”
“I can't help it, Caliph. Please don't talk about him any more. I
know somebody who is kinder and nobler and gentler.” She ceased
abruptly, fearful of breaking down her reserve and saying too much.
“Well, if Bill's case is hopeless”—his hand came groping for hers,
while he held her with his searching, wistful glance—“I wonder what
mine looks like. That is, Dolores, I—I——'
“Yes, John?”
“I've played fair with my friend,” he whispered eagerly. “I'm not
going to ask you to marry me, but I want to tell you that to me
you're such a very wonderful woman I can't help loving you with my
whole heart and soul.'
“I have suspected this, John,” she replied gravely.
“I suppose so. I'm such an obvious old fool. I've had my dream,
and I've put it behind me, but I—I just want you to know I love you;
so long as I live, I shall want to serve you when you're married to
this other man, and things do not break just right for you both—if I
have something he wants, in order to make you happy, I want you
to know it's yours to give to him. I—I—I guess that's all, Dolores.”
“Thank you, John. Would you like to know this man I'm going to
marry?”
“Yes, I think I'd like to congratulate the scoundrel.”
“Then I'll introduce you to him, John. I first met him on a train in
Death Valley, California. He was a shaggy old dear, all whiskers and
rags, but his whiskers couldn't hide his smile, and his rags couldn't
hide his manhood, and when he thrashed a drummer because the
man annoyed me, I just couldn't help falling in love with him. Even
when he fibbed to me and disputed my assertion that we had met
before——”
“Good land of love—and the calves got loose!” he almost shouted
as he held up his one sound arm to her. “My dear, my dear——”
“Oh, sweetheart,” she whispered laying her hot cheek against his,
“it's taken you so long to say it, but I love you all the more for the
dear thoughts that made you hesitate.”
He was silent a few moments, digesting his amazement,
speechless with the great happiness that was his—and then Dolores
was kissing the back of the hand of that helpless, bandaged arm
lying across his breast. He had a tightening in his throat, for he had
not expected love; and that sweet, benignant, humble little kiss
spelled adoration and eternal surrender; when she looked at him
again the mists of joy were in his eyes.
“Dear old Caliph John!” she crooned. “He's never had a woman to
understand his funny ways and appreciate them and take care of
him, has he?” She patted his cheek. “And bless his simple old heart,
he would rather give up his love than be false to his friend. Yes,
indeed. Johnny Webster respects 'No Shooting' signs when he sees
them, but he tells fibs and pretends to be very stupid when he really
isn't. So you wouldn't be false to Billy—eh, dear? I'm glad to know
that, because the man who cannot be false to his friend can never
be false to his wife.”
He crushed her down to him and held her there for a long time.
“My dear,” he said presently, “isn't there something you have to say
to me?”
“I love you, John,” she whispered, and sealed the sweet
confession with a true lover's kiss.
“All's well with the world,” John Stuart Webster announced when
he could use his lips once more for conversation. “And,” he added,
“owing to the fact that I started a trifle late in life, I believe I could
stand a little more of the same.”
The door opened and Ricardo looked in on them. “Killjoy!”
Webster growled. “Old Killjoy the Thirteenth, King of Sobrante. Is
this a surprise to you?”
“Not a bit of it, Jack. I knew it was due.”
“Am I welcome in the Ruey family?”
Ricardo came over and kissed his sister. “Don't be a lobster, Jack,”
he protested. “I dislike foolish questions.” And he pressed his friend's
hand with a fervour that testified to his pleasure.
“I'm sorry to crowd in at a time like this, Jack,” he continued, with
a hug for Dolores, “but Mr. What-you-may-call-him, the American
consul, has called to pay his respects. As a fellow-citizen of yours, he
is vitally interested in your welfare. Would you care to receive him
for a few minutes?”
“One minute will do,” Webster declared with emphasis. “Show the
human slug up, Rick.”
Mr. Lemuel Tolliver tripped breezily in with outstretched hand. “My
dear Mr. Webster,” he began, but Webster cut him short with a
peremptory gesture.
“Listen, friend Tolliver,” he said. “The only reason I received you
was to tell you I'm going to remain in this country awhile and help
develop it. I may even conclude to grow up with it. I shall not, of
course, renounce my American citizenship; and of course, as an
American citizen, I am naturally interested in the man my country
sends to Sobrante to represent it. I might as well be frank and tell
you that you won't do. I called on you once to do your duty, and you
weren't there; I told you then I might have something to say about
your job later on, and now I'm due to say it. Mr. Tolliver, I'm the
power behind the throne in this little Jim-crow country, and to quote
your own elegant phraseology, you, as American consul, are nux
vomica to the Sobrantean government. Moreover, as soon as the
Sobrantean ambassador reaches Washington, he's going to tell the
President that you are, and then the President will be courteous
enough to remove you. In the meantime, fare thee well, Mr. Consul.”
“But, Mr. Webster——”
“Vaya!”
Mr. Tolliver, appreciating the utter futility of argument, bowed and
departed.
“Verily, life grows sweeter with each passing day,” Webster
murmured whimsically. “Rick, old man, I think you had better escort
the Consul to the front door. Your presence is nux vomica to me
also. See that you back me up and dispose of that fellow Tolliver, or
you can't come to our wedding—can he, sweetheart?”
When Ricardo had taken his departure, laughing, John Stuart
Webster looked up quite seriously at his wife-to-be. “Can you explain
to me, Dolores,” he asked, “how it happened that your relatives and
your father's old friends here in Sobrante, whom you met shortly
after your arrival, never informed you that Ricardo was living?”
“They didn't know any more about him than I did, and he left here
as a mere boy. He was scarcely acquainted with his relatives, all of
whom bowed quite submissively to the Sarros yoke. Indeed, my
father's half-brother, Antonio Ruey, actually accepted a portfolio
under the Sarros régime and held it up to his death. Ricardo has a
wholesome contempt for his relatives, and as for his father's old
friends, none of them knew anything about his plans. Apparently his
identity was known only to the Sarros intelligence bureau, and it did
not permit the information to leak out.”
“Funny mix-up,” he commented. “And by the way, where did you
get all the inside dope about Neddy Jerome?”
She laughed and related to him the details of Neddy's perfidy.
“And you actually agreed to deliver me, hog-tied and helpless to
that old schemer, Dolores?”
“Why not, dear. I loved you; I always meant to marry you, if you'd
let me; and ten thousand dollars would have lasted me for pin
money a long time.”
“Well, you and Neddy have both lost out. Better send the old
pelican a cable and wake him out of his day-dream.”
“I sent the cable yesterday, John dear.”
“Extraordinary woman!”
“I've just received an answer. Neddy has spent nearly fifty dollars
telling me by cable what a fine man you are and how thankful I
ought to be to the good Lord for permitting you to marry me.”
“Dolores, you are perfectly amazing. I only proposed to you a
minute ago.”
“I know you did, slow-poke, but that is not your fault. You would
have proposed to me yesterday, only I thought best not to disturb
you until you were a little stronger. This evening, however, I made
up my mind to settle the matter, and so I——”
“But suppose I hadn't proposed to you, after all?”
“Then, John, I should have proposed to you, I fear.”
“But you were running an awful risk, sending that telegram to
Neddy Jerome.”
She took one large red ear in each little hand and shook his head
lovingly. “Silly,” she whispered, “don't be a goose. I knew you loved
me; I would have known it, even if Neddy Jerome hadn't told me so.
So I played a safe game all the way through, and oh, dear Caliph
John, I'm so happy I could cry.”
“God bless my mildewed soul,” John Stuart Webster murmured
helplessly. The entire matter was quite beyond his comprehension!

THE END
*** END OF THE PROJECT GUTENBERG EBOOK WEBSTER—MAN'S
MAN ***

Updated editions will replace the previous one—the old editions will
be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free


distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be


used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the
United States and most other parts of the world at no
cost and with almost no restrictions whatsoever. You
may copy it, give it away or re-use it under the terms
of the Project Gutenberg License included with this
eBook or online at www.gutenberg.org. If you are not
located in the United States, you will have to check the
laws of the country where you are located before using
this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived


from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted


with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this


electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the Project
Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing


access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™


electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for


the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you


discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,


the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.

Section 2. Information about the Mission


of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like