Geographic Data Analysis Using R Xindong He instant download
Geographic Data Analysis Using R Xindong He instant download
download
https://ebookbell.com/product/geographic-data-analysis-using-r-
xindong-he-58732246
https://ebookbell.com/product/geographic-data-analysis-using-r-2nd-
edition-xindong-he-58755800
https://ebookbell.com/product/geographic-health-data-fundamental-
techniques-for-analysis-1st-edition-boscoe-5394726
https://ebookbell.com/product/timeintegrative-geographic-information-
systems-management-and-analysis-of-spatiotemporal-data-1st-edition-dr-
thomas-ott-4198870
https://ebookbell.com/product/data-analysis-and-statistics-for-
geography-environmental-science-and-engineering-acevedo-5144568
Geographical Data Science And Spatial Data Analytics In R Lex Comber
https://ebookbell.com/product/geographical-data-science-and-spatial-
data-analytics-in-r-lex-comber-20458896
https://ebookbell.com/product/geographic-data-science-with-r-michael-
c-wimberly-49138230
Geographic Data Science With Python 1st Edition Sergio Rey Dani
Arribasbel
https://ebookbell.com/product/geographic-data-science-with-python-1st-
edition-sergio-rey-dani-arribasbel-56514430
https://ebookbell.com/product/geographic-data-mining-and-knowledge-
discovery-second-edition-chapman-hall-crc-data-mining-and-knowledge-
discovery-series-2nd-edition-harvey-j-miller-2023104
https://ebookbell.com/product/geographic-data-imperfection-1-from-
theory-to-applications-battonhubert-10823174
Xindong He
Geographic
Data Analysis
Using R
Geographic Data Analysis Using R
Xindong He
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
v
vi Preface
big data and hyperspectral remote sensing, data dimensionality reduction supported
by the PCA approach becomes one of the key steps. In this chapter, PCA and cluster
are identified as key exploratory methods in the regionalization of temperatures in
China, emphasizing the importance of understanding and interpreting their results
from a geographic research perspective. Although PCA is effective in reducing
dimensionality and enhancing the effectiveness of clustering, its application should
not be considered universal. Direct clustering may be more appropriate when the
original feature possesses clear interpretative or significant value.
Chapter 9: In the study of land uses and their changes, the Markov chain is a
commonly used predictive analysis method. In this chapter, the land use changes
for 2005 and 2015 in a specific area in China are analyzed. With 2015 as the base
year, the Markov chains method is employed to predict land use changes over the
subsequent ten years.
Chapter 10: The steps for conducting geographic network analysis, such as
analyzing a road network in R, are outlined, including the associated code. As an
example, the shortest path from Chengdu University of Technology (CDUT) to the
nearest fire station in downtown Chengdu is calculated.
Chapter 11: It demonstrates how to apply IDW, Ordinary Kriging, proximity
polygons, and nearest neighbor interpolation methods in R, using mean annual
temperatures from 837 meteorological stations in China for 2020. Although Ordi-
nary Kriging offers the most precise and smoothest interpolation results, it is complex
to calculate and requires an appropriate semivariogram model. The proximity poly-
gons method boasts a long history. The IDW method is straightforward but sensitive
to local extremes. The nearest neighbor method is relatively simple to use.
Notable features of this book are highlighted as follows:
• The book serves as an invaluable resource for geography students engaged in
studying quantitative analysis.
• Geographers seeking to analyze geographic data quantitatively will find this book
particularly useful.
• The organization of the content reflects the authors’ experience and the complexity
of the methods frequently employed in recent years for conducting quantitative
geographic data analysis.
• Detailed explanations are provided for the analysis results of the main methods
and functions in R.
• A comprehensive list of the principal functions and packages utilized in this book
is included.
Xindong He, Ph.D.
Associate Professor, Tenured College
of Geography and Planning
Chengdu University of Technology
Chengdu, Sichuan, China
Contents
ix
x Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
About the Author
xv
List of Figures
xvii
xviii List of Figures
Fig. 4.6 A diagnostic test plot for the mutiple linear regression
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Fig. 4.7 Q-Q plot for mfit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Fig. 4.8 The Component+Residual Plots for the model . . . . . . . . . . . . . . 60
Fig. 4.9 Q-Q plot for transformed mfit . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Fig. 4.10 Q-Q plots for three models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Fig. 4.11 Residuals analysis of three models . . . . . . . . . . . . . . . . . . . . . . . 69
Fig. 4.12 All-possible-regressions plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Fig. 4.13 Cp plot for all-possible-regressions . . . . . . . . . . . . . . . . . . . . . . . 77
Fig. 5.1 The precipitation map of 837 ground stations
across China, 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Fig. 5.2 The plot of standardised residuals agains the fitted values . . . . . 85
Fig. 5.3 The map of the spatial pattern of the regression residuals . . . . . 86
Fig. 5.4 The map of “Local Moran’s I” for residuals using
k-nearest neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Fig. 5.5 Moran scatterplot for the Pre . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Fig. 5.6 The GWR coefficients plot of Tem . . . . . . . . . . . . . . . . . . . . . . . 96
Fig. 5.7 The GWR coefficients plot of Prs . . . . . . . . . . . . . . . . . . . . . . . . 97
Fig. 5.8 The localR2 map of gwr.model . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Fig. 5.9 The t-values map of Tem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Fig. 5.10 The t-values map of Prs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Fig. 6.1 Time series plot of minimum and maximum daily
temperatures in a China station . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Fig. 6.2 Daily temperatures in a China station from 2010–2019 . . . . . . . 109
Fig. 6.3 Daily temperatures in a China station from 2010–2019 . . . . . . . 110
Fig. 6.4 30-day window for max temperatures . . . . . . . . . . . . . . . . . . . . . 111
Fig. 6.5 365-day window for max temperatures . . . . . . . . . . . . . . . . . . . . 112
Fig. 6.6 365 lagged differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Fig. 6.7 The plot of differencing TMX . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Fig. 6.8 The plot of differencing TMN . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Fig. 6.9 ACF and PACF plots for the first-differenced TMX
and TMN series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Fig. 6.10 Time series decomposition for daily maximum
temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Fig. 6.11 Time series decomposition for daily minimum
temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Fig. 6.12 Holt-Winters forecast of monthly maximum temperatures
at a Chinese station, 2020–2025 . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 6.13 Holt-Winters forecast of monthly max temperatures
at a Chinese station, 2020–2080 . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Fig. 6.14 The Arima model forecast of monthly maximum
temperatures in a China station, 2020–2025 . . . . . . . . . . . . . . . . 131
Fig. 6.15 The residuals plot of ARIMA model . . . . . . . . . . . . . . . . . . . . . . 132
Fig. 6.16 The plot of ACF and PACF of Residuals . . . . . . . . . . . . . . . . . . . 133
Fig. 7.1 The scatter plot of the K-Means clustering analysis results . . . . 139
List of Figures xix
Geographic data is defined as data that describes geographic entities, events, pro-
cesses, and phenomena related to specific Earth locations (ISO/TC 211 2015). The
sources of geographic data are varied (O’Brien 2005), including personal question-
naires, field surveys, experimental observations, government-published data, satel-
lite remote sensing data, and big data from the Internet and location services. Data
linked to a specific geographic location and space qualifies as geographic data. Geo-
graphic data symbolically represents the relationship between various geographical
features and phenomena, encompassing spatial location, attribute features, and tem-
poral features (Chen et al. 2000). The emergence and widespread use of Geographic
Information System (GIS) in geography have enabled the perfect integration of spa-
tial and attribute data at the geographic data level (Xu 2014).
Spatial data describes the connections between geographic entities, events, pro-
cesses, and phenomena within specific locations, regions, and spaces. In GIS, data
primarily comes in two forms: raster and vector. Raster geographic data is composed
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 1
X. He, Geographic Data Analysis Using R,
https://doi.org/10.1007/978-981-97-4022-2_1
2 1 Introduction to Geographic Data and R
Quantitative data, which can be observed and measured in terms of values or counts,
is numerically expressed to indicate “how many”, “how much”, or “how often” (Burt
et al. 2009). Examples include population, rainfall, temperature, and distance.
Qualitative data can be observed but not measured numerically. It represents
types of geographic entities or phenomena and can be categorized using names,
symbols, or codes. These categories, such as land use/land cover, gender, and job
occupation, are distinct and non-overlapping (Fotheringham et al. 2000). Despite the
lack of numerical values in qualitative data, quantitative methods can still be applied
for analysis (Young 1981; Bernard 1996; Mumford et al. 2008; Hetenyi et al. 2019).
In the era of big data, the reliance on professional processing tools for geographic
data has increased significantly. Owing to its simplicity and user-friendliness, the R
language has gained prominence in processing geographic data, making it a vital
tool for geography students and researchers.
1.2 R
1.2.1 Installing R
1.2.2 Updating R
The R language offers a comprehensive suite of tools for processing and analyz-
ing geographic data. With the application of big data and artificial intelligence in
geography and GIS, along with the development of geographic data science, Excel
struggles with large or complex data. In contrast, R is well-suited for such tasks,
leading to a growing demand for R language applications. Amidst these technolog-
ical advancements, students and technicians accustomed to traditional desktop GIS
tools like ArcGIS and QGIS are seeking more powerful programming languages
and command line methods for processing, analyzing, and visualizing geographic
data. The R language effectively meets these needs. For those proficient in it, R’s
command line approach is more efficient and faster than using graphical interfaces.
1.4 Data
The 2020 observation data from 837 meteorological surface stations in China orig-
inates from the “Daily Meteorological Dataset of Basic Meteorological Elements
of China National Surface Weather Stations” product. This dataset, encompassing
multi-year observations from over 2,000 stations nationwide, is published by the
China Meteorological Data Service Centre (National Meteorological Information
Center 2020). Following the author’s site selection and data aggregation process,
the dataset utilized in this book was compiled. The temperature observation data
for time series analysis, covering 2010 to 2019 from a meteorological station in
References 5
Henan Province, China, also derives from this dataset. The 2021 population data
for each Chinese province is sourced from the National Bureau of Statistics web-
site (National Bureau of Statistics 2021). DEM are available from the CGIAR-CSI
SRTM 90m Database (Jarvis et al. 2008). Land use and vector data of provinces in
China were obtained from the Resource and Environment Science and Data Center
in the Institute of Geographic Sciences and Natural Resources Research, Chinese
Academy of Sciences (XU et al. 2018; Xu 2022). China map vector data is pro-
vided by Northeast Asia Resource and Environment Big Data Center in Northeast
Institute of Geography and Agroecology, Chinese Academy of Sciences (Northeast
Asia Resource and Environment Big Data Center 2020). In Chap. 10, “Geographic
Network Analysis,” the road network data are sourced from OpenStreetMap (Open-
StreetMap contributors 2023), and the Chengdu fire stations are derived from Gaode
Map POI data (Amap 2023).
References
Amap. 2023. Chengdu Fire Station POI Data. Available from Amap - Data extracted from Amap
services. https://www.amap.com.
Bernard, H Russell. 1996. Qualitative Data, Quantitative Analysis. CAM Journal 8 (1): 9–11.
Burt, James E, Gerald M Barber, and David L Rigby. 2009. Elementary Statistics for Geographers.
Guilford Press.
Chen, Shupeng, Lu. Xuejun, and Chenghu Zhou. 2000. Introduction to Geographic Information
Systems. Beijing, China: Science Press.
Dorman, Michael. 2014. Learning r for Geospatial Analysis. Packt Publishing Ltd.
Fotheringham, A Stewart, Chris Brunsdon, and Martin Charlton. 2000. Quantitative Geography:
Perspectives on Spatial Data Analysis. United States: Sage.
Hetenyi, Gabor, Attila Lengyel, and Magdolna Szilasi. 2019. Quantitative Analysis of Qualitative
Data: Using Voyant Tools to Investigate the Sales-Marketing Interface. Journal of Industrial
Engineering and Management (JIEM) 12 (3): 393–404.
ISO/TC 211. 2015. ISO 19109:2015 Geographic Information — Rules for Application Schema.
ISO/TC 211 Secretariat. https://www.iso.org/obp/ui/#iso:std:iso:19109:ed-2:v1:en.
Jarvis, A., H. I. Reuter, A. D. Nelson, and E. Guevara. 2008. Hole-Filled SRTM for the Globe
Version 4. CGIAR Consortium for Spatial Information. http://srtm.csi.cgiar.org.
Kabacoff, Robert I. 2015. R in Action: Data Analysis and Graphics with R. 2nd ed. Manning
Publications.
Mumford, Michael D., Katrina E. Bedell-Avers, Samuel T. Hunter, Jazmine Espejo, Dawn
Eubanks, and Mary Shane Connelly. 2008. Violence in Ideological and Non-Ideological Groups:
A Quantitative Analysis of Qualitative Data. Journal of Applied Social Psychology 38 (6): 1521–
61.
National Bureau of Statistics. 2021. Annual Population Data by Province. National Bureau of
Statistics. https://data.stats.gov.cn/easyquery.htm?cn=E0103.
National Meteorological Information Center. 2020. Daily Meteorological Dataset of Basic Mete-
orological Elements of China National Surface Weather Station. National Meteorological Infor-
mation Center. https://data.cma.cn/data/cdcdetail/dataCode/A.0012.0001.html.
Northeast Asia Resource and Environment Big Data Center. 2020. China Map Vector Data. http://
wetland.igadc.cn.
O’Brien, Larry. 2005. Introducing Quantitative Geography: Measurement Methods and Gener-
alised Linear Models. London: Routledge.
6 1 Introduction to Geographic Data and R
This chapter covers using R for descriptive statistical analysis of the annual average
temperature data from 837 surface meteorological stations across China in 2020,
such as mean, median, mode, and standard deviation, along with shape measures
like skewness and kurtosis. It also compares the histogram and bar in the data plot-
ting using the 2021population data from various provinces and cities in China.
Upon completing the collection of geographic data tailored to the research and
project requirements, the initial step in data analysis involves observing and analyz-
ing key characteristics of the data, including mean, median, mode, and standard
deviation. Representing and graphically displaying these data characteristics con-
stitutes descriptive statistical analysis. Descriptive statistics encompass a range of
numerical and graphical techniques for organizing, presenting, and analyzing data.
The form used to describe a variable in a sample is contingent upon the measurement
level applied (Fisher et al. 2009).
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 7
X. He, Geographic Data Analysis Using R,
https://doi.org/10.1007/978-981-97-4022-2_2
8 2 Descriptive Analysis of Geographic Data
Its widespread software support makes it a common choice for data storage and
exchange. The essence of a CSV file is its simplicity: data is saved as text, line
by line, with each record divided into fields by a separator, and each record fol-
lows the same field sequence. For data in CSV format, R directly reads it using the
read.csv() function. This function is supported by the readr package. After read-
ing the data, R saves it as a data frame of type tibble. The simplest way to obtain
the readr package is by installing the tidyverse package.
The recent introduction of the Tidyverse package has significantly simplified
the learning process of the R language. Tidyverse comprises a suite of R packages,
such as readr, dplyr, ggplot2, tidyr, and stringr, covering a range from data
import and preprocessing to advanced transformation, visualization, modeling, and
display.
$ Tem <dbl> 16, 16, 24, 11, 13, 18, 19, 23, 23, 26, 13, 18, 13, 15, 17, 12, 1~
$ Alt <dbl> -3, -1, -1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3,~
$ Lon <dbl> 120.2833, 119.8500, 112.7667, 122.1667, 122.7000, 122.4500, 121.9~
$ Lat <dbl> 32.85000, 33.80000, 21.73333, 40.65000, 37.40000, 30.73333, 29.20~
In the dataset, site represents the meteorological observation site number, Alt
indicates the altitude, Tem denotes the annual mean temperature, Pre refers to the
annual mean precipitation, and Prs signifies the annual mean air pressure.
Next, we will demonstrate how to use R to calculate specific eigenvalues for
descriptive statistical analysis.
Mean, median, and mode represent the three primary measures of central tendency
in statistics. These measures help identify the central position within a data set, a
concept known as central tendency.
2.3.1 Mean
The mean is calculated as the sum of all values divided by their count. In R, this can
be computed using the mean() function:
# calculate the mean of the temperature:Temmean
Temmean <- mean(data$Tem)
# display Temmean
Temmean
[1] 12.58901
2.3.2 Median
The median represents the middle value in a data set, when arranged in ascending
order. It can be calculated in R using the median() function in the stats package:
# calculate the median of the temperature:Temmedian
Temmedian <- median(data$Tem)
# display Temmedian
Temmedian
[1] 14
10 2 Descriptive Analysis of Geographic Data
2.3.3 Mode
The mode refers to the value that occurs most frequently in a data set. In R, there
is no standard built-in function to compute the mode. Therefore, it’s necessary to
create a custom function for this purpose. The code is as follows:
# create a customized function
mymode <- function(x) {
return(as.numeric(names(table(x))[table(x) == max(table(x))]))
}
mymode(data$Tem)
[1] 17
2.4.1 Range
The range is defined as the difference between the largest and smallest data values
in a given set. The range can be computed in R using the following code:
# calculate the range of the temperature
max(data$Tem) - min(data$Tem)
[1] 31
Alternatively, the maximum and minimum values can be determined using the
range() function, and then the range can be calculated:
r <- range(data$Tem)
r[2] - r[1]
[1] 31
2.4.2 Variance
Variance quantifies the dispersion of data points around the mean. Low variance
suggests that the data points are generally similar and closely clustered around the
mean. Conversely, high variance indicates greater variability, with data points more
widely spread out from the mean.
The formula to calculate variance for a population is as follows:
∑n
i=1 (x i − μ)
.Variance = σ = 2
(2.1)
n
where .σ 2 represents the variance of a population, .xi denotes the .ith data point, .μ is
the population mean, and .n is the population size.
2.4 Measures of Variability 11
Alternatively, the formula to calculate variance for a sample data set is:
∑n
i=1 (x i − x)
.Variance = S =
2
(2.2)
n−1
where .s 2 denotes the variance of a sample, .xi represents the .ith observation, .x is the
sample mean, and .n indicates the size of the sample.
Alternatively, the standard deviation can be obtained by taking the square root of
the sample variance:
√
.Sample standard deviation = S2 (2.4)
In R, the functions var() and sd() are commonly utilized to calculate the vari-
ance and standard deviation of either a population or a sample.
# calculate the variance of the temperature
var(data$Tem)
[1] 43.03184
# calculate the standard deviation of the temperature
sd(data$Tem)
[1] 6.559866
The coefficient of variation (CV) is defined as the ratio of the standard deviation to
the mean. It quantifies the degree of variability in relation to the population’s mean.
A higher CV indicates a greater level of dispersion. The coefficient of variation is
particularly useful since it is dimensionless, meaning it is independent of the units
12 2 Descriptive Analysis of Geographic Data
2.5.1 Skewness
where S represents the standard deviation, and .x denotes the mean of the sample.
2.5.2 Kurtosis
where S denotes the standard deviation, and .x represents the mean of the sample.
In R, the skewness() and kurtosis() functions from the moments package
can be utilized to calculate skewness and kurtosis, respectively. The corresponding
commands are as follows:
# calculate skewness
skewness(data$Tem)
[1] -0.2369317
# calculate kurtosis
kurtosis(data$Tem)
[1] -0.8060247
Alternatively, the skewness() and kurtosis() functions from the e1071 package
can be used to calculate these values. The respective commands are as follows:
# calculate skewness
skewness(data$Tem)
[1] -0.2369317
# calculate kurtosis
kurtosis(data$Tem)
[1] -0.8060247
describe(Tem)
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 837 12.59 6.56 14 12.75 7.41 -5 26 31 -0.24 -0.81 0.23
describe(Pre)
vars n mean sd median trimmed mad min max range skew kurtosis
X1 1 837 953.5 595.52 843 923.11 624.17 0 2922 2922 0.45 -0.56
se
X1 20.58
describe(Prs)
vars n mean sd median trimmed mad min max range skew kurtosis
X1 1 837 923.34 110.26 969 945.47 63.75 511 1017 506 -1.55 1.7
se
X1 3.81
describe(Alt)
vars n mean sd median trimmed mad min max range skew kurtosis
X1 1 837 860.81 1104 371 621.82 517.43 -3 5052 5055 1.77 2.56
se
X1 38.16
The results produced by the describe() function align with those obtained from
the previous calculation steps. Initially, when learning to analyze geographic data
with R, it is beneficial to calculate each descriptive statistic individually. However,
with proficiency, the describe() function can efficiently generate all required val-
ues in a single step.
Alternatively, the summary() function in R can be used, providing min, max, lower
quartile, upper quartile, mean, and median for a vector, data frame, regres-
sion model, or ANOVA model.
The basic syntax for this function is as follows:
Summary(object, . . . )
Here, object refers to the entity for which a summary desired.
summary(Tem)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-5.00 8.00 14.00 12.59 18.00 26.00
summary(Pre)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0 513.0 843.0 953.5 1400.0 2922.0
summary(Prs)
Min. 1st Qu. Median Mean 3rd Qu. Max.
511.0 883.0 969.0 923.3 1005.0 1017.0
summary(Alt)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-3.0 83.0 371.0 860.8 1185.0 5052.0
2.7 Visualization 15
2.7 Visualization
Here, x and y represent the datasets for the x-axis and y-axis, respectively, "p"
indicates plotting with points, main sets the overall title, xlab and ylab label the
axes, while xlim and ylim define the axes’ value ranges.
plot(Alt, Prs, type = "p", xlab = "Altitude (meters)", ylab = "Air pressure (hpa)" ,
xlim = c(-5, 5200), ylim = c(500, 1100), cex = 0.5, cex.axis = 0.5, cex.lab = 0.5,
⊂→ pch = 1)
The distribution of points in Fig. 2.1 reveals a strong negative correlation between
air pressure and altitude, indicating that air pressure decreases as altitude increases.
2.7.2 Histogram
In Fig. 2.2, the x-axis denotes temperature ranges, while the y-axis represents the
frequency of observations within each range. The numbers atop each bar specify
the count of observations in respective temperature bins. The highest frequency
occurred in the 15–20 .◦ C range with 252 observations, followed by 183 observa-
tions in the 5–10 .◦ C range, and 162 in the 10–15 .◦ C range.
To enhance a histogram with probability density lines, begin by adjusting the
y-axis to density scale, achieved by setting the probability argument to TRUE
in the hist() function call. Subsequently, a probability density line can be crafted
using the density() function to determine the curve’s position, and then integrated
into the histogram with the lines() function. Additionally, for overlaying a normal
curve, employ the dnorm() function, which necessitates creating a grid of values
based on the data’s mean and standard deviation. This normal curve is then
seamlessly added atop the histogram, also using the lines() function, offering a
comprehensive view by combining both theoretical and observed data distributions
in a single visualization (Fig. 2.3).
2.7 Visualization 17
300
temperature in 2020
252
250
200
183
Frequency
162
150
121
90
100
50
27
2
0
−10 0 10 20 30 40
Temperature (°C)
Figure 2.3 presents a histogram with the overlay of a density line and a normal distri-
bution curve. The blue curve represents the kernel density estimate. The red dotted
line illustrates a normal distribution curve for comparison purposes. It suggests that
the temperature data might not follow a normal distribution, as indicated by the mis-
alignment of the distribution’s peaks with the dotted line and the tails appearing
broader than those typically seen in a normal distribution.
Similarly, histograms for precipitation and air pressure can be created, each com-
plemented by an overlay of the respective kernel density curve and a normal distri-
bution curve for comparative analysis (Fig. 2.4).
# make the histogram
line <- par(lwd = 0.2)
18 2 Descriptive Analysis of Geographic Data
0.07
Fig. 2.3 Histogram of
temperature with density and
0.06
normal curves
0.05
0.04
Density
0.03
0.02
0.01
0.00
−10 0 10 20 30 40
Temperature (°C)
86 87
78
Frequency
80
72
69
64
54
60
41
40
26
13
20
4
1 1
0
h <- hist(Pre, xlab = "Precipitation (mm)", ylab = "Frequency", xlim = c(0, 3000), ylim =
⊂→ c(0, 140), labels = F, main = "", col = "lightgray", lwd = 0.2, cex = 0.5,
⊂→ cex.axis = 0.5, cex.lab = 0.5)
text(x = h$mids, y = h$counts, labels = h$counts, pos = 3, cex = 0.5)
In Fig. 2.4, most of the precipitation values cluster around the lower end of the scale,
with the frequency decreasing as the precipitation amount increases. The highest
frequencies are for precipitation amounts roughly between 375 to 812.5 mm, after
which the frequency drops significantly. There are very few instances of very high
precipitation amounts (over 2500), as indicated by the low bars towards the right
end of the x-axis.
2.7 Visualization 19
In Fig. 2.5, the blue and red curves are the similar in Fig. 2.3. The tail on the right
side of the histogram is longer than the left side, and the mass of the distribution is
concentrated on the left.
Similar to temperature and precipitation, a histogram can also be constructed for
air pressure (Fig. 2.6).
# make the histogram of air pressure
line <- par(lwd = 0.2)
Precipitation (mm)
20 2 Descriptive Analysis of Geographic Data
300
pressure in 2020 256
250
219
200
Frequency
150
106 107
100
45
50 21 25 22
17 18
1
0
h <- hist(Prs, xlab = "Air pressure (hpa)", ylab = "Frequency", xlim = c(500, 1100),
⊂→ ylim = c(0, 300), labels = F, main = "", col = "lightgray", lwd = 0.2, cex = 0.5,
⊂→ cex.axis = 0.5, cex.lab = 0.5)
text(x = h$mids, y = h$counts, labels = h$counts, pos = 3, cex = 0.5)
In Fig. 2.6, this histogram illustrates the distribution of air pressure readings in a
dataset. It is evident that the majority of observations are clustered between 850
and 1050 hpa, indicating a higher occurrence of air pressure within this range. The
distribution is left-skewed, which is apparent from the tail stretching towards lower
pressure values. The bar representing the 1000–1050 hpa range has the highest fre-
quency, with 256 occurrences, suggesting that this is the most common air pressure
recorded in the dataset.
Following the method applied to temperature and precipitation, density and nor-
mal curves are likewise overlaid on the air pressure histogram, offering data analysts
an enhanced perspective of the data distribution (Fig. 2.7).
# Calculate the density of 'Prs'
density_values <- density(Prs)
0.008
pressure with density and
normal curves
0.006
Frequency
0.004
0.002
0.000
In Fig. 2.7, this histogram with overlaid density curves showcases the distribution
of air pressure values within the dataset. The blue line is the kernel density estimate.
The red dashed line represents a normal distribution curve. The graph indicates that
while there is a concentration of values around the mean, the distribution is not
perfectly normal as depicted by the discrepancy between the density estimate and
the normal curve.
2.7.3 Bar
To illustrate the distinction between a histogram and a bar graph, this section uses
2021 population data from various provinces and cities in China (measured in tens
of millions) as an example.
In R, the ggplot() function from the ggplot2 package can be used to create a
bar graph, as shown below:
# read the data
df <- read_csv(file = "./data/chinapopulation2021en.csv")
dplyr::glimpse(df)
Rows: 31
Columns: 2
$ Pname <chr> "Guangdong", "Shandong", "Henan", "Jiangsu", "Sichuan", "Hebei",~
$ Pop <dbl> 12.60, 10.15, 9.94, 8.47, 8.37, 7.46, 6.64, 6.46, 6.10, 5.78, 5.~
22 2 Descriptive Analysis of Geographic Data
Additionally, the subset() function from the base package is used to selectively
extract a specific subset of data from the original dataset, df, which contains the
2021 population data for China.
# PN indicates Province Names,PP indicates Province Poplation
PN <- df$Pname[1:31]
PP <- df$Pop[1:31]
# Pb is the subset for blue color,Pw is the subset for white
Pb <- subset(df, PP < 4)
Pw <- subset(df, PP >= 4)
When employing the ggplot() function for plotting, the arrange() and the
mutate() functions from the dplyr package are utilized to more effectively pro-
cess the data (Fig. 2.8).
df1 <- data.frame(
population = PP,
name = factor(PN),
y = seq(length(PN)) * 0.9
)
As depicted in Fig. 2.8, the ggplot() function effectively generates a bar chart.
While the default rendering is often sufficient, additional enhancements can signif-
icantly improve the chart’s aesthetics. These enhancements include modifying the
axes configuration, altering the background color, removing tick marks, adding grid
lines, changing fonts, and more are implemented using the scale_continuous(),
scale_y_discrete() and geom_text() functions from the ggplot2 package, and
the geom_shadowtext() function from the shadowtext package.
plt1 <- plt +
scale_x_continuous(
limits = c(0, 14),
breaks = seq(0, 14, by = 2),
expand = c(0, 0),
position = "top"
) +
scale_y_discrete(expand = expansion(add = c(0, 0.5))) +
theme(
panel.background = element_rect(fill = "white"),
panel.grid.major.x = element_line(color = "#A8BAC4", linewidth = 0.3),
axis.ticks.length = unit(0, "mm"),
axis.title = element_blank(),
axis.line.y.left = element_line(color = "black"),
axis.text.y = element_blank(),
axis.text.x = element_text(family = "Times New Roman", size = 8)
)
2.7 Visualization 23
Guangdong
Shandong
Henan
Jiangsu
Sichuan
Hebei
Hunan
Zhejiang
Anhui
Hubei
Guangxi
Yunnan
Jiangxi
Province/City
Liaoning
Fujian
Shaanxi
Guizhou
Shanxi
Chongqing
Heilongjiang
Xinjiang
Gansu
Shanghai
Jilin
Neimenggu
Beijing
Tianjin
Hainan
Ningxia
Qinghai
Xizang
0 5 10
Population (ten million)
Fig. 2.8 The barplot of 2021 population data for some provinces and cities in China
Adding a title and subtitle to the chart can greatly enhance its overall quality.
extrafont::loadfonts(quiet = TRUE)
library(showtext)
showtext_auto()
plt3 <- plt2 +
labs(
title = "Population",
subtitle = "Population of some provinces and cities in China, 2021 (in ten millions)"
) +
theme(
plot.title = element_text(
family = "Arial",
face = "bold",
size = 11
),
plot.subtitle = element_text(
family = "Arial",
size = 8
)
)
plt3
The final version of the chart, shown in Fig. 2.9, is more aesthetically pleasing and
easier to read compared to Fig. 2.8.
Fig. 2.9 The customized barplot of 2021 population data of some provinces and cities in China
Reference 25
Reference
Fisher, Murray J., and Andrea P. Marshall. 2009. Understanding Descriptive Statistics. Australian
Critical Care 22 (2): 93–97.
Chapter 3
Correlation Analysis
This chapter focuses on the correlation analysis of observation data from 837 sur-
face meteorological stations in China in 2020. Both the cor()and cor.test()
functions, included in R’s base package, are employed to meet our analytical require-
ments effectively. The rcorr() function is employed to derive the p-value matrix of
the correlation matrix. Additionally, the corr.test() function is utilized to rapidly
generate a pairwise correlation matrix for an entire dataset, complete with p-values
and confidence intervals.
Correlation is a statistical measure that represents the relationship between two vari-
ables. In geographic data analysis, correlation tests are often utilized to assess the
relationships among two or more variables, thereby revealing the closeness of rela-
tionships between geographic features (Xu 2006). These relationships are primarily
quantified using the correlation coefficient. For example, using the 2020 meteorolog-
ical data from China’s national meteorological observation stations, we can explore
relationships among variables like altitude, latitude, longitude, air pressure, precip-
itation, and temperature by calculating their correlation coefficients. The correla-
tion coefficient describes the degree of association between two variables, typically
ranging between .−1 and 1. A value closer to 0 indicates a weaker relationship.
Three types of correlation coefficients are commonly used in geographic data
analysis: the Pearson, Spearman, and Kendall correlation coefficients. The
Pearson coefficient, a parametric measure, assesses the linear relationship between
two variables and is appropriate for continuous variables following a normal dis-
tribution. In contrast, Spearman and Kendall coefficients are non-parametric and
suitable for both continuous and categorical variables.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 27
X. He, Geographic Data Analysis Using R,
https://doi.org/10.1007/978-981-97-4022-2_3
28 3 Correlation Analysis
In R, the cor() function in stats package calculates the correlation. Its syntax is as
follows:
cor(x, y, method=“pearson”, “kendall”, “spearman”)
Here, x and y are the variables, and method specifies the calculation approach, with
pearson being the default.
This section demonstrates how to calculate the correlation coefficients between two
variables in R, utilizing observation data from 837 surface stations across China in
2020.
# install and load the packages
packages <- c("ggpubr", "corrplot", "Hmisc", "tidyverse", "nortest", "dplyr", "psych",
⊂→ "PerformanceAnalytics")
an appropriate method for computing the correlation coefficient. This involves creat-
ing scatterplots to assess the linearity of the relationships and employing statistical
tests to check for normal distribution in the data. Following these assessments, the
most suitable method for calculating the correlation coefficient can be selected.
Prior to calculating the correlation coefficients among altitude (Alt), air pres-
sure (Prs), precipitation (Pre), and temperature (Tem), scatterplots are generated
using the ggscatter() function from the ggpubr package to visually assess the
relationships between these variable pairs (Fig. 3.1).
# draw the scatterplot for altitude and air pressure
ggscatter(data, x = "Alt", y = "Prs", color = "grey37", add = "reg.line", add.params =
⊂→ list(color = "blue"), conf.int = TRUE, cor.coef = TRUE, cor.coeff.args =
⊂→ list(method = "pearson", label.x.npc = "center", label.y.npc = "top"),
⊂→ cor.coef.size = 3, xlab = "Altitude (meters)", ylab = "Air pressure (hPa)",
⊂→ point.size = 0.5, shape = 1) +
theme(axis.line = element_line(size = 0.2), axis.title.x = element_text(size = 8),
⊂→ axis.title.y = element_text(size = 8), axis.text.x = element_text(size = 6),
⊂→ axis.text.y = element_text(size = 6))
In Fig. 3.1, the scatterplot illustrates the relationship between altitude and air
pressure, demonstrating a negative correlation: as altitude increases, air pressure
decreases. This inverse relationship is visually reinforced by the downward-sloping
blue regression line. The depicted correlation coefficient, r .= −1, implies a perfect
linear relationship, an unlikely scenario in natural datasets and possibly indicative of
a calculation error. Nonetheless, the notably low p-value underscores the statistical
significance of this negative correlation, affirming that it is not a product of random
variation.
# draw the scatterplot for altitude and precipitation
ggscatter(data, x = "Alt", y = "Pre", color = "grey37", add = "reg.line", add.params =
⊂→ list(color = "blue"), conf.int = TRUE, cor.coef = TRUE, cor.coeff.args =
⊂→ list(method = "pearson", label.x.npc = "center", label.y.npc = "top"),
⊂→ cor.coef.size = 3, xlab = "Altitude (meters)", ylab = "Precipitation (mm)",
⊂→ point.size = 0.5, shape = 1) +
theme(axis.line = element_line(size = 0.2), axis.title.x = element_text(size = 8),
⊂→ axis.title.y = element_text(size = 8), axis.text.x = element_text(size = 6),
⊂→ axis.text.y = element_text(size = 6))
Figure 3.2 features a scatterplot detailing the correlation between altitude and
precipitation, where each data point represents an individual observation. The cor-
relation coefficient of .−0.41 suggests a moderate negative correlation, indicating
a general decrease in precipitation with increasing altitude. The exceptionally low
p-value validates the statistical significance of this correlation, affirming its non-
random nature. The blue regression line, along with its confidence interval, outlines
the overall trend and the predictive reliability between altitude and precipitation.
30 3 Correlation Analysis
900
Air pressure (hPa)
800
700
600
500
Altitude (meters)
Fig. 3.1 The scatterplot for altitude and air pressure of surface stations in 2020
In Fig. 3.3, the scatterplot’s blue regression line delineates the general trend in the
dataset, highlighting the variation in temperature with altitude. The negative slope
of the line, coupled with a correlation coefficient (R) of .−0.59 and a p-value below
2.2e-16, indicates a substantial negative correlation. Simplified, this suggests that
temperature typically decreases as altitude increases.
Figures 3.1, 3.2, and 3.3 reveal certain linear relationships between altitude and air
pressure, altitude and precipitation, as well as altitude and temperature, respectively.
3.2 Calculating the Correlation 31
3000
2000
Precipitation (mm)
1000
Altitude (meters)
Fig. 3.2 The scatterplot for altitude and precipitation of surface stations in 2020
20
Temperature (°C)
10
Altitude (meters)
Fig. 3.3 The scatterplot for altitude and temperature of surface stations in 2020
In Fig. 3.4, data points near the middle of the plot (around the theoretical quan-
tile of 0) align closely with the reference line, suggesting normal distribution in this
region. However, deviations at the plot’s ends, notably on the right (higher theoret-
ical quantiles), indicate non-normality in the data’s tails, potentially suggesting a
heavy-tailed distribution. Therefore, while the Alt data’s central part appears nor-
mally distributed, the tail deviations imply that the overall data may not strictly
adhere to a normal distribution.
# Plotting with ggqqplot for air pressure
ggqqplot(data$Prs, ylab = "Air pressure", size = 0.5, shape = 1, ggtheme = my_theme)
Figure 3.5 reveals that the central part of the plot aligns with the reference line,
indicating that the Prs data distribution in this area resembles a normal distribution.
Nevertheless, deviations in the tails, especially in the right tail, suggest heavier tails
than those of a normal distribution, implying that Prs is approximately normal with
potential tail deviations.
# Plotting with ggqqplot for precipitation
ggqqplot(data$Pre, ylab = "Precipitation", size = 0.5, shape = 1, ggtheme = my_theme)
3.2 Calculating the Correlation 33
5000
2500
Altitude
−2500
−2 0 2
Theoretical
In Fig. 3.6, central data points on the Q-Q plot adhere to the reference line, indi-
cating normal distribution within this range. However, tail deviations, particularly in
the right tail where data points diverge upwards, suggest a possible right-skewed dis-
tribution with more high-value outliers than expected in a normal distribution. Most
data points fall within the confidence band, indicating that the data is close to a nor-
mal distribution. Thus, while not perfectly normal, especially with right skewness
evidence, a significant portion of the data could be considered normally distributed,
particularly around the mean.
# Plotting with ggqqplot for temperature
ggqqplot(data$Tem, ylab = "Temperature", size = 0.5, shape = 1, ggtheme = my_theme)
Figure 3.7 shows that data points in the plot’s center closely follow the refer-
ence line, hinting at normality within the central range of the distribution. How-
ever, tail deviations, especially in the right tail, suggest potential slight skewness
or heavier tails than a normal distribution. The slight S shape in the data points
implies lighter tails at the lower end and heavier tails at the upper end. Most points
within the band support the assumption of normality, despite slight tail deviations.
Therefore, while the data does not perfectly conform to a normal distribution, espe-
cially in the tails, the central mass largely fits normality, indicating that the dataset
is approximately normal, barring outliers or slight skewness.
34 3 Correlation Analysis
1200
1000
Air pressure
800
600
−2 0 2
Theoretical
The Q-Q plot serves as a visual method for assessing data distribution, but to gain
a more accurate understanding, it’s often necessary to complement it with additional
statistical tests for assessing normality.
Given that the sample size exceeds 50, the Anderson-Darling Test, a well-
regarded goodness-of-fit test, is suitable for this purpose.
Anderson-Darling normality test: This test evaluates how closely a given distribu-
tion approximates the observed data, and is commonly employed to assess normality.
The ad.test() function from the ggpubr package facilitates this analysis.
# Anderson-Darling normality test of altitude
ad.test(data$Alt)
data: data$Alt
A = 68.68, p-value < 2.2e-16
M
OTHER JENKS, grown impatient at the lack of news
concerning Webster, left Dolores to her grief in the room
across the hall and sought the open air, for of late she had
been experiencing with recurring frequency a slight feeling of
suffocation. She sat down on the broad granite steps, helped herself
to a much-needed “bracer” from her brandy flask and was gazing
pensively at the scene around her when Ricardo came up the stairs.
“'Elio!” Mother Jenks saluted him. “W'ere 'ave you been, Mr.
Bowers?”
“I have just returned from capturing Sarros, Mrs. Jenks. He is on
his way to the arsenal under guard.”
“Gor' strike me pink!” the old lady cried. “'Ave I lived to see this
day!” Her face was wreathed in a happy smile. “I wonder 'ow the
beggar feels to 'ave the shoe on the other foot, eh—
the'eartless'ound! I'm 'opin' this General Ruey will 'ave the blighter
shot.”
“You need have no worry on that score, Mrs. Jenks. I'm General
Ruey. Andrew Bowers was just my summer name, as it were.”
“Angels guard me! Wot the bloomin' 'ell surprise won't we 'ave
next. Wot branch o' the Ruey tribe do you belong to? Are you a
nephew o' him that was president before Sarros shot 'im? Antonio
Ruey, who was 'arf brother to the president, 'ad a son 'e called
Ricardo. Are you 'im, might I arsk?”
“I am the son of Ricardo the Beloved,” he answered proudly.
“Not the lad as was away at school when 'is father was
hexecuted?”
“I am that same lad, Mrs. Jenks. And who are you? You seem to
know a deal of my family history.”
“I,” the old publican replied with equal pride, “am Mrs. Colonel
'Enery Jenks, who was your father's chief of hartillery an' 'ad the
hextreme honour o' dyin' in front o' the same wall with 'im. By the
w'y, 'ow's Mr. Webster?” she added, suddenly remembering the
subject closest to her heart just then.
“His wounds are trifling. He'll live, Mrs. Jenks.”
“Well, that's better than gettin' poked in the eye with a sharp
stick,” the old dame decided philosophically.
“Do you remember my little sister, Mrs. Jenks?” Ricardo continued.
“She was in the palace when Sarros attacked it; she perished there.”
“I believe I 'ave got a slight recollection o' the nipper, sir,” Mother
Jenks answered cautiously. To herself she said: “I s'y, 'Enrietta, 'ere's
a pretty go. 'E don't know the lamb is livin' an' in the next room! My
word, wot a riot w'en 'e meets 'er!”
“I will see you again, Mrs. Jenks. I must have a long talk with
you,” Ricardo told her, and passed on into the palace; whereupon
Mother Jenks once more fervently implored the Almighty to strike
her pink, and the iron restraint of a long, hard, exciting day being
relaxed at last, the good soul bowed her gray head in her arms and
wept, moving her body from side to side the while and demanding,
of no one in particular, a single legitimate reason why she, a
blooming old baggage and not fit to live, should be the recipient of
such manifold blessings as this day had brought forth.
In the meantime Ricardo, with his hand on the knob of the door
leading to the room where Webster was having his wounds dressed,
paused suddenly, his attention caught by the sound of a sob, long-
drawn and inexpressibly pathetic. He listened and made up his mind
that a woman in the room across the entrance-hall was bewailing
the death of a loved one who answered to the name of Caliph and
John darling. Further eavesdropping convinced him that Caliph, John
darling, and Mr. John Stuart Webster were one and the same person,
and so he tilted his head on one side like a cock-robin and
considered.
“By jingo, that's most interesting,” he decided. “The wounded hero
has a sweetheart or a wife—and an American, too. She must be a
recent acquisition, because all the time we were together on the
steamer coming down here he never spoke of either, despite the fact
that we got friendly enough for such confidences. Something funny
about this. I'd better sound the old boy before I start passing out
words of comfort to that unhappy female.”
He passed on into the room. John Stuart Webster had, by this
time, been washed and bandaged, and one of the Sarros servants
(for the ex-dictator's retinue still occupied the palace) had, at Doctor
Pacheco's command, prepared a guest-chamber upstairs and
furnished a nightgown of ample proportions to cover Mr. Webster's
bebandaged but otherwise naked person. A stretcher had just
arrived, and the wounded man was about to be carried upstairs. The
late financial backer of the revolution was looking very pale and
dispirited; for once in his life his whimsical, bantering nature was
subdued. His eyes were closed, and he did not open them when
Ricardo entered.
“Well, I have Sarros,” the latter declared. Webster paid not the
slightest attention to this announcement. Ricardo bent over him.
“Jack, old boy,” he queried, “do you know a person of feminine
persuasion who calls you Caliph?”
John Stuart Webster's eyes and mouth flew wide open. “What the
devil!” he tried to roar. “You haven't been speaking to her, have you?
If you have, I'll never forgive you, because you've spoiled my little
surprise party.”
“No, I haven't been speaking to her, but she's in the next room
crying fit to break her heart because she thinks you've been killed.”
“You scoundrel! Aren't you human? Go tell her it's only a couple of
punctures, not a blowout.” He sighed. “Isn't it sweet of her to weep
over an old hunks like me!” he added softly. “Bless her tender heart!”
“Who is she?” Ricardo was very curious.
“That's none of your business. You wait and I'll tell you. She's the
guest I told you I was going to bring to dinner, and that's enough for
you to know for the present. Vaya, you idiot, and bring her in here,
so I can assure her my head is bloody but unbowed. Doctor, throw
that rug over my shanks and make me look pretty. I'm going to
receive company.”
His glance, bent steadily on the door, had in it some of the alert,
bright wistfulness frequently to be observed in the eyes of a terrier
standing expectantly before a rat-hole. The instant the door opened
and Dolores's tear-stained face appeared, he called to her with the
old-time camaraderie, for he had erased from his mind, for the
nonce, the memory of the tragedy of poor Don Juan Cafetéro and
was concerned solely with the task of banishing the tears from those
brown eyes and bringing the joy of life back to that sweet face.
“Hello, Seeress,” he called weakly. “Little Johnny's been fighting
again, and the bad boys gave him an all-fired walloping.”
There was a swift rustle of skirts, and she was bending over him,
her hot little palms clasping eagerly his pale, rough cheeks. “Oh, my
dear, my dear!” she whispered, and then her voice choked with the
happy tears and she was sobbing on his wounded shoulder. Ricardo
stooped to draw her away, but John Stuart bent upon him a look of
such frightfulness that he drew back abashed. After all, the past
twenty-four hours had been quite exciting, and Ricardo reflected
that John's inamorata was tired and frightened and probably hadn't
eaten anything all day long, so there was ample excuse for her
hysteria.
“Come, come, buck up,” Webster soothed her, and helped himself
to a long whiff of her fragrant hair. “Old man Webster had one leg in
the grave, but they've pulled it out again.”
Still she sobbed.
“Now, listen to me, lady,” he commanded with mock severity. “You
just stop that. You're wasting your sympathy; and while, of course, I
enjoy your sympathy a heap, just pause to reflect on the result if
those salt tears should happen to drop into one of my numerous
wounds.”
“I'm so sorry for you, Caliph,” she murmured brokenly. “You poor,
harmless boy! I don't see how any one could be so fiendish as to
hurt you when you were so distinctly a non-combatant.”
“Thank you. Let us forget the Hague Conference for the present,
however. Have you met your brother?” he whispered.
“No, Caliph.”
“Ricardo.”
“Yes, Jack.”
“Come here. Rick, you scheming, unscrupulous, bloodthirsty
adventurer, I have a tremendous surprise in store for you. The
sweetest girl in the world—and she's right here——”
Ricardo laughingly held up his hand. “Jack, my friend,” he
interrupted, “you're too weak to make a speech. Don't do it. Besides,
you do not have to.” He turned and bowed gracefully to Dolores. “I
can see for myself she's the sweetest girl in the world, and that she's
right here.” He held out his hand to her. “Jack thinks he's going to
spring a surprise,” he continued maliciously, “quite forgetting that a
good soldier never permits himself to be taken by surprise. I know
all about his little secret, because I heard you mourning for him
when you thought he was dead.” Ricardo favoured her with a
knowing wink. “I am delighted to meet the future Mrs. Webster. I
quite understand why you fell in love with him, because, you see, I
love him myself and do does everybody else.”
With typical Castilian courtliness he took her hand, bowed low
over it, and kissed it. “I am Ricardo Luiz Ruey,” he said, anxious to
spare his friend the task of further exhausting conversation. “And
you are——”
“You're a consummate jackass!” groaned Webster. “I'm only a dear
old family friend, and Dolores is going to marry Billy Geary. You
impetuous idiot! She's your own sister Dolores Ruey. She, Mark
Twain, and I have ample cause for common complaint against the
world because the reports of our death have been grossly
exaggerated. She didn't perish when your father's administration
crumbled. Miss Ruey, this is your brother Ricardo. Kiss her you
damn' fool—forgive me, Miss Ruey—oh, Lord, nothing matters any
more. He's gummed everything up and ruined my party. I wish I
were dead.”
Ricardo stared from the outraged Webster to his sister and back
again.
“Jack Webster,” he declared, “you aren't crazy, are you?”
“Of course he is—the old dear,” Dolores cried happily, “but I'm
not.” She stepped up to her brother, and her arms went around his
neck. “Oh, Rick,” she cried, “I'm your sister. Truly, I am.”
“Dolores. My little lost sister Dolores? Why, I can't believe it!”
“Well, you'd better believe it,” John Stuart Webster growled feebly.
“Of course, you can doubt my word and get away with it, now that
I'm flat on my back, but if you dare cast aspersions on that girl's
veracity, I'll murder you a month from now.”
He closed his eyes, feeling instinctively that he ought not spy on
such a sacred family scene. When, however, the affecting meeting
was over and Dolores was ruffling the Websterian foretop while her
brother pressed the Websterian hand and tried to say all the things
he felt but couldn't express, John Stuart Webster brought them both
back to a realization of present conditions.
“Don't thank me, sir,” he piped in pathetic imitation of the small
boy of melodrama. “I have only done me duty, and for that I cannot
accept this purse of gold, even though my father and mother are
starving.”
“Oh, Caliph, do be serious,” Dolores pleaded.
He looked up at her fondly. “Take your brother out to Mother
Jenks and prove your case, Miss Ruey,” he advised her. “And while
you're at it, I certainly hope somebody will remember I'm not
accustomed to reposing on a centre table. Rick, if you can persuade
some citizen of this conquered commonwealth to put me to bed, I'd
be obliged. I'm dead tired, old horse. I'm—ah—sleepy——”
His head rolled weakly to one side, for he had been playing a part
and had nerved himself to finish it gracefully, even in his weakened
condition. He sighed, moaned slightly, and slipped into
unconsciousness.
CHAPTER XXIX
T
HROUGHOUT the night there was sporadic firing here and
there in the city, as the Ruey followers relentlessly hunted
down the isolated detachment of Government troops which
had escaped annihilation and capture in the final rout and fallen
back on the city, where, concealing themselves according to their
nature and inclination, they indulged in more or less sniping from
windows and the roofs of buildings. The practice of taking no
prisoners was an old one in Sobrante, and few presidents had done
more than Sarros to keep that custom alive; ergo, firm in the
conviction that to surrender was tantamount to facing a firing squad
at daylight, the majority of these stragglers, with consummate
courage, fought to the death.
The capture of Buenaventura was alone sufficient to insure a brief
revolution, but the capture of Sarros was ample guarantee that the
resistance to the new order of things was already at an end.
However, Ricardo Ruey felt that the prompt execution of Sarros
would be an added guarantee of peace by effectually discouraging
any opposition to the rebel cause in the outlying districts, where a
few isolated garrisons still remained in ignorance of the momentous
events being enacted in the capital. For the time being, Ricardo was
master of life and death in Sobrante, and all of his advisers and
supporters agreed with him that a so-called trial of the ex-dictator
would be a rather useless affair. His life was forfeit a hundred times
for murder and treason, and to be ponderous over his elimination
would savour of mockery. Accordingly, at midnight, a priest entered
the room in the arsenal where Sarros was confined, and shrived him.
Throughout the night the priest remained with him, and when that
early morning march to the cemetery commenced, he walked beside
Sarros, repeating the prayers for the dying.
Upon reaching the cemetery there was a slight wait until a
carriage drove up and discharged Ricardo Ruey and Mother Jenks.
The sergeant in command of the squad saluted and was briefly
ordered to proceed with the matter in hand; whereupon he turned
to Sarros, who with the customary sang froid of his kind upon such
occasions was calmly smoking, and bowed deprecatingly. Sarros
actually smiled upon him. “Adios, amigos” he murmured. Then, as
an afterthought and probably because he was sufficient of an egoist
to desire to appear a martyr, he added heroically: “I die for my
country. May God have mercy on my enemies.”
“If you'd cared to play a gentleman's game, you blighter, you
might 'ave lived for your bally country,” Mother Jenks reminded him
in English. “Wonder if the beggar 'll wilt or will 'e go through smilin'
like my sainted 'Enery on the syme spot.”
She need not have worried. It requires a strong man to be dictator
of a Roman-candle republic for fifteen years, and whatever his sins
of omission or commission, Sarros did not lack animal courage.
Alone and unattended he limped away among the graves to the wall
on the other side of the cemetery and placed his back against it,
negligently in the attitude of a devil-may-care fellow without a worry
in life. The sergeant waited respectfully until Sarros had finished his
cigarette; when he tossed it away and straightened to attention, the
sergeant knew he was ready to die. At his command there was a
sudden rattle of bolts as the cartridges slid from the magazines into
the breeches; there followed a momentary halt, another command;
the squad was aiming when Ricardo Ruey called sharply:
“Sergeant, do not give the order to fire.”
The rifles were lowered and the men gazed wonderingly at
Ricardo. “He's too brave,” Ricardo complained. “Damn him, I can't
kill him as I would a mad-dog. I've got to give him a chance.” The
sergeant raised his brows expressively. Ah, the ley fuga, that popular
form of execution where the prisoner is given a running chance, and
the firing-squad practises wing shooting If the prisoner manages,
miraculously, to escape, he is not pursued!
A doubt, however, crossed the sergeant's mind. “But, my general,”
he expostulated, “Senor Sarros cannot accept the ley fuga. He is
very lame. That is not giving him the chance your Excellency desires
he should have.”
“I wasn't thinking of that,” Ricardo replied. “I was thinking I'm
killing him without a fair trial for the reason that he's so infernally
ripe for the gallows that a trial would have been a joke.
Nevertheless, I am really killing him because he killed my father—
and that is scarcely fair. My father was a gentleman. Sergeant, is
your pistol loaded?”
“Yes, General.”
“Give it to Senor Sarros.”
As the sergeant started forward to comply Ricardo drew his own
service revolver and then motioned Mother Jenks and the firing-
squad to stand aside while he crossed to the centre of the cemetery.
“Sarros,” he called, “I am going to let God decide which one of us
shall live. When the sergeant gives the command to fire, I shall open
fire on you, and you are free to do the same to me. Sergeant, if he
kills me and escapes unhurt, my orders are to escort him to the bay
in my carriage and put him safely aboard the steamer.”
Mother Jenks sat down on a tombstone. “Gord's truth!” she
gasped, “but there's a rare plucked 'un.” Aloud she croaked: “Don't
be a bally ass, sir.”
“Silence!” he commanded.
The sergeant handed Sarros the revolver. “You heard what I said?”
Ricardo called.
Sarros bowed gravely.
“You understand your orders, Sergeant?”
“Yes, General.”
“Very well. Proceed. If this prisoner fires before you give the word,
have your squad riddle him.” The sergeant backed away and gazed
owlishly from the prisoner to his captor. “Ready!” he called. Both
revolvers came up. “Fire!” he shouted, and the two shots were
discharged simultaneously. Ricardo's cap flew off his head, but he
remained standing, while Sarros staggered back against the wall and
there recovering himself gamely, fired again. He scored a clean miss,
and Ricardo's gun barked three times; Sarros sprawled on his face,
rose to his knees, raised his pistol halfway, fired into the sky and slid
forward on his face. Ricardo stood beside the body until the sergeant
approached and stood to attention, his attitude saying:
“It is over. What next, General?”
“Take the squad back to the arsenal, Sergeant,” Ricardo ordered
him coolly, and walked back to recover his uniform cap. He was
smiling as he ran his finger through a gaping hole in the upper half
of the crown.
“Well, Mrs. Jenks,” he announced when he rejoined the old lady,
“that was better than executing him with a firing-squad. I gave him
a square deal. Now his friends can never say that I murdered him.”
He extended his hand to help Mother Jenks to her feet. She stood
erect and felt again that queer swelling of the heart, the old feeling
of suffocation.
“Steady, lass!” she mumbled. “'Old on to me, sir. It's my bally
haneurism. Gor'—I'm—chokin'——”
He caught her in his arms as she lurched toward him. Her face
was purple, and in her eyes there was a queer fierce light that went
out suddenly, leaving them dull and glazed. When she commenced
to sag in his arms, he eased her gently to the ground and laid her on
her back in the grass.
“The nipper's safe, 'Enery,” he heard her murmur. “I've raised 'er a
lydy, s'elp me—she's back where—you found 'er— 'Enery——”
She quivered, and the light came creeping back into her eyes
before it faded forever. “Comin', 'Enery—darlin',” she whispered; and
then the soul of Mother Jenks, who had a code and lived up to it
(which is more than the majority of us do), had departed upon the
ultimate journey. Ricardo gazed down on the hard old mouth,
softened now by a little half-smile of mingled yearning and gladness:
“What a wonderful soul you had,” he murmured, and kissed her.
In the end she slept in the niche in the wall of the Catedral de la
Vera Cruz, beside her sainted 'Enery.
CHAPTER XXX
T
HREE days passed. Don Juan Cafetéro had been buried with
all the pomp and circumstance of a national hero; Mother
Jenks, too, had gone to her appointed resting-place, and El
Buen Amigo had been closed forever. Ricardo had issued a
proclamation announcing himself provisional president of Sobrante;
a convention of revolutionary leaders had been held, and a
provisional cabinet selected. A day for the national elections had
been named; the wreckage of the brief revolution had been cleared
away, and the wheels of government were once more revolving
freely and noiselessly. And while all of this had been going on, John
Stuart Webster had lain on his back, staring at the palace ceiling and
absolutely forbidden to receive visitors. He was still engaged in this
mild form of gymnastics on the third day when the door of his room
opened and Dolores looked in on him.
“Good evening, Caliph,” she called. “Aren't you dead yet?”
It was exactly the tone she should have adopted to get the best
results, for Webster had been mentally and physically ill since she
had seen him last, and needed some such pleasantry as this to lift
him out of his gloomy mood. He grinned at her boyishly.
“No, I'm not dead. On the contrary, I'm feeling real chirpy. Won't
you come in and visit for a while, Miss Ruey?”
“Well, since you've invited me, I shall accept.” Entering, she stood
beside his bed and took the hand he extended toward Her. “This is
the first opportunity I've had, Miss Ruey,” he began, “to apologize for
the shock I gave you the other day. I should have come back to you
as I promised, instead of getting into a fight and scaring you half to
death. I hope you'll forgive me, because I'm paying for my fun now
—with interest.”
“Very well, Caliph. I'll forgive you—on one condition.”
“Who am I to resist having a condition imposed upon me? Name
your terms. I shall obey.”
“I'm weary of being called Miss Ruey. I want to be Dolores—to
you.”
“By the toenails of Moses,” he reflected, “there is no escape. She's
determined to rock the boat.” Aloud he said: “All right, Dolores. I
suppose I may as well take the license of the old family friend. I
guess Bill won't mind.”
“Billy hasn't a word to say about it,” she retorted, regarding him
with that calm, impersonal, yet vitally interested look that always
drove him frantic with the desire for her.
“Well, of course, I understand that,” he countered. “Naturally,
since Bill is only a man, you'll have to manage him and he'll have to
take orders.”
“Caliph, you're a singularly persistent man, once you get an idea
into your head. Please understand me, once for all: Billy Geary is a
dear, and it's a mystery to me why every girl in the world isn't
perfectly crazy about him, but every rule has its exceptions—and
Billy and I are just good friends. I'd like to know where you got the
idea we're engaged to be married.”
“Why—why—well, aren't you?”
“Certainly not.”
“Well, you—er—you ought to be. I expected—that is, I planned—I
mean Bill told me and—and—and—er—it never occurred to me you
could possibly have the—er—crust—to refuse him. Of course you're
going to marry him when he asks you?”
“Of course I am not.”
“Ah-h-h-h!” John Stuart Webster gazed at her in frank amazement.
“Not going to marry Bill Geary!” he cried, highly scandalized.
“I know you think I ought to, and I suppose it will appear quite
incomprehensible to you when I do not——”
“Why, Dolores, my dear girl! This is most amazing. Didn't Bill ask
you to marry him before he left?”
“Yes, he did me that honour, and I declined him.”
“You what!”
She smiled at him so maternally that his hand itched to drag her
down to him and kiss her curving lips.
“Do you mind telling me just why you took this extraordinary
attitude?”
“You have no right to ask, but I'll tell you. I refused Billy because I
didn't love him enough—that way. What's more, I never could.”
He rolled his head to one side and softly, very softly, whistled two
bars of “The Spanish Cavalier” through his teeth He was properly
thunder-struck—so much so, in fact, that for a moment he actually
forgot her presence the while he pondered this most incredible state
of affairs.
“I see it all now. It's as clear as mud,” he announced finally. “You
refused poor old Bill and broke his heart, and so he went away and
hasn't had the courage to write me since. I'm afraid Bill and I both
regarded this fight as practically won—all over but the wedding-
march, as one might put it. I might as well confess I hustled the boy
down from the mine just so you two could get married and light out
on your honeymoon I figured Bill could kill two birds with one stone
—have his honeymoon and get rid of his malaria, and return here in
three or four months to relieve me, after I had the mine in
operation. Poor boy. That was a frightful song-and-dance you gave
him.”
“I suspected you were the matchmaker in this case. I must say I
think you're old enough to know better, Caliph John.”
“You did, eh? Well, what made you think so?”
She chuckled. “Oh, you're very obvious—to a woman.”
“I forgot that you reveal the past and foretell the future.”
“You are really very clumsy, Caliph. You should never try to direct
the destiny of any woman.”
“I'm on the sick list,” he pleaded, “and it isn't sporting of you to
discuss me. You're healthy—so let us discuss you. Dolores, do you
figure Bill's case to be absolutely hopeless?”
“Absolutely, Caliph.”
“Hum-m-m!”
Again Webster had recourse to meditation, seeing which, Dolores
walked to the pier-glass in the corner, satisfied herself that her
coiffure was just so and returned to his side, singing softly a little
song that had floated out over the transom of Webster's room door
into the hall one night:
A Spanish cavalier,
Went out to rope a steer,
Along with his paper cigar-r-ro!
“Caramba!” said he.
“Manana you will be
Muchù bueno carne por mio”
THE END
*** END OF THE PROJECT GUTENBERG EBOOK WEBSTER—MAN'S
MAN ***
Updated editions will replace the previous one—the old editions will
be renamed.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
ebookbell.com