100% found this document useful (1 vote)

139 views

Machine Learning Refined: Foundations, Algorithms, and Applications Second Edition Jeremy Watt 2024 scribd download

Machine

Uploaded by

creusmurza1n

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

139 views

Machine Learning Refined: Foundations, Algorithms, and Applications Second Edition Jeremy Watt 2024 scribd download

Machine

Uploaded by

creusmurza1n

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Download the Full Version of textbook for Fast Typing at textbookfull.

com

Machine Learning Refined: Foundations, Algorithms,

and Applications Second Edition Jeremy Watt

https://textbookfull.com/product/machine-learning-refined-
foundations-algorithms-and-applications-second-edition-
jeremy-watt/

OR CLICK BUTTON

DOWNLOAD NOW

Download More textbook Instantly Today - Get Yours Now at textbookfull.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Machine learning refined foundations algorithms and

applications Second Edition Borhani

https://textbookfull.com/product/machine-learning-refined-foundations-
algorithms-and-applications-second-edition-borhani/

textboxfull.com

Biota Grow 2C gather 2C cook Loucas

https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-loucas/

textboxfull.com

Foundations of Machine Learning second edition Mehryar

Mohri

https://textbookfull.com/product/foundations-of-machine-learning-
second-edition-mehryar-mohri/

textboxfull.com

Machine Learning Algorithms for Industrial Applications

Santosh Kumar Das

https://textbookfull.com/product/machine-learning-algorithms-for-
industrial-applications-santosh-kumar-das/

textboxfull.com
Machine Learning Foundations: Supervised, Unsupervised,
and Advanced Learning Taeho Jo

https://textbookfull.com/product/machine-learning-foundations-
supervised-unsupervised-and-advanced-learning-taeho-jo/

textboxfull.com

Pro Machine Learning Algorithms V Kishore Ayyadevara

https://textbookfull.com/product/pro-machine-learning-algorithms-v-
kishore-ayyadevara/

textboxfull.com

Analysis for computer scientists foundations methods and

algorithms Second Edition Oberguggenberger

https://textbookfull.com/product/analysis-for-computer-scientists-
foundations-methods-and-algorithms-second-edition-oberguggenberger/

textboxfull.com

Learning Microsoft Cognitive Services leverage machine

learning APIs to build smart applications Second Edition.
Edition Larsen
https://textbookfull.com/product/learning-microsoft-cognitive-
services-leverage-machine-learning-apis-to-build-smart-applications-
second-edition-edition-larsen/
textboxfull.com

Machine learning and security protecting systems with data

and algorithms First Edition Chio

https://textbookfull.com/product/machine-learning-and-security-
protecting-systems-with-data-and-algorithms-first-edition-chio/

textboxfull.com
Machine Learning Reﬁned

With its intuitive yet rigorous approach to machine learning, this text provides students
with the fundamental knowledge and practical tools needed to conduct research and
build data-driven products. The authors prioritize geometric intuition and algorithmic
thinking, and include detail on all the essential mathematical prerequisites, to offer a
fresh and accessible way to learn. Practical applications are emphasized, with examples
from disciplines including computer vision, natural language processing, economics,
neuroscience, recommender systems, physics, and biology. Over 300 color illustra-
tions are included and have been meticulously designed to enable an intuitive grasp
of technical concepts, and over 100 in-depth coding exercisesPython
(in ) provide a
real understanding of crucial machine learning algorithms. A suite of online resources
including sample code, data sets, interactive lecture slides, and a solutions manual are
provided online, making this an ideal text both for graduate courses on machine learning
and for individual reference and self-study.

Jeremy Watt received his PhD in Electrical Engineering from Northwestern University,
and is now a machine learning consultant and educator. He teaches machine learning,
deep learning, mathematical optimization, and reinforcement learning at Northwestern
University.

Reza Borhanireceived his PhD in Electrical Engineering from Northwestern University,

and is now a machine learning consultant and educator. He teaches a variety of courses
in machine learning and deep learning at Northwestern University.

is the Joseph Cummings Professor at Northwestern University,

Aggelos K. Katsaggelos

where he heads the Image and Video Processing Laboratory. He is a Fellow of IEEE,
SPIE, EURASIP, and OSA and the recipient of the IEEE Third Millennium Medal
(2000).
Machine Learning Reﬁned

Foundations, Algorithms, and Applications

J E R E M Y W AT T
Northwestern University, Illinois

REZA BORHANI
Northwestern University, Illinois

A G G E L O S K . K AT S A G G E L O S
Northwestern University, Illinois
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906

Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.

www.cambridge.org
Information on this title:
www.cambridge.org/9781108480727
DOI: 10.1017/9781108690935
© Cambridge University Press 2020
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2020
Printed and bound in Great Britain by Clays Ltd, Elcograf S.p.A.
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-48072-7 Hardback
Additional resources for this publication www.cambridge.org/watt2
at
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To our families:

Deb, Robert, and Terri

Soheila, Ali, and Maryam

Ειρήνη Ζωή Σοφία

, , , and Ειρήνη
Contents

Preface pagexii
Acknowledgements xxii
1 Introduction to Machine Learning 1
1.1 Introduction 1
1.2 Distinguishing Cats from Dogs: a Machine Learning Approach 1
1.3 The Basic Taxonomy of Machine Learning Problems 6
1.4 Mathematical Optimization 16
1.5 Conclusion 18
Part I Mathematical Optimization 19
2 Zero-Order Optimization Techniques 21
2.1 Introduction 21
2.2 The Zero-Order Optimality Condition 23
2.3 Global Optimization Methods 24
2.4 Local Optimization Methods 27
2.5 Random Search 31
2.6 Coordinate Search and Descent 39
2.7 Conclusion 40
2.8 Exercises 42
3 First-Order Optimization Techniques 45
3.1 Introduction 45
3.2 The First-Order Optimality Condition 45
3.3 The Geometry of First-Order Taylor Series 52
3.4 Computing Gradients Eﬃciently 55
3.5 Gradient Descent 56
3.6 Two Natural Weaknesses of Gradient Descent 65
3.7 Conclusion 71
3.8 Exercises 71
4 Second-Order Optimization Techniques 75
4.1 The Second-Order Optimality Condition 75
viii Contents

4.2 The Geometry of Second-Order Taylor Series 78

4.3 Newton’s Method 81
4.4 Two Natural Weaknesses of Newton’s Method 90
4.5 Conclusion 91
4.6 Exercises 92
Part II Linear Learning 97
5 Linear Regression 99
5.1 Introduction 99
5.2 Least Squares Linear Regression 99
5.3 Least Absolute Deviations 108
5.4 Regression Quality Metrics 111
5.5 Weighted Regression 113
5.6 Multi-Output Regression 116
5.7 Conclusion 120
5.8 Exercises 121
5.9 Endnotes 124
6 Linear Two-Class Classification 125
6.1 Introduction 125
6.2 Logistic Regression and the Cross Entropy Cost 125
6.3 Logistic Regression and the Softmax Cost 135
6.4 The Perceptron 140
6.5 Support Vector Machines 150
6.6 Which Approach Produces the Best Results? 157
6.7 The Categorical Cross Entropy Cost 158
6.8 Classification Quality Metrics 160
6.9 Weighted Two-Class Classification 167
6.10 Conclusion 170
6.11 Exercises 171
7 Linear Multi-Class Classification 174
7.1 Introduction 174
7.2 One-versus-All Multi-Class Classification 174
7.3 Multi-Class Classification and the Perceptron 184
7.4 Which Approach Produces the Best Results? 192
7.5 The Categorical Cross Entropy Cost Function 193
7.6 Classification Quality Metrics 198
7.7 Weighted Multi-Class Classification 202
7.8 Stochastic and Mini-Batch Learning 203
7.9 Conclusion 205
7.10 Exercises 205
Contents ix

8 Linear Unsupervised Learning 208

8.1 Introduction 208

8.2 Fixed Spanning Sets, Orthonormality, and Projections 208

8.3 The Linear Autoencoder and Principal Component Analysis 213

8.4 Recommender Systems 219

8.5 K-Means Clustering 221

8.6 General Matrix Factorization Techniques 227

8.7 Conclusion 230

8.8 Exercises 231

8.9 Endnotes 233

9 Feature Engineering and Selection 237

9.1 Introduction 237

9.2 Histogram Features 238

9.3 Feature Scaling via Standard Normalization 249

9.4 Imputing Missing Values in a Dataset 254

9.5 Feature Scaling via PCA-Sphering 255

9.6 Feature Selection via Boosting 258

9.7 Feature Selection via Regularization 264

9.8 Conclusion 269

9.9 Exercises 269

Part III Nonlinear Learning 273

10 Principles of Nonlinear Feature Engineering 275

10.1 Introduction 275

10.2 Nonlinear Regression 275

10.3 Nonlinear Multi-Output Regression 282

10.4 Nonlinear Two-Class Classiﬁcation 286

10.5 Nonlinear Multi-Class Classiﬁcation 290

10.6 Nonlinear Unsupervised Learning 294

10.7 Conclusion 298

10.8 Exercises 298

11 Principles of Feature Learning 304

11.1 Introduction 304

11.2 Universal Approximators 307

11.3 Universal Approximation of Real Data 323

ﬃ
11.4 Naive Cross-Validation 335

11.5 E cient Cross-Validation via Boosting 340

11.6 Eﬃ cient Cross-Validation via Regularization 350

11.7 Testing Data 361

11.8 Which Universal Approximator Works Best in Practice? 365

11.9 Bagging Cross-Validated Models 366

x Contents

11.10 K-Fold Cross-Validation 373

11.11 When Feature Learning Fails 378
11.12 Conclusion 379
11.13 Exercises 380
12 Kernel Methods 383
12.1 Introduction 383
12.2 Fixed-Shape Universal Approximators 383
12.3 The Kernel Trick 386
12.4 Kernels as Measures of Similarity 396
12.5 Optimization of Kernelized Models 397
12.6 Cross-Validating Kernelized Learners 398
12.7 Conclusion 399
12.8 Exercises 399
13 Fully Connected Neural Networks 403
13.1 Introduction 403
13.2 Fully Connected Neural Networks 403
13.3 Activation Functions 424
13.4 The Backpropagation Algorithm 427
13.5 Optimization of Neural Network Models 428
13.6 Batch Normalization 430
13.7 Cross-Validation via Early Stopping 438
13.8 Conclusion 440
13.9 Exercises 441
14 Tree-Based Learners 443
14.1 Introduction 443
14.2 From Stumps to Deep Trees 443
14.3 Regression Trees 446
14.4 Classiﬁcation Trees 452
14.5 Gradient Boosting 458
14.6 Random Forests 462
14.7 Cross-Validation Techniques for Recursively Deﬁned Trees 464
14.8 Conclusion 467
14.9 Exercises 467
Part IV Appendices 471
Appendix A Advanced First- and Second-Order Optimization Methods 473
A.1 Introduction 473
A.2 Momentum-Accelerated Gradient Descent 473
A.3 Normalized Gradient Descent 478
A.4 Advanced Gradient-Based Methods 485
Contents xi

A.5 Mini-Batch Optimization 487

A.6 Conservative Steplength Rules 490

A.7 Newton’s Method, Regularization, and Nonconvex Functions 499

A.8 Hessian-Free Methods 502

Appendix B Derivatives and Automatic Differentiation 511

B.1 Introduction 511

B.2 The Derivative 511

B.3 Derivative Rules for Elementary Functions and Operations 514

B.4 The Gradient 516

B.5 The Computation Graph 517

B.6 The Forward Mode of Automatic Di ﬀerentiation 520

B.7 The Reverse Mode of Automatic Diﬀerentiation 526

B.8 Higher-Order Derivatives 529

B.9 Taylor Series 531

B.10 Using the autograd Library 536

Appendix C Linear Algebra 546

C.1 Introduction 546

C.2 Vectors and Vector Operations 546

C.3 Matrices and Matrix Operations 553

C.4 Eigenvalues and Eigenvectors 556

C.5 Vector and Matrix Norms 559

References 564

Index 569
Preface

For eons we humans have sought out rules or patterns that accurately describe

how important systems in the world around us work, whether these systems

be agricultural, biological, physical, ﬁnancial, etc. We do this because such rules

allow us to understand a system better, accurately predict its future behavior

and ultimately, control it. However, the process of ﬁnding the ”right” rule that

seems to govern a given system has historically been no easy task. For most of

our history data (glimpses of a given system at work) has been an extremely

scarce commodity. Moreover, our ability to compute, to try out various rules

to see which most accurately represents a phenomenon, has been limited to

what we could accomplish by hand. Both of these factors naturally limited

the range of phenomena scientiﬁc pioneers of the past could investigate and

inevitably forced them to use philosophical and /or visual approaches to rule-

ﬁnding. Today, however, we live in a world awash in data, and have colossal

computing power at our ﬁngertips. Because of this, we lucky descendants of the

great pioneers can tackle a much wider array of problems and take a much more

empirical approach to rule-ﬁnding than our forbears could. Machine learning,

the topic of this textbook, is a term used to describe a broad (and growing)

collection of pattern-ﬁnding algorithms designed to properly identify system

rules empirically and by leveraging our access to potentially enormous amounts

of data and computing power.

In the past decade the user base of machine learning has grown dramatically.

From a relatively small circle in computer science, engineering, and mathe-

matics departments the users of machine learning now include students and

researchers from every corner of the academic universe, as well as members of

industry, data scientists, entrepreneurs, and machine learning enthusiasts. This

textbook is the result of a complete tearing down of the standard curriculum

of machine learning into its most fundamental components, and a curated re-

assembly of those pieces (painstakingly polished and organized) that we feel

will most beneﬁt this broadening audience of learners. It contains fresh and

intuitive yet rigorous descriptions of the most fundamental concepts necessary

to conduct research, build products, and tinker.

Preface xiii

Book Overview
The second edition of this text is a complete revision of our ﬁrst endeavor, with

virtually every chapter of the original rewritten from the ground up and eight

new chapters of material added, doubling the size of the ﬁrst edition. Topics from

the ﬁrst edition, from expositions on gradient descent to those on One-versus-

All classiﬁcation and Principal Component Analysis have been reworked and

polished. A swath of new topics have been added throughout the text, from

derivative-free optimization to weighted supervised learning, feature selection,

nonlinear feature engineering, boosting-based cross-validation, and more.

While heftier in size, the intent of our original attempt has remained un-

changed: to explain machine learning, from ﬁrst principles to practical imple-

mentation, in the simplest possible terms. A big-picture breakdown of the second

edition text follows below.

Part I: Mathematical Optimization (Chapters 2–4)

Mathematical optimization is the workhorse of machine learning, powering not

only the tuning of individual machine learning models (introduced in Part II)

but also the framework by which we determine appropriate models themselves

via cross-validation (discussed in Part III of the text).

In this ﬁrst part of the text we provide a complete introduction to mathemat-

ical optimization, from basic zero-order (derivative-free) methods detailed in

Chapter 2 to fundamental and advanced ﬁrst-order and second-order methods

in Chapters 3 and 4, respectively. More speciﬁcally this part of the text con-

tains complete descriptions of local optimization, random search methodologies,

gradient descent, and Newton’s method.

Part II: Linear Learning (Chapters 5–9)

In this part of the text we describe the fundamental components of cost function

based machine learning, with an emphasis on linear models.

This includes a complete description of supervised learning in Chapters 5–7

including linear regression, two-class, and multi-class classiﬁcation. In each of

these chapters we describe a range of perspectives and popular design choices

made when building supervised learners.

In Chapter 8 we similarly describe unsupervised learning, and Chapter 9 con-

tains an introduction to fundamental feature engineering practices including pop-

ular histogram features as well as various input normalization schemes, and

feature selection paradigms.

xiv Preface

Part III: Nonlinear Learning (Chapters 10–14)

In the ﬁnal part of the text we extend the fundamental paradigms introduced in

Part II to the general nonlinear setting.

We do this carefully beginning with a basic introduction to nonlinear super-

vised and unsupervised learning in Chapter 10, where we introduce the motiva-

tion, common terminology, and notation of nonlinear learning used throughout

the remainder of the text.

In Chapter 11 we discuss how to automate the selection of appropriate non-

linear models, beginning with an introduction to universal approximation. This

naturally leads to detailed descriptions of cross-validation, as well as boosting,

regularization, ensembling, and K-folds cross-validation.

With these fundamental ideas in-hand, in Chapters 12–14 we then dedicate an

individual chapter to each of the three popular universal approximators used in

machine learning: ﬁxed-shape kernels, neural networks, and trees, where we discuss

the strengths, weaknesses, technical eccentricities, and usages of each popular

universal approximator.

To get the most out of this part of the book we strongly recommend that

Chapter 11 and the fundamental ideas therein are studied and understood before

moving on to Chapters 12–14.

Part IV: Appendices

This shorter set of appendix chapters provides a complete treatment on ad-

vanced optimization techniques, as well as a thorough introduction to a range

of subjects that the readers will need to understand in order to make full use of

the text.

Appendix A continues our discussion from Chapters 3 and 4, and describes

advanced ﬁrst- and second-order optimization techniques. This includes a discussion

of popular extensions of gradient descent, including mini-batch optimization,

momentum acceleration, gradient normalization, and the result of combining these

enhancements in various ways (producing e.g., the RMSProp and Adam ﬁrst

order algorithms) – and Newton’s method – including regularization schemes

and Hessian-free methods.

Appendix B contains a tour of computational calculus including an introduc-

/
tion to the derivative gradient, higher-order derivatives, the Hessian matrix,

numerical di ﬀerentiation, forward and backward (backpropogation) automatic

di ﬀerentiation, and Taylor series approximations.
Appendix C provides a suitable background in linear and matrix algebra , in-

/
cluding vector matrix arithmetic, the notions of spanning sets and orthogonality,

as well as eigenvalues and eigenvectors.

Preface xv

Readers: How To Use This Book

This textbook was written with ﬁrst-time learners of the subject in mind, as

well as for more knowledgeable readers who yearn for a more intuitive and

serviceable treatment than what is currently available today. To make full use of

the text one needs only a basic understanding of vector algebra (mathematical

functions, vector arithmetic, etc.) and computer programming (for example,

basic proﬁciency with a dynamically typed language like Python). We provide

complete introductory treatments of other prerequisite topics including linear

algebra, vector calculus, and automatic di ﬀ erentiation in the appendices of the

text. Example ”roadmaps,” shown in Figures 0.1–0.4, provide suggested paths

for navigating the text based on a variety of learning outcomes and university

courses (ranging from a course on the essentials of machine learning to special

topics – as described further under ”Instructors: How to use this Book” below).

We believe that intuitive leaps precede intellectual ones, and to this end defer

the use of probabilistic and statistical views of machine learning in favor of a

fresh and consistent geometric perspective throughout the text. We believe that

this perspective not only permits a more intuitive understanding of individ-

ual concepts in the text, but also that it helps establish revealing connections

between ideas often regarded as fundamentally distinct (e.g., the logistic re-

gression and Support Vector Machine classiﬁers, kernels and fully connected

neural networks, etc.). We also highly emphasize the importance of mathemati-

cal optimization in our treatment of machine learning. As detailed in the ”Book

Overview” section above, optimization is the workhorse of machine learning

and is fundamental at many levels – from the tuning of individual models to

the general selection of appropriate nonlinearities via cross-validation. Because

of this a strong understanding of mathematical optimization is requisite if one

wishes to deeply understand machine learning, and if one wishes to be able to

implement fundamental algorithms.

To this end, we place signiﬁcant emphasis on the design and implementa-

tion of algorithms throughout the text with implementations of fundamental

algorithms given in Python. These fundamental examples can then be used as

building blocks for the reader to help complete the text’s programming exer-

cises, allowing them to ”get their hands dirty” and ”learn by doing,” practicing

the concepts introduced in the body of the text. While in principle any program-

ming language can be used to complete the text’s coding exercises, we highly

recommend using Python for its ease of use and large support community. We
also recommend using the open-source Python libraries NumPy, autograd, and
matplotlib, as well as the Jupyter notebook editor to make implementing and
testing code easier. A complete set of installation instructions, datasets, as well

as starter notebooks for many exercises can be found at

https://github.com/jermwatt/machine_learning_refined
xvi Preface

Instructors: How To Use This Book

Chapter slides associated with this textbook, datasets, along with a large array of

instructional interactive Python widgets illustrating various concepts through-

out the text, can be found on the github repository accompanying this textbook

https://github.com/jermwatt/machine_learning_refined
This site also contains instructions for installing Python as well as a number

of other free packages that students will ﬁnd useful in completing the text’s

exercises.

This book has been used as a basis for a number of machine learning courses

at Northwestern University, ranging from introductory courses suitable for un-

dergraduate students to more advanced courses on special topics focusing on

optimization and deep learning for graduate students. With its treatment of

foundations, applications, and algorithms this text can be used as a primary

resource or in fundamental component for courses such as the following.

Machine learning essentials treatment : an introduction to the essentials

of machine learning is ideal for undergraduate students, especially those in

quarter-based programs and universities where a deep dive into the entirety

of the book is not feasible due to time constraints. Topics for such a course

can include: gradient descent, logistic regression, Support Vector Machines,

One-versus-All and multi-class logistic regression, Principal Component Anal-

ysis, K-means clustering, the essentials of feature engineering and selection,

cross-validation, regularization, ensembling, bagging, kernel methods, fully

connected neural networks, and trees. A recommended roadmap for such a

course – including recommended chapters, sections, and corresponding topics

– is shown in Figure 0.1.

Machine learning full treatment: a standard machine learning course based

on this text expands on the essentials course outlined above both in terms

of breadth and depth. In addition to the topics mentioned in the essentials

course, instructors may choose to cover Newton’s method, Least Absolute

Deviations, multi-output regression, weighted regression, the Perceptron, the

Categorical Cross Entropy cost, weighted two-class and multi-class classiﬁca-

tion, online learning, recommender systems, matrix factorization techniques,

boosting-based feature selection, universal approximation, gradient boosting,

random forests, as well as a more in-depth treatment of fully connected neu-

ral networks involving topics such as batch normalization and early-stopping-

based regularization. A recommended roadmap for such a course – including

recommended chapters, sections, and corresponding topics – is illustrated in

Figure 0.2.
Preface xvii

Mathematical optimization for machine learning and deep learning: such

a course entails a comprehensive description of zero-, ﬁrst-, and second-order

optimization techniques from Part I of the text (as well as Appendix A) in-

cluding: coordinate descent, gradient descent, Newton’s method, quasi-Newton

methods, stochastic optimization, momentum acceleration, ﬁxed and adaptive

steplength rules, as well as advanced normalized gradient descent schemes

(e.g., Adam and RMSProp). These can be followed by an in-depth description

of the feature engineering processes (especially standard normalization and

PCA-sphering) that speed up (particularly ﬁrst-order) optimization algorithms.

All students in general, and those taking an optimization for machine learning

course in particular, should appreciate the fundamental role optimization plays

in identifying the ”right” nonlinearity via the processes of boosting and regular-

iziation based cross-validation, the principles of which are covered in Chapter

11. Select topics from Chapter 13 and Appendix B – including backpropagation,

/
batch normalization, and foward backward mode of automatic di ﬀ erentiation
– can also be covered. A recommended roadmap for such a course – including

recommended chapters, sections, and corresponding topics – is given in Figure

0.3.

Introductory portion of a course on deep learning : such a course is best suit-

able for students who have had prior exposure to fundamental machine learning

concepts, and can begin with a discussion of appropriate ﬁrst order optimiza-

tion techniques, with an emphasis on stochastic and mini-batch optimization,

momentum acceleration, and normalized gradient schemes such as Adam and

RMSProp. Depending on the audience, a brief review of fundamental elements

of machine learning may be needed using selected portions of Part II of the text.

A complete discussion of fully connected networks, including a discussion of

/
backpropagation and forward backward mode of automatic di ﬀerentiation, as
well as special topics like batch normalization and early-stopping-based cross-

validation, can then be made using Chapters 11, 13 , and Appendices A and B of

the text. A recommended roadmap for such a course – including recommended

chapters, sections, and corresponding topics – is shown in Figure 0.4. Additional

text’s github repository.

xviii Preface

CHAPTER SECTIONS TOPICS

1 2 3 4 5
Machine Learning Taxonomy
1

1 2 3 4 5
2 Global/Local Optimization Curse of Dimensionality

1 2 3 4 5
3 Gradient Descent

1 2
5 Least Squares Linear Regression

1 2 3 5 6 8
6 Logistic Regression Cross Entropy/Softmax Cost SVMs

1 2 3 4 6
7 One-versus-All Multi-Class Logistic Regression

1 2 3 5
Principal Component Analysis K-means
8

2 7
Feature Engineering Feature Selection
9

1 2 4
Nonlinear Regression Nonlinear Classification
10

1 2 3 4 6 7 9
11 Universal Approximation Cross-Validation Regularization

Ensembling Bagging

1 2 3
Kernel Methods The Kernel Trick
12

1 2 4
Fully Connected Networks Backpropagation
13

1 2 3 4
14 Regression Trees Classification Trees

Figure 0.1 Recommended study roadmap for a course on the essentials of machine

learning, including requisite chapters (left column), sections (middle column), and

corresponding topics (right column). This essentials plan is suitable for

time-constrained courses (in quarter-based programs and universities) or self-study, or

where machine learning is not the sole focus but a key component of some broader

course of study. Note that chapters are grouped together visually based on text layout

detailed under ”Book Overview” in the Preface. See the section titled ”Instructors: How

To Use This Book” in the Preface for further details.

Preface xix

CHAPTER SECTIONS TOPICS

1 2 3 4 5
1 Machine Learning Taxonomy

1 2 3 4 5
Global/Local Optimization Curse of Dimensionality
2
1 2 3 4 5
3 Gradient Descent

1 2 3
4 Newton’s method

1 2 3 4 5 6
5 Least Squares Linear Regression Least Absolute Deviations

Multi-Output Regression Weighted Regression

1 2 3 4 5 6 7 8 9 10
6 Logistic Regression Cross Entropy/Softmax Cost The Perceptron

SVMs Categorical Cross Entropy Weighted Two-Class Classification

1 2 3 4 5 6 7 8 9
7 One-versus-All Multi-Class Logistic Regression

Weighted Multi-Class Classification Online Learning

1 2 3 4 5 6 7
PCA K-means Recommender Systems Matrix Factorization
8
1 2 3 6 7
Feature Engineering Feature Selection Boosting Regularization
9

1 2 3 4 5 6 7
Nonlinear Supervised Learning Nonlinear Unsupervised Learning
10

1 2 3 4 5 6 7 8 9 10 11 12
Universal Approximation Cross-Validation Regularization
11
Ensembling Bagging K-Fold Cross-Validation

1 2 3 4 5 6 7
Kernel Methods The Kernel Trick
12
1 2 3 4 5 6 7 8
Fully Connected Networks Backpropagation Activation Functions
13
Batch Normalization Early Stopping

1 2 3 4 5 6 7 8
14 Regression/Classification Trees Gradient Boosting Random Forests

Figure 0.2 Recommended study roadmap for a full treatment of standard machine

learning subjects, including chapters, sections, as well as corresponding topics to cover.

This plan entails a more in-depth coverage of machine learning topics compared to the

essentials roadmap given in Figure 0.1, and is best suited for senior undergraduate/early

graduate students in semester-based programs and passionate independent readers. See

the section titled ”Instructors: How To Use This Book” in the Preface for further details.
xx Preface

CHAPTER SECTIONS TOPICS

1 2 3 4 5
Machine Learning Taxonomy
1

1 2 3 4 5 6 7
2 Global/Local Optimization Curse of Dimensionality

Random Search Coordinate Descent

1 2 3 4 5 6 7
3 Gradient Descent

1 2 3 4 5
Newton’s Method
4

6
8
Online Learning
7

8
3 4 5
Feature Scaling PCA-Sphering Missing Data Imputation
9

10
5 6
Regularization
11 Boosting

12
6
13 Batch Normalization

1 2 3 4 5 6 7 8
Momentum Acceleration Normalized Schemes: Adam, RMSProp
A
Fixed Lipschitz Steplength Rules Backtracking Line Search

Stochastic/Mini-Batch Optimization Hessian-Free Optimization

1 2 3 4 5 6 7 8 9 10
Forward/Backward Mode of Automatic Differentiation
B

Figure 0.3 Recommended study roadmap for a course on mathematical optimization

for machine learning and deep learning, including chapters, sections, as well as topics

to cover. See the section titled ”Instructors: How To Use This Book” in the Preface for

further details.
Preface xxi

CHAPTER SECTIONS TOPICS

2
1 2 3 4 5 6 7
3 Gradient Descent

1 2 3 4 5
10 Nonlinear Regression Nonlinear Classification Nonlinear Autoencoder

1 2 3 4 6
11 Universal Approximation Cross-Validation Regularization

12
1 2 3 4 5 6 7 8
13 Fully Connected Networks Backpropagation Activation Functions

Batch Normalization Early Stopping

1 2 3 4 5 6
A Momentum Acceleration Normalized Schemes: Adam, RMSProp

Fixed Lipschitz Steplength Rules Backtracking Line Search

Stochastic/Mini-Batch Optimization

1 2 3 4 5 6 7 8 9 10
B Forward/Backward Mode of Automatic Differentiation

Figure 0.4 Recommended study roadmap for an introductory portion of a course on

deep learning, including chapters, sections, as well as topics to cover. See the section

titled ”Instructors: How To Use This Book” in the Preface for further details.
Acknowledgements

This text could not have been written in anything close to its current form

without the enormous work of countless genius-angels in the Python open-

source community, particularly authors and contributers of NumPy, Jupyter,
and matplotlib. We are especially grateful to the authors and contributors of

autograd including Dougal Maclaurin, David Duvenaud, Matt Johnson, and

Jamie Townsend, as autograd allowed us to experiment and iterate on a host of

new ideas included in the second edition of this text that greatly improved it as

well as, we hope, the learning experience for its readers.

We are also very grateful for the many students over the years that provided

insightful feedback on the content of this text, with special thanks to Bowen

Tian who provided copious amounts of insightful feedback on early drafts of

the work.

Finally, a big thanks to Mark McNess Rosengren and the entire Standing

Passengers crew for helping us stay ca ﬀeinated during the writing of this text.
1 Introduction to Machine
Learning

1.1 Introduction
Machine learning is a uniﬁed algorithmic framework designed to identify com-

putational models that accurately describe empirical data and the phenomena

underlying it, with little or no human involvement. While still a young dis-

cipline with much more awaiting discovery than is currently known, today

machine learning can be used to teach computers to perform a wide array

of useful tasks including automatic detection of objects in images (a crucial

component of driver-assisted and self-driving cars), speech recognition (which

powers voice command technology), knowledge discovery in the medical sci-

ences (used to improve our understanding of complex diseases), and predictive

analytics (leveraged for sales and economic forecasting), to just name a few.

In this chapter we give a high-level introduction to the ﬁeld of machine

learning as well as the contents of this textbook.

1.2 Distinguishing Cats from Dogs: a Machine Learning

Approach
To get a big-picture sense of how machine learning works, we begin by dis-

cussing a toy problem: teaching a computer how to distinguish between pic-

tures of cats from those with dogs. This will allow us to informally describe the

terminology and procedures involved in solving the typical machine learning

problem.

Do you recall how you first learned about the di ff erence between cats and
dogs, and how they are di ff erent animals? The answer is probably no, as most
humans learn to perform simple cognitive tasks like this very early on in the

course of their lives. One thing is certain, however: young children do not need

some kind of formal scientiﬁc training, or a zoological lecture on felis catus and

canis familiaris species, in order to be able to tell cats and dogs apart. Instead,

they learn by example. They are naturally presented with many images of

what they are told by a supervisor (a parent, a caregiver, etc.) are either cats

or dogs, until they fully grasp the two concepts. How do we know when a

child can successfully distinguish between cats and dogs? Intuitively, when
2 Introduction to Machine Learning

they encounter new (images of) cats and dogs, and can correctly identify each

new example or, in other words, when they can generalize what they have learned

to new, previously unseen, examples.

Like human beings, computers can be taught how to perform this sort of task

in a similar manner. This kind of task where we aim to teach a computer to

distinguish between di ﬀ erent types or classes of things (here cats and dogs) is

referred to as a classiﬁcation problem in the jargon of machine learning, and is

done through a series of steps which we detail below.

1. Data collection. Like human beings, a computer must be trained to recognize

the diﬀ erence between these two types of animals by learning from a batch of
examples, typically referred to as a training set of data. Figure 1.1 shows such a

training set consisting of a few images of di ﬀerent cats and dogs. Intuitively, the
larger and more diverse the training set the better a computer (or human) can

perform a learning task, since exposure to a wider breadth of examples gives

the learner more experience.

Figure 1.1 A training set consisting of six images of cats (highlighted in blue) and six

images of dogs (highlighted in red). This set is used to train a machine learning model

that can distinguish between future images of cats and dogs. The images in this ﬁgure

were taken from [1].

2. Feature design. Think for a moment about how we (humans) tell the di ﬀ erence
between images containing cats from those containing dogs. We use color, size,

/
the shape of the ears or nose, and or some combination of these features in order

to distinguish between the two. In other words, we do not just look at an image

as simply a collection of many small square pixels. We pick out grosser details,

or features, from images like these in order to identify what it is that we are

looking at. This is true for computers as well. In order to successfully train a

computer to perform this task (and any machine learning task more generally)
1.2 Distinguishing Cats from Dogs: a Machine Learning Approach 3

we need to provide it with properly designed features or, ideally, have it ﬁnd or

learn such features itself.

Designing quality features is typically not a trivial task as it can be very ap-

plication dependent. For instance, a feature like color would be less helpful in

discriminating between cats and dogs (since many cats and dogs share similar

hair colors) than it would be in telling grizzly bears and polar bears apart! More-

over, extracting the features from a training dataset can also be challenging. For

example, if some of our training images were blurry or taken from a perspective

where we could not see the animal properly, the features we designed might

not be properly extracted.

However, for the sake of simplicity with our toy problem here, suppose we

can easily extract the following two features from each image in the training set:

size of nose relative to the size of the head, ranging from small to large, and shape

of ears, ranging from round to pointy.

pointy
ear shape
round

small nose size large

Figure 1.2 Feature space representation of the training set shown in Figure 1.1 where

the horizontal and vertical axes represent the features nose size and ear shape,

respectively. The fact that the cats and dogs from our training set lie in distinct regions

of the feature space reﬂects a good choice of features.

Examining the training images shown in Figure 1.1 , we can see that all cats

have small noses and pointy ears, while dogs generally have large noses and

round ears. Notice that with the current choice of features each image can now

be represented by just two numbers: a number expressing the relative nose size,

and another number capturing the pointiness or roundness of the ears. In other

words, we can represent each image in our training set in a two-dimensional

4 Introduction to Machine Learning

feature space where the features nose size and ear shape are the horizontal and

vertical coordinate axes, respectively, as illustrated in Figure 1.2.

3. Model training. With our feature representation of the training data the

machine learning problem of distinguishing between cats and dogs is now a

simple geometric one: have the machine ﬁnd a line or a curve that separates

the cats from the dogs in our carefully designed feature space. Supposing for

simplicity that we use a line, we must ﬁnd the right values for its two parameters

– a slope and vertical intercept – that deﬁne the line’s orientation in the feature

space. The process of determining proper parameters relies on a set of tools

known as mathematical optimization detailed in Chapters 2 through 4 of this text,

and the tuning of such a set of parameters to a training set is referred to as the

training of a model.

Figure 1.3 shows a trained linear model (in black) which divides the feature

space into cat and dog regions. This linear model provides a simple compu-

tational rule for distinguishing between cats and dogs: when the feature rep-

resentation of a future image lies above the line (in the blue region) it will be

considered a cat by the machine, and likewise any representation that falls below

the line (in the red region) will be considered a dog.

pointy
ear shape
round

small nose size large

Figure 1.3 A trained linear model (shown in black) provides a computational rule for

distinguishing between cats and dogs. Any new image received in the future will be

classiﬁed as a cat if its feature representation lies above this line (in the blue region), and

a dog if the feature representation lies below this line (in the red region).
1.2 Distinguishing Cats from Dogs: a Machine Learning Approach 5

Figure 1.4 A validation set of cat and dog images (also taken from [1]). Notice that the

images in this set are not highlighted in red or blue (as was the case with the training set

shown in Figure 1.1) indicating that the true identity of each image is not revealed to the

learner. Notice that one of the dogs, the Boston terrier in the bottom right corner, has

both a small nose and pointy ears. Because of our chosen feature representation the

computer will think this is a cat!

4. Model validation. To validate the e ﬃcacy of our trained learner we now show
the computer a batch of previously unseen images of cats and dogs, referred to

generally as a validation set of data, and see how well it can identify the animal

in each image. In Figure 1.4 we show a sample validation set for the problem at

hand, consisting of three new cat and dog images. To do this, we take each new

image, extract our designed features (i.e., nose size and ear shape), and simply

check which side of our line (or classiﬁer) the feature representation falls on. In

this instance, as can be seen in Figure 1.5, all of the new cats and all but one dog

from the validation set have been identiﬁed correctly by our trained model.

The misidentiﬁcation of the single dog (a Boston terrier) is largely the result

of our choice of features, which we designed based on the training set in Figure

1.1, and to some extent our decision to use a linear model (instead of a nonlinear

one). This dog has been misidentiﬁed simply because its features, a small nose

and pointy ears, match those of the cats from our training set. Therefore, while

it ﬁrst appeared that a combination of nose size and ear shape could indeed

distinguish cats from dogs, we now see through validation that our training set

was perhaps too small and not diverse enough for this choice of features to be

completely e ﬀ ective in general.

We can take a number of steps to improve our learner. First and foremost we

should collect more data, forming a larger and more diverse training set. Second,

/
we can consider designing including more discriminating features (perhaps eye

color, tail shape, etc.) that further help distinguish cats from dogs using a linear

model. Finally, we can also try out (i.e., train and validate) an array of nonlinear

models with the hopes that a more complex rule might better distinguish be-

tween cats and dogs. Figure 1.6 compactly summarizes the four steps involved

in solving our toy cat-versus-dog classiﬁcation problem.

6 Introduction to Machine Learning

pointy
ear shape
round

small nose size large

Figure 1.5 Identiﬁcation of (the feature representation of) validation images using our

trained linear model. The Boston terrier (pointed to by an arrow) is misclassiﬁed as a cat

since it has pointy ears and a small nose, just like the cats in our training set.

Data collection Feature design Model training Model validation

Training set

Validation set

Figure 1.6 The schematic pipeline of our toy cat-versus-dog classiﬁcation problem. The

same general pipeline is used for essentially all machine learning problems.

1.3 The Basic Taxonomy of Machine Learning Problems

The sort of computational rules we can learn using machine learning generally

fall into two main categories called supervised and unsupervised learning, which
we discuss next.
1.3 The Basic Taxonomy of Machine Learning Problems 7

1.3.1 Supervised learning

Supervised learning problems (like the prototypical problem outlined in Section

1.2) refer to the automatic learning of computational rules involving input /out-

put relationships. Applicable to a wide array of situations and data types, this

type of problem comes in two forms, called regression and classiﬁcation, depend-

ing on the general numerical form of the output.

Regression
Suppose we wanted to predict the share price of a company that is about to

go public. Following the pipeline discussed in Section 1.2, we ﬁrst gather a

training set of data consisting of a number of corporations (preferably active in

the same domain) with known share prices. Next, we need to design feature(s)

that are thought to be relevant to the task at hand. The company’s revenue is one

such potential feature, as we can expect that the higher the revenue the more

expensive a share of stock should be. To connect the share price (output) to the

revenue (input) we can train a simple linear model or regression line using our

training data.
share price

share price

revenue revenue
share price

share price

new company’s revenue estimated share price

revenue revenue

Figure 1.7 (top-left panel) A toy training dataset consisting of ten corporations’ share

price and revenue values. (top-right panel) A linear model is ﬁt to the data. This trend

line models the overall trajectory of the points and can be used for prediction in the

future as shown in the bottom-left and bottom-right panels.

The top panels of Figure 1.7 show a toy dataset comprising share price versus

revenue information for ten companies, as well as a linear model ﬁt to this data.

Once the model is trained, the share price of a new company can be predicted
8 Introduction to Machine Learning

based on its revenue, as depicted in the bottom panels of this ﬁgure. Finally,

comparing the predicted price to the actual price for a validation set of data

we can test the performance of our linear regression model and apply changes

as needed, for example, designing new features (e.g., total assets, total equity,

number of employees, years active, etc.) and/or trying more complex nonlinear

models.

This sort of task, i.e., ﬁtting a model to a set of training data so that predictions

about a continuous-valued output (here, share price) can be made, is referred to as

regression. We begin our detailed discussion of regression in Chapter 5 with the

linear case, and move to nonlinear models starting in Chapter 10 and throughout

Chapters 11–14. Below we describe several additional examples of regression to

help solidify this concept.

Example 1.1 The rise of student loan debt in the United States

Figure 1.8 (data taken from [2]) shows the total student loan debt (that is money

borrowed by students to pay for college tuition, room and board, etc.) held

by citizens of the United States from 2006 to 2014, measured quarterly. Over

the eight-year period reﬂected in this plot the student debt has nearly tripled,

totaling over one trillion dollars by the end of 2014. The regression line (in

black) ﬁts this dataset quite well and, with its sharp positive slope, emphasizes

the point that student debt is rising dangerously fast. Moreover, if this trend

continues, we can use the regression line to predict that total student debt will

surpass two trillion dollars by the year 2026 (we revisit this problem later in

Exercise 5.1).
[in trillions of dollars]
student debt

year

Figure 1.8 Figure associated with Example 1.1, illustrating total student loan debt in the

United States measured quarterly from 2006 to 2014. The rapid increase rate of the debt,

measured by the slope of the trend line ﬁt to the data, conﬁrms that student debt is

growing very fast. See text for further details.

Other documents randomly have
different content
I wished to go out again to look for my friend the bull elephant, but
I was unable to put my foot on the ground in consequence of my
injured instep. After our evening meal, which we had taken under the
trees outside the tent, George and I had an interesting chat with El
Hakim about elephant-hunting, upon which subject he was a
veritable mine of information. He had shot elephants persistently for
the previous four years in Somaliland, Galla-land, and the country
round Lake Rudolph, having killed over 150, on one occasion
shooting twenty-one elephants in twenty-one days—a fairly good
record. Commenting on the size of the tusks obtainable in the
districts north of the Waso Nyiro River, he mentioned that his largest
pair weighed just over 218 lbs., and measured 9 feet in length.
Naturally, exciting incidents, when in pursuit of his favourite quarry,
were numerous. Once he sighted a solitary bull feeding in the open
plain some little distance away from his camp. Snatching up an 8-
bore rifle and two or three cartridges, he started in pursuit. On
proceeding to load his weapon, he found that in his hurry he had
brought away the wrong cartridges! They were by a different maker
than those usually used in the rifle, and there was a slight difference
in the turning of the flange, which caused them to jam a little. He
forced them in, and, by an exercise of strength, closed the breech.
After a careful stalk he reached a favourable position for a shot,
and, taking aim, banged off. The rifle exploded with a terrific report,
the barrels blowing off in his hands, fortunately without doing him
any injury—the explosion of 10 drams of powder being too much for
the incompletely closed breech-locking grip. There was El Hakim
with the butt of his rifle in his hand and the barrels in the other,
vaguely wondering in what manner the beast would kill him, and, no
doubt, feeling very much de trop. The elephant, who was hit in the
shoulder, turned towards him, and, after regarding him with a
prolonged stare, turned away again, and moved slowly off as if a
bullet in the shoulder was of little or no consequence, leaving his
discomfited assailant considerably relieved.
Another time he took the same 8-bore—which, by the way, had not
been repaired—and started in pursuit of a herd of elephants. He
loaded the weapons, and, after closing the breech, bound it round
and round very tightly with a leather bootlace. On the first discharge,
stock and barrels again parted company; whereupon he handed the
useless weapon to one of his bearers, and, taking an old Martini in
exchange, rushed off after the herd, and bagged three more
elephants.
In Somaliland, one of the favourite amusements of his party was
riding out, mounted on light Somali ponies, to bait wild elephants.
Their shikaries would perhaps locate a couple of the animals in a
small clump of trees, where they were resting during the heat of the
day. One of the party would then ride up and fire a pistol at one of
them. The result, of course, would be a scream of rage, and a
furious charge by the insulted animal. Horse and rider would at once
make themselves scarce. The elephant would seldom charge more
than 100 yards or so away from cover, but at that distance, or under,
would halt and then slowly return, thus giving another member of the
party a chance. With a wild shout another horse and rider would
gallop at full speed across the elephant’s path, just out of reach.
Round would come the huge beast in another attempt to put an end
to what it justly considered a nuisance—an attempt foredoomed to
failure. One after another the horsemen would gallop up to the now
thoroughly infuriated beast, shouting and firing pistols, provoking
ugly rushes first at one and then another of them—for all the world
like a lot of schoolboys playing touch. Sometimes one or other of
them had a narrow escape, but somebody would nip in at the critical
moment and divert the elephant’s attention. A slip or a fall would
have meant a horrible death from the feet and tusks of the enraged
pachyderm; but the ponies were as agile as their riders, and enjoyed
the fun every whit as much.
We had no ponies, and playing with elephants in that manner
would not have been sufficiently amusing when mounted on a mule,
which had a habit of violently shying whenever it was urged faster
than a moderate trot. El Hakim once had a very unpleasant
experience through this mule’s aggravating peculiarity. He was riding
ahead of the safari, when he noticed a herd of elephants feeding a
mile or so in front. Taking his rifle from the bearer, he trotted after
them. The elephants moved slowly on, and disappeared over a ridge
some distance ahead. El Hakim urged the mule faster, but, in spite of
his efforts, on gaining the top of the ridge, he had the mortification of
seeing his quarry moving off at an ever-increasing speed. Fearing
that he would lose them after all, he jammed his spurs into the mule,
and raced away down the slope for all he was worth.
It was fairly steep, the ground being covered with loose stones,
some of which, displaced by the mule’s hoofs, rolled and clattered
downhill after him, and so frightened the animal that she
incontinently bolted. El Hakim’s whole energies were now
concentrated on keeping his seat, his rifle, and his presence of mind.
Just as he felt that he was gradually succeeding in getting his
agitated steed under control, she shied at a clump of cactus, and
shot him clean out of the saddle, and over the cactus, into the
clinging embrace of a well-developed wait-a-bit thorn which was
growing on the other side. When the men had finally cut him out, he
had quite given up the idea of shooting elephants that day, turning
his attention instead to his numerous abrasions. Besides, the
elephants were by that time miles away.
After the evening meal, when we generally sat in front of the
camp-fire smoking, George and I used, figuratively speaking, to sit at
the feet of El Hakim and listen for hours to his yarns of elephant-
hunting. It was very seldom we could get him to speak about his
experiences, but when in the mood to talk, his tales were well worth
listening to.
We had some hazy idea that elephants were shot at something
like a hundred yards’ range with a powerful large-bore rifle, which
mortally wounded them at the first discharge. Once I asked El
Hakim, off-hand, at what range he generally killed his elephants.
“Oh,” he replied, “anything from five to twenty yards!” and went on
to explain that it was much safer to shoot big game at short range.
“Always stalk your beast carefully,” said he, “and get close enough
to be certain of your shot; then hit him hard in the right place, and
there you are!”
It certainly sounded very simple, and I must say that El Hakim puts
his own precepts into practice with conspicuous success; but a
beginner does not find it so very easy. The temptation to fire at say
eighty or a hundred yards, is well-nigh irresistible. It seems so much
safer, though in reality it is much more dangerous—a fact which is
rather difficult of assimilation by the novice.
“Besides,” El Hakim would remark in conclusion, with the air of
one propounding an unanswerable argument, “it is more
sportsmanlike.”
Another advantage of the short-range shot is this: Suppose a herd
of elephants is located. If the conditions of wind, etc., are favourable,
one can, with ordinary care, get right up to them, near enough to pick
out the finest pair of tusks, and drop their owner with a bullet through
the brain. If a ·303 is used there is no smoke, while it makes a
comparatively small report, which is most likely attributed by the rest
of the herd to the effect, and not the cause, of their comrade’s fall. A
second and even a third elephant can often be obtained under these
circumstances, before the herd realizes what is happening and
stampedes.
This rule of careful stalking till near enough to make the result of
the shot certain holds good with all big game, though there are
certain other factors to be considered, such as the angle to your line
of sight at which the beast aimed at is standing, and also light, etc.
One can go into any club or hotel billiard-room in those parts of
Africa where big game is to be found, and listen to conversation on,
say, lion-shooting. The chances are that nine out of ten men present
have “had a shot at a lion;” but only a very small percentage have
actually bagged their beast. In these days of small-bore, high-power
rifles, a man can shoot at a stray lion at six hundred yards, and he
may be lucky enough to wound it or even, perhaps, kill it; but surely
that is not “playing the game.”
On the afternoon of the day after the Somalis left for the Waso
Nyiro, N’Dominuki came into camp with a chief named “Karama,”
who wished to make “muma,” or blood-brotherhood, with me, to
which I consented. It was rather a long affair. They brought a sheep
with them, which was killed, and the liver cut out and toasted.
Karama and I then squatted on the ground facing each other, while
our men on the one side, and Karama’s friends on the other, formed
a circle round us. A spear and a rifle were then crossed over our
heads, and N’Dominuki, as master of the ceremonies, then took a
knife and sharpened it alternately on the spear-blade and the gun-
barrel, reciting the oath of “muma” meanwhile. It was a long,
rambling kind of oath, amounting in fact to an offensive and
defensive alliance, with divers pains and penalties attached, which
came into operation in the event of either or both the blood-brothers
breaking the said oath. At the conclusion of N’Dominuki’s speech the
assembled spectators shouted the words “Orioi muma” three times.
Three incisions were then made in my chest, just deep enough to
allow the blood to flow, and a similar operation was performed on
Karama. N’Dominuki then ordered the toasted sheep’s liver to be
brought, which, on its arrival, was cut into small pieces, and a piece
handed to both Karama and me. A further recitation of the penalties
of breaking the oath was made by N’Dominuki, and again the
spectators shouted “Orioi muma.” Karama and I then dipped our
pieces of liver in our own blood, and amid breathless silence
exchanged pieces and devoured them. This was repeated three
times to the accompaniment of renewed shouts from the spectators.
The remainder of the liver was then handed round to the witnesses,
who ate it, and the ceremony was concluded, it only remaining for
me to make my new blood-brother a present.
The next morning our final preparations were completed, and
N’Dominuki having come over early, we turned all the animals we
were leaving behind over to him. He bade us adieu, with a wish that
we might return safe and sound, and, what is more, he sincerely
meant what he said.
After leaving our late camp we plunged once again into the thorn
forest, which we soon crossed, emerging into the sparsely vegetated
highland I have mentioned before as extending to the northward.
The sun was very hot, and travelling slow and laborious, not so
much from the nature of the ground, perhaps, as from the soft
condition of the men after their long rest. The ground, nevertheless,
made walking a wearisome task, as the loose pebbles and quartz
blocks turned our ankles and bruised our shins.

THE AUTHOR MAKING BLOOD-BROTHERHOOD WITH KARAMA.

THE “GREEN CAMP.” (See page 162.)

After two hours’ toiling we found ourselves on the edge of the

tableland looking down a sharp declivity to the plain beneath, which
stretched out in desolate barrenness as far as the eye could reach. It
was a dreary khaki-coloured landscape, with peculiarly shaped hills
in the extreme background. In the middle distance were belts of
dusty-looking thorn trees, while here and there mounds of broken
lava reared up their ugly masses to add to the general air of
desolation. Somewhere ahead of us, about four days’ march, was
the Waso Nyiro; and beyond that lay the desert again, stretching
away up towards Lakes Rudolph and Stephanie, and thence onward
to the hills of Abyssinia and Somaliland. The country we should have
to cross in order to reach the Waso Nyiro was, as far as we knew,
waterless, with the exception of one tiny brook, which flowed
northward from M’thara, probably emptying itself into the Waso
Nyiro. We followed it, therefore, in all its multitudinous windings, as,
without it, we should have been in a sorry plight indeed.
As we descended to the plain the heat appreciably increased. We
met several rhinoceros on the road, but we discreetly left them to
their meditations. Apparently there had once been grass on the
plain, but it had been burnt, and during the passage of our safari a
fine, choking black dust arose, which, in combination with the dust
from the dry red soil, formed a horrible compound that choked up our
ears, eyes, noses, and throats in a most uncomfortable manner. For
four hours we marched, and then camped on the banks of the
stream.
Innumerable rhino tracks crossed in every direction, leading us to
suppose that we were camped at the place where the brutes usually
drank. George, hearing the shrill cries of some guinea-fowl from the
opposite bank, sallied forth with the shot-gun, and soon the sound of
many shots in quick succession showed that his energy was reaping
its reward. He returned presently with eight birds, which were a very
welcome addition to our larder.
We turned in early. During the night I was awakened by the sound
of torrents of rain beating down on the tent. I rose and looked
cautiously out. A noise from El Hakim’s tent at once attracted my
attention, and gazing in that direction I saw El Hakim himself, clad
only in a diminutive shirt, busily engaged in placing the ground-sheet
of his tent over the stacked loads. He was getting splashed
considerably. I did not disturb him, but retired once more to my
blankets, perfectly satisfied that the loads were being properly
looked after.
In the morning the sky was as clear as crystal, while the parched
earth showed no traces of the heavy shower that had fallen during
the night. We travelled over the same kind of country as that
traversed the day before, dry brown earth, burnt grass, and loose
stones being the most noticeable features, if I except the ubiquitous
rhinoceros, of which truly there were more than “a genteel
sufficiency.” In fact, they proved a terrible nuisance, as we had
sometimes to make long détours in order to avoid them. They were
not only capable of doing so, but seemed only too anxious to upset
our safari. The men were mortally afraid of them, and much
preferred their room to their company.
After a couple of hours on the road we saw in the distance a large
swamp, which we had not previously noticed, surrounded for a
radius of a mile or so by thorn-bush, which grew a great deal thicker
than on other parts of the plain. The quantity of game we saw on the
road was simply incredible. Vast herds of oryx, zebra, and grantei,
roamed over the landscape; ostriches and giraffes were also in sight,
and, of course, rhinoceros. It is a sportsman’s paradise, and as yet,
with one or two exceptions, untouched.
When we reached the swamp the safari was halted to allow the
stragglers to come up. While waiting I saw something sticking out of
the grass a hundred yards away, to which I called El Hakim’s
attention. He observed it attentively through the binoculars for a
moment, and then turned to me with an exclamation of satisfaction,
softly observing, “Buffaloes, lying down.” Taking his ·450 express,
and motioning the few men with us to be silent, he started to stalk
them, followed by myself with the Martini rifle. We crawled down very
cautiously to leeward, and after half an hour’s careful stalking, during
which we advanced only fifty yards, we ensconced ourselves in a
favourable position in the reeds fringing the swamp. We were
considering the advisability of a further advance, when our fools of
men who had been in the rear reached the spot where we left the
others, and on learning that a whole herd of the dreaded “mbogo”
(buffalo) were in such close proximity, promptly climbed the adjacent
trees, from which safe and elevated position they carried on an
animated discourse on the merits of buffalo meat as an article of
diet. As a consequence we had the mortification of seeing the old
bull prick up his ears and listen, then slowly rise and sniff the air. The
indications were apparently unsatisfactory, for the whole herd rose
slowly to their feet, and, after a preliminary sniff, moved slowly off
over a rise in the ground, and out of range. Words would not express
our feelings!
El Hakim and I vehemently consigned our indiscreet followers to
the hottest possible place known to theology, but even that did not
comfort us. We decided not to give up, but to go on and follow the
herd, although it was extremely unlikely that they would allow us to
get within range, as the buffalo is a very keen beast, especially when
once alarmed. However, to our surprise and delight, we found, when
we had breasted the rise, that the herd (about thirty head) had halted
about two hundred yards away. We then noticed several very young
calves among them, which at once explained why they were so
deliberate in their movements. They were, however, on the look-out,
and directly we appeared they saw us. The cows with their calves
took up their station in the centre of the herd, while the bulls faced
outwards, something after the manner of soldiers forming square.
Most noble and majestic they appeared, with their huge, powerful
bodies and immense frontal development of horns. They had an air
of savage grandeur and ferocity about them that commanded my
highest admiration.
There were a few stunted thorn trees standing about, and we took
up a position behind one of them. As I have said, we were about two
hundred yards away, and as they showed no disposition to run, we
thought we might venture to walk boldly to another tree some
distance nearer to them. There was a certain amount of risk of being
charged in so doing, but we chanced it, and were perfectly
successful in our design, though our quarry were manifestly uneasy.
Sitting down, we waited patiently in the scorching sun for over an
hour, in order to let them settle down again, so that we might
approach still nearer. They gradually resumed their feeding, but not
without much sniffing of the air on the part of the bulls, coupled with
many suspicious glances in our direction.
El Hakim thought that the best thing to do would be for me to go to
another tree a hundred yards to the right. Once there we would both
crawl gradually within range, and then act as circumstances might
direct. I started off for the tree, and arrived without accident, although
the old bull, the guardian of the herd, sniffed severe disapproval.
They were evidently getting used to our presence, but it was highly
improbable they would tolerate our nearer approach, should they
observe it. We again waited, and then, watching El Hakim, I saw him
crawl stealthily on his stomach towards another tree fifty yards
nearer the herd. I followed suit on my side, suffering considerably in
so doing.
The vertical sun beat fiercely down, and, flattened out as I was, I
felt its full effects on my back, which was protected only by a flannel
shirt. The ground was covered with sharp pebbles and quartz
crystals; and the long sharp thorns, blown down from the trees,
pricked me cruelly, while I was tormented by a raging thirst. That fifty
yards’ crawl took us twenty minutes; it seemed an age. When I
arrived, panting and gasping, at my tree, I was bleeding freely from
numerous cuts and scratches on my chest, elbows, and knees.
However, we were now within easy range of the herd, and after
resting a few minutes to steady ourselves, we prepared for action.
Looking over to my left, I saw El Hakim raise his rifle, so, taking aim
at the largest bull I could pick out, I let drive, followed a fraction of a
second later by El Hakim. My beast jumped, staggered a few paces,
with the blood streaming in showers from his mouth and nostrils, and
then toppled over dead, shot through the lungs. El Hakim’s beast
also staggered a few paces and went down, evidently mortally
wounded. We had neither of us shot at the big bull, as at the moment
of firing he was behind some of the other animals. We had then two
magnificent beasts down, and did not want more, but the herd would
not move away. They smelt the two carcases stamping and pawing
the ground, but did not budge an inch.
The big bull gazed round, seeking an assailant; but we were well
under cover. Suddenly he turned, exposing his shoulder. Two rifles
spoke simultaneously, but he did not go down. Once more we fired
together, and again he was struck, but still kept his legs. Yet again
we fired, and had the satisfaction of seeing him settle down on his
hind quarters. To our great delight, the herd then moved off, and we
were able to walk cautiously up to within ten feet of the big bull, as
he sat propped up on his fore legs, bellowing defiance. Such a
spectacle of impotent rage I had never previously witnessed. He
made most herculean efforts to rise, but being unable to do so, he
rolled his blood-shot eyes, while foam dripped from his massive
jaws. He was the very picture of helpless though majestic rage. I
took pity on the noble beast, and planted a Martini bullet in his neck,
smashing the spine, thereby finishing him for good and all. We
carefully examined him, and, an instance of the splendid vitality of an
old bull buffalo, found all our six bullets planted in his left shoulder,
so close together that they could have been covered with an
ordinary-sized plate. Three were mine and three were El Hakim’s.
We could easily distinguish them, as, though our rifles were of the
same bore (·450), those from El Hakim’s Holland and Holland were
clean-cut and symmetrical, while my heavier Martini bullets,
propelled by half the charge of powder used by El Hakim, made a
more ragged hole. We tossed for the head, and I won.
Four men were required to carry it into camp, when it was severed
from the body. The horns were magnificently proportioned, and in
perfect condition. The horns of my first beast also were quite up to
the average.
As by this time it was long after midday (our stalk having lasted
three hours), we determined to camp near the edge of the swamp.
We dubbed it “Buffalo Camp,” and decided to stop there the next day
in order that the men might cut up the dead buffaloes and dry their
meat into biltong. We left their entrails where the beasts had been
shot, with men to protect them from the vultures till sundown, in the
hope that during the night they might attract lions.
Jumbi reported in the afternoon that two of the porters had
deserted on the road, and, worst of all, they were carrying, one a
load of food, and the other a load of the Venetian beads which were
to buy us food from the Rendili. We sent Jumbi with six men back to
endeavour to apprehend them.
Our camp was situated only about half a mile or so from the grave
of Dr. Kolb, whom Mr. Neumann met at M’thara. In reading
Neumann’s book[7] a pathetic paragraph (in the light of after events)
met my eye; it ran thus:—
“Here I had the honour of introducing my companion (Dr. Kolb) to
my esteemed brother N’Dominuki, and to the rhino, an animal whose
acquaintance he had not yet made. He had shot hippos in the Tana,
but felt rather desponding about his chances about bagging a ‘faro.’
However, I promised him that he should have that satisfaction, and
my pledge was fulfilled the first time he went out with me. After that
he shot many. He was, I believe, a first-rate shot, though somewhat
hampered in the bush by the necessity of wearing spectacles.”
Soon after those words were written Dr. Kolb was killed by a
rhinoceros under particularly affecting circumstances. El Hakim was
travelling in company with him at the time, but on the fatal morning
he was some half hour’s march in the rear, and arrived only in time
to see the end. I got the story from El Hakim, and can vouch for its
truth as far as he was concerned in it.
It appeared that Dr. Kolb was walking at the head of his men,
when he saw a half-grown rhinoceros in the path. He was carrying a
Mannlicher rifle, the magazine loaded with soft-nosed bullets. He
immediately fired, dropping the rhinoceros dead in its tracks. The
mother rhino then sprang up from the grass, where she had been
lying until then unobserved and probably asleep, and charged down
on to Dr. Kolb and his party. She caught his gun-bearer first, and
tossed him two or three times, her horn transfixing both the man’s
thighs. Dr. Kolb meanwhile was pouring magazine fire into her, but
failed to stop her, and she charged him in turn. He turned and fled,
but was overtaken in a very few yards, and hoisted into the air, falling
behind the rhinoceros, who passed on and disappeared. Her long
sharp horn entered the lower part of his body from behind, and
penetrated upwards for some distance. His men carried him into the
shade of a bush, and there El Hakim found him half an hour later. He
was quite conscious, and in no pain. El Hakim urged him to permit
him to examine his injuries, but Dr. Kolb assured him that he was
fatally wounded, and, like a true scientist, detailed his symptoms for
El Hakim’s benefit. He was quite calm and collected, and asked El
Hakim for a stimulant, and brandy was immediately supplied. Dr.
Kolb then referred to his watch, and calmly remarked that he had
twenty minutes more of consciousness and half an hour of life, his
prognosis proving correct in every particular.
The next morning as we were occupied in superintending the
manufacture of the biltong, a shout of “Simba! simba!” (Lions! lions!)
caused us to eagerly examine the landscape. Trotting unconcernedly
past our camp, not more than four hundred yards away, were a
superb lion and lioness. El Hakim, George, and I followed at once,
and discovered them loitering about some distance from the buffalo
entrails. We laid down near the remains, hoping they would come for
them, and so give us a shot, and watched them for some time.
They were a magnificent pair. Although the lion is known to be
rather a skulking brute than otherwise, there is such a suggestion of
latent power combined with careless grace in its carriage, that it
compels one’s admiration and causes lion-shooting to appear an
eminently desirable method of passing one’s time. These two lions
came gradually nearer, evidently attracted by the buffalo meat, but
when they were about two hundred yards away, in spite of our
caution, the lioness spotted us, and she immediately growled, and so
put her lord and master on the alert. Presently, to our great
disappointment, they turned and walked slowly away, stopping now
and again to look round and growl. We followed them, and at times
when they halted a little longer than usual, we almost got within
range—almost, but not quite, they invariably moving on again when
we approached closer than they judged expedient.
This game continued until we were several miles from camp, and,
notwithstanding our ardour, we were getting tired. Eventually they
retired to a patch of bush, but just as we were making arrangements
to beat it, the lioness emerged, and laid down in the grass out of
range, being presently joined by her mate. The old game of follow-
my-leader then recommenced, and after six hours of this we got
rather sick of it. On the way they were joined by another male, a
beautiful black-maned brute, the sight of which revived our flagging
energies, and we continued the chase, but to no purpose. In spite of
our efforts they kept a long way ahead, and finally went on at a trot,
leaving us far in the rear, quite out-distanced, and extremely
disgusted. We returned to camp after a fruitless tramp of about
seven hours.
Jumbi returned in the evening with one of the deserters; he had
been unable to secure the other. The captured culprit was the man
who had carried the load of food, which he had deliberately burnt. It
was really wicked. Food which was so hard to obtain, and which
before long would be so sorely needed by our men, had been
deliberately destroyed, and for no object, that we could ascertain,
beyond sheer perversity. The delinquent was ordered a flogging—
and got it. The other deserter, who had not been recaptured, had
also burnt his load of Venetian beads, which were particularly
valuable in view of our proposed stay among the Rendili.
I had the three buffalo heads buried in a large ant-heap against
our return, as we were unable to carry them about with us, and to
have hung them in the trees would have exposed them to theft from
wandering Wandorobbo or stragglers from the Somali caravan. The
ants were very large, being quite an inch in length, and of a bright
scarlet colour; they died on exposure to the air and light. They bit
very fiercely, drawing blood whenever they fastened their immensely
powerful jaws. The men who buried the horns suffered considerably
about the legs, but I was consoled by the thought that the horns
would be safe from the hyænas while in charge of such powerful little
warriors.

FOOTNOTES:
[7] “Elephant Hunting in East Equatorial Africa,” by Arthur H.
Neumann (1898), p. 126.
CHAPTER IX.
JOURNEY DOWN THE WASO NYIRO.

Arrival at the Waso Nyiro—The “Green Camp”—The “cinder-heap”—

The camp on fire—Scarcity of game—Hunting a rhino on mule-
back.
Next morning we continued to follow the course of the little stream
which issued, greatly diminished in size, from the opposite side of
the swamp. The country grew more barren as we advanced. Great
gravelly areas alternated with brown earth, and now and again an
outcrop of quartz or lava occurred. The universal thorn tree was the
only member of the vegetable world that seemed to be able to draw
any sustenance from the arid soil, with the exception of a few cacti
and small aloes. Rhinoceros there were in plenty, and several giraffe
loomed on the horizon. We were also greatly excited to observe
elephant tracks, two or three days old, trending north-eastward
towards the Waso Nyiro. In the distance we could see frowning cliffs
of pink gneiss, and due north, some peculiarly shaped hills, one in
particular being an almost exact replica of the Great Pyramid of
Cheops at Ghizeh. Another to the left of it consisted of a pyramidal
base, surmounted by a columnar peak that by some agency or other
had been split vertically into two unequal portions, which remained
sticking boldly upwards like a couple of gigantic teeth. Standing out
prominently to the north-north-west was the massive outline of
Mount Lololokwe, 3000 feet above the level of the surrounding plain,
while behind it, one point more to the westward, Mount Gwarguess
reared its stately head 2000 feet higher.
To our great annoyance and dismay, the little stream we were
following, which had been dwindling in size for some miles, now
disappeared completely into a subterranean passage. It was eleven
o’clock in the forenoon, so we crept into the scanty shade afforded
by some thorn trees, and rested in preparation for a long march to
the Waso Nyiro in the afternoon.
The heat was intense, and the atmosphere most remarkably dry
and clear. Small objects at long distances stood out with remarkable
distinctness. The hills, at the foot of which flowed the Waso Nyiro,
seemed not more than an hour’s march distant.
About two o’clock, having rested sufficiently, we once more forged
ahead, bearing more to the north-east than in the direction we had
hitherto followed. We encountered the same soft crumbling brown
earth, with loose stones on the surface. Aloes, morio trees, and thorn
trees were the only vegetation, and even they were only sparsely
distributed. The country was formed of long rolling ridges, which we
traversed at right angles. It was a weary and tiresome march. Each
time we climbed a ridge we looked eagerly forward for a sight of the
longed-for Waso Nyiro. Again and again we were disappointed, each
ridge exactly resembling the last. At four o’clock in the afternoon we
entered a small belt of thorn trees and dodged a couple of rhinos
who were love-making just inside, and would no doubt have
resented being disturbed. When we once more emerged from the
thorn belt we gazed over a broad plain which sloped gently down to
a range of dun-coloured hills some miles away. The Waso Nyiro, we
knew, flowed at the foot of these hills, and once more we pressed
forward, momentarily forgetting our fatigue in our eagerness to reach
the desired goal.
I was walking with my Martini over my shoulder, when I was
considerably startled by a noise from my left, which caused me to
hurriedly bring my rifle to the ready. It was a long-drawn growling
grunt, and my first thought was of lions. Closer attention, however,
solved the mystery. It was the cry of a zebra, one of a herd of
Grevy’s beautiful zebra which were congregated over half a mile
away. The cry of the zebra is very like a long-drawn growling whistle,
and in the distance, when too far off to hear the whistle, the growl
very much resembles that of a lion. There were large herds of oryx in
sight, and a few rhinoceros and water-buck.
The sun sank gradually lower in the western heavens, and we
were still apparently no nearer the range of hills we were making for,
so deceptive are the apparent distances in the clear atmosphere. But
just as dusk had fallen, our eyes were gladdened by the sight of a
large clump of Doum palms growing in the centre of an open green
space a mile away. We made towards them, and feasted our weary
eyes on the beautiful green expanse stretching out before us.
A spring emerged from the earth here, perhaps the same stream
we had followed in the morning, and which had so disappointed us
by suddenly disappearing. The water was quite warm, and
impregnated with mineral salts; so much so as to be almost
undrinkable. It welled up into a hole in the rocks about 20 feet long
and 12 feet wide by 4 feet deep, forming a lovely natural bath,
overgrown with varicoloured mosses and ferns. The overflow
meandered through the grass for 100 yards or so in a little stream a
foot deep with a pebbly bottom fringed by dark green rushes, and
then spread out into a swamp overgrown with tall papyrus reeds 10
or 12 feet high. There were two or three acres of good green grass
on one side of the swamp, to which our animals rushed with
whinnies of delight the instant they caught sight of it, and ate and ate
as if they would never stop. We crossed this little stream, and
pitched the tents under some large thorn trees. We christened the
place “Green Camp.” It was about 3500 feet above sea-level, and
over 1000 feet lower than M’thara.
There was a splendid specimen of the Doum palm on the other
side of the camp, which can be seen in the photograph of the Green
camp. The Doum palm (Hyphæne Thebaica) is called m’lala by the
Swahilis. It is a very graceful palm, and grows to a great height on
the Waso Nyiro, and was to be found everywhere along the banks of
the river. The stem divides into two branches a few feet from the
ground, each branch again and again dividing and being crowned
with its canopy of broad, flat, fan-shaped leaves. The fruit, about the
size of a potato, is mostly hard uneatable kernel, with a layer of
moist fibre, about half an inch thick, contained in a reddish bitter rind.
It reminds one of eating chopped cocoanut fibre, with a sweetish,
slightly astringent flavour. George and I ate quantities of it later on,
as also did the men, when we were without other vegetables.
While we were pitching the tents a rhinoceros emerged from
among the papyrus, where he had been wallowing in the swamp,
and trotted towards us. A shout soon caused him to change his
mind, and off he went full gallop for the Waso Nyiro, which, I should
have remarked, was about half a mile distant.
The country outside the camp was covered in places with large
white patches of mineral salts, principally carbonate of soda and
sulphate of magnesia; but we searched in vain for any common salt.
There was very little soil, and the men who were driving in the tent-
pegs struck rock three or four inches below the surface. A violent
gale of wind came on at sundown, and it needed the most
extraordinary precautions, in the way of extra guy-ropes to the tents,
to prevent them being blown bodily away.
After supper we held a consultation to decide what form our plans
should take. First and foremost, we wished to find the Burkeneji and
Rendili peoples, in order to trade for ivory. These people are
nomads, and wander at will over the immense tract of desert country
bounded, roughly, on the north by Southern Somaliland, on the south
by the Waso Nyiro, on the west by Lake Rudolph, and on the east by
the fortieth degree of longitude. They have one or two permanent
settlements, notably Marsabit, some eight or ten days’ journey to the
north of the Waso Nyiro, and at Mount Nyiro, situated some two or
three marches south of Lake Rudolph. There was every sign that
there had been a long drought (we found afterwards that no rain had
fallen for three years), and it was more than likely that they had
come south to the Waso Nyiro, as was their habit when water was
scarce in the arid country to the north.
After a little deliberation, therefore, we determined to follow the
course of the Waso Nyiro down-stream—that being, of course, to the
eastward—in order to try to discover the Rendili, whom we were very
anxious to find.
We started soon after daybreak the following morning. The
weather was perfect, being dry, warm, and clear; we felt it a pleasure
to be alive. We followed the river, as there was, of course, no other
water. The course of the Waso Nyiro is always clearly defined by the
belts of Doum palms that fringe the banks, and by the greater
greenness of the vegetation in its immediate vicinity. At first we
thought that if we followed the general direction of the river, viz.
eastward, we should never be far from the water, whether it was in
sight at the moment or not. Two or three days’ journey, however,
undeceived us on that point. The river, as a matter of fact, winds
about in a most extraordinary manner, and on several occasions
when, thinking we were near the river, we halted for the purpose of
camping, we found, owing to an utterly unexpected turn, that it was
really miles away. Consequently we adopted the more fatiguing but
safer course of following it in all its windings.
Just such an experience befell us on the morning we left “Green
Camp.” Away to the eastward of that place, and about ten miles
distant, was a mass of gneiss rock known as Mount Sheba, towering
500 feet above the plain, and 3500 feet above sea-level. We knew
the river flowed within a mile or two of it, but on which side, whether
to the north or south, we were uncertain. We therefore made for the
north end of the mountain, as, if the river flowed to the south, we
should necessarily meet it, while if it went to the north we should still
be going right.
The first hour’s march was fairly easy. Level stretches of sand
covered with patches of mineral salts, and dotted with stunted thorn
trees, offered no great impediment to our progress. Several
rhinoceros were browsing about, one brute being right in our path.
We cautiously approached and shouted at him, but he did not seem
disposed to move. On approaching nearer we saw that he was
wounded, a great hole in his ribs showing that he had been fighting
his brother rhinoceros, and had, apparently, considerably the worst
of the argument. Rhinoceros are inveterate fighters amongst
themselves; and of all the animals shot during the expedition there
was not one who did not show healed or partially healed wounds
somewhere in the region of the ribs. As this particular beast would
not move, I started forward with the intention of shooting him, but he
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.