100% found this document useful (2 votes)

946 views

Data Science: Theory, Algorithms, and Applications (Transactions On Computer Systems and Networks) Gyanendra K. Verma (Editor)

Systems

Uploaded by

femandoston

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

946 views

Data Science: Theory, Algorithms, and Applications (Transactions On Computer Systems and Networks) Gyanendra K. Verma (Editor)

Systems

Uploaded by

femandoston

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Full download ebooks at ebookmeta.

com

Data Science: Theory, Algorithms, and Applications

(Transactions on Computer Systems and Networks)
Gyanendra K. Verma (Editor)

https://ebookmeta.com/product/data-science-theory-
algorithms-and-applications-transactions-on-computer-
systems-and-networks-gyanendra-k-verma-editor/

OR CLICK BUTTON

DOWLOAD NOW

Download more ebook from https://ebookmeta.com

More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Computer Science Theory and Applications 17th

International Computer Science Symposium in Russia CSR
2022 Virtual Event June 29 July 1 2022 Lecture Notes in
Computer Science 13296 Alexander S. Kulikov (Editor)
https://ebookmeta.com/product/computer-science-theory-and-
applications-17th-international-computer-science-symposium-in-
russia-csr-2022-virtual-event-june-29-july-1-2022-lecture-notes-
in-computer-science-13296-alexander-s-kuliko/

Smolder Anita Blake Vampire Hunter 29 Laurell K

Hamilton

https://ebookmeta.com/product/smolder-anita-blake-vampire-
hunter-29-laurell-k-hamilton/

Distributed Systems Theory and Applications 1st Edition

Ratan K. Ghosh

https://ebookmeta.com/product/distributed-systems-theory-and-
applications-1st-edition-ratan-k-ghosh/

Opportunistic Networks: Fundamentals, Applications and

Emerging Trends 1st Edition Anshul Verma

https://ebookmeta.com/product/opportunistic-networks-
fundamentals-applications-and-emerging-trends-1st-edition-anshul-
verma/
Classical Hopf Algebras and Their Applications Algebra
and Applications 29 Pierre Cartier Frédéric Patras

https://ebookmeta.com/product/classical-hopf-algebras-and-their-
applications-algebra-and-applications-29-pierre-cartier-frederic-
patras/

Data Analytics for Drilling Engineering Theory

Algorithms Experiments Software Information Fusion and
Data Science Qilong Xue

https://ebookmeta.com/product/data-analytics-for-drilling-
engineering-theory-algorithms-experiments-software-information-
fusion-and-data-science-qilong-xue/

29 Single and Nigerian INCOMPLETE First Edition

Naijasinglegirl

https://ebookmeta.com/product/29-single-and-nigerian-incomplete-
first-edition-naijasinglegirl/

Cambridge IGCSE and O Level History Workbook 2C - Depth

Study: the United States, 1919-41 2nd Edition Benjamin
Harrison

https://ebookmeta.com/product/cambridge-igcse-and-o-level-
history-workbook-2c-depth-study-the-united-states-1919-41-2nd-
edition-benjamin-harrison/

Computer Vision Algorithms and Applications 2nd Edition

Richard Szeliski

https://ebookmeta.com/product/computer-vision-algorithms-and-
applications-2nd-edition-richard-szeliski/
Transactions on Computer Systems and Networks

Gyanendra K. Verma
Badal Soni
Salah Bourennane
Alexandre C. B. Ramos Editors

Data Science
Theory, Algorithms, and Applications
Transactions on Computer Systems
and Networks

Series Editor
Amlan Chakrabarti, Director and Professor, A. K. Choudhury School of
Information Technology, Kolkata, West Bengal, India
Transactions on Computer Systems and Networks is a unique series that aims
to capture advances in evolution of computer hardware and software systems
and progress in computer networks. Computing Systems in present world span
from miniature IoT nodes and embedded computing systems to large-scale
cloud infrastructures, which necessitates developing systems architecture, storage
infrastructure and process management to work at various scales. Present
day networking technologies provide pervasive global coverage on a scale
and enable multitude of transformative technologies. The new landscape of
computing comprises of self-aware autonomous systems, which are built upon a
software-hardware collaborative framework. These systems are designed to execute
critical and non-critical tasks involving a variety of processing resources like
multi-core CPUs, reconfigurable hardware, GPUs and TPUs which are managed
through virtualisation, real-time process management and fault-tolerance. While AI,
Machine Learning and Deep Learning tasks are predominantly increasing in the
application space the computing system research aim towards efficient means of
data processing, memory management, real-time task scheduling, scalable, secured
and energy aware computing. The paradigm of computer networks also extends it
support to this evolving application scenario through various advanced protocols,
architectures and services. This series aims to present leading works on advances
in theory, design, behaviour and applications in computing systems and networks.
The Series accepts research monographs, introductory and advanced textbooks,
professional books, reference works, and select conference proceedings.

More information about this series at http://www.springer.com/series/16657

Gyanendra K. Verma · Badal Soni ·
Salah Bourennane · Alexandre C. B. Ramos
Editors

Data Science
Theory, Algorithms, and Applications
Editors
Gyanendra K. Verma Badal Soni
Department of Computer Engineering Department of Computer Science
National Institute of Technology and Engineering
Kurukshetra National Institute of Technology Silchar
Kurukshetra, India Silchar, India

Salah Bourennane Alexandre C. B. Ramos

Multidimensional Signal Processing Group Mathematics and Computing Institute
Ecole Centrale Marseille Universidade Federal de Itajuba
Marseille, France Itajuba, Brazil

ISSN 2730-7484 ISSN 2730-7492 (electronic)

Transactions on Computer Systems and Networks
ISBN 978-981-16-1680-8 ISBN 978-981-16-1681-5 (eBook)
https://doi.org/10.1007/978-981-16-1681-5

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
We dedicate to all those who directly or
indirectly contributed to the accomplishment
of this work.
Preface

Digital information influences our everyday lives in various ways. Data sciences
provides us tools and techniques to comprehend and analyze the data. Data sciences
is one of the fastest-growing multidisciplinary fields that deals with data acquisition,
analysis, integration, modeling, visualization, and interaction of a large amount of
data.
Currently, each sector of the economy produces a huge amount of data in an
unstructured format. A huge amount of data is being available from various sources
like web services, databases, online repositories, etc.; however, the major challenge
is to extract meaningful intelligence information. However, to preprocess and extract
useful information is a challenging task. The role of artificial intelligence is playing
a pivotal role in the analysis of the data.
It becomes possible to analyze and interpret information in real-time with the
evolution of artificial intelligence. The deep learning models are widely used in
the analysis of big data for various applications, particularly in the area of image
processing.
This book aims to develop an understanding of data sciences theory and concepts,
data modeling by using various machine learning algorithms for a wide range of real-
world applications. In addition to providing basic principles of data processing, the
book teaches standard models and algorithms to data analysis.

Kurukshetra, India Dr. Gyanendra K. Verma

Silchar, India Dr. Badal Soni
Marseille, France Prof. Dr. h. c. Salah Bourennane
Itajuba, Brazil Prof. Dr. h. c. Alexandre C. B. Ramos
October 2020

vii
Acknowledgements

We are thankful to all the contributors who have generously given time and material
to this book. We would also want to extend our appreciation to those who have well
played their role to inspire us continuously.
We are extremely thankful to the reviewers, who have carried out the most impor-
tant and critical part of any technical book, evaluation of each of the submitted
chapters assigned to them.
We also express our sincere gratitude toward our publication partner, Springer,
especially to Ms. Kamiya Khatter and the Springer book production team for
continuous support and guidance in completing this book project.
Thank you.

ix
Introduction

Objective of the Book

This book aims to provide authors with an understanding of data sciences, their
architectures, and their applications in various domains. The data sciences is helpful
in the extraction of meaningful information from unstructured data. The major aspect
of data sciences is data modeling, analysis, and visualization. This book covers major
models, algorithms, and prominent applications of data sciences to solve real-world
problems. By the end of the book, we hope that our readers will have an understanding
of concepts, different approaches, models, and familiarity with the implementation
of data sciences tools and libraries.
Artificial intelligence has a major impact on research and raised the performance
bar substantially in many of the standard evaluations. Moreover, the new challenges
can be tackled using artificial intelligence in the decision-making process. However,
it is very difficult to comprehend, let alone guide, the process of learning in deep
learning. There is an air of uncertainty about exactly what and how these models
learn, and this book is an effort to fill those gaps.

Target Audience

The book is divided into three parts comprising a total of 27 chapters. Parts, distinct
groups of chapters, as well as single chapters are meant to be fairly independent
and also self-contained, and the reader is encouraged to study only relevant parts or
chapters. This book is intended for a broad readership. The first part provides the
theory and concepts of learning. Thus, this part addresses readers wishing to gain an
overview of learning frameworks. Subsequent parts delve deeper into research topics
and are aimed at the more advanced reader, in particular graduate and PhD students
as well as junior researchers. The target audience of this book will be academi-
cians, professionals, researchers, and students at engineering and medical institutions
working in the areas of data sciences and artificial intelligence.

xi
xii Introduction

Book Organization

This book is organized into three parts. Part I includes eight chapters that deal with
theory concepts of data sciences, Part II deals with data design and analysis, and
finally, Part III is based on the major applications of data sciences. This book contains
invited as well as contributed chapters.

Part I Theory and Concepts

The first part of the book exclusively focuses on the fundamentals of data sciences.
The book chapters under this part cover active learning, ensemble learning concepts
along with language processing concepts.
Chapter 1 describes a general active learning framework that has been proposed for
network intrusion detection. The authors have experimented with different learning
and sampling strategies on the KDD Cup 1999 dataset. The results show that the
performance of complex learning models has been found to outperform the rela-
tively simple learning models. The uncertainty and entropy sampling also outperform
random sampling. Chapter 2 describes a bagging classifier which is an ensemble
learning approach for student outcome prediction by employing base and meta-
classifiers. Additionally, performance analysis of various classifiers has been carried
out by an oversampling approach using SMOTE and an undersampling approach
using spread sampling. Chapter 3 presents the patient’s medical data security via bi-
chaos bi-order Fourier transform. In this work, authors have used three techniques
for medical or clinical image encryption, i.e., FRFT, logistic map, and Arnold map.
The results suggest that the complex hybrid combination makes the system more
robust and secure from the different cryptographic attacks than these methods alone.
In Chap. 4, word-sense disambiguation (WSD) for the Nepali language is performed
using variants of the Lesk algorithm such as direct overlap, frequency-based scoring,
and frequency-based scoring after drooping of the target word. Performance anal-
ysis based on the elimination of stop words, the number of senses, and context
window size has been carried out. Chapter 5 presents a performance analysis of
different branch prediction schemes incorporated in ARM big.LITTLE architecture.
The performance comparison of these branch predictors has been carried out based
on performance, power dissipation, conditional branch mispredictions, IPC, execu-
tion time, power consumption, etc. The results show that TAGE-LSC and perceptron
achieve the highest accuracy among the simulated predictors. Chapter 6 presents a
global feature representation using a new architecture SEANet that has been built
over SENet. An aggregate block implemented after the SE block aids in global feature
representation and reducing the redundancies. SEANet has been found to outperform
ResNet and SENet on two benchmark datasets—CIFAR-10 and CIFAR-100.
Introduction xiii

The subsequent chapters in this part are devoted to analyzing images. Chapter 7
presents an improved super-resolution of a single image through an external dictio-
nary formation for training and a neighbor embedding technique for reconstruction.
The task of dictionary formation is carried out so as to contain maximum structural
variations and the minimal number of images. The reconstruction stage is carried
out by the selection of overlapping pixels of a particular location. In Chap. 8, single-
step image super-resolution and denoising of SAR images are proposed using the
generative adversarial networks (GANs) model. The model shows improvement in
VGG16 loss as it preserves relevant features and reduces noise from the image. The
quality of results produced by the proposed approach is compared with the two-step
upscaling and denoising model and the baseline method.

Part II Models and Algorithms

The second part of the book focuses on the models and algorithms for data sciences.
The deep learning models, discrete wavelet transforms, principal component anal-
ysis, SenLDA, color-based classification model, and gray-level co-occurrence matrix
(GLCM) are used to model real-world problems.
Chapter 9 explores a deep learning technique based on OCR-SSD for car detection
and tracking in images. It also presents a solution for real-time license plate recog-
nition on a quadcopter in autonomous flight. Chapter 10 describes an algorithm for
gender identification based on biometric palm print using binarized statistical image
features. The filter size is varied with a fixed length of 8 bits to capture information
from the ROI palm prints. The proposed method outperforms baseline approaches
with an accuracy of 98%. Chapter 11 describes a Sudoku puzzle recognition and solu-
tion study. Puzzle recognition is carried out using a deep belief network for feature
extraction. The puzzle solution is given by serialization of two approaches—parallel
rule-based methods and ant colony optimization. Chapter 12 describes a novel profile
generation approach for human action recognition. DWT & PC is proposed to detect
energy variation for feature extraction in video frames. The proposed method is
applied to various existing classifiers and tested on Weizmann’s dataset. The results
outperform baselines like the MACH filter.
The subsequent chapters in this part are devoted to more research-oriented models
and algorithms. Chapter 13 presents a novel filter and color-based classification
model to assess the ripeness of tobacco leaves for harvesting. The ripeness detection
is performed by a spot detection approach using a first-order edge extractor and a
second-order high-pass filtering. A simple thresholding classifier is then proposed
for the classification task. Chapter 14 proposes an automatic deep learning frame-
work for breast cancer detection and classification model from hematoxylin and
eosin (H&E)-stained breast histopathology images with 80.4% accuracy for supple-
menting analysis of medical professionals to prevent false negatives. Experimental
results yield that the proposed architecture provides better classification results as
compared to benchmark methods. Chapter 15 specifies a technique for indoor flying
xiv Introduction

of autonomous drones using image processing and neural networks. The route for
the drone is determined through the location of the detected object in the captured
image. The first detection technique relies on image-based filters, while the second
technique focuses on the use of CNN to replicate a real environment. Chapter 16
describes the use of a gray-level co-occurrence matrix (GLCM) for feature detection
in SAR images. The features detected in SAR images by GLCM find much applica-
tion as it identifies various orientations such as water, urban areas, and forests and
any changes in these areas.

Part III Applications and Issues

The third part of the book covers the major applications of data sciences in various
fields like biometrics, robotics, medical imaging, affective computing, security, etc.
Chapter 17 deals with signature verification using Galois field operator. The
features are obtained by building a normalized cumulative histogram. Offline signa-
ture verification is also implemented using the K-NN classifier. Chapter 18 details a
face recognition approach in videos using 3D residual networks and comparing the
accuracy for different depths of residual networks. A CVBL video dataset has been
developed for the purpose of experimentation. The proposed approach achieves the
highest accuracy of 97% with DenseNets on the CVBL dataset. Microcontroller units
(MCU) with auto firmware communicate with the fog layer through a smart edge
node. The robot employs approaches such as simultaneous localization and mapping
(SLAM) and other path-finding algorithms and IR sensors for obstacle detection. ML
techniques and FastAi aid in the classification of the dataset. Chapter 20 describes
an automatic tumor identification approach to classify MRI of brain. An advanced
CNN model consisting of convolution and a dense layer is employed to correctly
classify the brain tumors. The results exhibit the proposed model’s effectiveness in
brain tumor image classification. Chapter 21 presents a vision-based sensor mech-
anism for phase lane detection in IVS. The land markings on a structured road are
detected using image processing techniques such as edge detection and Hough space
transformation on KITTI data. Qualitative and quantitative analysis shows satis-
factory results. In Chapter 22, the proposed implementation of deep convolutional
neural network (DCNN) for micro-expression recognition as DCNN has established
its presence in different image processing applications. CASME-II, a benchmark
database for micro-expression recognition, has been used for experimentations. The
results of the experiment had revealed that types based on CNN give correct results
of 90% and 88% for four and six classes, respectively, that is beyond the regular
methods.
In Chapter 23, the proposed semantic classification model intends to employ
modern embedding and aggregating methods which considerably enhance feature
discriminability and boost the performance of CNN. The performance of this frame-
work is exhaustively tested across a wide dataset. The intuitive and robust systems
that use these techniques play a vital role in various sectors like security, military,
Introduction xv

automation, industries, medical, and robotics. In Chap. 24, a countermeasure for

voice conversation spoofing attack has been proposed using source separation based
on nonnegative matrix factorization and CNN-based binary classifier. The voice
conversation spoofed speech is modeled as a combination of target estimate and
the artifact used in the voice conversion. The proposed method shows a decrease in
the false alarm rate of automatic speaker verification. Chapter 25 proposes a facial
emotion recognition and the prediction that can serve as a useful monitoring mecha-
nism in various fields. The first stage utilizes CNN for facial emotion detection from
real-time video frames and assigns a probability to the various emotional states. The
second stage uses a time-series analysis that predicts future facial emotions from
the output of the first stage. The final Chap. 26 describes a methodology for the
identification and analysis of cohorts for heart failure patients using NLP tasks. The
proposed approach uses various NLP processes implemented in the cTAKES tool to
identify patients of a particular cohort group. The proposed system has been found
to outperform the manual extraction process in terms of accuracy, precision, recall,
and F-measure scores.

Kurukshetra, India Dr. Gyanendra K. Verma

Silchar, India Dr. Badal Soni
Marseille, France Prof. Dr. h. c. Salah Bourennane
Itajuba, Brazil Prof. Dr. h. c. Alexandre C. B. Ramos
October 2020
Contents

Part I Theory and Concepts

1 Active Learning for Network Intrusion Detection . . . . . . . . . . . . . . . . 3
Amir Ziai
2 Educational Data Mining Using Base (Individual)
and Ensemble Learning Approaches to Predict
the Performance of Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Mudasir Ashraf, Yass Khudheir Salal, and S. M. Abdullaev
3 Patient’s Medical Data Security via Bi Chaos Bi Order
Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Bharti Ahuja and Rajesh Doriya
4 Nepali Word-Sense Disambiguation Using Variants
of Simplified Lesk Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Satyendr Singh, Renish Rauniyar, and Murali Manohar
5 Performance Analysis of Big.LITTLE System with Various
Branch Prediction Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Froila V. Rodrigues and Nitesh B. Guinde
6 Global Feature Representation Using Squeeze, Excite,
and Aggregation Networks (SEANet) . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Akhilesh Pandey, Darshan Gera, D. Gunasekar, Karam Rai,
and S. Balasubramanian
7 Improved Single Image Super-resolution Based on Compact
Dictionary Formation and Neighbor Embedding
Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Garima Pandey and Umesh Ghanekar
8 An End-to-End Framework for Image Super Resolution
and Denoising of SAR Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Ashutosh Pandey, Jatav Ashutosh Kumar, and Chiranjoy Chattopadhyay

xvii
xviii Contents

Part II Models and Algorithms

9 Analysis and Deployment of an OCR—SSD Deep Learning
Technique for Real-Time Active Car Tracking and Positioning
on a Quadrotor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Luiz G. M. Pinto, Wander M. Martins, Alexandre C. B. Ramos,
and Tales C. Pimenta
10 Palmprint Biometric Data Analysis for Gender Classification
Using Binarized Statistical Image Feature Set . . . . . . . . . . . . . . . . . . . . 157
Shivanand Gornale, Abhijit Patil, and Mallikarjun Hangarge
11 Recognition of Sudoku with Deep Belief Network and Solving
with Serialisation of Parallel Rule-Based Methods and Ant
Colony Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Satyasangram Sahoo, B. Prem Kumar, and R. Lakshmi
12 Novel DWT and PC-Based Profile Generation Method
for Human Action Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Tanish Zaveri, Payal Prajapati, and Rishabh Shah
13 Ripeness Evaluation of Tobacco Leaves for Automatic
Harvesting: An Approach Based on Combination of Filters
and Color Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
P. B. Mallikarjuna, D. S. Guru, and C. Shadaksharaiah
14 Automatic Deep Learning Framework for Breast Cancer
Detection and Classification from H&E Stained Breast
Histopathology Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Anmol Verma, Asish Panda, Amit Kumar Chanchal, Shyam Lal,
and B. S. Raghavendra
15 An Analysis of Use of Image Processing and Neural Networks
for Window Crossing in an Autonomous Drone . . . . . . . . . . . . . . . . . . 229
L. Pedro de Brito, Wander M. Martins, Alexandre C. B. Ramos,
and Tales C. Pimenta
16 Analysis of Features in SAR Imagery Using GLCM
Segmentation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Jasperine James, Arunkumar Heddallikar, Pranali Choudhari,
and Smita Chopde

Part III Applications and Issues

17 Offline Signature Verification Using Galois Field-Based
Texture Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
S. Shivashankar, Medha Kudari, and S. Prakash Hiremath
18 Face Recognition Using 3D CNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Nayaneesh Kumar Mishra and Satish Kumar Singh
Contents xix

19 Fog Computing-Based Seed Sowing Robots for Agriculture . . . . . . . 295

Jaykumar Lachure and Rajesh Doriya
20 An Automatic Tumor Identification Process to Classify MRI
Brain Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Arpita Ghosh and Badal Soni
21 Lane Detection for Intelligent Vehicle System Using Image
Processing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Deepak Kumar Dewangan and Satya Prakash Sahu
22 An Improved DCNN Based Facial Micro-expression
Recognition System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Divya Garg and Gyanendra K. Verma
23 Selective Deep Convolutional Framework for Vehicle
Detection in Aerial Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Kaustubh V. Sakhare and Vibha Vyas
24 Exploring Source Separation as a Countermeasure for Voice
Conversion Spoofing Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
R. Hemavathi, S. Thoshith, and R. Kumaraswamy
25 Statistical Prediction of Facial Emotions Using Mini Xception
CNN and Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Basudeba Behera, Amit Prakash, Ujjwal Gupta,
Vijay Bhaksar Semwal, and Arun Chauhan
26 Identification of Congestive Heart Failure Patients Through
Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Niyati Baliyan, Aakriti Johar, and Priti Bhardwaj

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
Editors and Contributors

About the Editors

Gyanendra K. Verma is currently working as Assistant Professor at the Depart-

ment of Computer Engineering, National Institute of Technology Kurukshetra, India.
He has completed his B.Tech. from Harcourt Butler Technical University (formerly
HBTI) Kanpur, India, and M.Tech. & Ph.D. from Indian Institute of Information
Technology Allahabad (IIITA), India. His all degrees are in Information Technology.
He has teaching and research experience of over six years in the area of Computer
Science and Information Technology with a special interest in image processing,
speech and language processing, human-computer interaction. His research work on
affective computing and the application of wavelet transform in medical imaging and
computer vision problems have been cited extensively. He is a member of various
professional bodies like IEEE, ACM, IAENG & IACSIT.

Badal Soni is currently working as Assistant Professor at the Department of

Computer Engineering, National Institute of Technology Silchar, India. He has
completed his B.Tech. from Rajiv Gandhi Technical University (formerly RGPV)
Bhopal, India, and M.Tech from Indian Institute of Information Technology, Design,
and Manufacturing (IITDM), Jabalpur, India. He received Ph.D. from the National
Institute of Technology Silchar, India. His all degrees are in Computer Science and
Engineering. He has teaching and research experience of over seven years in the area
of computer science and information technology with a special interest in computer
graphics, image processing, speech and language processing. He has published more
than 35 papers in refereed Journals, contributed books, and international conference
proceedings. He is the Senior member of IEEE and professional members of various
bodies like IEEE, ACM, IAENG & IACSIT.

Salah Bourennane received his Ph.D. degree from Institut National Polytechnique
de Grenoble, France. Currently, he is a Full Professor at the Ecole Centrale Marseille,
France. He is the head of the Multidimensional Signal Processing Group of Fresnel
Institute. His research interests are in statistical signal processing, remote sensing,

xxi
xxii Editors and Contributors

telecommunications, array processing, image processing, multidimensional signal

processing, and performance analysis. He has published several papers in reputed
international journals.

Alexandre C. B. Ramos is the associate Professor of Mathematics and Computing

Institute—IMC from Federal University of Itajubá—UNIFEI (MG). His interest
areas are multimedia, artificial intelligence, human-computer interface, computer-
based training, and e-learning. Dr. Ramos has over 18 years of research and
teaching experience. He did his Post-doctorate at the Ecole Nationale de l’Aviation
Civile—ENAC (France, 2013–2014), Ph.D. and Master in Electronic and Computer
Engineering from Instituto Tecnológico de Aeronáutica - ITA (1996 and 1992). He
completed his graduation in Electronic Engineering from the University of Vale
do Paraíba—UNIVAP (1985) and sandwich doctorate at Laboratoire d’Analyse et
d’Architecture des Systèmes—LAAS (France, 1995–1996). He has professional
experience in the areas of Process Automation with an emphasis on chemical
and petrochemical processes (Petrobras 1983–1995); and Computer Science, with
emphasis on Information Systems (ITA/ Motorola 1997–2001), acting mainly on the
following themes: Development of Training Simulators with the support of Intelli-
gent Tutoring Systems, Hybrid Intelligent Systems, and Computer Based Training,
Neural Networks in Trajectory Control in Unmanned Vehicles, Pattern Matching and
Image Digital Processing.

Contributors

S. M. Abdullaev Department of System Programming, South Ural State University,

Chelyabinsk, Russia
Bharti Ahuja Department of Information Technology, National Institute of
Technology, Raipur, India
Mudasir Ashraf School of CS and IT, Jain University, Bangalore, India
Jatav Ashutosh Kumar Indian Institute of Technology Jodhpur, Jodhpur,
Rajasthan, India
S. Balasubramanian Department of Mathematics and Computer Science
(DMACS), Sri Sathya Sai Institute of Higher Learning (SSSIHL), Prasanthi
Nilayam, Anantapur District, India
Niyati Baliyan Department of Information Technology, IGDTUW, Delhi, India
Basudeba Behera Department of Electronics and Communication Engineering,
NIT Jamshedpur, Jamshedpur, Jharkhand, India
Priti Bhardwaj Department of Information Technology, IGDTUW, Delhi, India
Editors and Contributors xxiii

Chiranjoy Chattopadhyay Indian Institute of Technology Jodhpur, Jodhpur,

Rajasthan, India
Arun Chauhan Department of Computer Science Engineering, IIIT Dharwad,
Dharwad, Karnataka, India
Smita Chopde FCRIT, Mumbai, India
Pranali Choudhari FCRIT, Mumbai, India
L. Pedro de Brito Federal University of Itajuba, Institute of Mathematics and
Computing, Itajubá, Brazil
Deepak Kumar Dewangan Department of Information Technology, National
Institute of Technology, Raipur, Chhattisgarh, India
Rajesh Doriya Department of Information Technology, National Institute of
Technology, Raipur, Chhattisgarh, India
Divya Garg Department of Computer Engineering, National Institute of
Technology Kurukshetra, Kurukshetra, India
Darshan Gera DMACS, SSSIHL, Bengaluru, India
Umesh Ghanekar National Institute of Technology Kurukshetra, Kurukshetra,
India
Arpita Ghosh National Institute of Technology Silchar, Silchar, Assam, India
Shivanand Gornale Department of Computer Science, Rani Channamma
University, Belagavi, Karnataka, India
Nitesh B. Guinde Goa College of Engineering, Ponda-Goa, India
D. Gunasekar Department of Mathematics and Computer Science (DMACS), Sri
Sathya Sai Institute of Higher Learning (SSSIHL), Prasanthi Nilayam, Anantapur
District, India
Ujjwal Gupta Department of Electronics and Communication Engineering, NIT
Jamshedpur, Jamshedpur, Jharkhand, India
D. S. Guru University of Mysore, Mysore, Karnataka, India
Mallikarjun Hangarge Department of Computer Science, Karnatak College,
Bidar, Karnataka, India
Arunkumar Heddallikar RADAR Division, Sameer, IIT Bombay, Mumbai, India
R. Hemavathi Department of Electronics and Communication Engineering,
Siddaganga Institute of Technology (Affiliated to Visveswaraya Technological
University, Belagavi), Tumakuru, India
S. Prakash Hiremath Department of Computer Science, KLE Technological
University, BVBCET, Hubballi, Karnataka, India
xxiv Editors and Contributors

Jasperine James FCRIT, Mumbai, India

Aakriti Johar Department of Information Technology, IGDTUW, Delhi, India
Medha Kudari Department of Computer Science, Karnatak University, Dharwad,
India
B. Prem Kumar Pondicherry Central University, Pondicherry, India
Amit Kumar Chanchal National Institute of Technology Karnataka, Mangalore,
Karnataka, India
R. Kumaraswamy Department of Electronics and Communication Engineering,
Siddaganga Institute of Technology (Affiliated to Visveswaraya Technological
University, Belagavi), Tumakuru, India
Jaykumar Lachure National Institute of Technology Raipur, Raipur, Chhattisgarh,
India
R. Lakshmi Pondicherry Central University, Pondicherry, India
Shyam Lal National Institute of Technology Karnataka, Mangalore, Karnataka,
India
P. B. Mallikarjuna JSS Academy of Technical Education, Bengaluru, Karnataka,
India
Murali Manohar Gramener, Bangalore, India
Wander M. Martins Institute of Systems Engineering and Information
Technology, Itajuba, MG, Brazil
Nayaneesh Kumar Mishra Computer Vision and Biometric Lab, IIIT Allahabad,
Allahabad, India
Asish Panda National Institute of Technology Karnataka, Mangalore, Karnataka,
India
Akhilesh Pandey Department of Mathematics and Computer Science (DMACS),
Sri Sathya Sai Institute of Higher Learning (SSSIHL), Prasanthi Nilayam, Anantapur
District, India
Ashutosh Pandey Indian Institute of Technology Jodhpur, Jodhpur, Rajasthan,
India
Garima Pandey National Institute of Technology Kurukshetra, Kurukshetra, India
Abhijit Patil Department of Computer Science, Rani Channamma University,
Belagavi, Karnataka, India
Tales C. Pimenta Institute of Systems Engineering and Information Technology,
Itajuba, MG, Brazil
Editors and Contributors xxv

Luiz G. M. Pinto Institute of Mathematics and Computing, Federal University of

Itajuba, Itajuba, MG, Brazil
Payal Prajapati Government Engineering College, Patna, India
Amit Prakash Department of Electronics and Communication Engineering, NIT
Jamshedpur, Jamshedpur, Jharkhand, India
B. S. Raghavendra National Institute of Technology Karnataka, Mangalore,
Karnataka, India
Karam Rai Department of Mathematics and Computer Science (DMACS), Sri
Sathya Sai Institute of Higher Learning (SSSIHL), Prasanthi Nilayam, Anantapur
District, India
Alexandre C. B. Ramos Institute of Mathematics and Computing, Federal
University of Itajuba, Itajuba, MG, Brazil
Renish Rauniyar Tredence Analytics, Bangalore, India
Froila V. Rodrigues Dnyanprassarak Mandal’s College and Research Centre,
Assagao-Goa, India
Satyasangram Sahoo Pondicherry Central University, Pondicherry, India
Satya Prakash Sahu Department of Information Technology, National Institute of
Technology, Raipur, Chhattisgarh, India
Kaustubh V. Sakhare Department of Electronics and Telecommunication, College
of Engineering, Pune, India
Yass Khudheir Salal Department of System Programming, South Ural State
University, Chelyabinsk, Russia
Vijay Bhaksar Semwal Department of Computer Science Engineering, MANIT,
Bhopal, Madhya Pradesh, India
C. Shadaksharaiah Bapuji Institute of Engineering and Technology, Davangere,
Karnataka, India
Rishabh Shah Nirma University, Ahmedabad, India;
Government Engineering College, Patna, India
S. Shivashankar Department of Computer Science, Karnatak University, Dharwad,
India
Satish Kumar Singh Computer Vision and Biometric Lab, IIIT Allahabad,
Allahabad, India
Satyendr Singh BML Munjal University, Gurugram, Haryana, India
Badal Soni National Institute of Technology Silchar, Silchar, Assam, India
xxvi Editors and Contributors

S. Thoshith Department of Electronics and Communication Engineering,

Siddaganga Institute of Technology (Affiliated to Visveswaraya Technological
University, Belagavi), Tumakuru, India
Anmol Verma National Institute of Technology Karnataka, Mangalore, Karnataka,
India
Gyanendra K. Verma Department of Computer Engineering, National Institute of
Technology Kurukshetra, Kurukshetra, India
Vibha Vyas Department of Electronics and Telecommunication, College of
Engineering, Pune, India
Tanish Zaveri Nirma University, Ahmedabad, India
Amir Ziai Stanford University, Stanford, CA, USA
Acronyms

BHC Bayesian Hierarchical Clustering

CNN Convolution Neural Network
DCIS Ductal Carcinoma In Situ
HE Hematoxylin and Eosin
IDC Invasive Ductal Carcinoma
IRRCNN Inception Recurrent Residual Convolutional Neural Network
SVM Support Vector Machine
VGG16 Visual Geometry Group—16
WSI Whole Slide Image

xxvii
Part I
Theory and Concepts
Chapter 1
Active Learning for Network Intrusion
Detection

Amir Ziai

Abstract Network operators are generally aware of common attack vectors that they
defend against. For most networks, the vast majority of traffic is legitimate. How-
ever, new attack vectors are continually designed and attempted by bad actors which
bypass detection and go unnoticed due to low volume. One strategy for finding such
activity is to look for anomalous behavior. Investigating anomalous behavior requires
significant time and resources. Collecting a large number of labeled examples for
training supervised models is both prohibitively expensive and subject to obsole-
tion as new attacks surface. A purely unsupervised methodology is ideal; however,
research has shown that even a very small number of labeled examples can signifi-
cantly improve the quality of anomaly detection. A methodology that minimizes the
number of required labels while maximizing the quality of detection is desirable.
False positives in this context result in wasted effort or blockage of legitimate traf-
fic, and false negatives translate to undetected attacks. We propose a general active
learning framework and experiment with different choices of learners and sampling
strategies.

1.1 Introduction

Detecting anomalous activity is an active area of research in the security space. Tuor
et al. use an online anomaly detection method based on deep learning to detect anoma-
lies. This methodology is compared to traditional anomaly detection algorithms such
as isolation forest (IF) and a principal component analysis (PCA)-based approach
and found to be superior. However, no comparison is provided with semi-supervised
or active learning approaches which leverage a small amount of labeled data (Tuor
et al. 2017). The authors later propose another unsupervised methodology leverag-
ing recurrent neural network (RNN) to ingest the log-level event data as opposed to
aggregated data (Tuor et al. 2018). Pimentel et al. propose a generalized framework
for unsupervised anomaly detection. They argue that purely unsupervised anomaly

A. Ziai (B)
Stanford University, 450 Serra Mall, Stanford, CA 94305, USA
e-mail: amirziai@stanford.edu

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 3
G. K. Verma et al. (eds.), Data Science, Transactions on Computer Systems
and Networks, https://doi.org/10.1007/978-981-16-1681-5_1
4 A. Ziai

Table 1.1 Prevalence and number of attacks for each of the 10 attack types
Label Attacks Prevalence Prevalence Records
(overall)
smurf. 280,790 0.742697 0.568377 378,068
neptune. 107,201 0.524264 0.216997 204,479
back. 22.3 0.022145 0.04459 99,481
satan 1589 0.016072 0.003216 98,867
ipsweep 1247 0.012657 0.002524 98,525
portsweep. 1040 0.010578 0.002105 98,318
warezclient. 1020 0.010377 0.002065 98,298
teardrop. 979 0.009964 0.001982 98,257
pod. 264 0.002707 0.000534 97,542
nmap. 231 0.002369 0.000468 97,509

detection is undecidable without a prior on the distribution of anomalies, and learned

representations have simpler statistical structure which translate to better generaliza-
tion. They propose an active learning approach with a logistic regression classifier as
the learner (Pimentel et al. 2018). Veeramachaneni et al. propose a human-in-the-loop
machine learning system that provides both insights to the analyst and addressing
large data processing concerns. This system uses unsupervised methods to surface
anomalous data points for the analyst to label and a combination of supervised and
unsupervised methods to predict the attacks (Veeramachaneni et al. 2016). In this
work, we also propose an analyst-in-the-loop active learning approach. However, our
approach is not opinionated about the sampling strategy or the learner used in active
learning. We will explore trade-offs in that design space.

1.2 Dataset

We have used the KDD Cup 1999 dataset which consists of about 500K records
representing network connections in a military environment. Each record is either
“normal” or one of 22 different types of intrusion such as smurf, IP sweep, and
teardrop. Out of these 22 categories, only 10 have at least 100 occurrences, and the
rest were removed. Each record has 41 features including duration, protocol, and
bytes exchanged. Prevalence of attack types varies substantially with smurf being
the most pervasive at about 50% of total records and Nmap at less than 0.01% of
total records (Table 1.1).
1 Active Learning for Network Intrusion Detection 5

Table 1.2 Snippet of input data

Duration Protocol Service Flag src dst Land Wrong Urgent Hot ... dst
_type _bytes _bytes _fragment _host
_srv
_count
0 tcp http SF 181 5450 0 0 0 0 ... 9
0 tcp http SF 239 4860 0 0 0 0 ... 19
0 tcp http SF 235 1337 0 0 0 0 ... 29

1.2.1 Input and Output Example

Table 1.2 depicts three rows of data (excluding the label):

The objective of the detection system is to label each row as either “normal” or
“anomalous.”

1.2.2 Processing Pipeline

We generated 10 separate datasets consisting of normal traffic and each of the attack
vectors. This way we can study the proposed approach over 10 different attack vectors
with varying prevalence and ease of detection. Each dataset is then split into train,
development, and test partitions with 80%, 10%, and 10% proportions. All algorithms
are trained on the train set and evaluated on the development set. The winning strategy
is tested on the test set to generate an unbiased estimate of generalization. Categorical
features are one-hot encoded, and missing values are filled with zero.

1.3 Approach

1.3.1 Evaluation Metric

Since labeled data is very hard to come by in this space, we have decided to treat this
problem as an active learning one. Therefore, the machine learning model receives a
subset of the labeled data. We will use the F1 score to capture the trade-off between
precision and recall:
F1 = (2P R)/(P + R) (1.1)

where P = TP/((TP + FP)), R = TP/((TP + FN)), TP is true positive, FP is false

positive, and FN is the number of false negative. A model that is highly precise
(does not produce false positives) is desirable as it will not waste the analyst’s time.
6 A. Ziai

However, this usually comes at the cost of being overly conservative and not catching
anomalous activity that is indeed an intrusion.

1.3.2 Oracle and Baseline

Labeling effort is a major factor in this analysis and a dimension along which we
will define the upper and lower bounds of the quality of our detection systems. A
purely unsupervised approach would be ideal as there is no labeling involved. We
will use an isolation forest (Zhou et al. 2004) to establish our baseline. Isolation
forests (IFs) are widely, and very successfully, used for anomaly detection. An IF
consists of a number of isolation trees, each of which are constructed by selecting
random features to split and then selecting a random value to split on (random value
in the range of continuous variables or random value for categorical variables). Only a
small random subset of the data is used for growing the trees, and usually a maximum
allowable depth is enforced to curb computational cost. We have used 10 trees for
each IF. Intuitively, anomalous data points are easier to isolate with a smaller average
number of splits and therefore tend to be closer to the root. The average closeness
to the root is proportional to the anomaly score (i.e., the lower this score, the more
anomalous the data point).
A completely supervised approach would incur maximum cost as we will have
to label every data point. We have used a random forest classifier with 10 estimators
trained on the entire training dataset to establish the upper bound (i.e., Oracle). In
Table 1.3, the F1 scores are reported for evaluation on the development set:

Table 1.3 Oracle and baseline for different attack types

Label Baseline F1 Oracle F1
smurf 0.38 1.00
neptune 0.49 1.00
back 0.09 1.00
satan 0.91 1.00
ipsweep 0.07 1.00
portsweep 0.53 1.00
warezclient 0.01 1.00
teardrop 0.30 1.00
pod 0.00 1.00
nmap 0.51 1.00
Means ± standard deviation 0.33±0.29 1.00±0.01
1 Active Learning for Network Intrusion Detection 7

Fig. 1.1 Active learning scheme

1.3.3 Active Learning

The proposed approach starts with training a classifier on a small random subset of
the data (i.e., 1000 samples) and then continually queries a security analyst for the
next record to label. There is a maximum budget of 100 queries (Fig. 1.1).
This approach is highly flexible. The choice of classifier can range from logistic
regression all the way up to deep networks as well as any ensemble of those models.
Moreover, the hyper-parameters for the classifier can be tuned on every round of
training to improve the quality of predictions. The sampling strategy can range from
simply picking random records to using classifier uncertainty or other elaborate
schemes. Once a record is labeled, it is removed from the pool of labeled data and
placed into the labeled record database. We are assuming that labels are trustworthy
which may not necessarily be true. In other words, the analyst might make a mistake
in labeling or there may be low consensus among analysts around labeling. In the
presence of those issues, we would need to extend this approach to query multiple
analysts and to build the consensus of labels into the framework.

1.4 Experiments

1.4.1 Learners and Sampling Strategies

We used a logistic regression (LR) classifier with L2 penalty as well as a random forest
(RF) classifier with 10 estimators, Gini impurity for splitting criteria, and unlimited
depth for our choice of learners. We also chose three sampling strategies. First is
a random strategy that randomly selects a data point from the unlabeled pool. The
second option is uncertainty sampling that scores the entire database of unlabeled
data and then selects the data point with the highest uncertainty. The first option
is entropy sampling, which calculates the entropy over the positive and negative
8 A. Ziai

Table 1.4 Effects of learner and sampling strategy on detection quality and latency
Learner Sampling F1 initial F1 after 10 F1 after 50 F1 after Train time Query time
strategy 100 (s) (s)
LR Random 0.76±0.32 0.76±0.32 0.79±0.31 0.86±0.17 0.05±0.01 0.09±0.08
LR Uncertainty 0.83±0.26 0.85±0.31 0.88±0.20 0.10±0.08
LR Entropy 0.83±0.26 0.85±0.31 0.88±0.20 0.08±0.08
RF Random 0.90±0.14 0.91±0.12 0.84±0.31 0.95±0.07 0.11±0.00 0.09±0.07
RF Uncertainty 0.98±0.03 0.99±0.03 0.99±0.03 0.16±0.06
RF Entropy 0.98±0.04 0.98±0.03 0.99±0.03 0.12±0.08

classes and selects the highest entropy data point. Ties are broken randomly for both
uncertainty and entropy sampling.
Table 1.4 shows the F1 score immediately after the initial training (F1 initial)
followed by the F1 score after 10, 50, and 100 queries to the analyst across different
learners and sampling strategies aggregated over the 10 attack types:
Random forests are strictly superior to logistic regression from a detection per-
spective regardless of the sampling strategy. It is also clear that uncertainty and
entropy sampling are superior to random sampling which suggests that judiciously
sampling the unlabeled dataset can have a significant impact on the detection quality,
especially in the earlier queries (F1 goes from 0.90 to 0.98 with just 10 queries). It is
important to notice that the query time might become a bottleneck. In our examples,
the unlabeled pool of data is not very large but as this set grows these sampling
strategies have to scale accordingly. The good news is that scoring is embarrassingly
parallelizable.
Figure 1.2 depicts the evolution of detection quality as the system makes queries
to the analyst for an attack with high prevalence (i.e., the majority of traffic is an
attack):
The random forest learner combined with an entropy sampler can get to perfect
detection within 5 queries which suggests high data efficiency (Mussmann and Liang
2018). We will compare this to the Nmap attack with significantly lower prevalence
(i.e., less than 0.01% of the dataset is an attack) (Fig. 1.3):
We know from our Oracle evaluations that a random forest model can achieve
perfect detection for this attack type; however, we see that an entropy sampler is not
guaranteed to query the optimal sequence of data points. The fact that the prevalence
of attacks is very low means that the initial training dataset probably does not have a
representative set of positive labels that can be exploited by the model to generalize.
The failure of uncertainty sampling has been documented (Zhu et al. 2008), and
more elaborate schemes can be designed to exploit other information about the unla-
beled dataset that the sampling strategy is ignoring. To gain some intuition into
these deficiencies, we will unpack a step of entropy sampling for the Nmap attack.
Figure 1.4 compares (a) the relative feature importance after the initial training to (b)
the Oracle (Fig. 1.5):
1 Active Learning for Network Intrusion Detection 9

Fig. 1.2 Detection quality for a high prevalence attack

The Oracle graph suggests the “src_bytes” is a feature that the model is highly
reliant upon for prediction. However, our initial training is not reflecting this; we will
compute the z-score for each of the positive labels in our development set:

|μ R fi − μW fi |
z fi = (1.2)
σ R fi

where μ R fi is the average value of the true positives for feature i (i.e., f i ), μW fi is
the average value of the false positives or false negatives, and σ R fi is the standard
deviation of the values in the case of true positives.
10 A. Ziai

Fig. 1.3 Detection quality

for a low prevalence attack

The higher this value is for a feature, the more our learner needs to know about it
to correct the discrepancy. However, we see that the next query made by the strategy
does not involve a decision around this fact. The score for “src_bytes” is an order
of magnitude larger than other features. The model continues to make uncertainty
queries staying oblivious to information about specific features that it needs to correct
for.

1.4.2 Ensemble Learning

Creating an ensemble of classifiers is usually a very effective way to combine the

power of multiple learners (Zainal et al. 2009). This strategy is highly effective
when the errors made by classifiers in the ensemble tend to cancel out and are not
compounded. To explore this idea, we designed a weighted ensemble:
The prediction in the above diagram is calculated as follows:
1 Active Learning for Network Intrusion Detection 11

Fig. 1.4 Random forest feature importance for a initial training and b Oracle

Fig. 1.5 Ensemble learner

12 A. Ziai

Fig. 1.6 Ensemble active

learning results for
warezclient and satan attacks

we
[PredictionEnsemble = I we Prediction E > (1.3)
e E e E
2

where Prediction E {0, 1} is the binary prediction associated with the classifier e E =
{R F, G B, L R, I F} and we is the weight of the classifier in the ensemble.
The weights are proportional to the level of confidence we have in each of the
learners. We have added a gradient boosting classifier with 10 estimators.
Unfortunately, the results of this experiment suggest that this particular ensemble
is not adding any additional value. Figure 1.6 shows that at best the results match
that of random forest (a) and in the worst case they can be significantly worse (b):
The majority of the error associated with this ensemble approach relative to only
using random forests can be attributed to a high false negative rate. The other four
algorithms are in most cases conspiring to generate a negative class prediction which
overrides the positive prediction of the random forest.
1 Active Learning for Network Intrusion Detection 13

Table 1.5 Active learning an unsupervised sampling strategy

Sampling strategy Initial F1 F1 after 10 F1 after 50 F1 after 100
Isolation forest 0.94±0.07 0.94±0.05 0.95±-0.09 0.93=0.09
Entropy 0.94±0.07 0.98±0.03 0.99±0.03 0.99±0.03

1.4.3 Sampling the Outliers Generated Using Unsupervised

Learning

Finally, we explore whether we can use an unsupervised method for finding the
most anomalous data points to query. If this methodology is successful, the sampling
strategy is decoupled from active learning and we can simply precompute and cache
the most anomalous data points for the analyst to label.
We compared a sampling strategy based on isolation forest with entropy sampling
(Table 1.5):
In both cases, we are using a random forest learner. The results suggest that
entropy sampling is superior since it is sampling the most uncertain data points in
the context of the current learner and not a global notion of anomaly which isolation
forest provides.

1.5 Conclusion

We have proposed a general active learning framework for network intrusion detec-
tion. We experimented with different learners and observed that more complex learn-
ers can achieve higher detection quality with significantly less labeling effort for most
attack types. We did not explore other complex models such as deep neural networks
and did not attempt to tune the hyper-parameters of our model. Since the bottleneck
associated with this task is the labeling effort, we can add model tuning while staying
within the acceptable latency requirements.
We then explored a few sampling strategies and discovered that uncertainty and
entropy sampling can have a significant benefit over unsupervised or random sam-
pling. However, we also realized that these strategies are not optimal, and we can
extend them to incorporate available information about the distribution of the fea-
tures for mispredicted data points. We attempted a semi-supervised approach called
label spreading that builds the affinity matrix over the normalized graph Laplacian
which can be used to create pseudo-labels for unlabeled data points (Zhou et al. 2004).
However, this methodology is very memory-intensive, and we could not successfully
train and evaluate it on all of the attack types.
14 A. Ziai

References

Mussmann S, Liang P (2018) On the relationship between data efficiency and error for un-certainty
sampling. arXiv preprint arXiv:1806.06123
Pimentel T, Monteiro M, Viana J, Veloso A, Ziviani N (2018) A generalized active learning approach
for unsupervised anomaly detection. arXiv preprint arXiv:1805.09411
Tuor A, Kaplan S, Hutchinson B, Nichols N, Robinson S (2017) Deep learning for unsupervised
insider threat detection in structured cybersecurity data streams. arXiv preprint arXiv:1710.00811
Tuor A, Baerwolf R, Knowles N, Hutchinson B, Nichols N, Jasper R (2018) Recurrent neural
network language models for open vocabulary event-level cyber anomaly detection. Workshops
at the thirty-second AAAI conference on artificial intelligence
Veeramachaneni K, Arnaldo I, Korrapati V, Bassias C, Li K (2016) AI: training a big data machine
to defend. Big Data Security on Cloud (BigDataSecurity), IEEE international conference on high
performance and smart computing (HPSC), and IEEE international conference on intelligent data
and security (IDS), IEEE 2nd international conference, pp 49–54
Zainal A, Maarof MA, Shamsuddin SM (2009) Ensemble classifiers for network intrusion detection
system. J Inf Assur Secur 4(3):217–225
Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global
consistency. In: Advances in neural information processing systems, pp 321–328
Zhu J, Wang H, Yao T, Tsou BK (2008) Active learning with sampling by uncertainty and density
for word sense disambiguation and text classification. In: Proceedings of the 22nd international
conference on computational linguistics, vol 1, pp 1137–1144
Chapter 2
Educational Data Mining Using Base
(Individual) and Ensemble Learning
Approaches to Predict the Performance
of Students

Mudasir Ashraf, Yass Khudheir Salal, and S. M. Abdullaev

Abstract The ensemble approaches involving amalgamation of various learning

classifiers are grounded on heuristic machine learning methods to device prediction
paradigms, as these learning ensemble methods are commonly more precise than
individual classifiers. Therefore, among diverse ensemble techniques, investigators
have experienced a widespread learning classifier viz. bagging to forecast the per-
formance of students. As exploitation of ensemble approaches is considered to be a
remarkable phenomenon in prediction and classification mechanisms, therefore con-
sidering the striking character and originality of analogous method, namely bagging
in educational data mining, researchers have applied this specific approach across
the existing pedagogical dataset obtained from the University of Kashmir. The entire
results were estimated with 10-fold cross validation, once pedagogical dataset was
subjected to base classifiers comprising of j48, random tree, naïve bayes, and knn.
Consequently, based on the learning phenomenon of miscellaneous types of classi-
fiers, prediction models have been proposed for each classifier including base and
meta learning algorithm. In addition, techniques specifically SMOTE (oversampling
method) and spread subsampling (undersampling method) were employed to further
draw a relationship among ensemble classifier and base learning classifiers. These
methods were exploited with the key objective to observe further enhancement in
prediction exactness of students.

2.1 Introduction

The fundamental concept behind ensemble method is to synthesize contrasting base

classifiers into a single classifier, which is more precise and consistent in terms

M. Ashraf (B)
School of CS and IT, Jain university, Bangalore190006, India
Y. K. Salal · S. M. Abdullaev
Department of System Programming, South Ural State University, Chelyabinsk, Russia
e-mail: yasskhudheirsalal@gmail.com
S. M. Abdullaev
e-mail: abdullaevsm@susu.ru
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 15
G. K. Verma et al. (eds.), Data Science, Transactions on Computer Systems
and Networks, https://doi.org/10.1007/978-981-16-1681-5_2
16 M. Ashraf et al.

of prediction accuracy produced by the composite model and in decision making.

This theory of hybridizing multiple models to develop a single predictive model
has been under study since decades. According to Buhlmann and Yu (Bü hlmann
and Yu 2003), the narrative of ensemble techniques incepted in the beginning of
1977 through Turkey Twicing, who initiated the ensemble research by integrating
couple of linear regression paradigms (Bü hlmann and Yu 2003). The application
of ensemble approaches can be prolific in enhancing the excellence and heftiness
of various clustering learning models (Dimitriadou et al. 2003, 2018a; Ashraf et al.
2019). From the past empirical research undertaken by different machine learning
researchers acknowledges that there is considerable advancement in mitigating the
generalization error, once output of multiple classifiers are synthesized (Ashraf et al.
2020; Opitz and Maclin 1999; Salzberg 1994, 2018b).
Due to the inductive bias of individual classifiers involved in the phenomenon of
ensemble approach, it has been investigated that ensemble methods are very effective
in nature than deployment of individual base learning algorithms (Geman et al.
1992). In fact, distinct mechanisms of ensembles can be efficacious in squeezing the
variance error to some level (Ali and Pazzani 1996) without augmenting the bias
error associated with the classifier (Ali and Pazzani 1996). In certain cases, the bias
error can be curtailed using ensemble techniques, and identical approach has been
highlighted by the theory of large margin classifiers (Geman et al. 1992).
Moreover, ensemble learning methods have been applied in diverse areas includ-
ing bioinformatics (Bartlett and Shawe-Taylor 1999), economics (Tan et al. 2003),
health care (Leigh et al. 2002), topography (Mangiameli et al. 2004), production
(Ahmed and Elaraby 2014), and so on. There are several ensemble approaches that
have been deployed by various research communities to foretell the performance of
different classes pertaining to miscellaneous datasets. The preeminent and straight-
forward ensemble-based concepts are bagging (Bruzzone et al. 2004) and boosting
(Maimon and Rokach 2004), wherein predictions are based on combined output,
generally through subsamples of the training set on different learning algorithms.
The predictions are, however, geared up through the process of voting phenomenon
(Ashraf et al. 2018).
Another method of learning viz. meta learning, targets on selecting the precise
algorithm for making predictions while solving specific problems, which is based
on the inherent idiosyncrasy of the dataset (Breiman 1996). The performance in
meta learning can in addition be grounded on other effortless learning algorithms
(Brazdil et al. 1994). Another widespread practice employed by (Pfahringer et al.
2000) for making decisions via using ensemble technique is to generate subsamples
of comprehensive dataset and exploit them on each algorithm.
Researcher have made valid attempts and have applied various machine learning
algorithms to improve prediction accuracy in the field of academics (Ashraf and
Zaman 2017; Ashraf et al. 2017, 2018c; Sidiq et al. 2017; Salal et al. 2021; Salal and
Abdullaev 2020; Mukesh and Salal 2019). Contemporarily, realizing the potential
application of ensemble methods, several techniques are at the disposal to the research
community for ameliorating prediction accuracy as well as explore possible insights
that are obscure within large datasets. Therefore, in this study, primarily efforts
2 Educational Data Mining Using Base (Individual) … 17

Table 2.1 Exhibits results of diverse classifiers

Classifier Correctly Incorrectly TP rate FP rate Precision Recall F- ROC Rel.
name classified classified (%) measure area Abs.
(%) Err.
(%)
J48 92.20 7.79 0.922 0.053 0.923 0.919 0.922 0.947 13.51
Random tree 90.30 9.69 0.903 0.066 0.903 0.904 0.903 0.922 15.46
Naïve Bayes 95.50 4.45 0.955 0.030 0.957 0.956 0.955 0.994 7.94
KNN 91.80 8.18 0.918 0.056 0.919 0.918 0.917 0.934 13.19

would be propounded to categorize all significant methods employed in the realm of

ensemble approaches and procedures.
Moreover, to make further advancements in this direction, researchers would
be employing classification algorithms including naïve bayes, KNN, J48, random
tree, and an ensemble method viz. bagging on the pedagogical dataset attained from
university of Kashmir, inorder to improve the prediction accuracy of students. Fur-
thermore, from the past literature related to educational data mining hitherto, the
researchers have done self-effacing efforts to exploit ensemble methods. Therefore,
there is still deficit of research conduct within this realm. Moreover, innovative
techniques are indispensable to be employed across pedagogical datasets, so as to
determine prolific and decisive knowledge from educational settings.

2.2 Performance of Diverse Individual Learning Classifiers

In this study, primarily, we have applied four learning classifiers such as j48, ran-
dom tree, naïve bayes, and knn across academic dataset. Thereafter, the academic
dataset was subjected to progression of oversampling and undersampling methods
to corroborate whether there is any improvement in prediction achievements of stu-
dent’s outcome. Correspondingly, the analogous procedure is practiced over ensem-
ble methodologies including bagging and boosting to substantiate which learning
classifier among base or meta has demonstrated compelling results.
Table 2.1 portrays outcome of diverse classifiers accomplished subsequent to
running these machine learning classifiers across educational dataset. Moreover, it
is unequivocal that naïve bayes has achieved notable prediction precision of 95.50%
in classifying the actual occurrences, incorrectly classification error of 4.45%, and
minimum relative absolute error of 7.94% in contrast to remaining classifiers. The
supplementary calculations related with the learning algorithm such as Tp rate, Fp
rate, precision, recall, f -measure, and ROC area have been also found significant.
Conversely, random tree produced although substantial classification accuracy of
90.03%, incorrectly classified instances as 9.69%, (RAE) relative absolute error of
15.46%, and supplementary parameters connected with the algorithm were found
18 M. Ashraf et al.

Table 2.2 Shows results with SMOTE process

Classifier Correctly Incorrectly TP rate FP rate Precision Recall F- ROC Rel.
name classified classified (%) measure area Abs.
(%) Err.
(%)
J48 92.98 7.01 0.930 0.038 0.927 0.925 0.930 0.959 11.43
Random tree 90.84 9.15 0.908 0.049 .909 0.908 0.908 0.932 13.92
Naïve Bayes 97.15 2.84 0.972 0.019 0.973 0.972 0.974 0.978 4.60
KNN 92.79 7.20 0.928 0.039 0.929 0.928 0.929 0.947 10.98

noteworthy as well, and nevertheless acquired outcomes were least considerable

among remaining algorithms.

2.2.1 Empirical Results of Base Classifiers with

Oversampling Method

Table 2.2 exemplifies results of diverse classifiers subsequent to the application of

oversampling technique, namely SMOTE across pedagogical dataset. As per the
results furnished in the below-mentioned table, all classifiers have shown exem-
plary improvement in prediction accuracy along with additional performance matri-
ces related with the classifiers after using oversampling method. Therefore, Tables
2.1 and 2.2 disclose expansion of miscellaneous algorithms viz. j48 (from 92.20
to 92.98%), random tree (90.30–90.84%), naïve bayes (95.50–97.15%), and knn
(91.80–92.79%).
Additionally, the relative absolute errors related with the individual algorithms
after SMOTE demonstrated further improvement from 13.51 to 11.43% (j48), 15.46–
13.92% (random tree), 7.94–4.60% (naïve bayes), and 13.19–10.98% (knn), than
other estimates. On the contrary, ROC (area under curve) has shown minute discrep-
ancy in case of naïve bayes algorithm with definite variation in its values from 0.994
to 0.978.

2.2.2 Empirical Outcomes of Base Classifiers with

Undersampling Method

After successfully deploying spread subsampling (undersampling technique) over

real pedagogical dataset, the underneath Table 2.3 puts forth the results. The under-
sampling method has depicted excellent forecast correctness in case of knn classifier
from 91.80% to 93.94% which has exemplified supremacy over knn using oversam-
pling technique (91.80–92.79%).
2 Educational Data Mining Using Base (Individual) … 19

Table 2.3 Demonstrates results with undersampling method

Classifier Correctly Incorrectly TP rate FP rate Precision Recall F- ROC Rel.
name classified classified (%) measure area Abs.
(%) Err.
(%)
J48 92.67 7.32 0.927 0.037 0.925 0.926 0.924 0.955 11.68
Random tree 88.95 11.04 0.890 0.055 0.888 0.889 0.896 0.918 16.65
Naïve Bayes 95.85 4.14 0.959 0.021 0.960 0.959 0.959 0.996 7.01
KNN 93.94 6.05 0.939 0.030 0.939 0.937 0.939 0.956 9.39

Entire performance estimates connected with knn learning algorithm such as Tp

rate (0.918–0.939), Fp rate (0.030–0.056), precision (0.919–0.939), recall (0.918–
0.937), f -measure (0.917–0.939), ROC area (0.934–0.956), and relative absolute
error (13.19–9.39%) have explained exceptional results, which have been demon-
strated in Table 2.1 (prior to application of undersampling) and Table 2.3 (post
undersampling). Nevertheless, undersampling procedure has demonstrated unpre-
dictability in results across random tree classifier whose performance has declined
from 90.30 to 88.95%. Although, the forecast correctness of j48 and naïve bayes
has exemplified significant achievements (92.20–92.67, 95.50–95.85% correspond-
ingly), but outcomes are not as noteworthy as oversampling procedure has produced
(Table 2.2).

2.3 Bagging Approach

Under this subsection, bagging has been utilized using various classifiers that are
highlighted in Table 2.4. Nevertheless, after employing bagging, the prediction accu-
racy has demonstrated paramount success over base learning mechanism. The cor-
rectly classified rate in Table 2.4 when contrasted with initial prediction rate of
different classifiers in Table 2.1 have shown substantial improvement in three learn-
ing algorithms such as j48 (92.20–94.87%), random tree (90.30–94.76%), and knn
(91.80–93.81%).
In addition, the incorrectly classified instances have come down to considerable
level in these classifiers, and as a consequence, supplementary parameters viz. Tp
rate, Fp rate, precision, recall, ROC area, and f -measure related with these classifiers
have also rendered admirable results. However, naïve bayes has not revealed any
significant achievement in prediction accuracy with bagging approach, and moreover,
relative absolute error associated with each meta classifier has augmented while
synthesizing different classifiers.
20 M. Ashraf et al.

Table 2.4 Shows results using bagging approach

Classifier Correctly Incorrectly TP rate FP rate Precision Recall F- ROC Rel.
name classified classified (%) measure area Abs.
(%) Err.
(%)
Bag. with 94.87 5.12 0.949 0.035 0.949 0.947 0.948 0.992 14.55
J48
Bag.with 94.76 5.23 0.948 0.036 0.948 0.946 0.947 0.993 16.30
Random tree
Bag. with 95.32 4.67 0.953 0.031 0.954 0.953 0.952 0.993 8.89
Naïve Bayes
Bag. with 93.81 6.18 0.938 0.042 0.939 0.937 0.938 0.983 11.63
KNN

Table 2.5 Displays results of bagging method with SMOTE

Classifier Correctly Incorrectly TP rate FP rate Precision Recall F- ROC Rel.
name classified classified (%) measure area Abs.
(%) Err.
(%)
Bag. with 95.21 4.78 0.952 0.026 0.953 0.951 0.952 0.994 11.91
J48
Bag. with 95.21 4.79 0.952 0.026 0.954 0.951 0.951 0.996 13.11
random tree
with Naïve 95.15 3.84 0.962 0.020 0.963 0.962 0.961 0.996 7.01
Bayes
Bag. with 94.68 5.31 0.947 0.028 0.948 0.947 0.946 0.988 9.49
KNN

2.3.1 Bagging After SMOTE

The oversampling method(SMOTE) when applied on ensemble of each algorithms

viz. j48, random tree, naïve bayes, and knn with bagging system, the results attained
afterwards have explicated considerable accuracy in its prediction, and the statistical
figures are represented in Table 2.5.
The results not only have publicized improvement in correctly classified and
wrongly classified instances, Tp rate, Fp rate, precision, and so on, but more notice-
ably in relative absolute error had shown inconsistency in earlier Table 2.5. However,
naïve bayes with bagging method again has not shown any development in its predic-
tion accuracy. Nevertheless, misclassified instances, relative absolute error, and other
performance estimates have demonstrated substantial growth. Furthermore, bagging
with j48 classifier delivered best forecasting results among other classifiers while
comparing entire set of parameters with ensemble technique (bagging).
2 Educational Data Mining Using Base (Individual) … 21

Table 2.6 Explains outcomes of bagging method with undersampling method

Classifier Correctly Incorrectly TP rate FP rate Precision Recall F- ROC Rel.
name classified classified (%) measure area Abs.
(%) Err.
(%)
Bag. with 95.43 4.56 0.954 0.023 0.955 0.953 0.954 0.995 13.24
J48
Bag. with 94.79 5.20 0.948 0.026 0.949 0.947 0.948 0.995 13.98
random tree
Bag. with 96.07 3.92 0.961 0.020 0.963 0.961 0.961 0.997 6.90
Naïve Bayes
Bag. with 92.99 7.00 0.930 0.035 0.932 0.929 0.930 0.985 10.76
KNN

2.3.2 Bagging After Spread Subsampling

Bagging procedure, when deployed with undersampling technique (Spread subsam-

pling), has shown advancement in prediction accuracy with two classifiers, namely
j48 (95.43%) and naïve bayes (96.07%) that are referenced in Table 2.6. Using
undersampling method, naïve bayes has generated paramount growth from 95.15 to
96.07% in distinction to earlier results (Table 2.5) acquired with SMOTE technique,
and relative absolute error has reduced to statistical value of 6.90%. On the contrary,
bagging with random tree and knn have produced relatively significant results but
with less precision in comparison to bagging with oversampling approach.
Figure 2.1 summarizes the precision and relative absolute error of miscellaneous
learning algorithms under application of different approaches viz. bagging without
subject to filtering process, bagging with both SMOTE, and spread subsampling.
Among all classifiers with inclusion of bagging concept and without employment
of filtering procedures, naïve bayes performed outstanding with 95.32%. Further-
more, with oversampling technique (SMOTE), the ensemble of identical classifiers
produced results relatively with same significance. However, by means of undersam-
pling technique, naïve bayes once again achieved exceptional prediction accuracy of
96.07%. Moreover, the below-mentioned figure symbolizes relative absolute error of
entire bagging procedures, and consequently among classifiers, naive base has gen-
erated admirable results with minimum relative absolute errors of 8.89% (without
filtering process), 7.01% ( SMOTE) and 6.90% (Spread subsampling).

2.4 Conclusion

In this research study, the central focus has been early prediction of student’s out-
come using various individual (base) and meta classifiers to provide timely guidance
for weak students. The individual learning algorithms employed across pedagogical
22 M. Ashraf et al.

Fig. 2.1 Visualizes the results after deployment of different methods

data including j48, random tree, naïve bayes, and knn which have evidenced phe-
nomenal prediction accuracy of student’s final outcomes. Among each base learning
algorithms, naïve bayes attained paramount accuracy of 95.50%. As the dataset
in this investigation was imbalanced which could have otherwise culminated with
inaccurate and biased outcomes, therefore academic dataset was exploited to filter-
ing approaches, namely synthetic minority oversampling technique ( SMOTE) and
spread subsampling.
In this contemporary study, a comparative revision was conducted with base and
meta learning algorithms, followed by oversampling (SMOTE) and undersampling
(spread subsampling) techniques to get a comprehensive knowledge which classifiers
can be more precise and decisive in generating predictions. The above-mentioned
base learning algorithms were subjected to phenomenon of oversampling and under-
sampling methods. The naïve bayes yet again demonstrated noteworthy improve-
ment of 97.15% after practiced with oversampling technique. With undersampling
technique, knn showed exceptional improvement of 93.94% in prediction accuracy
over other base learning algorithms. However, in case of ensemble learning such as
bagging, among all classifiers bagging with naïve bayes accomplished convincing
correctness of 95.32% in predicting the exact instances.
The bagging algorithm, when put into effect with techniques such as oversam-
pling and undersampling, the ensembles generated from classifiers viz. j48 and naïve
bayes demonstrated with significant accuracy and least classification error (95.21%,
bagging with j48 and 96.07%, bagging with naïve bayes), respectively.

References

Ahmed ABED, Elaraby IS (2014) Data mining: a prediction for student’s performance using clas-
sification method. World J Comput Appl Technol 2(2):43–47
Ali KM, Pazzani MJ (1996) Error reduction through learning multiple descriptions. Mach Learn
24(3):173–202
2 Educational Data Mining Using Base (Individual) … 23

Ashraf M et al (2017) Knowledge discovery in academia: a survey on related literature. Int J Adv
Res Comput Sci 8(1)
Ashraf M, Zaman M (2017) Tools and techniques in knowledge discovery in academia: a theoretical
discourse. Int J Data Min Emerg Technol 7(1):1–9
Ashraf M, Zaman M, Ahmed Muheet (2018a) Using ensemble StackingC method and base classi-
fiers to ameliorate prediction accuracy of pedagogical data. Proc Comput Sci 132:1021–1040
Ashraf M, Zaman M, Ahmed M (2018b) Using predictive modeling system and ensemble method
to ameliorate classification accuracy in EDM. Asian J Comput Sci Technol 7(2):44–47
Ashraf M, Zaman M, Ahmed M (2020) An intelligent prediction system for educational data mining
based on ensemble and filtering approaches. Proc Comput Sci 167:1471–1483
Ashraf M, Zaman M, Ahmed M (2018c) Performance analysis and different subject combinations:
an empirical and analytical discourse of educational data mining. In: 8th international conference
on cloud computing. IEEE, data science & engineering (confluence), p 2018
Ashraf M, Zaman M, Ahmed M (2019) To ameliorate classification accuracy using ensemble vote
approach and base classifiers. Emerging technologies in data mining and information security.
Springer, Singapore, pp 321-334
Bartlett P, Shawe-Taylor J (1999) Generalization performance of support vector machines and other
pattern classifiers. Advances in Kernel methods—support vector learning, pp 43–54
Brazdil P, Gama J, Henery B (1994) Characterizing the applicability of classification algorithms
using meta-level learning. In: European conference on machine learning. Springer, Berlin, Hei-
delberg, p 83102
Breiman L (1996). Bagging predictors. Machine Learn 24(2): 123–140; Freund Y, Schapire RE
(1996) Experiments with a new boosting algorithm. ICML 96:148–156
Bruzzone L, Cossu R, Vernazza G (2004) Detection of land-cover transitions by combining multidate
classifiers. Pattern Recogn Lett 25(13):1491–1500
Bü hlmann P, Yu B (2003) Boosting with the L 2 loss: regression and classification. J Am Stat Assoc
98(462):324–339
Dimitriadou E, Weingessel A, Hornik K (2003) A cluster ensembles framework, design and appli-
cation of hybrid intelligent systems
Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural
Comput 4(1):1–58
Leigh W, Purvis R, Ragusa JM (2002) Forecasting the NYSE composite index with technical
analysis, pattern recognizer, neural network, and genetic algorithm: a case study in romantic
decision support. Decision Support Syst 32(4):361–377
Maimon O, Rokach L (2004) Ensemble of decision trees for mining manufacturing data sets. Mach
Eng 4(1–2):32–57
Mangiameli P, West D, Rampal R (2004) Model selection for medical diagnosis decision support
systems. Decision Support Syst 36(3):247–259
Mukesh K, Salal YK (2019) Systematic review of predicting student’s performance in academics.
Int J Eng Adv Techno 8(3): 54–61
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–
198
Pfahringer B, Bensusan H, Giraud-Carrier CG (2000) Meta-Learning by land-marking various
learning algorithms. In: ICML, pp 743–750
Salal YK, Abdullaev SM (2020, December) Deep learning based ensemble approach to predict
student academic performance: case study. In: 2020 3rd International conference on intelligent
sustainable systems (ICISS) (pp 191–198). IEEE
Salal YK, Hussain M, Paraskevi T (2021) Student next assignment submission prediction using a
machine learning approach. Adv Autom II 729:383
Salzberg SL (1994) C4. 5: programs for machine learning by J. Rossquinlan. Mach Learn 16(3):235–
240
Sidiq SJ, Zaman M, Ashraf M, Ahmed M (2017) An empirical comparison of supervised classifiers
for diabetic diagnosis. Int J Adv Res Comput Sci 8(1)
Another Random Document on
Scribd Without Any Related Topics
CHAPTER XXVII
WHAT WAS THAT?

Dorothy stood very quiet for a moment, saying nothing, just

staring at her chum.
Then suddenly she began to laugh—a wild sort of laughter that had
tears in it.
Tavia looked at her sharply, then reached out a hand and gripped
her hard.
“Dorothy, you’ve got to stop that!” she cried. “There isn’t anything
to laugh about—really, you know.”
“That’s why I’m laughing, I guess!” retorted Dorothy.
But she had stopped her untimely mirth and was gazing moodily
enough at the sodden, dreary forest about them.
“We shouldn’t be standing under a tree in a thunder and lightning
storm,” she said absently. “It’s dangerous.”
It was Tavia’s turn to laugh.
“So I’ve heard,” she said. “And if you can tell me any way that we
can avoid it, I’ll be very grateful. Oh, Doro, what’s the use? We are
just stuck, that’s all.”
That fact was so obvious that Dorothy did not take the trouble to
answer it.
“It’s all my fault,” said Tavia after a moment, her voice sounding
queer and remote above the clamor of the storm. “I ought to have
looked where I was going.”
“It isn’t your fault any more than mine,” Dorothy declared.
“Anyway, nobody could look where she was going in this storm.”
“Well, I suppose we might as well go on,” said Tavia, slapping the
reins upon the pony’s sleek and steaming back. “If we have luck we
may stumble on the path.”
“Stumble is right,” said Dorothy wearily, as she urged her reluctant
pony onward. “Oh, if I could only lie down somewhere,” she added,
in a tone that she made sure would not reach Tavia. Then the
absurdity of her wish appealed to her and in spite of the misery and
danger of their predicament, she was forced to laugh at herself.
“So many nice comfortable places around here to lie down in,” she
told herself, sweeping a hand about at the sodden landscape.
“Although it would be hard to be more wet and miserable than we
are just now,” she added.
They wandered on for a long time—they had no conception of just
how long. Finally, because the chill was creeping into their bones and
they felt stiff and cramped in their saddles, they dismounted and
stumbled along on foot, leading their ponies.
At least they would get some exercise and keep the blood stirring
in their veins.
Then at last relief came, or partial relief. The storm at last blew
itself away and the sun—a faltering and late-afternoon sun, but the
sun nevertheless—broke through the heavy clouds.
Tavia was inclined to greet him with loud exclamations of joy, but
Dorothy was too bruised and anxious and miserable of mind and
body to care very much whether the sun shone or not.
They sat down after a while on a couple of rocks that seemed not
quite so wet as the surrounding country to talk things over.
“Garry and the rest of the handsome cowboys must be somewhere
in the neighborhood,” said Tavia, determined to take a cheerful view.
“And if one of them doesn’t stumble upon us Garry is sure to send
out a searching party as soon as he finds we are gone.”
“But he won’t know we are gone till he gets back to the ranch, and
that may be late to-night,” Dorothy pointed out to her, adding with a
little moan: “What will he think of me when he finds what I have
done!”
“What we have done,” corrected Tavia. “Anyway, he will be far too
glad to get you back again to scold. You can be sure of that.”
“And Joe! We have done a lot toward finding Joe!” went on
Dorothy bitterly. “Those men could have done anything they liked to
him as far as we are concerned. As trailers we are a brilliant success!”
“We haven’t set the world on fire yet,” Tavia admitted, as she
jumped briskly to her feet. “But there is no use giving up the old ship
so soon. As long as we can’t find our way out of the trackless forest
we might as well make good use of our time and keep on hunting for
Joe.”
Dorothy stared at her chum for an instant. Then she also got to her
feet, though stiffly and wearily. She was beginning to be achingly
conscious of numerous bruises she had not known she possessed, of
sharp twinges in her back and arms that made her want to cry aloud
with the stabbing pain.
But if anything could be done, if there was the slightest chance of
finding Joe—though this she doubted—she would not give up.
“You are a confirmed optimist, Tavia honey,” she said. “But I’m
glad you are. You make a mighty-much cheerfuller companion, that
way.”
“You said it!” Tavia replied, as they started on slowly, leading the
horses. “Although I must confess that, internally, I am not as
cheerful as I have sometimes been. Something whispers that it has
been a long, long time since I gratified my craving for sustenance.”
“Oh, I don’t believe I can ever eat again!” cried Dorothy.
“You just wait till somebody tries you on a good hot plate of stew
or some good hot vegetable soup,” advised Tavia sagely. “My, what
would I give for a sniff of Mrs. Hank Ledger’s kitchen just now!”
“Oh, don’t! What is the use!” cried Dorothy, and to Tavia’s
complete surprise and dismay she began to cry, not violently, but
softly and pathetically as if she could no longer check the tears.
“Doro darling!” cried Tavia, putting an arm about her chum in
instant sympathy and alarm. “What is the matter? You? Why, you
never did this before!”
“I know it,” replied Dorothy, dabbing at her eyes with a sodden
handkerchief. “But I ache so, Tavia, and I am so frightened about
Joe, and I wish Garry were here. Then, when you spoke of the ranch
kitchen, it was just about the last straw!”
“You might know I would go and put my foot in it!” cried Tavia
penitently and quite at a loss what to do next. “You poor girl. You got
horribly banged up with that fall. If you weren’t the best sport ever
you wouldn’t go on at all. But honestly, Doro, I don’t know what to
do.”
“Of course you don’t,” cried Dorothy, trying to smile and
succeeding pretty well, considering. “And I am a goose to act this way
——”
She stopped short, a curious expression leaping to her eyes.
What was that she had heard?
Had it been a wail—a cry for help?
Nonsense! In this wilderness?
Again it came, and this time unmistakable.
She clung to Tavia, her face terrible to see in its agony of doubt, of
sudden hope.
“Some one is in trouble!”
Tavia whispered the words as though loth to break the tense
silence between them.
But suddenly Dorothy broke from her, running wildly, blindly
through the woods.
“It’s all right, Joe darling! I’m coming! Dorothy’s coming!”
CHAPTER XXVIII
A VOICE IN THE MOUNTAIN

Tavia overtook Dorothy, grasped her fiercely by the arm and

clapped a frantic hand upon her mouth.
“Hush, Doro! Are you mad?” she whispered fiercely. “There is
something queer going on here. You must not let any one hear you.”
“But it was Joe!” cried Dorothy, struggling frantically to be free.
“Didn’t you hear? It was Joe’s voice! Let me go, Tavia! Let me go!”
“Not until you can listen to reason,” cried Tavia, and Dorothy
suddenly became quiet, staring at her tensely.
“Oh, you are right—of course you are right,” she said, making a
terrible effort to calm herself. “I was a little mad, I guess. Joe calling
for help. Tavia, we must go to him quickly!”
“Of course we must,” agreed Tavia soothingly. “But it won’t do us
any good to rush in when we don’t know what we may be rushing
into. Besides, how can you be sure that was Joe’s voice?”
“Oh, Tavia, I know! Don’t you suppose I would know his voice
anywhere?”
Tavia nodded and scanned the mountain side with puzzled eyes.
“Where do you suppose it came from?” asked Dorothy, her voice
lowered to a whisper. She was beginning to tremble and her teeth
chattered uncontrollably. “It sounded as if——”
“It came from the side of the mountain,” Tavia replied. “I can’t
understand it, but if we go cautiously we probably can solve the
mystery.”
But to “go cautiously” was the last thing Dorothy wanted to do just
then. Usually the cautious one, accustomed to restraining the
impetuous Tavia, now the tables were reversed. Dorothy was the one
who could brook no delay, Tavia the one who counseled caution.
But though Dorothy’s heart urged her to fly to Joe, knowing that
he was in peril, her head whispered that Tavia’s advice was sound—
that they must proceed with infinite caution if they meant to help her
brother.
When Tavia said that the sound seemed to come from the side of
the mountain she had meant to be taken literally.
Through the woods and directly in front of them they could see the
mountain where it rose abruptly upward. There was no trail at this
point, for here the mountain was practically unclimbable.
The trail, the one they had lost, zigzagged tortuously this way and
that seeking those sections of the mountain where it was possible for
men to force a pathway.
“We had better tether our ponies here,” Dorothy suggested softly.
“If we take them much farther they are apt to whinny.”
“Excellent idea!” said Tavia, suiting the action to the word. “Now,
we’ll see what is funny about that mountain.”
Silently they crept through the woods, careful to avoid twigs that
might crack under their feet.
Once when Tavia caught her toe in the gnarled root of a tree and
fell full length upon the ground, she lay there for several seconds,
afraid to move while Dorothy stood motionless, her hand touching
the trunk of a tree to steady herself.
Nothing happened, no sound broke the murmurous silence of the
woods, and finally they gained courage to start again.
They had gained some distance when Dorothy stopped,
bewildered, and reached out a hand to Tavia.
“It’s queer we don’t hear any further sound from him,” she said,
her lips close to Tavia’s ear. “I can’t tell which way to go, can you?”
Tavia shook her head and was about to speak when Dorothy raised
her hand imploringly.
She had heard another sound, and they were startlingly close to it.
A man was speaking and although they could not hear the words
they could tell by his tone that they were angry and threatening. And
again the voice seemed to come from the heart of the mountain itself.
“Where in the world does that voice come from?” whispered Tavia.
“I don’t mind telling you, Doro, that it has me scared.”
Dorothy held up her hand again, gesturing for silence. Then,
before Tavia knew what she was up to, Dorothy flung herself face
down upon the ground and with infinite caution made her way, eel-
like, toward a huge rock that jutted out from the mountainside.
Wondering, Tavia followed her example.
Dorothy did not increase her speed even when a sharp cry rang
out, shattering the silence with breath-taking abruptness.
“I won’t do it—you—you—” came a boy’s voice, broken and furious.
“You wouldn’t try to make me do a thing like that if you weren’t a lot
of cowards! You wait till I tell Garry! You just wait!”
“Oh, we’ll wait all right, kid.”
The girls were near enough now to hear the sneering words,
although the tone was still carefully lowered.
The boy tried to answer, but a heavy hand across his mouth
strangled the defiance.
Dorothy had reached the jutting, out-flung rock and had solved the
mystery of the mountain.
For the rock served as a gigantic door, almost blocking up the
entrance of a cave that seemed to extend far into the mountain. From
where she and Tavia had stood when Joe’s desperate cry first
reached their ears, the rock entirely concealed the entrance to the
cave.
A most excellent retreat and one admirably adapted to the needs of
Larrimer and his gang!
Tavia crowded close to her side and Dorothy saw that she also had
discovered the answer to the riddle.
With infinite caution Dorothy crept still closer to the entrance of
the cave, peering around the edge of the rock.
The cave was so dark that at first she could see nothing.
Then, as her eyes became accustomed to the gloom, she made out
the figure of a man squatting upon something that looked like an
overturned keg or small barrel. His back was turned squarely to her
so that she could not catch even a profile glimpse of his face.
Then, her eyes searching feverishly, they fell upon an object that
very nearly caused her to forget the need of caution.
Lying huddled upon the floor of the cave, pushed a little further
into the darkness than the man’s figure, was something that
appeared to be a bunch of old clothes. It moved, cried out in misery,
and Dorothy knew that it was Joe.
Every instinct in her prompted her to fly to him, to take him in her
arms and loose the cruel bonds that bound him.
She half rose to her feet. A sound that seemed loud to Tavia,
crouching at her side, but was, in reality, only the shadow of a sound,
escaped her lips.
Tavia immediately drew her down, pressed a warning hand against
her lips.
“Don’t spoil it all now!” she hissed. “Lie still and wait.”
Dorothy nodded mutely and peered round the rock again.
Suddenly she pressed back, pushing Tavia with her behind the
shelter of its huge bulk.
For the man had risen and was moving toward the entrance of the
cave.
“So you think you won’t, my hearty,” they heard him say in his
heavy, jeering tone. “Well, I am goin’ to give you just one more
chance before we really begin to put the screws on. This here little
letter we want you to write, my lad, ain’t goin’ to hurt Garry Knapp
none.” The scoundrel condescended to an argumentative tone and
Dorothy clinched her hands fiercely.
“All you have to do is to write him a letter,” the heavy voice went
on, “tellin’ him you will be as free as air as soon as he agrees to sell us
his land—at a fair figure, mind, a very fair figure. You would be doin’
him a favor, really. Think of all that cash right in his hand to-
morrow, say, or the next day at the outside. You would be doin’ him a
favor and savin’ your own skin at the same time. Come now, how
about it? Let’s be sensible.”
Dorothy listened breathlessly for her brother’s answer. She did not
realize how much that answer meant to her till later when she found
the imprint of her fingernails in the palms of her two hands.
“Say, I can’t tell you what I think of you—I don’t know words that
are bad enough!” cried Joe furiously. “But I know you’re a—a—bum—
and I’ll get even with you for this some day.”
“Some day—mebbe,” the man sneered. “But in the meantime this
place ain’t goin’ to be any bed of roses for you, my lad. You gotta
think of that, you know.”
“I don’t care, as long as I play fair with Garry,” muttered the boy. “I
—I—don’t care what—what you do with me.”
But Dorothy knew that, despite all his bravado, Joe was only a boy
and he did care. And even while her heart ached with pity, it thrilled
with pride at the thought that he had stood the test, had proved
himself a thoroughbred. He would “play fair” with Garry, no matter
what happened.
She shrank back suddenly as Joe’s tormentor brushed the rock
that guarded the entrance of the cave and disappeared into the
woods.
“Now, Tavia!” she whispered tensely. “Now!”
CHAPTER XXIX
THE DASTARDLY PLOT

The two girls waited to make sure there was no one else in the cave
besides Joe, listened until the sounds made by his captor crashing
through the underbrush had died away.
Then Dorothy ran to him, sank to her knees beside him, laughed
and cried over him as she lifted his head and held it tight against her.
“Joe, Joe! why did you run away? We’ve been nearly crazy, dear!
No, no, don’t cry, Joe darling! It’s all right. Your Dorothy is here.
Nothing, nothing will ever hurt you again.”
Her arms tightened about him fiercely and the boy sobbed, great,
tearing sobs that he was ashamed of but could not control.
The storm lasted only a minute, and then he said gruffly, big-boy
fashion, to hide his weakness:
“I—you oughtn’t to come near me, Dot. I—I’ve done an awful thing
and got myself into a heap of trouble!”
“Never mind about that now, dear,” cried Dorothy, suddenly
recalled to the peril of their situation. “We’ve got to get you away
before that dreadful man comes back.”
“He went off to fetch the others,” said Joe, growing suddenly eager
and hopeful now that rescue seemed near. “They are going to do
something awful to me because I wouldn’t——”
“Yes, yes, Joe, I know. But now be quiet,” cried Dorothy, as she
propped him up against the wall and began to work feverishly at the
knots of the heavy cord that bound his feet and hands. “Some one
might hear you and—oh, we must get away from here before they
come back!”
“Here, I have something better than that,” cried Tavia, who had
been watching Dorothy’s clumsy efforts to unloose Joe’s bonds.
She fished frantically in the pockets of her jacket and brought forth
a rather grimy ball of cord and a penknife. This she held up
triumphantly.
“A good sight better than your fingers!”
“Oh, give it to me, quickly,” cried Dorothy, reaching for the knife in
an agony of apprehension. “Oh, it won’t open! Yes, I have it!”
With the sharp blade she sawed feverishly at the cords.
They gave way one after another and she flung them on to the floor
of the cave.
Joe tried to get to his feet, but stumbled and fell.
“Feel funny and numb, kind of,” he muttered. “Been tied up too
long, I guess.”
“But, Joe, you must stand up—you must!” cried Dorothy
frantically. “Come, try again. I’ll hold you. You must try, Joe. They
will be back in a minute! Never mind how much it hurts, stand up!”
With Dorothy’s aid Joe got to his feet again slowly and painfully
and stood there, swaying, an arm about his sister’s shoulders, the
other hand clenched tight against the damp, rocky wall of the cave.
The pain was so intense as the blood flowed back into his tortured
feet that his face went white and he clenched his teeth to keep from
crying out.
“Do you think you can walk at all, dear?” asked Dorothy, her own
face white with the reflection of his misery. “If you could manage to
walk a little way! We have horses in the woods and it would be
harder for them to find us there. Try, Joe dear! Try!”
“I guess I can make it now, Sis,” said Joe from between his
clenched teeth. “If Tavia will help a little too—on the other side.”
“I guess so!” cried Tavia with alacrity, as she put Joe’s other arm
about her shoulders and gave his hand a reassuring squeeze. “Now
something tells me that the sooner we leave this place behind the
healthier it will be for all of us.”
“Hush! What’s that?” cried Dorothy, and they stood motionless for
a moment, listening.
“I didn’t hear anything, Doro,” whispered Tavia. “It was just
nerves, I guess.”
They took a step toward the entrance of the cave, Joe still leaning
heavily upon the two girls.
A horse whinnied sharply and as they paused again, startled, a
sinister shadow fell across the narrow entrance to the cave. They
shrank back as substance followed shadow and a man wedged his
way into the cave.
He straightened up and winked his eyes at the unexpected sight
that met them.
Dorothy stifled a startled exclamation as she recognized him. It
was the small, black-eyed man, Gibbons, known to Desert City as
George Lightly, who stood blinking at them.
Suddenly he laughed, a short, sharp laugh, and turned back toward
the mouth of the cave.
“Come on in, fellows!” he called cautiously. “Just see what I
found!”
Joe’s face, through the grime and dirt that covered it, had grown
fiery red and he struggled to get free of Dorothy and Tavia.
“Just you let me get my hands on him!” he muttered. “I’ll show
him! I’ll——”
“You keep out of this, Joe,” Dorothy whispered fiercely. “Let me do
the talking.”
Three other men squeezed through the narrow opening and stood
blinking in the semi-darkness of the cave.
One of them Dorothy recognized as Joe’s former captor, a big,
burly man with shifty eyes and a loose-lipped mouth, another was
Philo Marsh, more smug and self-sufficient than she remembered
him, and the third was Cal Stiffbold, her handsome cavalier of the
train ride, who had called himself Stanley Blake.
It took the girls, crouched against the wall of the cave, only a
moment to see all this, and the men were no slower in reading the
meaning of the situation.
Stiffbold’s face was suffused with fury as he recognized Dorothy
and Tavia, and he took a threatening step forward. Philo Marsh
reached out a hand and drew him back, saying in mild tones:
“Easy there, Stiffbold. Don’t do anything you are likely to regret.”
“So, ladies to the rescue, eh?” sneered Lightly, thrusting his hands
into his pockets and regarding the girls with an insulting leer.
“Regular little heroines and all, ain’t you? Well, now, I’ll be blowed!”
“Young ladies, this isn’t the place for you, you know.” Philo Marsh
took a step forward, reaching out his hand toward Joe. “You’re
interfering, you know, and you’re likely to get yourselves in a heap o’
trouble. But if you’ll go away and stay away and keep your mouths
closed——”
“And leave my brother here with you scoundrels, I suppose?”
suggested Dorothy.
The hypocritical expression upon the face of Philo Marsh changed
suddenly to fury at her short, scornful laugh.
“Scoundrels, is it?” he sneered. “Well, my young lady, maybe you’ll
know better than to call honest people names before you leave this
place.”
“Honest people! You?” cried Dorothy, no longer able to contain her
furious indignation. “That sounds startling coming from you, Philo
Marsh, and your—honest friends!
“Do you call it honest,” she took a step forward and the men
retreated momentarily, abashed before her fury, “to take a poor boy
away from his people, to hide him here in a place like this, to torture
him physically and mentally, to attempt to make him false to all his
standards of right——”
“See here, this won’t do!” Lightly blustered, but Dorothy turned
upon him like a tigress.
“You will listen to me till I have said what I am going to say,” she
flung at him. “You do all this—you honest men,” she turned to the
others, searing them with her scorn. “And why? So that you can force
Garry Knapp, who has the best farmlands anywhere around here—
and who will make more than good some day, in spite of you, yes, in
spite of you, I say—to turn over his lands to you for a song, an
amount of money that would hardly pay him for the loss of one little
corner of it——”
“Say, are we goin’ to stand here and take this?”
“Yes, you are—Stanley Blake!” Dorothy flamed at him, and the
man retreated before her fury. “And then, when this boy defies you,
what do you do? Act like honest men? Of course you do! You
threaten to ‘put the screws on’ until he is too weak to defy you, a boy
against four—honest—men! If that is honesty, if that is bravery, then
I would rather be like that slimy toad out in the woods who knows
nothing of such things!”
“Hold on there, you!” George Lightly started forward, his hand
uplifted threateningly. “You call us any more of those pretty names
and I’ll——”
“What will you do?” Dorothy defied him gloriously, her eyes
blazing. “You dare to lay a hand upon me or my friend or my
brother,” instinctively her arm tightened about Joe, “and Garry
Knapp will hound you to the ends of the earth. Hark! What’s that?”
She paused, head uplifted, listening.
They all listened in a breathless silence while the distant clatter of
horses’ hoofs breaking a way through the woodland came closer—
ever closer!
“Garry!” Dorothy lifted her head and sent her cry ringing through
the woodland. “We are over this way, Garry, over this way! Come qui
——”
A HORSEMAN BROKE THROUGH THE
UNDERBRUSH. IT WAS GARRY.

“Dorothy Dale to the Rescue.” Page

237
CHAPTER XXX
CAPTURED

A rough hand closed over Dorothy’s mouth, shutting off her

breath, strangling her. In an instant Tavia and Joe were similarly
gagged and helpless.
There was a silence during which their captors waited breathlessly,
hoping that the horseman had not heard the cry, would pass the cave
by.
For a moment, remembering how well the spot was concealed,
Dorothy was horribly afraid that this might actually happen. If it was
really Garry coming! If he had heard her!
But the clattering hoofs still came on. She could hear the shouts of
the riders, Garry’s voice, calling her name!
She felt herself released with a suddenness and violence that sent
her reeling toward the rear of the cave. The men were making for the
entrance, jostling one another and snarling in their efforts to escape.
The men out of sight beyond the huge rock, Dorothy and Tavia
rushed to the cave mouth, leaving poor Joe to limp painfully after
them, just in time to see the knaves disappear among the trees.
The next moment a horseman broke through the underbrush,
charging straight for them. It was Garry!
At sight of Dorothy he pulled his horse to its haunches, drawing in
his breath in a sharp exclamation.
“Dorothy! Thank heaven! I thought——”
“Never mind about us, Garry. They went over that way—the men
you are after!”
She pointed in the direction the men had disappeared and Garry
nodded. The next moment he had spurred his pony in pursuit,
followed by several other horsemen who had come up behind him.
The girls watched them go, and Joe, coming up behind them, laid a
dirty hand upon his sister’s shoulder.
“You—you were great, Sis, to those men!” he said awkwardly. “I
was awfully proud of you.”
Dorothy smiled through tears and, taking Joe’s grimy hand,
pressed it against her cheek.
“It is so wonderful to have you again, dear!” she said huskily.
They were back again in a moment, Garry and his men, bringing
with them two captives—the big-framed, loose-lipped fellow who had
first taunted Joe in the cave, and George Lightly.
By Garry’s face it was easy to see he was in no mood to deal gently
with his prisoners.
He dismounted, threw the bridle to one of the men, and
approached the big fellow whom he knew to be a tool of the Larrimer
gang.
The fellow was sullen and glowering, but Garry was a good enough
judge to guess that beneath this exterior the fellow was ready to
break.
“Now then,” Garry said coolly, as he pressed the muzzle of his
revolver in uncomfortable proximity to the ribs of his prisoner, “you
tell us what you were doing in that cave over there and you’ll go scot
free. Otherwise, it’s jail for you—if not worse. My men,” he added, in
a gentle drawl, “are just hankering to take part in a lynching party.
It’s a right smart time since they have been treated to that sort of
entertainment, and they are just ripe for a little excitement. How
about it, boys, am I right?”
There came an ominous murmur from the “boys” that caused the
prisoner to look up at them quickly and then down again at his
shuffling feet.
Lightly tried to interfere, but Garry silenced him sharply.
“You hankering to be in this lynching party, too?” he inquired,
adding gratingly: “Because if you are not, I’d advise you to keep your
mouth tight shut!”
It was not long before the captive yielded to the insistence of that
revolver muzzle pressed beneath his fifth rib and made a clean breast
of the whole ugly business. Possibly the invitation to the lynching
party had something to do with his surrender.
As he stutteringly and sullenly revealed the plot which would have
forced Garry to the sale of his lands to insure the safety of his
fiancée’s brother, Garry jotted down the complete confession in his
notebook and at the conclusion forced both his prisoners at the point
of his revolver to sign the document.
Then Garry turned to two of the cowboys, who had been looking
on with appreciative grins.
“Here, Steve, and you, Gay, take these two worms to town and see
that they are put where they belong,” he ordered, and the two boys
leaped to the task eagerly. “You others go help the boys round up the
rest of the gentlemen mentioned in this valuable document,” and he
tapped the confession with a cheerful grin. “So long, you fellows!”
They waved their hats at him, wheeled their ponies joyfully, and
were off to do his bidding.
Then it was that Garry came toward Dorothy, his arms
outstretched. It is doubtful if at that moment he even saw Joe and
Tavia standing there.
Dorothy took a step toward him and suddenly the whole world
seemed to rock and whirl about her. She flung out her hand and
grasped nothing but air. Then down, down into fathomless space and
nothingness!

Dorothy opened her eyes again to find herself in a bed whose

softness and cleanliness meant untold luxury to her. Her body ached
all over, horribly, and her head ached too.
She closed her eyes, but there was a movement beside the bed that
made her open them again swiftly. Somebody had coughed, and it
had sounded like Joe.
She turned over slowly, discovering new aches and pains as she did
so, and saw that it was indeed Joe sitting there, his eyes fixed
hungrily upon her.
She opened her arms and he ran to her and knelt beside the bed.
“Aw, now, don’t go to crying, Sis,” he said, patting her shoulder
awkwardly. “They said if I bothered you they wouldn’t let me stay.”
“I’d like to see them get you away,” cried Dorothy. “Joe, sit back a
little bit and let me look at you. I can’t believe it’s you!”
“But I did an awful thing, Dot,” he said, hanging his head. “You’d
better let me tell you about it before you get too glad I’m back.”
“Tell me about it then, dear,” said Dorothy quietly. “I’ve been
wanting to know just why you ran away.”
“It was all because of the fire at Haskell’s toy store,” said Joe,
speaking swiftly, as though he would be glad to get the explanation
over. “Jack Popella said the explosion was all my fault and he told me
I would be put in prison——”
“But just what did you do?” Dorothy insisted.
“Well, it was like this.” Joe took a long breath, glanced up at her,
then turned his eyes away again. “Jack had a fight with Mr. Haskell
over some money he picked up in the road. Mr. Haskell said he stole
it from his cash drawer, but Jack kept on saying he found it in the
road. I shouldn’t wonder if he did steal it though, at that,” Joe went
on, thoughtfully, and for the first time Dorothy looked at him
accusingly.
“You know I begged you not to have anything to do with Jack
Popella, Joe.”
The lad hung his head and flushed scarlet.
“I know you did. I won’t ever, any more.”
“All right, dear. Tell me what happened then.”
“Jack was so mad at Mr. Haskell he said he would like to knock
down all the boxes in the room back of his store just to get even. He
asked me to help him and—just for fun—I said sure I would. Then he
told me to go on in and get started and he would come in a minute.
“I knocked down a couple of boxes,” Joe continued, after a
strained silence. “And then—the explosion came. Jack said I was to
blame and—the—the cops were after me. I wasn’t going to let them
send me to prison,” he lifted his head with a sort of bravado and met
Dorothy’s gaze steadily. “So—so I came out West to Garry.”
“And you are going back again with me, Joe,” said his sister firmly.
“It was cowardly to run away. Now you will have to face the music!”
Joe hung his head for a moment, then squared his shoulders and
looked bravely at Dorothy.
“All right, Dot. I guess it was kind of sneaking to run away. I—I’m
awful sorry.”
The door opened softly behind them and Tavia poked her head in.
“My goodness gracious, Doro Doodlekins,” she cried, “you look as
bright as a button. First thing you know I’ll be minus a patient.”
Dorothy propped herself up on her elbow and stared at her chum.
“Tavia, we must send a telegram immediately,” she cried. “The
Major must know that Joe is safe.”
Tavia came over and smoothed her pillow fondly.
“Foolish child, did you think no one but you would think of that?”
she chided. “Garry sent one of the boys to Dugonne with orders to
send a night letter to The Cedars telling everything that happened.
That was after you fainted, you know, and we brought you here.”
“Such a foolish thing to do,” sighed Dorothy, sinking back on her
pillow. “What must Garry think of me?”
“Suppose I let him answer that for himself,” suggested the flyaway,
and before Dorothy could protest she had seized Joe by the arm and
escorted him gently from the room. A moment later Dorothy could
hear Tavia calling to Garry that he was “needed very much upstairs.”
Dorothy closed her eyes and opened them the next minute to find
Garry standing beside the bed, looking down at her. She reached out
a hand to him and he took it very gently, kneeling down beside her.
“Joe and Tavia have been telling me how you stood up to those
men in the cave, little girl. I only wish I had been there to see you do
it. We’ve got them all, by the way, and Stiffbold and Lightly and the
rest of them are where they won’t hatch any more schemes in a hurry
—thanks to you.”
“Thanks to me?” repeated Dorothy, wondering. “Garry, why?”
“I never would have discovered that cave if I hadn’t heard you call
out,” Garry explained. “That hole in the mountainside was the coziest
little retreat I ever saw.”
“Well, I’m glad if I helped a little,” sighed Dorothy. “I was afraid
you might be going to scold me.”
“Scold you?” repeated Garry tenderly. “You foolish, little brick!”
It was a long time before Garry remembered something that had
once seemed important to him. With an exclamation of dismay he
stuck his hand in his pocket and drew forth a yellow envelope.
“Here’s a telegram from The Cedars, and I clean forgot all about
it,” he said penitently. “One of the boys brought it from Dugonne
where he went to send the telegram to Major Dale. I didn’t mean to
keep it, honest I didn’t!”
“Under the circumstances, I don’t blame you in the least,” said
Dorothy demurely, as she hastily tore open the telegram.
She read it through, then turned to Garry with shining eyes.
“This is the one thing I needed to make me perfectly happy,
Garry,” she said. “Nat says that Jack Popella has been arrested for
setting Haskell’s store on fire. That automatically clears Joe of
suspicion!”
“That’s great. The poor kid has had more than his share of worry
lately. Just wait till he reads that telegram.” And to Tavia, passing the
door at that moment, he gave the yellow sheet with the request that
she convey it to Joe with all possible speed.
“Just to be comfortable and safe and happy once more,”
murmured Dorothy, as Garry came back to her. “It seems very
wonderful, Garry.”
“And my job,” said Garry softly, “will be to keep you safe and
comfortable and happy for the rest of your life!”

THE END
THE DOROTHY DALE SERIES

By MARGARET PENROSE

Author of “The Motor Girls Series,” “Radio Girls Series,” &c.

12 mo. Illustrated

Price per volume, $1.00, postpaid

Dorothy Dale is the daughter of an old Civil War
veteran who is running a weekly newspaper in a
small Eastern town. Her sunny disposition, her fun-
loving ways and her trials and triumphs make
clean, interesting and fascinating reading. The
Dorothy Dale Series is one of the most popular
series of books for girls ever published.

DOROTHY DALE: A GIRL OF TO-DAY

DOROTHY DALE AT GLENWOOD SCHOOL
DOROTHY DALE’S GREAT SECRET
DOROTHY DALE AND HER CHUMS
DOROTHY DALE’S QUEER HOLIDAYS
DOROTHY DALE’S CAMPING DAYS
DOROTHY DALE’S SCHOOL RIVALS
DOROTHY DALE IN THE CITY
DOROTHY DALE’S PROMISE
DOROTHY DALE IN THE WEST
DOROTHY DALE’S STRANGE DISCOVERY
DOROTHY DALE’S ENGAGEMENT
DOROTHY DALE TO THE RESCUE
The Motor Girls Series

By MARGARET PENROSE

Author of the highly successful “Dorothy Dale Series”

12mo. Illustrated. Price per volume, $1.00 postpaid.

Since the enormous success of our “Motor Boys
Series,” by Clarence Young, we have been asked to
get out a similar series for girls. No one is better
equipped to furnish these tales than Mrs. Penrose,
who, besides being an able writer, is an expert
automobilist.

The Motor Girls

or A Mystery of the Road

The Motor Girls on a Tour

or Keeping a Strange Promise

The Motor Girls at Lookout Beach

or In Quest of the Runaways

The Motor Girls Through New England

or Held by the Gypsies

The Motor Girls on Cedar Lake

or The Hermit of Fern Island

The Motor Girls on the Coast

or The Waif from the Sea

The Motor Girls on Crystal Bay

or The Secret of the Red Oar

The Motor Girls on Waters Blue

or The Strange Cruise of the Tartar

The Motor Girls at Camp Surprise

or The Cave in the Mountain

The Motor Girls in the Mountains

or The Gypsy Girl’s Secret
THE LINGER-NOT SERIES

By AGNES MILLER

12mo. Cloth. Illustrated. Jacket in full colors

Price per volume, 65 cents, postpaid

This new series of girls’ books is in a new style of
story writing. The interest is in knowing the girls
and seeing them solve the problems that develop
their character. Incidentally, a great deal of
historical information is imparted, and a fine
atmosphere of responsibility is made pleasing and
useful to the reader.

1. THE LINGER-NOTS AND THE

MYSTERY HOUSE
or The Story of Nine Adventurous Girls
How the Linger-Not girls met and formed their club seems
commonplace, but this writer makes it fascinating, and how they
made their club serve a great purpose continues the interest to the
end, and introduces a new type of girlhood.

2. THE LINGER-NOTS AND THE VALLEY FEUD

or The Great West Point Chain
The Linger-Not girls had no thought of becoming mixed up with
feuds or mysteries, but their habit of being useful soon entangled
them in some surprising adventures that turned out happily for all,
and made the valley better because of their visit.
3. THE LINGER-NOTS AND THEIR GOLDEN
QUEST
or The Log of the Ocean Monarch
For a club of girls to become involved in a mystery leading back
into the times of the California gold-rush, seems unnatural until the
reader sees how it happened, and how the girls helped one of their
friends to come into her rightful name and inheritance, forms a fine
story.
THE RADIO GIRLS SERIES

By MARGARET PENROSE

12mo. Cloth. Illustrated. Jacket in full colors

Price per volume, 65 cents, postpaid

A new and up-to-date series, taking in the
activities of several bright girls who become
interested in radio. The stories tell of thrilling
exploits, out-door life and the great part the Radio
plays in the adventures of the girls and in solving
their mysteries. Fascinating books that girls of all
ages will want to read.

1. THE RADIO GIRLS OF ROSELAWN

or A Strange Message from the Air
Showing how Jessie Norwood and her chums became interested in
radiophoning, how they gave a concert for a worthy local charity, and
how they received a sudden and unexpected call for help out of the
air. A girl wanted as witness in a celebrated law case disappears, and
the radio girls go to the rescue.

2. THE RADIO GIRLS ON THE PROGRAM

or Singing and Reciting at the Sending Station
When listening in on a thrilling recitation or a superb concert
number who of us has not longed to “look behind the scenes” to see
how it was done? The girls had made the acquaintance of a sending
station manager and in this volume are permitted to get on the
program, much to their delight. A tale full of action and fun.

3. THE RADIO GIRLS ON STATION ISLAND

or The Wireless from the Steam Yacht
In this volume the girls travel to the seashore and put in a vacation
on an island where is located a big radio sending station. The big
brother of one of the girls owns a steam yacht and while out with a
pleasure party those on the island receive word by radio that the
yacht is on fire. A tale thrilling to the last page.

4. THE RADIO GIRLS AT FOREST LODGE

or The Strange Hut in the Swamp
The Radio Girls spend several weeks on the shores of a beautiful
lake and with their radio get news of a great forest fire. It also aids
them in rounding up some undesirable folks who occupy the strange
hut in the swamp.
THE BETTY GORDON SERIES

By ALICE B. EMERSON

Author of the Famous “Ruth Fielding” Series

12mo. Cloth. Illustrated. Jacket in full colors

Price per volume, 65 cents, postpaid

A series of stories by Alice B. Emerson which are
bound to make this writer more popular than ever
with her host of girl readers.

1. BETTY GORDON AT BRAMBLE

FARM
or The Mystery of a Nobody
At the age of twelve Betty is left an orphan.

2. BETTY GORDON IN WASHINGTON

or Strange Adventures in a Great City
In this volume Betty goes to the National Capitol to find her uncle
and has several unusual adventures.

3. BETTY GORDON IN THE LAND OF OIL

or The Farm That Was Worth a Fortune
From Washington the scene is shifted to the great oil fields of our
country. A splendid picture of the oil field operations of to-day.