0% found this document useful (0 votes)
10 views

Introduction To Datafication Implement Datafication Using Ai And Ml Algorithms Shivakumar R Goniwada pdf download

The document is an introduction to datafication, focusing on the implementation of datafication using AI and ML algorithms by Shivakumar R. Goniwada. It covers various aspects of datafication, including its importance, principles, analytics, and data-sharing pipelines. The book aims to provide knowledge and resources for leveraging datafication across different industries.

Uploaded by

hysanjaalvik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Introduction To Datafication Implement Datafication Using Ai And Ml Algorithms Shivakumar R Goniwada pdf download

The document is an introduction to datafication, focusing on the implementation of datafication using AI and ML algorithms by Shivakumar R. Goniwada. It covers various aspects of datafication, including its importance, principles, analytics, and data-sharing pipelines. The book aims to provide knowledge and resources for leveraging datafication across different industries.

Uploaded by

hysanjaalvik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Introduction To Datafication Implement

Datafication Using Ai And Ml Algorithms


Shivakumar R Goniwada download

https://ebookbell.com/product/introduction-to-datafication-
implement-datafication-using-ai-and-ml-algorithms-shivakumar-r-
goniwada-50618658

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Introduction To Datafication Implement Datafication Using Ai And Ml


Algorithms Shivakumar R Goniwada

https://ebookbell.com/product/introduction-to-datafication-implement-
datafication-using-ai-and-ml-algorithms-shivakumar-r-goniwada-50637848

Introduction To Modern Analysis 2nd Edition 2nd Kantorovitz

https://ebookbell.com/product/introduction-to-modern-analysis-2nd-
edition-2nd-kantorovitz-44870612

Introduction To The Speechmaking Process 15th Edition Diana K Leonard


Raymond S Ross

https://ebookbell.com/product/introduction-to-the-speechmaking-
process-15th-edition-diana-k-leonard-raymond-s-ross-44874488

Introduction To Construction Management 2nd Edition 2nd Fred Sherratt

https://ebookbell.com/product/introduction-to-construction-
management-2nd-edition-2nd-fred-sherratt-44899008
Introduction To Analysis With Complex Numbers Irena Swanson

https://ebookbell.com/product/introduction-to-analysis-with-complex-
numbers-irena-swanson-44912170

Introduction To Quantitative Methods In Business With Applications


Using Microsoft Office Excel 1st Edition Bharat Kolluri

https://ebookbell.com/product/introduction-to-quantitative-methods-in-
business-with-applications-using-microsoft-office-excel-1st-edition-
bharat-kolluri-44915766

Introduction To Biostatistical Applications In Health Research With


Microsoft Office Excel And R 2nd Edition Robert P Hirsch

https://ebookbell.com/product/introduction-to-biostatistical-
applications-in-health-research-with-microsoft-office-excel-and-r-2nd-
edition-robert-p-hirsch-44915830

Introduction To Strategies For Organic Synthesis 2nd Edition Laurie S


Starkey

https://ebookbell.com/product/introduction-to-strategies-for-organic-
synthesis-2nd-edition-laurie-s-starkey-44915846

Introduction To Hydrogen Technology 2nd Edition K S V Santhanam

https://ebookbell.com/product/introduction-to-hydrogen-technology-2nd-
edition-k-s-v-santhanam-44916142
Introduction
to Dataf ication
Implement Dataf ication Using AI
and ML Algorithms

Shivakumar R. Goniwada
Introduction to
Datafication
Implement Datafication Using
AI and ML Algorithms

Shivakumar R. Goniwada
Introduction to Datafication: Implement Datafication Using AI and
ML Algorithms
Shivakumar R. Goniwada
Gubbalala, Bangalore, Karnataka, India

ISBN-13 (pbk): 978-1-4842-9495-6 ISBN-13 (electronic): 978-1-4842-9496-3


https://doi.org/10.1007/978-1-4842-9496-3
Copyright © 2023 by Shivakumar R. Goniwada

This work is subject to copyright. All rights are reserved by the publisher, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way,
and transmission or information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark
symbol with every occurrence of a trademarked name, logo, or image we use the names, logos,
and images only in an editorial fashion and to the benefit of the trademark owner, with no
intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or not
they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal
responsibility for any errors or omissions that may be made. The publisher makes no warranty,
express or implied, with respect to the material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Celestin Suresh John
Development Editor: Laura Berendson
Coordinating Editor: Mark Powers
Copy Editor: April Rondeau
Cover designed by eStudioCalamar
Cover image by Pawel Czerwinsk on Unsplash (www.unsplash.com)
Distributed to the book trade worldwide by Apress Media, LLC, 1 New York Plaza, New York, NY
10004, U.S.A. Phone 1-800-SPRINGER, fax (201) 348-4505, email orders-ny@springer-sbm.com,
or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member
(owner) is Springer Science+Business Media Finance Inc. (SSBM Finance Inc.). SSBM Finance
Inc. is a Delaware corporation.
For information on translations, please e-mail booktranslations@springernature.com;
for reprint, paperback, or audio rights, please e-mail bookpermissions@springernature.com.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook
versions and licenses are also available for most titles. For more information, reference our Print
and eBook Bulk Sales web page at http://www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is
available to readers on GitHub (https://github.com/Apress). For more detailed information,
please visit http://www.apress.com/source-code.
Printed on acid-free paper
This book is dedicated to those who may need access to the
resources and opportunities many take for granted. May
this book serve as a reminder that knowledge and learning
are powerful tools that can transform lives and create new
opportunities for those who seek them.
Table of Contents
About the Author�������������������������������������������������������������������������������xiii

About the Technical Reviewer������������������������������������������������������������xv

Acknowledgments����������������������������������������������������������������������������xvii

Introduction���������������������������������������������������������������������������������������xix

Chapter 1: Introduction to Datafication������������������������������������������������1


What Is Datafication?��������������������������������������������������������������������������������������������2
Why Is Datafication Important?�����������������������������������������������������������������������3
Data for Datafication���������������������������������������������������������������������������������������������4
Datafication Steps�������������������������������������������������������������������������������������������5
Digitization vs. Datafication����������������������������������������������������������������������������������6
Types of Data in Datafication��������������������������������������������������������������������������������7
Elements of Datafication���������������������������������������������������������������������������������8
Data Harvesting�����������������������������������������������������������������������������������������������9
Data Curation�������������������������������������������������������������������������������������������������10
Data Storage��������������������������������������������������������������������������������������������������11
Data Analysis�������������������������������������������������������������������������������������������������12
Cloud Computing�������������������������������������������������������������������������������������������13
Datafication Across Industries����������������������������������������������������������������������������13
Summary������������������������������������������������������������������������������������������������������������14

v
Table of Contents

Chapter 2: Datafication Principles and Patterns��������������������������������15


What Are Architecture Principles?����������������������������������������������������������������������16
Datafication Principles����������������������������������������������������������������������������������������16
Data Integration Principle������������������������������������������������������������������������������17
Data Quality Principle������������������������������������������������������������������������������������20
Data Governance Principles��������������������������������������������������������������������������22
Data Is an Asset���������������������������������������������������������������������������������������������23
Data Is Shared�����������������������������������������������������������������������������������������������24
Data Trustee��������������������������������������������������������������������������������������������������24
Ethical Principle���������������������������������������������������������������������������������������������25
Security by Design Principle��������������������������������������������������������������������������27
Datafication Patterns������������������������������������������������������������������������������������������29
Data Partitioning Pattern�������������������������������������������������������������������������������30
Data Replication��������������������������������������������������������������������������������������������32
Stream Processing����������������������������������������������������������������������������������������33
Change Data Capture (CDC)���������������������������������������������������������������������������35
Data Mesh�����������������������������������������������������������������������������������������������������37
Machine Learning Patterns���������������������������������������������������������������������������38
Summary������������������������������������������������������������������������������������������������������������66

Chapter 3: Datafication Analytics�������������������������������������������������������67


Introduction to Data Analytics�����������������������������������������������������������������������������68
What Is Analytics?�����������������������������������������������������������������������������������������68
Big Data and Data Science����������������������������������������������������������������������������68
Datafication Analytical Models����������������������������������������������������������������������������72
Content-Based Analytics�������������������������������������������������������������������������������72
Data Mining���������������������������������������������������������������������������������������������������72
Text Analytics�������������������������������������������������������������������������������������������������73

vi
Table of Contents

Sentiment Analytics���������������������������������������������������������������������������������������73
Audio Analytics����������������������������������������������������������������������������������������������75
Video Analytics����������������������������������������������������������������������������������������������76
Comparison in Analytics��������������������������������������������������������������������������������76
Datafication Metrics��������������������������������������������������������������������������������������������77
Datafication Analysis�������������������������������������������������������������������������������������������79
Data Sources�������������������������������������������������������������������������������������������������80
Data Gathering�����������������������������������������������������������������������������������������������83
Introduction to Algorithms����������������������������������������������������������������������������������83
Supervised Machine Learning�����������������������������������������������������������������������84
Linear Regression������������������������������������������������������������������������������������������86
Support Vector Machines (SVM)��������������������������������������������������������������������88
Decision Trees�����������������������������������������������������������������������������������������������89
Neural Networks��������������������������������������������������������������������������������������������91
Naïve Bayes Algorithm����������������������������������������������������������������������������������93
K-Nearest Neighbor (KNN) Algorithm������������������������������������������������������������94
Random Forest����������������������������������������������������������������������������������������������95
Unsupervised Machine Learning�������������������������������������������������������������������������96
Clustering������������������������������������������������������������������������������������������������������97
Association Rule Learning�����������������������������������������������������������������������������98
Dimensionality Reduction������������������������������������������������������������������������������98
Reinforcement Machine Learning�����������������������������������������������������������������������99
Summary����������������������������������������������������������������������������������������������������������100

Chapter 4: Datafication Data-Sharing Pipeline���������������������������������101


Introduction to Data-Sharing Pipelines�������������������������������������������������������������102
Steps in Data Sharing����������������������������������������������������������������������������������103
Data-Sharing Process����������������������������������������������������������������������������������104

vii
Table of Contents

Data-Sharing Decisions�������������������������������������������������������������������������������106
Data-Sharing Styles������������������������������������������������������������������������������������������108
Unidirectional, Asynchronous Push Integration Style����������������������������������108
Real-Time and Event-based Integration Style���������������������������������������������109
Bidirectional, Synchronous, API-led Integration Style���������������������������������110
Mediated Data Exchange with an Event-­Driven Approach��������������������������111
Designing a Data-Sharing Pipeline�������������������������������������������������������������������112
Types of Data Pipeline���������������������������������������������������������������������������������������118
Batch Processing�����������������������������������������������������������������������������������������118
Extract, Transform, and Load Data Pipeline (ETL)����������������������������������������119
Extract, Load, and Transform Data Pipeline (ELT)����������������������������������������120
Streaming and Event Processing�����������������������������������������������������������������121
Change Data Capture (CDC)�������������������������������������������������������������������������123
Lambda Data Pipeline Architecture�������������������������������������������������������������124
Kappa Data Pipeline Architecture����������������������������������������������������������������126
Data as a Service (DaaS)����������������������������������������������������������������������������������127
Data Lineage�����������������������������������������������������������������������������������������������������129
Data Quality�������������������������������������������������������������������������������������������������������130
Data Integration Governance����������������������������������������������������������������������������132
Summary����������������������������������������������������������������������������������������������������������133

Chapter 5: Data Analysis������������������������������������������������������������������135


Introduction to Data Analysis����������������������������������������������������������������������������136
Data Analysis Steps������������������������������������������������������������������������������������������137
Prepare a Question��������������������������������������������������������������������������������������138
Prepare Cleansed Data��������������������������������������������������������������������������������143
Identify a Relevant Algorithm����������������������������������������������������������������������144
Build a Statistical Model������������������������������������������������������������������������������146

viii
Table of Contents

Match Result�����������������������������������������������������������������������������������������������159
Create an Analysis Report���������������������������������������������������������������������������161
Summary����������������������������������������������������������������������������������������������������������163

Chapter 6: Sentiment Analysis���������������������������������������������������������165


Introduction to Sentiment Analysis�������������������������������������������������������������������166
Use of Sentiment Analysis���������������������������������������������������������������������������167
Types of Sentiment Analysis�����������������������������������������������������������������������������168
Document-Level Sentiment Analysis�����������������������������������������������������������168
Aspect-Based Sentiment Analysis���������������������������������������������������������������170
Multilingual Sentiment Analysis������������������������������������������������������������������171
Pros and Cons of Sentiment Analysis���������������������������������������������������������������173
Pre-Processing of Data�������������������������������������������������������������������������������������174
Tokenization������������������������������������������������������������������������������������������������174
Stop Words Removal������������������������������������������������������������������������������������175
Stemming and Lemmatization���������������������������������������������������������������������175
Handling Negation and Sarcasm�����������������������������������������������������������������176
Rule-Based Sentiment Analysis������������������������������������������������������������������������177
Lexicon-Based Approaches�������������������������������������������������������������������������178
Sentiment Dictionaries��������������������������������������������������������������������������������179
Pros and Cons of Rule-Based Approaches��������������������������������������������������179
Machine Learning–Based Sentiment Analysis��������������������������������������������������180
Supervised Learning Techniques�����������������������������������������������������������������180
Unsupervised Learning Techniques�������������������������������������������������������������182
Pros and Cons of the Machine Learning–Based Approach��������������������������182
Best Practices for Sentiment Analysis��������������������������������������������������������������183
Summary����������������������������������������������������������������������������������������������������������183

ix
Table of Contents

Chapter 7: Behavioral Analysis��������������������������������������������������������185


Introduction to Behavioral Analytics�����������������������������������������������������������������185
Data Collection��������������������������������������������������������������������������������������������������188
Behavioral Science�������������������������������������������������������������������������������������������189
Importance of Behavioral Science��������������������������������������������������������������������191
How Behavioral Analysis and Analytics Are Processed�������������������������������������191
Cognitive Theory and Analytics��������������������������������������������������������������������192
Biological Theories and Analytics����������������������������������������������������������������193
Integrative Model�����������������������������������������������������������������������������������������194
Behavioral Analysis Methods����������������������������������������������������������������������������194
Funnel Analysis�������������������������������������������������������������������������������������������195
Cohort Analysis��������������������������������������������������������������������������������������������195
Customer Lifetime Value (CLV)���������������������������������������������������������������������196
Churn Analysis���������������������������������������������������������������������������������������������196
Behavioral Segmentation����������������������������������������������������������������������������������197
Analyzing Behavioral Analysis���������������������������������������������������������������������������197
Descriptive Analysis with Regression����������������������������������������������������������197
Causal Analysis with Regression�����������������������������������������������������������������203
Causal Analysis with Experimental Design��������������������������������������������������210
Challenges and Limitations of Behavioral Analysis�������������������������������������������212
Summary����������������������������������������������������������������������������������������������������������213

Chapter 8: Datafication Engineering�������������������������������������������������215


Steps of AI and ML Engineering������������������������������������������������������������������������215
AI and ML Development������������������������������������������������������������������������������������217
Understanding the Problem to Be Solved����������������������������������������������������217
Choosing the Appropriate Model�����������������������������������������������������������������218
Preparing and Cleaning Data�����������������������������������������������������������������������220

x
Table of Contents

Feature Selection and Engineering�������������������������������������������������������������221


Model Training and Optimization�����������������������������������������������������������������222
AI and ML Testing����������������������������������������������������������������������������������������������223
Unit Testing��������������������������������������������������������������������������������������������������223
Integration Testing���������������������������������������������������������������������������������������225
Non-Functional Testing��������������������������������������������������������������������������������225
Performance������������������������������������������������������������������������������������������������227
Security Testing�������������������������������������������������������������������������������������������228
DataOps������������������������������������������������������������������������������������������������������������230
MLOps���������������������������������������������������������������������������������������������������������������233
Summary����������������������������������������������������������������������������������������������������������235

Chapter 9: Datafication Governance�������������������������������������������������237


Importance of Datafication Governance������������������������������������������������������������238
Why Is Datafication Governance Required?������������������������������������������������������239
Datafication Governance Framework����������������������������������������������������������������239
Oversight and Accountability�����������������������������������������������������������������������240
Model Risk, Risk Assessment, and Regulatory Guidance����������������������������241
Roles and Responsibilities���������������������������������������������������������������������������244
Monitoring and Reporting����������������������������������������������������������������������������245
Datafication Governance Guidelines and Principles������������������������������������245
Ethical and Legal Aspects����������������������������������������������������������������������������247
Datafication Governance Action Framework�����������������������������������������������������247
Datafication Governance Challenges����������������������������������������������������������������248
Summary����������������������������������������������������������������������������������������������������������250

xi
Table of Contents

Chapter 10: Datafication Security����������������������������������������������������251


Introduction to Datafication Security����������������������������������������������������������������251
Datafication Security Framework���������������������������������������������������������������������253
Regulations��������������������������������������������������������������������������������������������������253
Organization Concerns��������������������������������������������������������������������������������256
Governance and Compliance�����������������������������������������������������������������������257
Business Access Needs�������������������������������������������������������������������������������259
Datafication Security Measures������������������������������������������������������������������������260
Encryption���������������������������������������������������������������������������������������������������260
Data Masking����������������������������������������������������������������������������������������������261
Penetration Testing��������������������������������������������������������������������������������������261
Data Security Restrictions���������������������������������������������������������������������������261
Summary����������������������������������������������������������������������������������������������������������262

Index�������������������������������������������������������������������������������������������������263

xii
About the Author
Shivakumar R. Goniwada is an author,
inventor, chief enterprise architect, and
technology leader with over 23 years of
experience architecting cloud-native, data
analytics, and event-driven systems. He works
in Accenture and leads a highly experienced
technology enterprise and cloud architect
team. Over the years, he has led many
complex projects across industries and the
globe. He has ten software patents in cloud
computing, polyglot architecture, software
engineering, data analytics, and IoT. He authored a book on Cloud Native
Architecture and Design. He is a speaker at multiple global and in-house
conferences. Shivakumar has earned Master Technology Architecture,
Google Professional, AWS, and data science certifications. He completed
his executive MBA at the MIT Sloan School of Management.

xiii
About the Technical Reviewer
Dr. Mohan H M is a technical program
manager and research engineer (HMI, AI/
ML) at Digital Shark Technology, supporting
the research and development of new
products, promotion of existing products, and
investigation of new applications for existing
products.
In the past, he has worked as a technical education evangelist and
has traveled extensively all over India delivering training on artificial
intelligence, embedded systems, and Internet of Things (IoT) to research
scholars and faculties in engineering colleges under the MeitY scheme. In
the past, he has worked as an assistant professor at the T. John Institute of
Technology. Mohan holds a master’s degree in embedded systems and the
VLSI design field from Visvesvaraya Technological University. He earned
his Ph.D. on the topic of non-invasive myocardial infarction prediction
using computational intelligence techniques from the same university.
He has been a peer reviewer for technical publications, including BMC
Informatics, Springer Nature, Scientific Reports, and more. His research
interests include computer vision, IoT, and biomedical signal processing.

xv
Acknowledgments
Many thanks to my mother, S. Jayamma, and late father, G.M. Rudrapp,
who taught me the value of hard work, and to my wife, Nirmala, and
daughter, Neeharika, without whom I wouldn’t have been able to work
long hours into the night every day of the week. Last but not least, I’d like
to thank my friends, colleagues, and mentors at Mphasis, Accenture, and
other corporations who have guided me throughout my career.
Thank you also to my colleagues Mark Powers, Celestin Suresh John,
Shobana Srinivasan, and other Apress team members for allowing me to
work with you and Apress, and to all who have helped this book become
a reality. Thank you for my mentors Bert Hooyman and Abubacker
Mohamed and thanks for my colleague Raghu Pasupuleti for providing
key inputs.

xvii
Introduction
The motivation to write this book goes back to the words of Swami
Vivekananda: “Everything is easy when you are busy, but nothing is easy
when you are lazy,” and “Take up on one idea, make that one idea your life,
dream of it, think of it, live on that idea.”
Data is increasingly shaping the world in which we live. The
proliferation of digital devices, social media platforms, and the Internet
of Things (IoT) has led to an explosion in the amount of data generated
daily. This has created new opportunities and challenges for everyone
as we seek to harness the power of data to drive innovation and improve
decision making.
This book is a comprehensive guide to the world of datafication and its
development, governing process, and security. We explore fundamental
principles and patterns, analysis frameworks, techniques to implement
artificial intelligence (AI) and machine learning (ML) algorithms, models,
and regulations to govern datafication systems.
We will start by exploring the basics of datafication and how it
transforms the world, and then delve into the fundamental principles and
patterns and how data are ingested and processed with an extensive data
analysis framework. We will examine the ethics, regulations, and security
of datafication in a real scenario.
Throughout the book, we will use real-world examples and case
studies to illustrate key concepts and techniques and provide practical
guidance in sentiment and behavior analysis.
Whether you are a student, analyst, engineer, technologist, or someone
simply interested in the world of datafication, this book will provide you
with a comprehensive understanding of datafication.

xix
CHAPTER 1

Introduction to
Datafication
A comprehensive look at datafication must first begin with its definition.
This chapter provides that and details why datafication plays a significant
role in modern business and data architecture.
Datafication has profoundly impacted many aspects of society,
including business, finance, health care, politics, and education. It
has enabled companies to gain insights into consumer behavior and
preferences, health care to improve patient outcomes, finance to enhance
consumer experience and risk and compliance, and educators to
personalize learning experiences.
Datafication helps you to take facts and statistics gained from myriad
sources and give them domain-specific context, aggregating and making
them accessible for use in strategy building and decision making.
This improves sales and profiles, health results, and influence over
public policy.
Datafication is the process of turning data into a usable and accessible
format and involves the following:

• Collecting data from myriad sources

• Organizing and cleaning the data

© Shivakumar R. Goniwada 2023 1


S. R. Goniwada, Introduction to Datafication,
https://doi.org/10.1007/978-1-4842-9496-3_1
Chapter 1 Introduction to Datafication

• Making it available for analysis to use

• Analyzing the data by using artificial intelligence (AL)


and machine learning (ML) models

Developing a deeper understanding of the datafication process and


its implications for individuals and society is essential. This requires a
multidisciplinary approach that brings together stakeholders from various
fields to explore the challenges and opportunities of datafication and to
develop ethical and effective strategies for managing and utilizing data in
the digital age.
This chapter will drill down into the particulars and explain how
datafication benefits the across industry. We will cover the following topics:

• What is datafication?

• How is datafication embraced across industries?

• Why is datafication important?

• What are elements of datafication?

What Is Datafication?
Datafication involves using digital technologies such as the cloud, data
products, and AI/ML algorithms to collect and process vast amounts of
data on human behavior, preferences, and activities.
Datafication converts various forms of information, such as texts,
images, audio recordings, comments, claps, and likes/dislikes to curated
format, and that data can be easily analyzed and processed by multiple
algorithms. This involves extracting relevant data from social media,
hospitals, and Internet of Things (IoT). These data are organized into
a consistent format and stored in a way that makes them accessible for
further analysis.

2
Chapter 1 Introduction to Datafication

Everything around us, from finance, medical, construction, and social


media to industrial equipment, is converted into data. For example,
you create data every time you post to social media platforms such
as WhatsApp, Instagram, Twitter, or Facebook, and any time you join
meetings in Zoom or Google Meet, or even when you walk past a CCTV
camera while crossing the street. The notion differs from digitization, as
datafication is much broader than digitization.
Datafication can help you to understand the world more fully than
ever before. New cloud technologies are available to ingest, store, process,
and analyze data. For example, marketing companies use Facebook and
Twitter data to determine and predict sales. Digital Twin uses industrial
equipment behavior to analyze the behavior of the machine.
Datafication also raises important questions about privacy, security,
and ethics. The collection and use of personal data can infringe on
individual rights and privacy, and there is a need for greater transparency
and accountability in how data are collected and used. Overall,
datafication represents a significant shift in how we live, work, and act.

Why Is Datafication Important?


Datafication enables organizations to transform raw data into a format
that can be analyzed and used to gain insights, make informed business
decisions, improve patients’ health, and streamline supply-chain
management. This is crucial for every industry to improve in today’s
data-driven world. By using the processed data, organizations can identify
trends, gain insight into customer behavior, and discover other key
performance indicators using analytics tools and algorithms.

3
Chapter 1 Introduction to Datafication

Data for Datafication


Data is available everywhere, but what type of data you require for analysis
in datafication is crucial and helps you to understand hidden values
and challenges. Data can come from a wide range of sources, but the
specific data set will depend on the particular context and the goal of the
datafication process.
Today, data are created not only by people and their activities in the
world, but also by machines. The amount of data produced is almost out of
control.
For example:

• Social media data such as posts and comments


are structured data that can be easily analyzed for
sentiment and behavior. This involves extracting text
from the posts and comments and identifying and
categorizing any images, comments, or other media
that are part of it.

• In the medical context, datafication might involve


converting medical records and other patient
information into structured data that can be used
for analysis and research. This involves extracting
information about diagnoses, treatments, and other
medical reports.

• In the e-commerce context, datafication might


involve converting users’ statistics and other purchase
information into structured data that can be used for
analysis and recommendations.
In summary, data can come from a wide range of sources, and how it
is used will depend on the specific context and goals of the datafication
process.

4
Chapter 1 Introduction to Datafication

Data constantly poses new challenges in terms of storage and


accessibility. The need to use all of this data is pushing us into a higher
level of technological advancement, whether we like or want it or not.
Datafication requires new forms of integration to uncover large hidden
values from extensive collections that are diverse, complex, and of a
massive scale. According to Kepios (https://kepios.com/), there will be
4.80 billion social media users worldwide as of April 2023, 59.0 percent of
the world population, and approximately 227 million users join every year.
The following are a few statistics regarding major social media
applications as of the writing of this book:

• Facebook has 3.46 billion monthly visitors.

• YouTube’s potential advertising reach is 7.55 billion


people (monthly average).

• WhatsApp has at least 3 billion monthly users.

• Instagram’s potential advertising reach is


approximately 2.13 billion people.

• Twitter’s possible advertising reach is approximately


2.30 billion people.

Datafication Steps
For datafication, as defined by DAMA (Data Management Association),
you must have a clear set of data, well-defined analysis models, and
computing power. To obtain a precise collection of data, relevant models,
and required computing power, one must follow these steps:

• Data Harvesting: This step involves obtaining data in


a real-time and reliable way from various sources, such
as databases, sensors, files, etc.

5
Chapter 1 Introduction to Datafication

• Data Curation: This step involves organizing and


cleaning the data to prepare it for analysis. You need
to ensure that the data collected are accurate by
removing errors, inconsistencies, and duplicates with a
standardized format.

• Data Transformation: This step involves converting


data into a suitable format for analysis. This step helps
you transform the data into a specific form, such as
dimensional and graph models.

• Data Storage: This step involves storing the data after


transformation in storage, such as a data lake or data
warehouse, for further analysis.

• Data Analysis: This step involves using statistical and


analytical techniques to gain insights from data and
identify trends, patterns, and correlations in the data
that help with predictions and recommendations.

• Data Dissemination: This step involves sharing the


dashboards, reports, and presentations with relevant
stakeholders.
• Cloud Computing: This step provides the necessary
infrastructure and tools for the preceding steps.

Digitization vs. Datafication


For a better understanding of datafication, it can be helpful to contrast it
with digitization. This may help you to better visualize the datafication
process.

6
Chapter 1 Introduction to Datafication

Digitization is a process that has taken place for decades. It entails


the conversion of information into a digital format; for example, music
to MP3/MP4, images to JPG/PNG, manual banking process to mobile
and automated web process, manual approval process to automatic BPM
workflow process, and so on.
Datafication, on the other hand, involves converting data into a usable,
accessible format. This consists of collecting data from various sources and
formats, organizing and cleansing it, and making it available for analysis.
The primary goal of datafication is to help the organization make data-­
driven decisions, allowing it to gain insights and knowledge from the data.
Datafication helps monitor what each person does. It does so with
advanced technologies that can monitor and measure things individually.
In digitization, you convert many forms into digital forms, which are
accessible to an individual computer. Similar to datafication, you ingest
the activities and behavior and convert them into a virtual structure that
can be used within formal systems.
However, many organizations realize that more than simply processing
data is needed to support business disruption. It requires quality data
and the application of suitable algorithms. Modern architecture and
methodologies must be adopted to address these challenges to create
datafication opportunities.

Types of Data in Datafication


The first type of data is content, which can be user likes, comments on
blogs and web forums, visible hyperlinks in the content, user profiles on
social networking sites, news articles on news sites, and machine data. The
data format can be structured or unstructured.
The second type of data is the behavior of objects and the runtime
operational parameters of industrial systems, buildings, and so forth.

7
Chapter 1 Introduction to Datafication

The third type is time series data, such as stock price, weather, or
sensor data.
The fourth type of data is network structured data, such as integrated
networked systems in an industrial unit, such as coolant pipes and water
flow. This data type is beneficial because it provides for overall media
analysis, entire industrial function, and so on.
The fifth data set is your health, fitness, sleep time, conversation chats,
smart home, and health monitor device.

Elements of Datafication
As defined by DAMA, Figure 1-1 illustrates the seven critical elements of
the datafication architecture used to develop the datafication process.
Datafication will only be successful if at least one of the steps is included.

Figure 1-1. Data elements

8
Chapter 1 Introduction to Datafication

Data Harvesting
Data harvesting is extracting data from a given source, such as social
media, IoT devices, or other various data sources.
Before harvesting any data, you need to analyze it to identify the source
and software tools needed for harvesting.
First, the data is undesiably noticeable if it is inaccurate, biased,
confidential, and irrelevant. Therefore, harvested information is
more objective and reliable than familiar data sources. However, the
disadvantage is that it is difficult to know the users’ demographic and
psychological variables for social media data sources.
Second, harvesting must be automatic, real-time, streaming and able
to handle large-scale data sources efficiently.
Third, the data are usually fine-grained and available in real-time. Text
mining techniques are used to preprocess raw text images, text processing
techniques are used to preprocess essential texts, and video processing
techniques are used to preprocess photos and videos for further analysis.
Fourth, the data can be ingested in real-time or in batches. In real-­
time, each data item is imported as the source changes it. When data are
ingested through sets, the data elements are imported in discrete chunks
at periodic intervals.
Various data harvesting methods can be used depending on the data
source type, as follows:

• IoT devices typically involve collecting data from IoT


sensors and devices using protocols such as MQTT,
CoAP, HTTP, and AMQP.

• Social media platforms such as Facebook, Twitter,


LinkedIn, Instagram, and others use REST API,
streaming, Webhooks, and GraphQL.

9
Chapter 1 Introduction to Datafication

Data Curation
Data curation organizes and manages data collected through ingestion
from various sources. This involves organizing and maintaining data in
a way that makes it accessible and usable for data analysis. This involves
cleaning and filtering data, removing duplicates and errors, and properly
labeling and annotating data.
Data curation is essential for ensuring that the data are accurate,
consistent, and reliable, which is crucial for data analysis.
The following are the few steps involved in data curation:

• Data Cleaning: Once the data is harvested, it must


be cleaned to remove errors, inconsistencies, and
duplicates. This involves removing missing values,
correcting spelling errors, and standardizing data
formats.

• Data Transformation: After the data has been cleaned,


it needs to be transformed into a format suitable for
analysis. This involves aggregating data, creating new
variables, and so forth. For example, you might have a
data set of pathology reports with variables that include
such elements as patient ID, date of visit, test ID, test
description, and test results. You want to transform this
data set into a format that shows each patient’s total
health condition. To do this you need to alter harvested
data for data analysis with transformations such as
creating a new variable for test category, aggregating
test data for patient for a year, summarizing data by
group (ex: hemoglobin), etc.

10
Chapter 1 Introduction to Datafication

• Data Labeling: Annotating data with relevant


metadata, such as variable names and data
descriptions.

• Data Quality Test: In this step, you need to ensure the


data is accurate and reliable by using various tests like
statistical tests, etc.

The overall objective of data curation is to reduce the time it takes


to obtain insight from raw data by organizing and bringing relevant
information together for further analysis.
The steps involved in data curation are organizing and cataloging data,
ensuring data quality, preserving data, and providing access to data.

Data Storage
Data storage stores actual digital data on a computer with the help of a
hard drive, solid-state drive, and related software to manage and organize
the data.
Data storage is the actual physical storage of datafication data. More
than 2.5 quintillion bytes of data are created daily, and data snowballs
of approximately 2 MB are made every second for every person. These
numbers are from users searching the content in the internet, browsing
social media networks, posting blogs, photos, comments, status updates,
watching a video, downloading images, streaming songs, etc. To make a
business decision, the data must be stored in a way that is easier to manage
and access, and it is essential to protect data against cyber threats.
For IoT, the data need to be collected from sensors and devices and
stored in the cloud.
Several types of database storage exist, including relational databases,
NoSQL databases, in-memory databases, and cloud databases. Each type
of database storage has advantages and disadvantages, but the best choice
for datafication is cloud databases that involve data lakes and warehouses.

11
Chapter 1 Introduction to Datafication

Data Analysis
Data analysis refers to analyzing a large set of data to discover different
patterns and KPIs (Key Performance Indicators) of an organization. The
main goal of analytics is to help organizations make better business
decisions and future predictions. Advanced analytics techniques such
as machine learning models, text analytics, predictive analytics, data
mining, statistics, and natural language processing are used. With these
ML models, you uncover hidden patterns, unknown correlations, market
trends, customer preferences, feedback about your new FMCG (Fast
Moving Consumer Goods) products, and so on.
The following are the types of analytics that you can process using
ML models:
• Prescriptive: This type of analytics helps to decide
what action should be taken and examines data to
answer various questions such as what should be done.
Or what can we do to make our product attractive? This
helps to find an answer to various problems, such as
where to focus on treatment.
• Predictive: This type of analytics helps to predict the
future or what might happen, such as emphasizing the
business relevance of the resulting insights and use
cases, such as sales and production data.
• Diagnostic: This type of analytics helps to analyze
past situations, such as what went wrong and why it
happened. This helps to facilitate correction in the future;
for example, weather prediction and customer behavior.
• Descriptive: This type of analytics helps to analyze
current and future use cases, such as behavioral analysis
of users.
• Exploratory: This type of analytics involves visualization.

12
Chapter 1 Introduction to Datafication

Cloud Computing
Cloud computing is the use of computing resources delivered over the
internet and has the potential to offer substantial opportunities in various
datafication scenarios. It is a flexible delivery platform for data, computing,
and other services. It can support many architectural and development
styles, from extensive, monolithic systems to sizeable virtual machine
deployments, nimble clusters of containers, a data mesh, and large farms
of serverless functions.
The primary services of cloud offerings for data storage are as follows:

• Data storage as a service

• Streaming services for data ingestion

• Machine learning workbench for analysis

Datafication Across Industries


Datafication is a valuable resource for businesses and organizations
seeking to gain insights into customer behavior, market trends, patient
health healing progress, and more.
Datafication is the process of converting various types of data into a
standardized format that can be used for analysis and decision making
and has become increasingly important across industries as a means of
leveraging data.
In the health-care industry, datafication is used to improve patient
outcomes and reduce costs. By collecting and analyzing patient data,
including pathology tests, medical histories, vital signs, and lab results, health-
care providers are able to optimize treatments and improve patient care.
In the finance industry, datafication is used to analyze financial
data, such as transaction history, risk, fraud management, personalized
customer experience, and compliance.

13
Chapter 1 Introduction to Datafication

In the manufacturing industry, datafication is used to analyze


production data, machine data to improve the production process, digital
twins, etc.
In the retail industry, datafication is used to analyze customer behavior
and preferences to optimize pricing strategies and personalized customer
experience.

Summary
Datafication is the process of converting various types of data and
information into a digital format that can easily be processed and
analyzed. With datafication, you can increase your organization’s
footprint by using data effectively for decision making. It helps to improve
operational efficiency and provides input to the manufacturing hub to
develop new products and services.
Overall, data curation is the key component of effective datafication,
as it ensures that the data is accurate, complete, and reliable, which is
essential for making decisions and gleaning meaningful insights.
In this chapter, I described datafication and discussed the types
of data involved in datafication, datafication steps, and datafication
elements. Next chapter provides more details of principles, patterns and
methodolgoies to realize the datafication.

14
CHAPTER 2

Datafication
Principles
and Patterns
Principles are guidelines for the design and development of a system. They
reflect the level of consensus among the various elements of your system.
Without proper principles, your architecture has no compass to guide its
journey toward datafication.
Patterns are tried-and-tested accurate solutions to common design
problems, and they can be used as a starting point for developing a
datafication.
The processes involved in datafication are to collect, analyze, and
interpret the vast amount of information from a range of sources, such
as social media, Internet of Things (IoT) sensors, and other devices. The
principles and patterns underlying datafication must be understood to
ensure that it benefits all.
The patterns are reusable solutions to commonly occurring problems
in software design. These patterns provide a template for creating designs
that solve specific problems while also being flexible to adapt to different
contexts and requirements.

© Shivakumar R. Goniwada 2023 15


S. R. Goniwada, Introduction to Datafication,
https://doi.org/10.1007/978-1-4842-9496-3_2
Chapter 2 Datafication Principles and Patterns

This chapter provides you with an overview of the principles and


patterns shaping the development of datafication. It will examine the
ethical implication of these technologies for society. Using these principles
and patterns, you can develop datafication projects that are fair and
transparent and that perform well.

What Are Architecture Principles?


A principle is a law or a rule that must be or usually is to be followed
when making critical architectural decisions. The architecture and design
principles of datafication play a crucial role in guiding the software
architecture work responsible for defining the datafication direction.
While following these principles, you must also align with the existing
enterprise’s regulations, primarily those related to data and analytics.
The data and analytics architecture principles are a subset of the
overall enterprise architecture principles that pertain to the rules
surrounding your data collection, usage, management, integration, and
analytics. Ultimately, these principles keep your datafication architecture
consistent, clean, and accountable and help to improve your overall
datafication strategy.

Datafication Principles
As mentioned in the previous chapter, datafication analyzes data from
various sources, such as social media, IoT, and other digital devices. For
a successful and streamlined datafication architecture, you must define
principles related to data ingestion, data streaming, data quality, data
governance, data storage, data analysis, visualization, and metrics. These
principles ensure that data and analytics are used in a way that is aligned
with the organization’s goals and objectives.

16
Chapter 2 Datafication Principles and Patterns

Examples of datafication principles include the use of accurate and up-­


to-­date data, the use of a governance framework, the application of ethical
standards, and the application of quality rules.
The following few principles that helps you to design the datafication
process:

Data Integration Principle


Before big data and streaming technology, data movement was simple.
Data moved linearly from static structured databases and static APIs to
data warehouses. Once you built an integration pipeline in this stagnant
world, it operated consistently because data moved like trains on a track.
In datafication, data have upended the traditional train track–based
approach to use a modern and smart city traffic signal–based approach.
To move data at the speed of business and unlock the flexibility of modern
data architecture, the integration must be handled such that it has the
ability to monitor and manage performance continually. For modern data
integration, your data movement must be prepared for the following:1

• Be capable of doing streaming, batch, API-led, and


micro-batch processing
• Support structured, semi-structured, and
unstructured data

• Handle scheme and semantic changes without


affecting the downstream analysis

• Respond to changes from sources and


application control

1
https://streamsets.com/blog/data-integration-architecture/

17
Chapter 2 Datafication Principles and Patterns

The following principles will help you design modern data integration.
For example, in health-care data analysis, you need to integrate various
health-care systems in the hospitals, such as electronic medical records
and insurance claims data. In financial data analysis, to generate trends
and statistics of financial performance, you need to integrate various data
systems, such as payment processors, accounting systems, and so forth.

• Design for Both Batch and Streaming: While you are


building for social media and IoT, which capitalize
on streaming and API-led data, you must account for
the fact that these data often need to be joined with
or analyzed against historical data and other batch
sources within an enterprise.

• Structured, Semi-structured, and Unstructured


Data: Data integration combines data from multiple
sources to provide a unified view. To achieve this, data
integration software must be able to support this.

• Handle Scheme and Schematics Changes: In


data integration, it is standard for the scheme and
schematics of the data to change over time as new data
sources are added or existing sources are modified.
These changes affect the downstream analysis, making
it difficult to maintain the data’s integrity and the
analysis’s accuracy. It is essential to use a flexible and
extensible data integration architecture to handle this.
You can use data lineage tools to achieve this.

• Respond to Changes from Sources: In data integration,


responding to the source side requires technical and
organizational maturity. Using CDC (Change Data
Capture) and APIs (Application Programming Interface)
and implementing the best change management ensures
that data integration is responsive, efficient, and effective.

18
Chapter 2 Datafication Principles and Patterns

• Use Low-Code No-Code Concepts: Writing custom


code to ingest data from the source into your data store
has been commonplace.

• Sanitize Raw Data upon Data Harvest: Storing


raw inputs invariably leads you to have personal
data and otherwise sensitive information posing
some compliance risks (use only when it is needed).
Sanitizing data as close to the source as possible makes
data analytics productive.

• Handle Data Drift to Ensure Consumption-Ready


Data: Data drift refers to the process of data changing
over time, often in ways that are unpredictable and
difficult to detect. This drift can occur for many
reasons, such as changes in the data source, changes in
data processing algorithms, or changes in the system’s
state. This kind of drift can impact the quality and
reliability of data and analytics. Data drift increases
costs, causes delays in time to analysis, and leads to
poor decisions based on incomplete data. To mitigate
this, you need to analyze and choose the right tools and
software to detect and react to changes in the schema
and keep data sources in sync.

• Cloud Design: Designing integration for the cloud


is fundamentally different when architecting the
cloud. Enterprises often put raw data into object
stores without knowing the end analytical intent.
Legacy tools for data integration often lack the level of
customization and interoperability needed to take full
advantage of cloud services.

19
Chapter 2 Datafication Principles and Patterns

• Instrument Everything: Obtaining end-to-end insight


into data systems will be challenging. End-to-end
instrumentation helps to manage data movements.
This instrumentation is needed for time series analysis
of a single data flow to tease out changes over time.

• Implement the DataOps Approach: Traditional


data integration was suitable for the waterfall
delivery approach but may not work for modern-­
day engineering principles. Modern dataflow tools
provide an integrated development environment for
continuous use for the dataflow life cycle.

Data Quality Principle


Ensuring you have high-quality data is central to the data management
platform. The principle of data quality management is a set of fundamental
understandings, standards, rules, and values. It is the core ingredient of
a robust data architecture. Data quality is critical for building an effective
datafication architecture. Well-governed, high-quality data helps create
accurate models and robust schemas.
There are five characteristics of data quality, as follows:

• Accuracy: Is the information captured in every detail?

• Completeness: How compressive is the data?

• Reliability: Does the data contradict other trusted


resources?

• Relevance: Do you need this data?

• Timeliness: Is this data obsolete or up-to-date, and can


it be used for real-time analysis?

20
Chapter 2 Datafication Principles and Patterns

To address these issues, several steps can be taken to improve the


quality, as follows:

• Identify the type of source and assess its reliability.

• Check the incoming data for errors, inconsistencies,


and missing values.

• Use data cleaning techniques to fix any issues and


improve the quality.

• Use validation rules to ensure data is accurate and


complete. This could be an iterative approach.

• Monitor regularly and identify changes.

• Apply data governance and management process.

Data Quality Tools


Data quality is a critical capability of datafication, as the accuracy and
reliability of data are essential for an accurate outcome. These tools and
techniques can ensure that data is correct, complete, and consistent and
can help identify and remediate quality issues. There are various tools and
techniques to address data quality. Here are a few examples:
• Data Cleansing tools: These help you identify and fix
errors, inconsistencies, and missing values.

• Data Validation tools: These tools help you to check


data consistency and accuracy.

• Data Profiling tools: These will provide detailed data


analysis such as data types, patterns, and trends.

• Data Cataloging tools: These tools will create a


centralized metadata repository, including data quality
metrics, data lineage, and data relationships.

21
Chapter 2 Datafication Principles and Patterns

• Data Monitoring and Alerting tools: These track data


quality metrics and alert the governance team when
quality issues arise.

Data Governance Principles


Data are an increasingly significant asset as organizations implementing
datafication move up the digital curve as they focus on big data and
analytics. Data governance helps organizations better manage data
availability, usability, integrity, and security. It involves establishing
policies and procedures for collecting, storing, and using data.
In modern architecture, especially for datafication, data are expected
to be harvested and accessed anywhere, anytime, on any device. Satisfying
these expectations can give rise to considerable security and compliance
risks, so robust data governance is needed to meet the datafication process.
Data governance is about bringing data under control and keeping
it secure. Successful data governance requires understanding the data,
policies, and quality of metadata management, as well as knowing where
data resides. How did it originate? Who has access to it? And what does it
mean? Effective data governance is a prerequisite to maintaining business
compliance, regardless of whether that compliance is self-imposed by an
organization or comes from global industry practices.
Data governance includes how data delivery and access take place.
How is data integrity managed? How does data lineage take place? How
is data loss prevention (DLP) configured? How is security implemented
for data?
Data governance typically involves the following:

• Establish the data governance team and define its roles


and responsibilities.

• Develop a data governance framework that includes


policies, standards, and procedures.

22
Chapter 2 Datafication Principles and Patterns

• Data consistency across user behavior ensures


completeness and accuracy in generating required KPIs
(Key Performance Indicators).

• Identify critical data assets and classify them according


to their importance.

• Define compliance matrices like GDPR, etc.

• Fact-based decisions based on advanced analytics become


actual time events, and data governance ensures data
veracity, which builds the confidence an organization
needs to achieve the real-time goal for decision making.

• Consider using data governance software such as


Alation, Collibra, Informatica, etc.

Data Is an Asset
Data is an asset that has value to organizations and must be managed
accordingly. Data is an organizational resource with real measurable
value, informing decisions, improving operations, and driving business
growth. Organizations’ assets are carefully managed, and data are equally
important as physical or digital assets. Quality data are the foundation
of the organization’s decisions, so you must ensure that the data are
harvested with quality and accuracy and are available when needed.
The techniques used to measure the data value are directly related to the
accuracy of the outcome of the decision, the accuracy of the outcome
depends on the quality, relevance, and reliability of hte data used in
the decision-making process. the common techniques are data quality
assessment, data relevance analysis, cost-benefit analysis, impact analysis
and differnet forms of analytics.

23
Chapter 2 Datafication Principles and Patterns

Data Is Shared
Different organizational stakeholders will access the datafication data
to analyze various KPIs. Therefore, the data can be shared with relevant
teams across an organization. Timely access to accurate and cleansed data
is essential to improving the quality and efficiency of an organization’s
decision-making ability. The speed of data collection, creation, transfer,
and assimilation is driven by the ability of an organization’s process and
technology to capture social media or IoT sensor data.
To enable data sharing, you must develop and abide by a common
set of policies, procedures, and standards governing data management
and access in the short and long term. It would be best if you had a clear
blueprint for data sharing; there should not be any compromise of the
confidentiality and privacy of data.

Data Trustee
Each data element in a datafication architecture has a trustee accountable for
its quality. As the degree of data sharing grows and business units within an
organization rely upon information, it becomes essential that only the data
trustee makes decisions about the content of the data. In this role, the data
trustee is responsible for ensuring that the data used are following applicable
laws, regulations, or policies and are handled securely and responsibly. The
specific responsibilities of a data trustee will vary depending on the type of
data being shared and the context in which it is being used.
The trustee and steward are different roles. The trustee is responsible
for the accuracy and currency of the data, while the steward may be
broader and include standardization and definition tasks.

24
Chapter 2 Datafication Principles and Patterns

Ethical Principle
Datafication focuses on and analyzes social media, medical, and IoT data.
These data are focused on human dignity, which involves considering
the potential consequences of data and ensuring that it is used fairly,
responsibly, and transparently. This principle reflects the fundamental
ethical requirement that people be treated in a way that respects their
dignity and autonomy as human individuals. When analyzing social
media and medical data, we must remember that data also affects,
represents, and touches people. Personal data are entirely different from
any machine’s raw data, and the unethical use of personal data can directly
influence people’s interactions, places in the community, personal product
usage, etc. It would be best if you considered various laws across the globe
to meet ethics needs while designing your system.
There are various laws in place globally; here are a few:
GDPR Principles (Privacy): Its focus is protecting, collecting, and
managing personal data; i.e., data about individuals. It applies to all
companies and originations in the EU and companies outside of Europe that
hold or otherwise process personal data. The following are a few guidelines
from the GDPR. For more details, refer to https://gdpr-­info.eu/:

• Fairness, Lawfulness, Transparency: Personal data


shall be processed lawfully, fairly, and transparently
about the data subject.

• Purpose Limitation: Personal data must be collected


for specified, explicit, and legitimate purposes and not
processed in an incompatible manner.

• Data Minimization: Personal data must be adequate,


relevant, and limited to what is necessary for the
purpose they are processed.

• Accuracy: Personal data must be accurate and, where


necessary, kept up to date.

25
Chapter 2 Datafication Principles and Patterns

• Integrity and Confidentiality: Data must be


processed with appropriate security of the personal
data, including protection against unauthorized and
unlawful processing.

• Accountability: Data controllers must be responsible


for any compliance

PIPEDA (Personal Information Protection and Electronic


Documents Act): This applies to every organization that collects, uses,
and disseminates personal information. The following are the statutory
obligations of PIPEDA; for more information, visit https://www.
priv.gc.ca/:

• Accountability: Organizations are responsible for


personal information under its control and must
designate an individual accountable for compliance.

• Identifying Purpose: You must specify the purpose for


which personal information is collected.

• Consent: You must obtain the knowledge and consent


of the individual for the collection.

• Accuracy: Personal information must be accurate,


complete, and up to date.

• Safeguards: You must protect personal information.

Human Rights and Technology Act: The U.K. government proposed


this act. It would require companies to conduct due diligence to ensure
that their datafication system does not violate human rights and to report
any risk or harm associated with the technology. You can find more
information at https://www.equalityhumanrights.com/. The following
are a few guidelines:

26
Chapter 2 Datafication Principles and Patterns

• Human Rights Impact Assessment: Conduct a human


rights impact assessment before launching new
services.

• Transparency and Accountability: You must disclose


information about technology services, including how
you collect the data and the algorithms you use to make
decisions affecting individual rights.

Universal Guidelines for AI: This law provides a set of guidelines for
AI/ML and was developed by IEEE (Institute of Electrical and Electronics
Engineers). These guidelines include transparency, accountability, and
safety. You can find more information at https://thepublicvoice.org/
ai-universal-guidelines/. The following are a few guidelines:

• Transparency: AI should be transparent in decision-­


making process, and data algorithms used in AI should
be open and explainable.

• Safety and Well-being: Should be designed to ensure


the safety and well-being of individuals and society.

There are various laws available for each country, and we suggest
following the laws and compliance requirements before processing any
data for analysis.

Security by Design Principle


Security by design also means privacy by design and is a concept in which
security and privacy are considered fundamental aspects of the design.
This principle emphasizes the importance of keeping security and
privacy at the core of a product system.

27
Chapter 2 Datafication Principles and Patterns

The following practices help with the design and development of a


datafication architecture:2

• Minimize Attack Surface Area: Restricts a user’s


access to services.

• Establish Secure Defaults: Strong security rules on


registering users to access your services.

• The Principle of Least Privilege: The user should


have minimum privileges needed to perform a
particular task.

• The Principle of Defense Depth: Add multiple layers


of security validations.

• Fail Securely: Failure is unavoidable and therefore you


want it to fail securely.

• Don’t Trust Services: Do not trust third-party services


without implementing a security mechanism.

• Separation of Duties: Prevent individuals from acting


fraudulently.

• Avoid Security by Obscurity: Should be sufficient


security controls in place to keep your application safe
without hiding core functionality or source code.

• Keep Security Simple: Avoid the use of very


sophisticated architecture when developing security
controls.

• Fix Security Issues Correctly: Developers should


carefully identify all affected systems.

2
Cloud Native Architecture and Design Patterns, Shivakumar Goniwada,
Apress, 2021

28
Chapter 2 Datafication Principles and Patterns

Datafication Patterns
Datafication is the process of converting various aspects of invisible data
into digital data that can be analyzed and used for decision making. As
I explained in Chapter 1, “Introduction to Datafication,” datafication is
increasingly prevalent in recent years, as advances in technology have
made it easier to collect, store, and analyze large amount of data.
The datafication patterns are the common approaches and techniques
used in the process of datafication. These patterns involve the use of
various technologies and methods, digitization, aggregation, visualization,
AI, and ML to convert data into useful insights.
By understanding these patterns, you can effectively store, collect,
analyze, and use data to drive decision making and gain a competitive
edge. By leveraging these patterns, you optimize storage operations.
Each solution is stated so that it gives the essential fields of the
relationships needed to solve the problem, but in a very general and
abstract way so that you can solve the problem for yourself by adapting it
to your preferences and conditions.
The patterns can be the following:

• Can be seen as building blocks of more complex


solutions

• The function is a common language used by


technology architects and designers to describe
solutions.3

3
Cloud Native Architecture and Design Patterns, Shivakumar Goniwada,
Apress, 2021

29
Chapter 2 Datafication Principles and Patterns

Data Partitioning Pattern


Partition allows a table, index, or index-organized table to be subdivided
into smaller chunks, where each chunk of such a database object is called a
partition. This is often done for reasons of efficiency, scalability, or security.
Data partitioning divides the data set and distributes the data over
multiple servers or shards. Each shard is an independent database, and
collectively the shards make up a single database. The portioning helps
with manageability, performance, high availability, security, operational
flexibility, and scalability.
Data partitioning addresses the following scale-like issues:

• High query rates exhausting the CPU capacity of


the server

• Larger data sets exceeding the storage capacity of a


single machine

• Working set sizes are more significant than the system’s


RAM, thus stressing the I/O capacity of disk drives.

You can use the following strategies for database partitioning:

• Horizontal Partitioning (Sharding): Each partition is


a separate data store, but all partitions have the same
schema. Each partition is known as a shard and holds a
subset of data.

• Vertical Partitioning: Each partition holds a subset of


the fields for items in the data store. These fields are
divided according to how you access the data.

• Functional Partitioning: Data are aggregated


according to how each bounded context in the system
uses it.

30
Chapter 2 Datafication Principles and Patterns

You can combine multiple strategies in your application. For example,


you can apply horizontal partitioning for high availability and use a vertical
partitioning strategy to store based on data access.
The database, either RDBMS or NoSQL, provides different criteria to
share the database. These criteria are as follows:

• Range or interval partitioning

• List partitioning

• Round-robin partitioning

• Hash partitioning

Round-robin partitioning is a data partitioning strategy used in


distributed computing systems. In this strategy, data is divided into equal-­
sized partitions or chunks and assigned to different nodes in a round-robin
fashion. It distributes the rows of a table among the nodes. In range, list,
and hash partitioning, an attribute “partitioning key” must be chosen from
among the table attributes. The partition of the table rows is based on the
value of the partitioning key. For example, if there are three nodes and 150
records to be partitioned, the records are divided into three equal chunks
of 50 records each. The first chunk is assigned to the first node, the second
chunk assigned to the second node, and so on. After each node is assigned
a chunk of data, the partitioning starts again from the beginning, assigning
the fifth chunk to the first node and so on.
Range partitioning is a partitioning strategy where data is partitioned
based on a specific range of values. For example, you have a large data
set of patient records, and you want to partition the data based on the
age group. To do this, first you determine the minimum and maximum
age groups in the data set and then divide the range of dates into equal
intervals, each representing a partition.

31
Chapter 2 Datafication Principles and Patterns

Data Replication
Data replication is the process of copying data from one location to another
location. The two locations are generally located on different servers. This
kind of distribution satisfies the failover and fault tolerance characteristics.
Replication can serve many nonfunctional requirements, such as the
following:

• Scalability: Can handle higher query throughput than


a single machine can handle

• High Availability: Keeping the system running even


when one or more nodes go down

• Disconnected Operations: Allowing an application to


continue working when there is a network problem

• Latency: Placing data geographically closer to users so


that users can interact with the data faster

In some cases, replication can provide increased read capacity as the


client can send read operations to different servers. Maintaining copies
of data in different nodes and different availability zones can increase
the data locality and availability of the distributed application. You can
also maintain additional copies of dedicated purposes, such as disaster
recovery, reporting, or backup.
There are two types of replication:

• Leader-based or leader-follower replication

• Quorum-based replication

These two types of replication support full data replication, partial data
replication, master-slave replication, and multi-master replication.4

4
Cloud Native Architecture and Design Patterns, Shivakumar Goniwada,
Apress, 2021

32
Chapter 2 Datafication Principles and Patterns

Stream Processing
Stream processing is the real-time processing of data streams. A stream is
a continuous flow of data that is generated by a variety of sources, such as
social media, medical data, sensors, and financial transactions.
Stream processing helps consumers query continuous data streams
to detect conditions (for example, in payment processing, the AML (Anti-
Money Laundering) system alerts if it founds anamolies in transactions)
quickly in a near real-time mode instead of batch mode. The detection of
the condition varies depending on the type of source and use cases used.
There are several approaches to stream processing, including stream
processing application frameworks, application engines, and platforms.
Stream processing allows applications to exploit a limited form of parallel
processing more easily. The application that supports stream processing
can manage multiple computational units without explicitly managing
allocation, synchronization, or communication among those units. The
stream processing pattern simplifies parallel software and hardware by
restricting the parallel computations that can be performed.
Stream processing takes on data via aggregation, analytics,
transformations, enrichment, and ingestion.
As shown in Figure 2-1, for each input source, the stream processing
engine operates in real time on the data source and provides output in the
target database.

Figure 2-1. Stream processing

33
Chapter 2 Datafication Principles and Patterns

The output is delivered to a streaming analytics application and added


to the output streams.
The stream processing pattern addresses many challenges in the
modern architecture of real-time analytics and event-driven applications,
such as the following:

• Stream processing can handle data volumes that are


much larger than the data processing systems.

• Stream processing easily models the continuous flow


of data.

• Stream processing decentralizes and decouples the


infrastructure.

The typical use cases of stream processing will be examined next.

Social Media Data Use Case


Let’s consider a real-time sentiment analysis. Sentiment analysis is the
process of analyzing text data to determine the attitude expressed in
the text, video, etc. Let’s consider an e-commerce platform. They sell
smart phones, and the company wants to monitor public opinion about
different smart phone brands on social media, and to respond quickly
to any negative feedback or complaints. To do this, you need to set up a
stream processing to continuously monitor social media platforms for
the mention of the various brands. The system can use natural language
processing (NLP) technique to perform sentiment analysis on the text of
the posts and classifies each mention as positive, negative and neutral.

IoT Sensors
Stream processing is used for real-time analysis of data generated by
IoT sensors. Let’s consider a boiler machine at a chemical plant. They
have a network of IoT sensors that are used to monitor environmental

34
Random documents with unrelated
content Scribd suggests to you:
in the strongest possible terms.
“We need stronger discipline in the army,” said the stern secretary
of war to the judge advocate. “The time has come when the
President must yield to our opinion.”
Judge Holt was himself one of the ablest lawyers of his day, and
had won fame as a forensic orator long before the war.
“In presenting these cases,” said he to the writer a few months
before his death, “in obedience to the wish of the secretary of war, I
used all the legal acumen at my command. One morning, with my
papers all ready (and I was deeply in earnest in the matter), I
proceeded to the White House; and, as I entered his private office,
the President looked up with his long, sad face, saying:
“‘Ah! Holt, what have you there?’
“‘I have some important cases for your careful consideration, Mr.
President, with documentary evidence sufficient to condemn every
man.’
“He took the papers and read them carefully, stopping at times to
reflect, then read on until he finished. There was no change in his
countenance this time, unless that it grew more sad and his
expression more serious. I had covered the cases in question with
strong and convincing argument and evidence. He finally raised his
eyes from the last paper and gazed intently through the window at
some object across the Potomac. Then, rising from his chair, with
the papers all folded together, he placed them in a pigeon hole
already filled with similar documents. With his tall, gaunt form facing
me, he spoke, in deep, sad tones, that would have touched the
heart of the sternest officer of the army:
“‘Holt (it was his custom to mention only the last name), you
acknowledge those men have a previous record for bravery. It is not
the first time they have faced danger; and they shall not be shot for
this one offense.’
“I then thought it was my duty as the head of my department of
military justice to make further argument. For I knew Stanton would
nearly explode with rage when he heard of the President’s decision.
I began to speak and Lincoln sat down again, giving me his closest
attention. Then, rising from his chair and riveting his eyes upon me,
he said:
“‘Holt, were you ever in battle?’
“‘I have never been.’
“‘Did Stanton ever march in the first line, to be shot at by an
enemy like those men did?’
“‘I think not, Mr. President.’
“‘Well, I tried it in the Black Hawk war, and I remember one time I
grew awful weak in the knees when I heard the bullets whistle
around me and saw the enemy in front of me. How my legs carried
me forward I cannot now tell, for I thought every minute that I
would sink to the ground. The men against whom those charges
have been made probably were not able to march into battle. Who
knows that they were able? I am opposed to having soldiers shot for
not facing danger when it is not known that their legs would carry
them into danger. Send this dispatch ordering them to be set free.’
And they were set free that day.”
THE LINCOLN PORTRAITS.
The Lincoln apotheosis is much more satisfactory than the
Napoleon apotheosis. Lincoln is not only our own, but a greater,
purer, sweeter, really stronger man than Napoleon. It is a good thing
to bring out the little-known portraits of Lincoln. What a marvelous
face! It is full of strength—with just enough of the big child in it to
kindle love and sympathy. Has anyone ever noticed the way in which
Lincoln’s face is cast on the lines of the North American Indian? We
have never heard that Lincoln had Indian blood in him; but take any
of his good, beardless portraits, with front or nearly front view; add
to it a shock of straight hair parted in the middle and falling down,
either straight or in two braids, on the shoulders; add a feather to it;
clothe the body in a blanket and let it take an Indian stoop; and no
one would question that the man was an aborigine. The face has the
gravity of the Indian countenance, but not the impassiveness that
we read about; but Indian faces, after all, are seldom impassive. The
face of Lincoln, who was not an Indian, has more of the aborigine in
it than of that other great President, Benito Juarez, who was an
Indian.
LINCOLN’S FAITH IN PROVIDENCE.
The raid made by the Confederate general, J. E. B. Stuart, in June,
1862, around the Union army commanded by General McClellan,
caused great anxiety in Washington. One of its results was the
interruption of communication between the capital and the army of
the Potomac. What this portended no one could affirm. That it
suggested the gravest possibilities was felt by all.
While this feeling was dominating all circles, several gentlemen,
myself among them, called on President Lincoln in order to be
definitely advised about the condition of affairs as understood by
him.
To our question: “Mr. President, have you any news from the
army?” he sadly replied: “Not one word; we can get no
communication with it. I do not know that we have an army; it may
have been destroyed or captured, though I cannot so believe, for it
was a splendid army. But the most I can do now is to hope that
serious disaster has not befallen it.”
This led to a somewhat protracted conversation relative to the
general condition of our affairs. It was useless to talk about the
Army of the Potomac; for we knew nothing concerning its condition
or position at that moment. The conversation therefore took a wide
range and touched upon the subject of slavery, about which much
was said.
The President did not participate in this conversation. He was an
attentive listener, but gave no sign of approval or disapproval of the
views which were expressed. At length one of the active participants
remarked:
“Slavery must be stricken down wherever it exists in this country.
It is right that it should be. It is a crime against justice and
humanity. We have tolerated it too long. It brought war upon us. I
believe that Providence is not unmindful of the struggle in which this
nation is engaged. If we do not do right I believe God will let us go
our own way to our ruin. But, if we do right, I believe He will lead us
safely out of this wilderness, crown our arms with victory, and
restore our now dissevered Union.”
I observed President Lincoln closely while this earnest opinion and
expression of religious faith was being uttered. I saw that it affected
him deeply, and anticipated, from the play of his features and the
sparkle of his eyes, that he would not let the occasion pass without
making some definite response to it. I was not mistaken. Mr. Lincoln
had been sitting in his chair, in a kind of weary and despondent
attitude while the conversation progressed. At the conclusion of the
remarks I have quoted, he at once arose and stood at his extreme
height. Pausing a moment, his right arm outstretched towards the
gentleman who had just ceased speaking, his face aglow like the
face of a prophet, Mr. Lincoln gave deliberate and emphatic
utterance to the religious faith which sustained him in the great trial
to which he and the country were subjected. He said: “My faith is
greater than yours. I not only believe that Providence is not
unmindful of the struggle in which this nation is engaged; that if we
do not do right God will let us go our own way to our ruin; and that
if we do right He will lead us safely out of this wilderness, crown our
arms with victory, and restore our dissevered union, as you have
expressed your belief; but I also believe that He will compel us to do
right in order that He may do these things, not so much because we
desire them as that they accord with His plans of dealing with this
nation, in the midst of which He means to establish justice. I think
He means that we shall do more than we have yet done in
furtherance of His plans, and He will open the way for our doing it. I
have felt His hand upon me in great trials and submitted to His
guidance, and I trust that as He shall further open the way I will be
ready to walk therein, relying on His help and trusting in His
goodness and wisdom.”—From “Some Memories of Lincoln,” by ex-
Senator James F. Wilson, in North American Review.
LINCOLN’S LAST WORDS.
The very last words Lincoln delivered on the afternoon before the
assassination—last of those great utterances that for six or seven
years electrified and enlightened half the world—were a message of
suggestion and encouragement to the miners of the Rockies.
Schuyler Colfax was going thither and was paying his final call at the
White House. Lincoln said to him:
“Mr. Colfax, I want you to take a message from me to the miners
whom you visit. I have very large ideas of the mineral wealth of our
nation. I believe it is practically inexhaustible. It abounds all over the
western country, from the Rocky mountains to the Pacific, and its
development has scarcely commenced. During the war, when we
were adding a couple of million dollars every day to our national
debt, I did not care about encouraging the increase in the volume of
our precious metals; we had the country to save first. But now that
the rebellion is overthrown, and we know pretty nearly the amount
of our national debt, the more gold and silver we mine, we make the
payment of that debt so much easier. Now, I am going to encourage
that in every possible way. We shall have hundreds of thousands of
disbanded soldiers, and many have feared that their return home in
such great numbers might paralyze industry by furnishing suddenly a
greater supply of labor than there will be a demand for. I am going
to try to attract them to the hidden wealth of our mountain ranges,
where there is room enough for all. Immigration, which even the
war has not stopped, will land upon our shores hundreds of
thousands more from over-crowded Europe. I intend to point them
to the gold and silver that wait for them in the West. Tell the miners
for me, that I shall promote their interests to the best of my ability,
because their prosperity is the prosperity of the nation; and we shall
prove in a few years that we are indeed the treasury of the world.”
A CHICAGOAN WHO SAW LINCOLN
SHOT.
Mr. George C. Read, of Chicago, at the time of President Lincoln’s
assassination, was a foot orderly under Generals Griffin and Ayers.
He was in Washington on the fateful April 14, 1865, and was an
eyewitness to the tragedy. He tells of it as follows:
“Some time in the latter part of March, 1865, I was sent to
Washington on account of the loss of my voice. I remained there
most of the time in barracks on east Capitol Hill. On the afternoon of
the fated April 14, 1865, I happened in the saloon next door to
Ford’s Theater to see the barkeeper, one Jim Peck. While standing
near a stove about the center of the room three men came into the
place laughing and talking loudly. They all went to the end of the bar
nearest the door and ordered a drink. One was a tall, handsome
fellow, dressed in the latest fashionable clothes, if I remember
rightly, and the others appeared like workmen of some kind. Both
were carelessly dressed, and I think one was in his shirt sleeves.
They had their drink, and then the fine-looking man turned toward
where I was standing and said, ‘Come up, soldier, and have a drink.’
I declined, for the reason that I had not at that time become
addicted to the habit of social drinking. He then approached me and
took me by the arm and said, ‘Have something; take a cigar.’ This I
did not refuse, and he put his hand in his vest pocket and, pulling
out a cigar, handed it to me without any further remarks. He then
returned to his companions at the bar. They remained, if I remember
correctly, about five minutes after, and then, all laughing at
something that Peck said, left the place. As soon as they were gone
I asked Peck who the big man was, and he said that he was an actor
—one of the Booth family—John Wilkes Booth. I had heard of him
before, but paid no further attention to it except to remark that he
seemed to be in a happy frame of mind, when Peck stated that he
was on a ‘drunk,’ and associated with the stage mechanics in the
theater all the time.
“As I was about to depart, little thinking what history would
develop in a few short hours, Peck asked me to accept a couple of
tickets to the theater for that night. I was glad to get them, having
no money to purchase the same, and knowing that the President
would be at the play. Later I found a young man, like myself, broke,
and invited him to accompany me to the play. We were on hand
early, and, having good reserved seats about the center of the
house, were elated over our good luck.
“Suffice it to say that the curtain went up and ‘Our American
Cousin’ was introduced. I was intently interested and cannot
remember positively what act it was that was on, except what is told
in history, when I heard a shot, and immediately a man appeared at
the front of the President’s box and, without waiting, jumped to the
stage beneath. I, as well as all others in the theater, was astonished.
He ran to about the center of the stage and raised his left hand and
said something I did not catch, and then disappeared behind the
wings. As soon as I saw him I recognized the handsome man I had
seen in the saloon that afternoon, and turned to my comrade and
said: ‘That’s Wilkes Booth, the actor, and I think he is on a drunk.’
Before I had finished even this a cry went up that the President had
been shot, ‘Stop that man!’ and many other exclamations I have
forgotten. It was all done so quickly that one had hardly time to
think. Immediately the audience rose as one person and cries were
heard all over the house, ‘Stop that man!’ ‘The President has been
assassinated!’ and many others. The people began to crush each
other and try to get out of the theater, but they were quieted to a
certain extent and the provost guard on duty there fought to make
them keep their places. Soon there was a movement on the side
aisle running from the President’s box, and from where I was
standing on my seat I could see what appeared to be a party of men
carrying some one. Later the rest of the party were conducted out of
the theater, and when I managed to get outside I saw a crowd
looking up at a house opposite. On asking what it meant, I was told
that the President had been carried there and was dying. I lost my
comrade in the crowd and have never met him since.
“It is unnecessary to go into any more details of what occurred
that night. I was excited, as well as every one else in the city, and
got little rest. But that is my experience, told as briefly as possible,
without any stretch of imagination. If I had to do with the same
again I think it would have been better if I had told the officials what
I saw that afternoon, but, as it was, all came out right, and the
really guilty ones suffered the penalty of their crime. I met Peck the
next year in New York City, but have never heard of or seen him
since.”
MARTYRED LINCOLN’S BLOOD.
An interesting and valuable relic, which brings vividly to the mind
the historic scene in Ford’s Theater, Washington, on the night of April
14, 1865, is owned by Colonel James S. Case, at one time a resident
of Philadelphia, but whose home is now in Brooklyn.
It is only a play bill, but upon it is a discoloration made by a tiny
drop of President Lincoln’s blood. It was picked up just after the
tragedy by John T. Ford, the manager of the theater. He found it on
the floor of the box where it had fallen from the President’s hand
when the bullet of Assassin Booth pierced his head. It lay beneath
the chair in which the citizen-hero received his death wound. There
was a tiny spot of blood, still red as it came from the great heart of
Lincoln, on the edge.
Mr. Ford carried the precious paper home, and only parted with it
at the request of the late A. K. Browne of Washington, who was a
warm personal friend of the manager. It came into Mr. Browne’s
possession while the nation was still mourning for its idol, and soon
after his assassin had met justly merited fate at the hands of
Sergeant Boston Corbett.
The play bill is somewhat yellow from age, but otherwise in an
excellent state of preservation. The bloodstain is now a dark brown.
The program was of “Our American Cousin,” which was being given
for the benefit of Laura Keene. The bloodstain is nearly half way
down the program, opposite the names of John Dyott, and Harry
Hawk, Miss Keene’s leading support.
A STRANGE COINCIDENCE IN THE
LIVES OF LINCOLN AND HIS
SLAYER.
When President Lincoln was assassinated on the night of April 14,
1865, while witnessing a play at Ford’s Theater in Washington, he
was removed to the Peterson house, which was directly opposite the
theater.
The late John T. Ford related that he had occasion to visit John
Wilkes Booth at the Peterson house once. The Davenport-Wallack
combination was playing “Julius Cæsar” at Ford’s theater. Booth had
been cast to play Marc Antony and was late in coming to rehearsal.
Ford went over to the house to ask him to hurry up. He found Booth
lying in bed studying his lines. He little dreamed then that Lincoln
would so shortly die in the same house, the same room and on that
identical bed, or that Booth would turn out to be his assassin.
WHERE IS THE ORIGINAL
EMANCIPATION PROCLAMATION?
When Lincoln went to Washington he had a sale of the furniture of
the Eighth street home at Springfield. Most of the articles were
bought by a well-to-do family named Tilton, who admired the
President in such a way as to make what had belonged to him things
to be treasured. When the troops passed through Springfield to the
front they visited the house “where Uncle Abe had lived,” and the
Tiltons used to confer great favor by permitting the boys in blue to
sit down in the dining room and have a glass of milk off the table
from which Mr. Lincoln had eaten many times. But the Tiltons moved
away to Chicago. They carried with them the furniture which had
been in the Lincoln house, prizing it more than ever after his death.
In 1871 came the Chicago fire, and with it went not only the Lincoln
furniture, but the original document, which, if it were in existence
now, would be preserved with the zeal that guards the Declaration
of Independence—the Proclamation of Emancipation. The draft of
the proclamation had been sent to Chicago to be exhibited for some
purpose and was burned in that fire.
MR. GRIFFITHS ON LINCOLN.
“No other public man has been subjected to such scrutiny from
the time he was born until the end of his tragic career as was
Lincoln,” said Mr. Griffiths in a lecture. “He obtained his early
education from ‘Æsop’s Fables,’ ‘Robinson Crusoe,’ the ‘Pilgrim’s
Progress’ and a copy of the Indiana statutes. This was before some
of our later legislatures had made their records or his education
might have been marred instead of made.
“When he was elected President,” Mr. Griffiths continued, “he was
a plodding country lawyer whose library consisted of twenty-two
volumes. Through his public addresses he blazed his way to the
Presidency. He believed the position of a stump speaker to be one of
sacred trust. He had none of the platform graces. His figure was
ungainly; his voice was rasping. He always made the most careful
preparation and gave his best thought to the smallest audiences. He
had marvelous gift of expression and he knew more about the Bible
than Webster. He was not learned in the law and he despised the
legal routine. On a lawsuit he always dealt in the unexpected, which
greatly discomfited the opposing lawyer. He liked stories, but he
always told them to illustrate a point. He was a deeply religious
man.”
A FAMOUS CHICAGO LAWYER’S
VIEWS.
“Into the story of the republic from 1861 to 1865 the patriot does
well to enter, there to find for instruction and example the manliest
of Americans, the highest type of Americanism, the central figure of
the century, Abraham Lincoln. The fierce partisanship which assailed
him during his short period of leadership became silent at his death,
and each succeeding year but serves to exalt his work and character.
“The judgment of time has already shown to be colossal him who
was called common—the honor that we offer to his memory is only
the spontaneous tribute of contemporary history—our enthusiasm is
but the sum of the world’s calmest thinking. For years in all lands
gifted speech has proclaimed his deeds and the pens of poets have
sketched his life. Thus does he receive his tribute from the people.
“In his mentality Lincoln shone in justice, common sense,
consistency, persistence, and knowledge of men. In his words he
was candid and frank, but accurate and concise, speaking strong
Anglo-Saxon unadorned—powerful in its simplicity. In his sentiments
he was kind, patriotic, and brave. No leader ever combined more
completely the graces of gentleness with rugged determination. In
his morals truth was his star, honesty the vital essence of his life.
“In his religion he was faithful as a saint. Providence was his stay
and he walked with God. As President his life and deeds were a
constant sermon. Love of men and faith in God were the
fundamental elements of his character. Poverty had schooled him to
pity and taught him the equality of all mankind.”—Luther Laflin Mills.
LINCOLN WAS PLAIN BUT GREAT.
Lincoln’s forefathers were independent owners of the land they
trod on, barons, not serfs. You will say, perhaps, that Lincoln had
little education. We are apt to say that of our great men. Lincoln
knew how to speak, read and write. What more do we teach our
boys to-day? He knew the Bible, which cannot be said of everybody
in Boston. He read Burns, and this with the Bible gave him his
inspiration and sentiment. Æsop and “Pilgrim’s Progress” taught him
aptness and pregnant illustration.
The incidents of his life were few but notable. He was a resident
of three states before he was 21, and made a river trip to New
Orleans, longer than Thomas Jefferson had taken at his age. At New
Orleans he saw for the first time the auction and whipping of slaves,
which made so deep an impression on him that it may be said to be
the birth of his anti-slavery sentiment.
The choice of Mr. Lincoln for President was not a strained one. He
was the logical selection. Lincoln’s qualities, that sympathy with the
common people, that homely sincerity, have given him a place in the
people’s hearts a little closer, a little dearer, than is held by any other
public man. He had faults, but they were small compared with his
virtues. He had not Washington’s grandeur, the mental alertness of
Hamilton, or the intellectual force of Webster. His greatness was
made up of natural qualities, as of a hillside towering o’er a plain,
yet a part of it. Lincoln was surpassed in certain qualities by other of
our historically great men, but there are none, we feel sure, who
would have filled the place he filled as well as he.—Secretary of War
Long.
LINCOLN’S SPECIFIC LIFE WORK.
One often thinks of his life as cut off, but no great man since
Cæsar has seen his life work ended as did Lincoln. Napoleon died
upon a desert rock, but not until Austerlitz and Wagram had become
memories, and the dust of the empire even as all dust. Cromwell
knew that England had not at heart materially altered. Washington
did not know that he had created one of the great, perhaps the
greatest, empires to be known to man. But Lincoln had a specific
task to do—to save his country and to make it free—and on that
fateful 14th of April he knew that he had accomplished both things.
There are those who would say that chance put this man where
he was to do this work. To the thoughtful mind it was not chance,
however, but design, and that the design of which all greatness is a
part. War is indeed the crucible of the nations. It is the student of a
century hence who shall properly place the civil war in American
history. But, whatever that place be, there can be no doubt of the
position in it of the war President. Like William the Silent, his
domination of all about him was a matter not of personal desire, but
of absolute and constant growth. There are few more interesting
characters in history than Lincoln. There is none who in quite the
same manner fits himself so absolutely into his circumstances. It is
the highest form of genius that so produces as to make production
seem effortless, and it is perhaps the greatest of all tributes to
Lincoln that what he did seems sometimes only what the average
man would have done in his place.
THE PROPOSED PURCHASE OF THE
SLAVES.
The discussion on the question of whether or not Abraham Lincoln
suggested at the conference with the southern commissioners at the
so-called Fortress Monroe meeting, that he was prepared to pay
$400,000,000 for the slaves in the Southern States provided peace
with union could be obtained, is hardly likely to lead to any definite
conclusion, for the reason that the few who should have known
definitely about it are distinctly divided in their opinions. We are
inclined to believe that, if the proposition was made, Mr. Lincoln,
notwithstanding the immense influence that he then possessed,
would have found it exceedingly difficult to convince Congress and a
majority of the people of the North of the wisdom of the suggestion.
As a business proposition, entirely apart from sentiment, it might
have been, even at that late day, a wise plan to adopt. But the war
had then been going on for years, and the hard feelings engendered
would apparently have made the scheme a less tenable one then
than at an earlier day. It will, we imagine, appear to future historians
that, in spite of the example which had been set by England in the
West Indies, those representing both the North and the South
showed themselves, just prior to the war, wanting in the true
elements of statesmanship in not realizing that it was better to
peaceably adjust their differences than have recourse to physical
force. It is now well understood, and might have been well
understood at the time, that the main issue was the slave issue, and
that once out of the way, all other sources of division were
insignificant. We could have well afforded to vote, if need be, several
thousands of millions of dollars to purchase the freedom of the
slaves if by that means the civil war with all of its wastes and
sufferings could have been avoided; and if not now, a generation or
two hence, we feel convinced that the people, both of the North and
the South, will be of the opinion that such an outcome of the
contention would have been possible if we had had on both sides of
the quarrel, statesmen of the caliber of Washington, Jefferson,
Franklin, John Quincy Adams and other eminent Americans who
have made their mark in our national history.
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like