100% found this document useful (1 vote)
47 views

Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services Dr. Shitalkumar R. Sukhdeve download

The document promotes the ebook 'Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services' by Dr. Shitalkumar R. Sukhdeve and Sandika S. Sukhdeve, available for download at ebookmass.com. It includes links to various other recommended ebooks related to data science and machine learning. The content covers topics such as Google Cloud services, data processing, machine learning, and data visualization.

Uploaded by

sdsumaguad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
47 views

Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services Dr. Shitalkumar R. Sukhdeve download

The document promotes the ebook 'Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services' by Dr. Shitalkumar R. Sukhdeve and Sandika S. Sukhdeve, available for download at ebookmass.com. It includes links to various other recommended ebooks related to data science and machine learning. The content covers topics such as Google Cloud services, data processing, machine learning, and data visualization.

Uploaded by

sdsumaguad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Download the full version and explore a variety of ebooks

or textbooks at https://ebookmass.com

Google Cloud Platform for Data Science: A Crash


Course on Big Data, Machine Learning, and Data
Analytics Services Dr. Shitalkumar R. Sukhdeve

_____ Follow the link below to get your download now _____

https://ebookmass.com/product/google-cloud-platform-for-
data-science-a-crash-course-on-big-data-machine-learning-
and-data-analytics-services-dr-shitalkumar-r-sukhdeve/

Access ebookmass.com now to download high-quality


ebooks or textbooks
Here are some recommended products for you. Click the link to
download, or explore more at ebookmass.com

BIG DATA ANALYTICS: Introduction to Hadoop, Spark, and


Machine-Learning Raj Kamal

https://ebookmass.com/product/big-data-analytics-introduction-to-
hadoop-spark-and-machine-learning-raj-kamal/

Fundamentals of Machine Learning for Predictive Data


Analytics: Algorithms,

https://ebookmass.com/product/fundamentals-of-machine-learning-for-
predictive-data-analytics-algorithms/

Data Science in Theory and Practice: Techniques for Big


Data Analytics and Complex Data Sets Maria C. Mariani

https://ebookmass.com/product/data-science-in-theory-and-practice-
techniques-for-big-data-analytics-and-complex-data-sets-maria-c-
mariani/

(eBook PDF) Intro to Python for Computer Science and Data


Science: Learning to Program with AI, Big Data and The
Cloud
https://ebookmass.com/product/ebook-pdf-intro-to-python-for-computer-
science-and-data-science-learning-to-program-with-ai-big-data-and-the-
cloud/
The Big R-Book: From Data Science to Learning Machines and
Big Data Philippe J. S. De Brouwer

https://ebookmass.com/product/the-big-r-book-from-data-science-to-
learning-machines-and-big-data-philippe-j-s-de-brouwer/

Machine Learning, Big Data, and IoT for Medical


Informatics Pardeep Kumar

https://ebookmass.com/product/machine-learning-big-data-and-iot-for-
medical-informatics-pardeep-kumar/

Demystifying Big Data, Machine Learning, and Deep Learning


for Healthcare Analytics Pradeep N Sandeep Kautish Sheng-
Lung Peng
https://ebookmass.com/product/demystifying-big-data-machine-learning-
and-deep-learning-for-healthcare-analytics-pradeep-n-sandeep-kautish-
sheng-lung-peng/

Data Science With Rust: A Comprehensive Guide - Data


Analysis, Machine Learning, Data Visualization & More Van
Der Post
https://ebookmass.com/product/data-science-with-rust-a-comprehensive-
guide-data-analysis-machine-learning-data-visualization-more-van-der-
post/

Essential Statistics for Data Science: A Concise Crash


Course 1st Edition Mu Zhu

https://ebookmass.com/product/essential-statistics-for-data-science-a-
concise-crash-course-1st-edition-mu-zhu/
Google Cloud
Platform for Data
Science
A Crash Course on Big Data,
Machine Learning, and Data
Analytics Services

Dr. Shitalkumar R. Sukhdeve


Sandika S. Sukhdeve
Google Cloud Platform for Data Science: A Crash Course on Big Data,
Machine Learning, and Data Analytics Services
Dr. Shitalkumar R. Sukhdeve Sandika S. Sukhdeve
Gondia, India Gondia, Maharashtra, India

ISBN-13 (pbk): 978-1-4842-9687-5 ISBN-13 (electronic): 978-1-4842-9688-2


https://doi.org/10.1007/978-1-4842-9688-2

Copyright © 2023 by Shitalkumar R. Sukhdeve and Sandika S. Sukhdeve


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way,
and transmission or information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark
symbol with every occurrence of a trademarked name, logo, or image we use the names, logos,
and images only in an editorial fashion and to the benefit of the trademark owner, with no
intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or not
they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal
responsibility for any errors or omissions that may be made. The publisher makes no warranty,
express or implied, with respect to the material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Susan McDermott
Development Editor: Laura Berendson
Coordinating Editor: Jessica Vakili
Cover designed by eStudioCalamar
Cover image from www.freepik.com
Distributed to the book trade worldwide by Apress Media, LLC, 1 New York Plaza, New York, NY
10004, U.S.A. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com,
or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member
(owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance
Inc is a Delaware corporation.
For information on translations, please e-mail booktranslations@springernature.com; for
reprint, paperback, or audio rights, please e-mail bookpermissions@springernature.com.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook
versions and licenses are also available for most titles. For more information, reference our Print
and eBook Bulk Sales web page at http://www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is
available to readers on GitHub (https://github.com/Apress). For more detailed information,
please visit https://www.apress.com/gp/services/source-code.
Paper in this product is recyclable
To our parents, family, and friends.
Table of Contents
About the Authors��������������������������������������������������������������������������������ix

About the Technical Reviewer�������������������������������������������������������������xi

Acknowledgments�����������������������������������������������������������������������������xiii

Preface�����������������������������������������������������������������������������������������������xv

Introduction��������������������������������������������������������������������������������������xvii

Chapter 1: Introduction to GCP�������������������������������������������������������������1


Overview of GCP and Its Data Science Services���������������������������������������������������3
Setting Up a GCP Account and Project������������������������������������������������������������7
Summary��������������������������������������������������������������������������������������������������������������8

Chapter 2: Google Colaboratory����������������������������������������������������������11


Features of Colab������������������������������������������������������������������������������������������������11
Creating and Running Jupyter Notebooks on Colaboratory��������������������������������12
Hands-On Example����������������������������������������������������������������������������������������16
Importing Libraries����������������������������������������������������������������������������������������19
Working with Data�����������������������������������������������������������������������������������������������20
Visualize Data�����������������������������������������������������������������������������������������������������21
Running Machine Learning Models on Colaboratory������������������������������������������24
Deploying the Model on Production���������������������������������������������������������������29
Accessing GCP Services and Data from Colaboratory����������������������������������������32
Summary������������������������������������������������������������������������������������������������������������33

v
Table of Contents

Chapter 3: Big Data and Machine Learning����������������������������������������35


BigQuery�������������������������������������������������������������������������������������������������������������35
Running SQL Queries on BigQuery Data�������������������������������������������������������������36
BigQuery ML�������������������������������������������������������������������������������������������������������43
Google Cloud AI Platform and Its Capabilities�����������������������������������������������������51
Using Vertex AI for Training and Deploying Machine Learning Models���������������56
Train a Model Using Vertex AI and the Python SDK����������������������������������������82
Introduction to Google Cloud Dataproc and Its Use Cases for
Big Data Processing������������������������������������������������������������������������������������������107
How to Create and Update a Dataproc Cluster by Using the
Google Cloud Console����������������������������������������������������������������������������������108
TensorFlow��������������������������������������������������������������������������������������������������������115
Summary����������������������������������������������������������������������������������������������������������118

Chapter 4: Data Visualization and Business Intelligence�����������������121


Looker Studio and Its Features�������������������������������������������������������������������������122
Creating and Sharing Data Visualizations and Reports with
Looker Studio����������������������������������������������������������������������������������������������124
BigQuery and Looker�����������������������������������������������������������������������������������134
Building a Dashboard����������������������������������������������������������������������������������136
Data Visualization on Colab�������������������������������������������������������������������������������142
Summary����������������������������������������������������������������������������������������������������������146

Chapter 5: Data Processing and Transformation������������������������������149


Introduction to Google Cloud Dataflow and Its Use Cases for Batch
and Stream Data Processing���������������������������������������������������������������������������150
Running Data Processing Pipelines on Cloud Dataflow�������������������������������151
Introduction to Google Cloud Dataprep and Its Use Cases for Data
Preparation�������������������������������������������������������������������������������������������������������152
Summary����������������������������������������������������������������������������������������������������������158

vi
Table of Contents

Chapter 6: Data Analytics and Storage���������������������������������������������161


Introduction to Google Cloud Storage and Its Use Cases for Data Storage�������162
Key Features������������������������������������������������������������������������������������������������163
Storage Options�������������������������������������������������������������������������������������������164
Storage Locations����������������������������������������������������������������������������������������165
Creating a Data Lake for Analytics with Google Cloud Storage�������������������165
Introduction to Google Cloud SQL and Its Use Cases for
Relational Databases����������������������������������������������������������������������������������������169
Create a MySQL Instance by Using Cloud SQL��������������������������������������������170
Connect to Your MySQL Instance�����������������������������������������������������������������174
Create a Database and Upload Data in SQL�������������������������������������������������176
Introduction to Google Cloud Pub/Sub and Its Use Cases for Real-Time
Data Streaming�������������������������������������������������������������������������������������������������178
Setting Up and Consuming Data Streams with Cloud Pub/Sub������������������������180
Summary����������������������������������������������������������������������������������������������������������186

Chapter 7: Advanced Topics�������������������������������������������������������������189


Securing and Managing GCP Resources with IAM��������������������������������������������189
Using the Resource Manager API, Grant and Remove IAM Roles����������������191
Using Google Cloud Source Repositories for Version Control����������������������������194
Dataplex������������������������������������������������������������������������������������������������������������196
Cloud Data Fusion���������������������������������������������������������������������������������������������204
Enable or Disable Cloud Data Fusion�����������������������������������������������������������205
Create a Data Pipeline���������������������������������������������������������������������������������207
Summary����������������������������������������������������������������������������������������������������������210

Bibliography�������������������������������������������������������������������������������������213

Index�������������������������������������������������������������������������������������������������215

vii
About the Authors
Dr. Shitalkumar R. Sukhdeve is an
experienced senior data scientist with a strong
track record of developing and deploying
transformative data science and machine
learning solutions to solve complex business
problems in the telecom industry. He has
notable achievements in developing a machine
learning–driven customer churn prediction
and root cause exploration solution, a
customer credit scoring system, and a product
recommendation engine.
Shitalkumar is skilled in enterprise data science and research
ecosystem development, dedicated to optimizing key business indicators
and adding revenue streams for companies. He is pursuing a doctorate
in business administration from SSBM, Switzerland, and an MTech in
computer science and engineering from VNIT Nagpur.
Shitalkumar has authored a book titled Step Up for Leadership
in Enterprise Data Science and Artificial Intelligence with Big Data:
Illustrations with R and Python and co-authored a book titled Web
Application Development with R Using Shiny, Third Edition. He is a
speaker at various technology and business events such as World AI Show
Jakarta 2021, 2022, and 2023, NXT CX Jakarta 2022, Global Cloud-Native
and Open Source Summit 2022, Cyber Security Summit 2022, and ASEAN
Conversational Automation Webinar. You can find him on LinkedIn at
www.linkedin.com/in/shitalkumars/.

ix
About the Authors

Sandika S. Sukhdeve is an expert in data


visualization and Google-certified project
management. She previously served as
Assistant Professor in the Mechanical
Engineering Department and has authored
Amazon bestseller titles across diverse markets
such as the United States, Germany, Canada,
and more. She has a background in human
resources and a wealth of experience in
branding.
As Assistant Professor, Sandika successfully guided more than 2,000
students, delivered 1,000+ lectures, and mentored numerous projects
(including Computational Fluid Dynamics). She excels in managing both
people and multiple projects, ensuring timely completion. Her areas of
specialization encompass thermodynamics, applied thermodynamics,
industrial engineering, product design and development, theory of
machines, numerical methods and optimization, and fluid mechanics.
She holds a master's degree in technology (with a specialization in heat
power), and she possesses exceptional skills in visualizing, analyzing,
and constructing classification and prediction models using R and
MATLAB. You can find her on LinkedIn at ­www.linkedin.com/in/
sandika-awale/.

x
About the Technical Reviewer
Sachin G. Narkhede is a highly skilled data
scientist and software engineer with over
12 years of experience in Python and R
programming for data analytics and machine
learning. He has a strong background in
building machine learning models using scikit-
learn, Pandas, Seaborn, and NLTK, as well as
developing question-answering machines and
chatbots using Python and IBM Watson.
Sachin's expertise also extends to data visualization using Microsoft
BI and the data analytics tool RapidMiner. With a master's degree in
information technology, he has a proven track record of delivering
successful projects, including transaction monitoring, trade-based money
laundering detection, and chatbot development for banking solutions. He
has worked on GCP (Google Cloud Platform).
Sachin's passion for research is evident in his published papers on
brain tumor detection using symmetry and mathematical analysis. His
dedication to learning is demonstrated through various certifications and
workshop participation. Sachin's combination of technical prowess and
innovative thinking makes him a valuable asset in the field of data science.

xi
Acknowledgments
We extend our sincerest thanks to all those who have supported us
throughout the writing process of this book. Your encouragement,
guidance, and unwavering belief in our abilities have contributed to
bringing this project to fruition.
Above all, we express our deepest gratitude to our parents, whose
unconditional love, unwavering support, and sacrifices have allowed us
to pursue our passions. Your unwavering belief in us has been the driving
force behind our motivation.
We are grateful to our family for their understanding and patience
during our countless hours researching, writing, and editing this book.
Your love and encouragement have served as a constant source of
inspiration.
A special thank you goes to our friends for their words of
encouragement, motivation, and continuous support throughout this
journey. Your belief in our abilities and willingness to lend an ear during
moments of doubt have been invaluable.
We would also like to express our appreciation to our mentors
and colleagues who generously shared their knowledge and expertise,
providing valuable insights and feedback that have enriched the content of
this book.
Lastly, we want to express our deepest gratitude to the readers of this
book. Your interest and engagement in the subject matter make all our
efforts worthwhile. We sincerely hope this book proves to be a valuable
resource for your journey in understanding and harnessing the power of
technology.

xiii
Acknowledgments

Once again, thank you for your unwavering support, love, and
encouragement. This book would not have been possible without each
and every one of you.
Sincerely,
Shitalkumar and Sandika

xiv
Preface
The business landscape is transforming by integrating data science
and machine learning, and cloud computing platforms have become
indispensable for handling and examining vast datasets. Google Cloud
Platform (GCP) stands out as a top-tier cloud computing platform, offering
extensive services for data science and machine learning.
This book is a comprehensive guide to learning GCP for data science,
using only the free-tier services offered by the platform. Regardless of
your professional background as a data analyst, data scientist, software
engineer, or student, this book offers a comprehensive and progressive
approach to mastering GCP's data science services. It presents a step-by-
step guide covering everything from fundamental concepts to advanced
topics, enabling you to gain expertise in utilizing GCP for data science.
The book begins with an introduction to GCP and its data science
services, including BigQuery, Cloud AI Platform, Cloud Dataflow, Cloud
Storage, and more. You will learn how to set up a GCP account and
project and use Google Colaboratory to create and run Jupyter notebooks,
including machine learning models.
The book then covers big data and machine learning, including
BigQuery ML, Google Cloud AI Platform, and TensorFlow. Within this
learning journey, you will acquire the skills to leverage Vertex AI for
training and deploying machine learning models and harness the power of
Google Cloud Dataproc for the efficient processing of large-scale datasets.
The book then delves into data visualization and business intelligence,
encompassing Looker Studio and Google Colaboratory. You will gain
proficiency in generating and distributing data visualizations and reports
using Looker Studio and acquiring the knowledge to construct interactive
dashboards.

xv
Preface

The book then covers data processing and transformation, including


Google Cloud Dataflow and Google Cloud Dataprep. You will learn how
to run data processing pipelines on Cloud Dataflow and how to use Cloud
Dataprep for data preparation.
The book also covers data analytics and storage, including Google
Cloud Storage, Google Cloud SQL, and Google Cloud Pub/Sub. You will
learn how to use Cloud Pub/Sub for real-time data streaming and how to
set up and consume data streams.
Finally, the book covers advanced topics, including securing and
managing GCP resources with Identity and Access Management (IAM),
using Google Cloud Source Repositories for version control, Dataplex, and
Cloud Data Fusion.
Overall, this book provides a comprehensive guide to learning GCP
for data science, using only the free-tier services offered by the platform.
It covers the basics of the platform and advanced topics for individuals
interested in taking their skills to the next level.

xvi
Introduction
Welcome to Google Cloud Platform for Data Science: A Crash Course on
Big Data, Machine Learning, and Data Analytics Services. In this book, we
embark on an exciting journey into the world of Google Cloud Platform
(GCP) for data science. GCP is a cutting-edge cloud computing platform
that has revolutionized how we handle and analyze data, making it an
indispensable tool for businesses seeking to unlock valuable insights and
drive innovation in the modern digital landscape.
As a widely recognized leader in cloud computing, GCP offers a
comprehensive suite of services specifically tailored for data science
and machine learning tasks. This book provides a progressive and
comprehensive approach to mastering GCP's data science services,
utilizing only the free-tier services offered by the platform. Whether you’re
a seasoned data analyst, a budding data scientist, a software engineer, or
a student, this book equips you with the skills and knowledge needed to
leverage GCP for data science purposes.
Chapter 1: “Introduction to GCP”
This chapter explores the transformative shift that data science and
machine learning brought about in the business landscape. We highlight
cloud computing platforms' crucial role in handling and analyzing vast
datasets. We then introduce GCP as a leading cloud computing platform
renowned for its comprehensive suite of services designed specifically for
data science and machine learning tasks.

xvii
Introduction

Chapter 2: “Google Colaboratory”


Google Colaboratory, or Colab, is a robust cloud-based platform for
data science. In this chapter, we delve into the features and capabilities of
Colab. You will learn how to create and run Jupyter notebooks, including
machine learning models, leveraging Colab's seamless integration with
GCP services. We also discuss the benefits of using Colab for collaborative
data analysis and experimentation.
Chapter 3: “Big Data and Machine Learning”
This chapter explores the world of big data and machine learning
on GCP. We delve into BigQuery, a scalable data warehouse, and its
practical use cases. Next, we focus on BigQuery ML, which enables you to
build machine learning models directly within BigQuery. We then focus
on Google Cloud AI Platform, where you will learn to train and deploy
machine learning models. Additionally, we introduce TensorFlow, a
popular framework for deep learning on GCP. Lastly, we explore Google
Cloud Dataproc, which facilitates the efficient processing of large-scale
datasets.
Chapter 4: “Data Visualization and Business Intelligence”
Effective data visualization and business intelligence are crucial for
communicating insights and driving informed decisions. In this chapter,
we dive into Looker Studio, a powerful tool for creating and sharing data
visualizations and reports. You will learn how to construct interactive
dashboards that captivate your audience and facilitate data-
driven decision-making. We also explore data visualization within Colab,
enabling you to create compelling visual representations of your data.
Chapter 5: “Data Processing and Transformation”
This chapter emphasizes the importance of data processing and
transformation in the data science workflow. We introduce Google Cloud
Dataflow, a batch and stream data processing service. You will learn to
design and execute data processing pipelines using Cloud Dataflow.
Additionally, we cover Google Cloud Dataprep, a tool for data preparation
and cleansing, ensuring the quality and integrity of your data.

xviii
Introduction

Chapter 6: “Data Analytics and Storage”


Data analytics and storage play a critical role in data science. This
chapter delves into Google Cloud Storage and its use cases for storing and
managing data. Next, we explore Google Cloud SQL, a relational database
service that enables efficient querying and analysis of structured data.
Finally, we introduce Google Cloud Pub/Sub, a real-time data streaming
service, and guide you through setting up and consuming data streams.
Chapter 7: “Advanced Topics”
In the final chapter, we tackle advanced topics to elevate your expertise
in GCP for data science. We explore securing and managing GCP resources
with Identity and Access Management (IAM), ensuring the confidentiality
and integrity of your data. We also introduce Google Cloud Source
Repositories, a version control system for your code. Lastly, we touch upon
innovative technologies like Dataplex and Cloud Data Fusion, expanding
your data science horizons.
By the end of this book, you will have gained comprehensive
knowledge and practical skills in utilizing GCP for data science. You will
be equipped with the tools and techniques to extract valuable insights
from vast datasets, train and deploy machine learning models, and create
impactful data visualizations and reports.
Moreover, this newfound expertise in GCP for data science will open
up numerous opportunities for career advancement. Whether you're a
data analyst, data scientist, software engineer, or student, the skills you
acquire through this book will position you as a valuable asset in today's
data-driven world. You can leverage GCP's robust services to develop end-
to-­end data solutions, seamlessly integrating them into the tech ecosystem
of any organization.
So get ready to embark on this transformative journey and unlock
the full potential of Google Cloud Platform for data science. Let's dive in
together and shape the future of data-driven innovation!

xix
CHAPTER 1

Introduction to GCP
Over the past few years, the landscape of data science has undergone a
remarkable transformation in how data is managed by organizations.
The rise of big data and machine learning has necessitated the storage,
processing, and analysis of vast quantities of data for businesses. As a
result, there has been a surge in the demand for cloud-based data science
platforms like Google Cloud Platform (GCP).
According to a report by IDC, the worldwide public cloud services
market was expected to grow by 18.3% in 2021, reaching $304.9 billion.
GCP has gained significant traction in this market, becoming the third-
largest cloud service provider with a market share of 9.5% (IDC, 2021). This
growth can be attributed to GCP’s ability to provide robust infrastructure,
data analytics, and machine learning services.
GCP offers various data science services, including data storage,
processing, analytics, and machine learning. It also provides tools for building
and deploying applications, managing databases, and securing resources.
Let’s look at a few business cases that shifted to GCP and achieved
remarkable results:

1. The Home Depot: The Home Depot, a leading home


improvement retailer, wanted to improve their online
search experience for customers. They shifted their
search engine to GCP and saw a 50% improvement
in the speed of their search results. This led to a 10%
increase in customer satisfaction and a 20% increase in
online sales (Google Cloud, The Home Depot, n.d.).
© Shitalkumar R. Sukhdeve and Sandika S. Sukhdeve 2023 1
S. R. Sukhdeve and S. S. Sukhdeve, Google Cloud Platform for Data Science,
https://doi.org/10.1007/978-1-4842-9688-2_1
Chapter 1 Introduction to GCP

2. Spotify: Spotify, a popular music streaming service,


migrated its data infrastructure to GCP to handle its
growing user base. By doing so, Spotify reduced its
infrastructure costs by 75% and handled more user
requests per second (Google Cloud, Spotify, n.d.).

3. Nielsen: Nielsen, a leading market research firm,


wanted to improve their data processing speed and
accuracy. They shifted their data processing to GCP
and achieved a 60% reduction in processing time,
resulting in faster insights and better decision-
making (Google Workspace, n.d.).

Apart from the preceding examples, other organizations have


benefited from GCP’s data science services, including Coca-Cola, PayPal,
HSBC, and Verizon.
With GCP, data scientists can focus on solving complex business
problems instead of worrying about managing infrastructure. For instance,
let’s consider the example of a retail company that wants to analyze
customer behavior to improve its marketing strategy. The company has
millions of customers and collects vast data from various sources such
as social media, online purchases, and surveys. To analyze this data,
the company needs a powerful cloud-based platform that can store
and process the data efficiently. GCP can help the company achieve
this by providing scalable data storage with Google Cloud Storage, data
processing with Google Cloud Dataflow, and data analytics with Google
BigQuery. The company can also leverage machine learning with Google
Cloud AI Platform to build models that can predict customer behavior and
Google Cloud AutoML to automate the machine learning process.

2
Chapter 1 Introduction to GCP

GCP’s data science services offer valuable support to individuals


pursuing careers in the field of data science. As the demand for data
scientists continues to increase, possessing expertise in cloud-based
platforms like GCP has become crucial. According to Glassdoor, data
scientists in the United States earn an average salary of $113,309 per year.
Additionally, a recent report by Burning Glass Technologies highlights
a notable 67% increase in job postings that specifically require GCP
skills within the past year. Consequently, acquiring knowledge in GCP
can provide significant advantages for those seeking employment
opportunities in the data science field.
This chapter provides an introduction to GCP along with its data
science offerings, and it will walk you through the process of creating your
GCP account and project.

 verview of GCP and Its Data Science


O
Services
The architecture of Google Cloud Platform (GCP) is designed to provide
users with a highly available, scalable, and reliable cloud computing
environment. It is based on a distributed infrastructure that spans multiple
geographic regions and zones, allowing for redundancy and failover in
case of system failures or disasters (Sukhdeve, 2020).
At a high level, the architecture of GCP consists of the following layers:

1. Infrastructure layer: This layer includes the


physical hardware and network infrastructure
comprising the GCP data centers. It has servers,
storage devices, networking equipment, and other
components required to run and manage the cloud
services.

3
Chapter 1 Introduction to GCP

2. Compute layer: This layer includes the virtual


machines (VMs) and container instances that
run user applications and services. It has services
like Compute Engine, Kubernetes Engine, and
App Engine.

3. Storage layer: This includes the various storage


services GCP provides, such as Cloud Storage, Cloud
SQL, and Cloud Bigtable. It offers scalable and
durable storage options for multiple types of data.

4. Networking layer: This layer includes the various


networking services provided by GCP, such as
Virtual Private Cloud (VPC), Load Balancing, and
Cloud CDN. It offers secure and reliable networking
options for user applications and services.

5. Management and security layer: This includes the


various management and security services GCP
provides, such as Identity and Access Management
(IAM), Security Command Center, and Stackdriver.
It provides tools for managing and securing user
applications and services running on GCP.
In this way, the architecture of GCP is designed to provide a flexible
and customizable cloud computing environment that can meet the needs
of a wide range of users, from small startups to large enterprises.
GCP is a suite of cloud computing services that runs on the same
infrastructure that Google uses for its end user products, such as Google
Search and YouTube (Google Cloud, 2023).

4
Chapter 1 Introduction to GCP

For data science, it offers several services, including

• BigQuery: A fully managed, serverless data


warehousing solution

• Cloud AI Platform: A suite of machine learning tools,


including TensorFlow, scikit-learn, and XGBoost

• Cloud Dataflow: A fully managed service for


transforming and analyzing data

• Cloud DataLab: An interactive development


environment for machine learning and data science

• Cloud Dataproc: A managed Apache Hadoop and


Apache Spark service

• Cloud Storage: A scalable, fully managed object


storage service

• Cloud Vision API: A pre-trained image analysis API

These services can be used together to build complete data science


solutions, from data ingestion to model training to deployment. Many of
the GCP data science services mentioned in the syllabus have a free tier,
which provides limited access to the services for free. The following is a list
of the services and their respective free-tier offerings:

• Google Colaboratory: Completely free.

• BigQuery: 1 TB of data processed per month for free.

• Cloud AI Platform: Free access to AI Platform


Notebooks, which includes Colaboratory.

• Looker Studio: Completely free.

• Cloud Dataflow: Two free job hours per day.

5
Chapter 1 Introduction to GCP

• Cloud Dataprep: Free trial of two million records


per month.

• Cloud Storage: 5 GB of standard storage per month


for free.

• Cloud SQL: Free trial of second-generation instances,


up to 125 MB of storage.

• Cloud Pub/Sub: The free tier includes one million


operations per month.

Note The free-tier offerings may be subject to change, and usage


beyond the free tier will incur charges. It is recommended to check
the GCP pricing page for the latest information on the free-tier
offerings.

The various services of GCP can be categorized as follows:


1. Compute: This category includes services for
running virtual machines (VMs) and containers,
such as Compute Engine, Kubernetes Engine, and
App Engine.
2. Storage: This category includes services for storing
and managing different types of data, such as Cloud
Storage, Cloud SQL, and Cloud Datastore.
3. Networking: This category includes services for
managing networking resources, such as Virtual
Private Cloud (VPC), Cloud Load Balancing, and
Cloud DNS.
4. Big data: This category includes services for
processing and analyzing large datasets, such as
BigQuery, Cloud Dataflow, and Cloud Dataproc.

6
Chapter 1 Introduction to GCP

5. Machine learning: This category includes services


for building and deploying machine learning
models, such as Cloud AI Platform, AutoML, and AI
Building Blocks.

6. Security: This category includes services for


managing and securing GCP resources, such as
Identity and Access Management (IAM), Cloud Key
Management Service (KMS), and Cloud Security
Command Center.

7. Management tools: This category includes services


for managing and monitoring GCP resources,
such as Stackdriver, Cloud Logging, and Cloud
Monitoring.

8. Developer tools: This category includes services for


building and deploying applications on GCP, such as
Cloud Build, Cloud Source Repositories, and Firebase.

9. Internet of Things (IoT): This category includes


services for managing and analyzing IoT data, such
as Cloud IoT Core and Cloud Pub/Sub.

Setting Up a GCP Account and Project


Here are the steps to set up a Google Cloud Platform (GCP) account and
project:

• Go to the GCP website: https://cloud.google.com/.

• Click the Try it free button.

• Log in using your existing Google account, or create a


new account if you do not have one.

• Fill out the required information for your GCP account.

7
Chapter 1 Introduction to GCP

• Once your account is set up, click the Console button to


access the GCP Console.

• In the GCP Console, click the Projects drop-down


menu in the top navigation bar.

• Click the New Project button.

• Enter a name and ID for your project, and select a


billing account if you have multiple accounts.

• Click the Create button.

Your project is now set up, and you can start using GCP services.

Note If you are using the free tier, make sure to monitor your usage
to avoid charges, as some services have limitations. Also, you may
need to enable specific services for your project to use them.

Summary
Google Cloud Platform (GCP) offers a comprehensive suite of cloud
computing services that leverage the same robust infrastructure used by
Google’s products. This chapter introduced GCP, highlighting its essential
services and their relevance to data science.
We explored several essential GCP services for data science, including
BigQuery, Cloud AI Platform, Cloud Dataflow, Cloud DataLab, Cloud
Dataproc, Cloud Storage, and Cloud Vision API. Each of these services
serves a specific purpose in the data science workflow, ranging from data
storage and processing to machine learning model development and
deployment.

8
Chapter 1 Introduction to GCP

Furthermore, we discussed the availability of free-tier offerings for


various GCP data science services, allowing users to get started and
explore these capabilities at no cost. Staying updated with the latest
information on free-tier offerings and pricing is important by referring to
the GCP pricing page.
Following the outlined steps, users can set up their GCP account and
create projects, providing access to a powerful cloud computing platform
for their data science initiatives.
GCP’s robust infrastructure, extensive range of services, and
integration with popular data science tools make it a compelling choice
for organizations and individuals looking to leverage the cloud for their
data-driven projects. With GCP, users can harness Google’s infrastructure’s
scalability, reliability, and performance to tackle complex data challenges
and unlock valuable insights.
As we progress through this guide, we will delve deeper into specific
GCP services and explore how they can be effectively utilized for various
data science tasks.

9
CHAPTER 2

Google Colaboratory
Google Colaboratory is a free, cloud-based Jupyter Notebook environment
provided by Google. It allows individuals to write and run code in Python
and other programming languages and perform data analysis, data
visualization, and machine learning tasks. The platform is designed to be
accessible, easy to use, and collaboration-friendly, making it a popular tool
for data scientists, software engineers, and students.
This chapter will guide you through the process of getting started
with Colab, from accessing the platform to understanding its features and
leveraging its capabilities effectively. We will cover how to create and run
Jupyter notebooks, run machine learning models, and access GCP services
and data from Colab.

Features of Colab
Cloud-based environment: Colaboratory runs on Google’s servers,
eliminating users needing to install software on their devices.
Easy to use: Colaboratory provides a user-friendly interface for
working with Jupyter notebooks, making it accessible for individuals with
limited programming experience.
Access to GCP services: Colaboratory integrates with Google
Cloud Platform (GCP) services, allowing users to access and use GCP
resources, such as BigQuery and Cloud Storage, from within the notebook
environment.

© Shitalkumar R. Sukhdeve and Sandika S. Sukhdeve 2023 11


S. R. Sukhdeve and S. S. Sukhdeve, Google Cloud Platform for Data Science,
https://doi.org/10.1007/978-1-4842-9688-2_2
Chapter 2 Google Colaboratory

Sharing and collaboration: Colaboratory allows users to share


notebooks and collaborate on projects with others, making it an excellent
tool for team-based work.
GPU and TPU support: Colaboratory provides access to GPUs and
TPUs for running computationally intensive tasks, such as deep learning
and high-performance computing.
Google Colaboratory provides a powerful and flexible platform for data
science, machine learning, and other technical tasks, making it a popular
choice for individuals and teams looking to work with cloud-based tools.

 reating and Running Jupyter Notebooks


C
on Colaboratory
To begin your journey with Colab, follow these steps to access the
platform:

i. Open a web browser: Launch your preferred web


browser on your computer or mobile device.

ii. Sign up for a Google account: Before using


Colaboratory, you must have a Google account.
If you don’t have one, sign up for a free Google
account (Figure 2-1).

12
Chapter 2 Google Colaboratory

Figure 2-1. Sign up for a Google account

iii. Navigate to Colaboratory: Visit the Google


Colaboratory website (https://colab.research.
google.com/) and log in with your Google account.
Upon logging in, you should see a dialog like the one
shown in Figure 2-2.

13
Chapter 2 Google Colaboratory

Figure 2-2. Colaboratory after login

iv. Colab interface overview:

Menu bar: The top section of the Colab interface


houses the menu bar. It contains various options for file
operations, runtime configuration, and more. Take a
moment to explore the available menu options.

Toolbar: The toolbar is located directly below the menu


bar. It provides quick access to common actions such as
running, inserting new cells, and changing cell types.

Code cells: The main area of the Colab interface


is composed of code cells. These cells allow you to
write and execute Python code. By default, a new
notebook starts with a single code cell. You can add
additional cells as needed.

Text cells (Markdown cells): Besides code cells, Colab


supports text cells, also known as Markdown cells. Text
cells allow you to add explanatory text, headings, bullet
14
Chapter 2 Google Colaboratory

points, and other formatted content to your notebook.


They help provide documentation, explanations, and
context to your code.

v. Create a new notebook: To create a new


Colaboratory notebook, click the New notebook
button in the top-right or bottom-right corner as
shown in Figure 2-2.

vi. Select the runtime type: When creating a new


notebook, you’ll need to specify the runtime
type. Select Python 3 to work with Python in
Colaboratory.

vii. Start coding: You’re now ready to start coding in


Python. You can change the notebook’s name by
clicking the top-left corner where “Untitled1.ipynb”
is written (Figure 2-3). Click the first code cell in
your notebook to select it. Type a simple Python
statement, such as print(“Hello, Colab!”), into the
code cell. Press Shift+Enter to execute the code. The
output of the code will appear below the cell.

Figure 2-3. New notebook

15
Chapter 2 Google Colaboratory

Hands-On Example
Insert text in the notebook to describe the code by clicking the +
Text button.

Figure 2-4. Insert text in the notebook

Perform a sum of two numbers.

Figure 2-5. Arithmetic example

As can be seen from the preceding image, a = 2, b = 3, and sum = a + b,


giving an output of 5.

16
Chapter 2 Google Colaboratory

Here is an example in Python to generate random numbers using the


random library and visualize the data using Matplotlib.

Figure 2-6. Data science example

The following code demonstrates how to generate random numbers,


create a histogram, and visualize the data using the Python libraries
random and matplotlib.pyplot:

import random
import matplotlib.pyplot as plt

# Generate 100 random numbers between 0 and 100


random_numbers = [random.randint(0,100) for i in range(100)]

# Plot the data as a histogram


plt.hist(random_numbers, bins=10)
plt.xlabel('Number')
plt.ylabel('Frequency')
plt.title('Random Number Distribution')
plt.show()

17
Chapter 2 Google Colaboratory

Here’s a breakdown of the code:

Import random and import matplotlib.pyplot


as plt: These lines import the necessary libraries
for generating random numbers and creating
visualizations using Matplotlib.

random_numbers = [random.randint(0,100) for


i in range(100)]: This line generates a list called
random_numbers containing 100 random integers
between 0 and 100. The random.randint() function
from the random library is used within a list
comprehension to generate these random numbers.

plt.hist(random_numbers, bins=10): This line


creates a histogram using the hist() function from
matplotlib.pyplot. It takes the random_numbers
list as input and specifies the number of bins (10)
for the histogram. The hist() function calculates the
frequency of values within each bin and plots the
histogram.

plt.xlabel(‘Number’), plt.ylabel(‘Frequency’), plt.


title(‘Random Number Distribution’): These lines
set the labels for the x-axis and y-axis and the title of
the plot, respectively.

plt.show(): This line displays the plot on the screen.


The show() function from matplotlib.pyplot is called
to render the histogram plot.

By executing this code in a Colaboratory notebook or a Python


environment, you will see a histogram visualization like Figure 2-7 that
shows the distribution of the randomly generated numbers. The x-axis
represents the number range (0–100), and the y-axis represents the
frequency of each number within the dataset.

18
Chapter 2 Google Colaboratory

Figure 2-7. Output random number distribution

This example showcases the capability of Python and libraries like


random and Matplotlib to generate and visualize data, providing a basic
understanding of how to work with random numbers and histograms.

Importing Libraries
The following is an example code to import libraries into your notebook. If
the library is not already installed, use the “!pip install” command followed
by the library’s name to install:

# Import the Pandas library


import pandas as pd

# Import the Numpy library


import numpy as np

# Import the Matplotlib library


import matplotlib.pyplot as plt

# Import the Seaborn library


import seaborn as sns

19
Chapter 2 Google Colaboratory

The Pandas, Numpy, Matplotlib, and Seaborn libraries are imported


in this example. The import statement is used to import a library, and the
as keyword is used to give the library an alias for easy access. For example,
Pandas is given the alias pd, and Numpy is given the alias np. This allows
you to access the library’s functions and methods using the alias, which
is shorter and easier to type. You can run this code in a Colaboratory
notebook and use the imported libraries in your code.

Working with Data


Colaboratory integrates with Google Drive and Google Cloud Storage,
allowing you to easily import and work with data from those services.
You can store or access data stored in Google Drive from Colaboratory
by using the google.colab library. For example, to mount your Google
Drive to Colaboratory, you can run the following code:

        from google.colab import drive


            drive.mount('/content/drive')

Once you execute the code, a prompt will appear, asking you to
sign into your Google account and grant Colaboratory the necessary
permissions to access your Google Drive.
Once you’ve authorized Colaboratory, you can access your Google
Drive data by navigating to /content/drive in the Colaboratory file explorer.
To write data to Google Drive, you can use Python’s built-in open
function. For example, to write a Pandas DataFrame to a CSV file in Google
Drive, you can use the following code:

import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
     df.to_csv('/content/drive/My Drive/Colab Notebooks/data.
csv', index=False)

20
Chapter 2 Google Colaboratory

This code creates a new CSV file in the My Drive/Colab Notebooks/


folder in Google Drive with the data from the Pandas DataFrame.
Note that you can also use other libraries, such as gspread to write
data to Google Sheets or google.cloud to write data to other GCP services.
The approach you select will vary based on the desired data format and
destination for writing.
You can use the data in your Google Drive just as you would any
other data in Colaboratory. For example, you can read a CSV file
stored in Google Drive and load it into a Pandas DataFrame using the
following code:

      import pandas as pd
           df = pd.read_csv('/content/drive/My Drive/Colab
Notebooks/data.csv')

Figure 2-8. Screenshot of Python on Colab

Visualize Data
To visualize data in Colaboratory, you can use libraries such as Matplotlib
or Seaborn to create plots and charts.

21
Chapter 2 Google Colaboratory

Create a data frame in Python using the Pandas library with the
following code:

import pandas as pd
import numpy as np

df = pd.DataFrame({'age': np.random.randint(20, 80, 100),


                   'weight': np.random.randint(50, 100, 100)})

This will create a data frame with 100 rows and 2 columns named “age”
and “weight”, populated with random integer values between 20 and 80 for
age and 50 and 100 for weight.
Visualize the data in the Panda’s data frame using the Matplotlib
library in Python. Here’s an example to plot a scatter plot of the age and
weight columns:

               import matplotlib.pyplot as plt

              plt.scatter(df['age'], df['weight'])
              plt.xlabel('Age')
              plt.ylabel('Weight')
              plt.show()

The provided code is using the Matplotlib library in Python to create a


scatter plot. Let’s go through the code line by line:

import matplotlib.pyplot as plt: This line imports


the Matplotlib library, specifically the pyplot
module. The plt is an alias or shorthand that allows
us to refer to the module when calling its functions.

plt.scatter(df[‘age’], df[‘weight’]): This line


creates a scatter plot using the scatter() function
from Matplotlib. It takes two arguments: df[‘age’]
and df[‘weight’]. Assuming that df is a Pandas

22
Chapter 2 Google Colaboratory

DataFrame, this code is plotting the values from the


“age” column on the x-axis and the values from the
“weight” column on the y-axis.

plt.xlabel(‘Age’): This line sets the label for the


x-axis of the scatter plot. In this case, the label is set
to “Age.”

plt.ylabel(‘Weight’): This line sets the label for the


y-axis of the scatter plot. Here, the label is set to
“Weight.”

plt.show(): This line displays the scatter plot on the


screen. It is necessary to include this line in order to
see the plot rendered in a separate window or within
the Jupyter notebook.

This will create a scatter plot with the age values on the x-axis and
weight values on the y-axis. You can also use other types of plots, like
histograms, line plots, bar plots, etc., to visualize the data in a Pandas
data frame.

Figure 2-9. Resultant graph age-weight distribution

23
Chapter 2 Google Colaboratory

 unning Machine Learning Models


R
on Colaboratory
Here is an example of building a machine learning model using Python
and the scikit-learn library. First, generate the data using the following
code if you don’t have one. The code generates random features and a
random target array and then combines them into a Pandas DataFrame for
further analysis or modeling:

import pandas as pd
import numpy as np

# Generate random features


np.random.seed(0)
n_samples = 1000
n_features = 5
X = np.random.randn(n_samples, n_features)

# Generate random target


np.random.seed(1)
y = np.random.randint(0, 2, n_samples)

# Combine the features and target into a data frame


df = pd.DataFrame(np.hstack((X, y[:, np.newaxis])),
columns=["feature_1", "feature_2", "feature_3", "feature_4",
"feature_5", "target"])

We will now examine the code step-by-step, analyzing each line:

1. import pandas as pd and import numpy as np:


These lines import the Pandas and Numpy libraries,
which are commonly used for data manipulation
and analysis in Python.

24
Chapter 2 Google Colaboratory

2. np.random.seed(0): This line sets the random


seed for Numpy to ensure that the random
numbers generated are reproducible. By setting a
specific seed value (in this case, 0), the same set of
random numbers will be generated each time the
code is run.

3. n_samples = 1000: This line assigns the number of


samples to be generated as 1000.

4. n_features = 5: This line assigns the number of


features to be generated as 5.

5. X = np.random.randn(n_samples, n_features):
This line generates a random array of shape (n_
samples, n_features) using the np.random.randn()
function from Numpy. Each element in the array is
drawn from a standard normal distribution (mean =
0, standard deviation = 1).

6. np.random.seed(1): This line sets a different


random seed value (1) for generating the random
target values. This ensures that the target values are
different from the features generated earlier.

7. y = np.random.randint(0, 2, n_samples): This line


generates an array of random integers between 0
and 1 (exclusive) of length n_samples using the np.
random.randint() function. This represents the
target values for classification, where each target
value is either 0 or 1.

25
Chapter 2 Google Colaboratory

8. df = pd.DataFrame(np.hstack((X, y[:,
np.newaxis])), columns=[“feature_1”, “feature_2”,
“feature_3”, “feature_4”, “feature_5”, “target”]):
This line combines the features (X) and target (y)
arrays into a Pandas DataFrame. The np.hstack()
function horizontally stacks the X and y arrays, and
the resulting combined array is passed to the pd.
DataFrame() function to create a DataFrame. The
columns parameter is used to assign column names
to the DataFrame, specifying the names of the
features and the target.

Now, we can generate a machine learning model using the following


code. The code loads a dataset from a CSV file, splits it into training and
testing sets, trains a random forest classifier, makes predictions on the test
set, and evaluates the model’s performance by calculating the accuracy:

from sklearn.model_selection import train_test_split


from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Load the dataset


df = pd.read_csv("data.csv")

# Split the data into features (X) and target (y)


X = df.drop("target ", axis=1)
y = df["target "]

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2)

26
Chapter 2 Google Colaboratory

# Train a Random Forest classifier


clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)

# Make predictions on the test set


y_pred = clf.predict(X_test)

# Evaluate the model's performance


accuracy = accuracy_score(y_test, y_pred)
      print("Accuracy:", accuracy)

Let’s go through the code line by line:

1. from sklearn.model_selection import train_


test_split: This line imports the train_test_split
function from the model_selection module in the
scikit-learn library. This function is used to split the
dataset into training and testing subsets.

2. from sklearn.ensemble import


RandomForestClassifier: This line imports the
RandomForestClassifier class from the ensemble
module in scikit-learn. This class represents a
random forest classifier, an ensemble model based
on decision trees.

3. from sklearn.metrics import accuracy_score: This


line imports the accuracy_score function from the
metrics module in scikit-learn. This function is used
to calculate the accuracy of a classification model.

4. df = pd.read_csv(“data.csv”): This line reads a CSV


file named “data.csv” using the read_csv function
from the Pandas library. The contents of the CSV file
are loaded into a Pandas DataFrame called df.

27
Chapter 2 Google Colaboratory

5. X = df.drop(“target”, axis=1): This line creates a


new DataFrame X by dropping the “target” column
from the original DataFrame df. This DataFrame
contains the features used for training the model.

6. y = df[“target”]: This line creates a Series y by


selecting the “target” column from the original
DataFrame df. This Series represents the target
variable or the labels corresponding to the features.

7. X_train, X_test, y_train, y_test = train_test_split(X,


y, test_size=0.2): This line splits the data into
training and testing sets using the train_test_split
function. The features X and the target y are passed
as arguments, along with the test_size parameter,
which specifies the proportion of the data to be used
for testing (in this case, 20%).

8. clf = RandomForestClassifier(n_estimators=100):
This line creates an instance of the
RandomForestClassifier class with 100 decision
trees. The n_estimators parameter determines the
number of trees in the random forest.
9. clf.fit(X_train, y_train): This line trains the random
forest classifier (clf) on the training data (X_train
and y_train). The classifier learns patterns and
relationships in the data to make predictions.

10. y_pred = clf.predict(X_test): This line uses the


trained classifier (clf) to make predictions on the
test data (X_test). The predicted labels are stored in
the y_pred variable.

28
Chapter 2 Google Colaboratory

11. accuracy = accuracy_score(y_test, y_pred):


This line calculates the accuracy of the model by
comparing the predicted labels (y_pred) with the
true labels from the test set (y_test). The accuracy
score is stored in the accuracy variable.

12. print(“Accuracy:”, accuracy): This line prints the


accuracy score on the console, indicating how well
the model performed on the test data.

Deploying the Model on Production


If you want to deploy a scikit-learn model, you can use the joblib library to
save the model and then serve the model using a REST API. The following
code installs the joblib and Flask libraries, saves a trained model to a file
using joblib, and sets up a Flask application to serve the model predictions.
The “/predict” endpoint is defined, and when a request is made to this
endpoint, the model is loaded, predictions are made, and the results are
returned as a JSON response.
You can use a cloud service such as Google Cloud Platform or Amazon
Web Services to host the REST API and make the predictions available to
your applications. Here’s an example of how to do this:

1. Install the joblib library:

!pip install joblib

2. Save the model:

                           import joblib
                          model = clf # Your trained model
                          # Save the model
                          joblib.dump(model, '/content/drive/My
Drive/Colab Notebooks/model.joblib')

29
Other documents randomly have
different content
After the painting belonging to Yale College. Cf. photograph in
Kingsley's Yale College, i. 102; engravings in Hollister's Connecticut,
i. 234, and Amer. Quart. Reg., viii. 31, 193; and memoir in Sparks's
Amer. Biog., xvi. 3, by J. L. Kingsley.

To these may be added various diaries and orderly-books, which


are of little distinctive value.[555] There are other accounts, written
at a later period, in which personal recollections are assisted by
study of the recitals of others, and chief among them are the
narrative in Thacher's Military Journal (Boston, 1823), where the
account is entered as of July, 1775, and chapter xix. of General
James Wilkinson's Memoirs (1816), embodying what he learned in
going over the field in March, 1776, with Stark and Reed. Col. John
Trumbull saw the smoke of the fight from the Roxbury lines, and
gave an outline narrative in his Autobiography (1841).[556] The
account in General Heath's Memoirs (Boston, 1798) is short.[557] A
few of the earlier general histories of the war were written by those
on the American side who had some advantages by reason of
friendly or other relations with the actors.[558] Of the still later
accounts, Frothingham and Dawson have already been referred to
for their bibliographical accompaniments. The diversity of
evidence[559] respecting almost all cardinal points of the battle's
history has necessarily entailed more or less of the controversial
spirit in all who have written upon it, but for thoroughness of
research and a fair discrimination combined, the labors of
Frothingham must be conceded to be foremost. Dawson is elaborate,
and he reveals more than Frothingham the processes of his
collations, but his spirit is not so tempered by discretion, and an air
of flippant controversy often pervades his narrative. Of the more
recent general historians it is only necessary to mention Bancroft[560]
and Carrington. The former gave to it three chapters in his original
edition, in 1858, which, by a little condensation, make a single one in
his final revision, but without material change.[561] The account in
Carrington[562] is intended to be distinctively a military criticism.[563]
The troops of Connecticut[564] and New Hampshire[565] were the
only ones engaged beside those of Massachusetts.
The question of who commanded during the day has been the
subject of continued controversy, arising from the too large claims of
partisans. Though there is much conflict of contemporary evidence,
it seems well established that Col. William Prescott commanded at
the redoubt, and no one questioned his right. He also sent out the
party which in the beginning protected his flank towards the Mystick;
but when Stark, with his New Hampshire men, came up to
strengthen that party, his authority seems to have been generally
recognized, and he held the rail fence there as long as he could to
cover the retreat of Prescott's men from the redoubt. Putnam, the
ranking officer on the field, Warren disclaiming all right to command,
withdrew men with entrenching tools from Prescott, and planned to
throw up earthworks on the higher eminence, now known as Bunker
Hill proper, and near the end of the retreat he assumed a general
command, and directed the fortifying of Prospect Hill. It is not
apparent, then, that any officer, previous to this last stage of the
fight, can be said to have had general command in all parts of the
field. The discussion of the claims of Putnam and Prescott has
resulted in a large number of monographs, and has formed a
particular feature in many of the general accounts of the battle, the
mention of some of which has for this reason been deferred till they
could be placed in the appended note.[566]
A list of officers in the battle, not named in Frothingham's Siege,
is given in the N. E. Hist. and Geneal. Reg., April, 1873; and an
English list of the Yankee officers in the force about Boston in June,
1775, is in Ibid., July, 1874. The Lives of participants and observers
add occasionally some items to the story.[567]
This follows the reproduction of an engraving in J. C. Smith's Brit.
Mezzotint Portraits, p. 1716, which is inscribed: Israel Putnam, Esq.,
Major-General of the Connecticut forces, and Commander-in-chief
at the engagement on Buncker's-Hill, near Boston, 17 June, 1775.
Published by C. Shepherd, 9 Sepr 1775. J. Wilkinson pinxt. (Cf.
Mass. Hist. Soc. Proc., xix. 102.) There is a French engraving,
representing him in cocked hat, looking down and aside, and
subscribed "Israel Putnam, Eqre., major général des Troupes de
Connecticut. Il commandait en chef à l'affaire de Bunckes hill près
Boston, le 17 Juin, 1775." Col. J. Trumbull made a sketch of
Putnam, which has been engraved by W. Humphreys (National
Portrait Gallery, N. Y., 1834) and by Thomas Gimbrede.
Cf. portraits in Murray's Impartial Hist. (1778), i. 334; Hollister's
Connecticut; Irving's Washington, illus. ed., i. 413; and Geschichte
der Kriege in und ausser Europa (Nürnberg, 1778).
For lives of Putnam, see Sabin, xvi. no. 66,804, etc. For his
birthplace, see Appleton's Journal, xi. 321; Miss Larned's Windham
County, Conn. Cf. B. J. Lossing in Harper's Monthly, xii. 577;
Evelyns in America, 273; R. H. Stoddard in Nat. Mag., xii. 97.

JOSEPH WARREN.
After a copperplate by J. Norman in An Impartial Hist. of the War in
America (Boston, 1781), vol. ii. p. 210. The best known picture of
Warren is a small canvas by Copley, belonging to Dr. John Collins
Warren, of Boston, which has been often engraved, and is given in
mezzotint by H. W. Smith in Frothingham's Life of Warren. The
picture in Faneuil Hall is painted after this, and Thomas Illman has
engraved that copy. A larger canvas by Copley, painted not long
before that artist left Boston for England, is owned by Dr.
Buckminster Brown, of Boston, and was engraved for the first time
in the Mem. Hist. of Boston, iii. 60, where will be found accounts of
various contemporary prints and memorials of Warren (pp. 59, 61,
142, 143), including his house at Roxbury, the manuscript of his
Massacre Oration, etc. Cf. Frothingham's Warren, p. 546; Hist.
Mag., Dec., 1857; Loring's Hundred Boston Orators, p. 67; Mrs. J.
B. Brown's Stories of General Warren; Life of Dr. John Warren; the
Warren Genealogy; Mass. Hist. Soc. Proc., Sept., 1866. The earliest
eulogy was that by Perez Morton in 1776 (Loring's Hundred Boston
Orators, 327; Niles's Principles and Acts, 1876, p. 30), and the
earliest memoir of any extent was that by A. H. Everett, in Sparks's
Amer. Biography (vol. x.). There are reminiscences in the N. E. Hist.
and Geneal. Reg., xii. 113, 234, which were based by Gen. William
H. Sumner on some letters published by him in 1825 in the Boston
Patriot, when, as adjutant-general of the State, he arranged for the
appearance of the Bunker Hill veterans in the celebration of that
year, and derived some reminiscences from them respecting
Warren's appearance and action during the fight. All other accounts
of Warren, however, have been eclipsed by Frothingham's Life of
Warren (Boston, 1865). In the Boston Medical and Surgical Journal
(June 17, 1875), Dr. John Jeffries (son of the surgeon of the British
army who saw Warren's body on the field) published a paper on his
death. Cf. also R. J. Speirr in Potter's Amer. Monthly, v. 571;
Frothingham's Warren, pp. 519, 523; Barry's Massachusetts, i. 37,
and references.
The grateful intentions expressed by the Massachusetts House of
Representatives (April 4, 1776), by the Continental Congress (April
8, 1777; Sept. 6, 1778; July 1, 1780,—see Journals of Congress),
and by the Congress of the United States (Jan. 30, 1846,—Mass.
Hist. Soc. Proc., ii. 337), have never been carried out. Benedict
Arnold manifested a special interest in the welfare of Warren's
children (N. E. Hist. and Geneal. Reg., April, 1857, p. 122). The
Freemasons erected a pillar to his memory on the battlefield in
1794, which disappeared when the present obelisk was begun in
1825. There is a view of the pillar in the Analectic Mag., March,
1818, and in Snow's Boston, 309. Cf. Mass. Hist. Soc. Proc., xiv. 65.
A statue of Warren, by Henry Dexter, was placed in a pavilion near
the obelisk in 1857. Cf. G. W. Warren's Hist. of the Bunker Hill
Monument Association; Frothingham's Warren, p. 547.

Among the anniversary discourses upon the battle, a few will


bear reading. The earliest was by Josiah Bartlett in 1794, published
by B. Edes, in Boston, the next year. Daniel Webster made a famous
address at the laying of the corner-stone of the monument in 1825,
which can be found in his Works, i. 59. (Cf. Analectic Mag., vol. xi.;
A. Levasseur's Lafayette en Amérique, Paris, 1829.) The same orator,
at the completion of the monument in 1843, embodied little of
historical interest in his Address. (Works, i. 89.[568]) Alexander H.
Everett's Address in 1836 was subsequently inwoven in his Life of
Warren. The Rev. George E. Ellis began his conspicuous labors in this
field in his discourse in 1841. Edward Everett spoke in 1850
(Orations, etc., iii. p. 3), and Gen. Charles Devens, at the Centennial
in 1875, delivered an oration, which was published by the city of
Boston. The most noteworthy address since that time was that of
Robert C. Winthrop at the unveiling of the statue of Colonel William
Prescott, June 17, 1881.[569] This statue, of which an engraving will
be found in the Mem. Hist. of Boston (iv. 410), stands near the base
of the monument.[570]

We turn now to the accounts on the British side. The orderly-


books of General Howe are preserved among Lord Dorchester's
(Carleton's) Papers in the Royal Institution, London. Sparks made
extracts from them, now in no. xlv. of the Sparks MSS. in Harvard
College library. Extracts relating to the dispositions for the day of the
battle, and for subsequent days, are given by Ellis (1843) p. 88.[571]
Cf. Mag. of Amer. Hist., 1885, p. 214. The more immediate English
notes and comments on the battle can be best grouped in a note.
[572]
During 1775 there were two English accounts, aiming at
something like historical perspective. One of these was, very likely,
by Edmund Burke, and was in the Annual Register (p. 133, etc.). The
other was An Impartial and Authentic Narrative of the Battle fought
on the 17th of June, 1775, between his Britannic Majesty's Troops
and the American Provincial Army on Bunker's Hill near Charles Town
in New England. The author was John Clark, a first lieutenant of
marines. He gives a speech of Howe to his men, representing that it
was delivered just as he advanced to the attack, but this and much
else in the book are considered of doubtful authenticity.[573]
In 1780 there appeared in the London Chronicle some letters by
Israel Mauduit, which were republished the same year as Three
letters to Lord Viscount Howe: added, Remarks on the battle of
Bunker's Hill (London, 1780), which in a second edition (1781) reads
additionally in the title, To which is added a comparative view of the
Conduct of Lord Cornwallis and General Howe. There was among the
Chalmers' MSS. (Thorpe's Supplemental Catal., 1843, no. 660) a
writing entitled Some particulars of the battle of Bunker's Hill, the
situation of the ground, etc. (8 pp., 1784), which Chalmers calls a
"most curious paper in the handwriting of Israel Mauduit, found
among his pamphlets, Jan. 23, 1789."
In 1784 William Carter's Genuine Detail of the Royal and
American Armies appeared in London. Carter was a lieutenant in the
Fortieth Foot, and his book was seemingly reissued in 1785, with a
new title-page. (Brinley, no. 1,789; Stevens, Bibl. Amer., 1885, nos.
80, 81; Harvard Coll. lib., 6351.16.)
Note.—The fac-simile on this page is of a handbill, printed in
Boston, giving the tory side of the fight at Bunker Hill,—after an
original in the library of the Mass. Hist. Society.

Note.—This sketch of Bunker Hill Battle, made for Lord Rawdon,


follows a tracing of the original belonging to Dr. Emmet of New
York, furnished to me by Mr. Benson J. Lossing. A finished drawing
from this sketch is given in the Mem. Hist. of Boston, vol. iii. Cf.
Harper's Mag. xlvii., p. 18. The spire in the foreground is that of the
West Church, which stood where Dr. Bartol's church, in Cambridge
Street, Boston, now stands, showing that the sketcher was on
Beacon hill, 138 feet above the water. The smoke from the frigate
to the right of the spire rises against the higher hill where Putnam
endeavored to rally the retreating provincials. This hill is 110 feet
above the water, and about one mile and a half distant from the
spectator. One hundred and thirty rods to the right of this summit is
the crown of the lower or Breed's Hill, where the redoubt was,
which is 62 feet above the sea. Dr. Emmet secured this picture and
another of the slope of the hill, taken after the battle, and showing
the broken fences (Mem. Hist. of Boston, iii. 88), at the sale of the
effects of the Marquis of Hastings, who was a descendant of Lord
Rawdon, then on Gage's staff (Harper's Monthly Mag., 1875). The
earliest engraved picture of the battle is one cut by Roman, which
was published the same year, and appeared also in Sept., 1775, on
a reduced scale, in the Pennsylvania Magazine. It has been
reproduced in Frothingham's Centennial: Battle of Bunker Hill
(1875), in Moore's Ballad History, and in other of the Centennial
memorials. In 1781 a poem by George Cockings, The American War
(London), had a somewhat extraordinary picture, which has been
reproduced in Gay's Pop. Hist. U. S., iii. 401, by S. A. Drake, and
others. In 1786 Col. John Trumbull painted his well-known picture
of the battle, which has been often engraved. (Cf. Trumbull's
Autobiography; N. E. Hist. and Geneal. Reg., xv.; Tuckerman's Book
of the Artists; Harper's Magazine, Nov., 1879.) Trumbull claimed
that the following figures in his picture were portraits: Warren,
Putnam, Howe, Clinton, Small, and the two Pitcairns.
In the Mass. Magazine, Sept., 1789, there is a view of Charlestown,
showing Bunker's and Breed's hills, with their original contours. It is
reproduced in Mem. Hist. Boston, iii. 554, with a note upon other
early views. Frothingham (Siege, p. 121) gives one from an early
manuscript which closely resembles the topography of the Rawdon
sketch; and again (Centennial, etc.) another which is in fact the
perspective sketch of the town at the edge of Price's view of Boston
(1743), converted into a panoramic picture (Mem. Hist. of Boston,
ii. 329).
The Gentleman's Mag., Feb., 1790, has a view of Charlestown, with
the tents of the British army on the hill, taken after the battle, and
from Copp's Hill. It shows the wharves and ruins of the town. (Cf.
note in Mem. Hist. Boston, iii. 88.)
The account of the loyalist Jones (N. Y. during the Rev., i. 52) has
his usual twist of vision, though he is severe on Gage for "taking the
bull by the horns" in making an attack in front.

CHARLESTOWN PENINSULA, 1775.


Sketched from a plan by Montresor, showing the redoubt erected
by the British, after June 17, on the higher eminence of Bunker Hill.
The original is in the library of Congress, where is a plan on a large
scale of this principal redoubt.

The long list of general histories on the British side, detailing the
events of the battle, begins with Murray's Impartial Hist. of the War
(London, 1778; Newcastle, 1782), and is made up during the rest of
that century by the Hist. of the War published at Dublin (1779-85);
Hall's Civil War in America (1780); The Detail and Conduct of the
Amer. War (1780); Andrews's Hist. of the War (1785, vol. i. 301,—
quoted at length by Ryerson, Loyalists, i. 461); Stedman, Hist. Amer.
War (London, 1794, vol. i. 125). The best of the later historians is
Mahon (Hist. of England, vi.), who was forced to admit, when
pressed upon the question, that the American claims of victory,
which he says they have always held, appear only in the reports of
later British tourists (vol. vi., App. xxix.). Lecky, in his brief account
(England in the Eighteenth Century, iii. 463), makes an intention of
Gage to fortify the Charlestown and not the Dorchester heights the
incentive to the American occupation of the former. Edw. Bernard's
History of England (London) has a curious "View of the Attack on
Bunker's Hill, with the burning of Charlestown."
Something confirmatory, rather than of original value, can be
gained from the histories of various regiments which took part in the
battle, as detailed in the series of Historical Records of such
regiments.[574]

The battle almost immediately found commemoration in British


ballads (Hist. Mag., ii. 58; v. 251; Hale's Hundred Years Ago, p. 7),
and the slain were commemorated in elegiac verses, as in M. M.
Robinson's To a young lady, on the death of her brother, slain in the
late engagement at Boston (London, 1776). The same year there
appeared at Philadelphia The Battle of Bunker's Hill, a dramatic piece
in five acts, in heroic measure, by a gentleman of Maryland.

Note.—The references in the corner of this cut, too fine to be easily


read in this reduced fac-simile, are as follows:—
"A A. First position, where the troops remained until reinforcements
arrived.
B B. Second position.
C C C. Ground on which the different regiments marched to form
the line.
D D. Direction in which the attack was made upon the redoubt and
breastwork.
E E. Position of a part of the 47th and marines, to silence the fire of
a barn at E.
F. First position of the cannon.
G. Second position of the cannon in advancing with the grenadiers,
but stopped by the marsh.
H. Breastwork formed of pickets, hay, stones, etc., with the pieces
of cannon.
I I. Light infantry advancing along the shore to force the right of
the breastwork H.
L L. The "Lively" and "Falcon" hauled close to shore, to rake the low
grounds before the troops advanced.
M M. Gondolas that fired on the rebels in their retreat.
N. Battery of cannon, howitzers, and mortars on Copp's Hill, that
battered the redoubt and set fire to Charlestown.
O O O. The rebels behind all the stone walls, trees, and brush-
wood, and their numbers uncertain, having constantly large
columns to reinforce them during the action.
P. Place from whence the grenadiers received a very heavy fire.
Q. Place of the fifty-second regiment on the night of the 17th.
R. Forty-seventh regiment, in Charlestown, on the night of the
17th.
S. Detachments in the mill and two storehouses.
T. Breastwork thrown up by the remainder of the troops on the
night of the 17th.
Note. The distance from Boston to Charlestown is about 550
yards."

Its author is said to be Hugh Henry Brackenridge, and the


frontispiece, "The Death of Warren", by Norman, is held to be the
earliest engraving in British America by a native artist (Hunnewell, p.
13; Brinley, no. 1,787; Sabin, ii. 7,184; xiv. 58,640). In 1779 there
was printed at Danvers, America Invincible, an heroic poem, in two
books: a Battle at Bunker Hill, by an officer of rank in the Continental
army (Hunnewell, p. 13). In 1781 an anonymous poem was
published in London, known later to be the production of George
Cockings, and called The American War, in which the names of the
officers who have distinguished themselves during the war are
introduced (Brinley, no. 1,788; Hunnewell, p. 14). Of later use of the
battle in fiction, it is only necessary to name Cooper's novel of Lionel
Lincoln and O. W. Holmes's Grandmother's Story of Bunker Hill Battle
(Mass. Hist. Soc. Proc., 1875, p. 33).

The chief enumerations which have been heretofore made of the


plans of the battle of Bunker Hill are by Frothingham, in Mass. Hist.
Soc. Proc., xiv. 53; by Hunnewell in his Bibliog. of Charlestown, p.
17; and by Winsor in the Mem. Hist. of Boston, iii. (introduction).
The earliest rude sketches are by Stiles in his diary (Dawson, p.
393), and one formed by printer's rules in Rivington's Gazetteer,
Aug. 3, 1775 (Frothingham's Siege, p. 397, and Dawson, p. 390).
Montresor, of the British engineers, very soon made a survey of the
field, and this was used by Lieutenant Page in drawing a plan of the
action, which he carried to England with him when, on account of
wounds received while acting as an aid to Howe, he was given leave
of absence (Mass. Hist. Soc. Proc., June, 1875, p. 56). In the Faden
collection (nos. 25-30) of maps in the library of Congress there are
Page's rough and finished plans, drawn before the British works on
the hill were begun, and also plans by Montresor and R. W., of the
Welsh Fusiliers. Page's plan, as engraved, was issued in London in
1776, and called A Plan of the Action at Bunker's Hill.[575]
Page's, however, was not the first engraved. One "by an officer
on the spot" was published in London, Nov. 27, 1775, called Plan of
the battle on Bunker's Hill. Fought on the 17th of June, which was
issued as a broadside, with Burgoyne's letter to Lord Stanley on the
same sheet. The central position of the Americans is called "Warren's
redoubt." This is reproduced in F. Moore's Ballad History of the
Revolution.
Another contemporary British plan—discovered probably "in the
baggage of a British officer", after the royal troops left Boston in
March, 1776, but not brought to light till forty years later, when it
was mentioned in a newspaper in Wilkesbarre, Penn., as having been
found in an old drawer—was one made by Henry de Bernière, of the
Tenth Royal Infantry, on nearly the same scale as Page's, but less
accurately.
It was engraved in 1818 in the Analectic Magazine (Philad., p.
150), and a fac-simile of that engraving is annexed. The text
accompanying it states that its general accuracy had been vouched
for by Governor Brooks, General Dearborn, Dr. A. Dexter, Deacon
Thos. Miller, John Kettell, Dr. Bartlett, the Hon. James Winthrop, and
Mr. [Judge] Prescott. General Dearborn and Deacon Miller thought
the rail fence too far in the rear of the redoubt, having been really
nearly in the line of it. Judge Winthrop and Dr. Bartlett thought the
map in this particular correct. There was the same division of belief
regarding the cannon behind the fence, Dearborn and Miller
believing there were none there, Brooks and Winthrop holding the
contrary. Other witnesses represented to the editor of the Magazine
that there was no interval between the breastwork and the fence,
but that an imperfect line of defence connected the redoubt with the
Mystick shore, as represented in Stedman's (Page's) map.[576]
In the Portfolio (March, 1818) General Dearborn criticised the
plan (Dawson, p. 406), and, using the same plate in his separate
issue of his comments, he imposed in red his ideas of the position of
the works, and this was in turn criticised by Governor Brooks.[577]
Mr. G. G. Smith made a (plan) Sketch of the Battle of Bunker Hill by
a British Officer (Boston, 1843), which grew out of the plan and the
comments on it. Bernière's plan was also used by Colonel Swett as
the basis of the one which he published in his History of the Battle of
Bunker Hill (1828, 1826, 1827), which has been frequently copied
(Ellis, Lossing, etc.). The latest attempt to map the phases of the
action critically is by Carrington in his Battles of the Revolution (p.
112), who gives an eclectic plan. Plans adopting the features of
earlier ones are in the English translation of Botta's War of
Independence, Grant's British Battles (ii. 144). A plan of the present
condition of the ground, by Thomas W. Davis, superposing the line of
the American works, is given in the Bunker Hill Monument
Association's Proceedings (1876). A map of Charlestown in 1775 with
a plan of the battle was prepared and published in 1875 by James E.
Stone. A plan of the works as reconstructed by the British, and
deserted by them in March, 1776, is given in Carter's Genuine detail,
etc. (London, 1784), which is reproduced in Frothingham's Siege, p.
330. Other MS. plans of their works on both hills are in the Faden
maps in the library of Congress.
Before the war closed a plan was engraved by Norman, a Boston
engraver, which is the earliest to appear near the scene itself. This
was a Plan of the town of Boston with the attack on Bunker's Hill, in
the peninsula of Charlestown, on June 17, 1775 (measuring 11-1/2 ×
7 inches), which is, however, of no topographical value as respects
the action. It appeared in Murray's Impartial History (1778), i. p.
430, and in An Impartial History of the War in America (Boston,
1781, vol. i.), and a reduced fac-simile of it is annexed.[578]

C. The American Camp.—A variety of journals and diaries have been


preserved, the best known of which is that of Dr. Thacher, a surgeon
on Prospect Hill.[579]
The daily life of the Cambridge camp is best seen in the letters
sent from it, and foremost in interest among such are those of
Washington.[580] From the Roxbury camp there are letters of General
Thomas in the Thomas Papers, where is one of Dr. John Morgan, the
medical director. Several from Jedediah Huntington are preserved in
the Trumbull Papers, and are printed in the Mass. Hist. Soc. Coll.,
xlix.[581] The principal letters from the Winter Hill camp are those of
General Sullivan,[582] and a few have been printed written at the
Prospect Hill camp.[583]
Something of the spirit prevailing in Watertown, where the
Provincial Congress was sitting, can be seen in the letters of James
Warren and Samuel Cooper.[584]
There are in the library of the Amer. Antiq. Soc. at Worcester
several orderly-books of the siege,[585] and others are preserved
elsewhere.[586]

D. The British Camp.—The condition of Boston during the siege


must be learned from various sources. The Boston News-Letter was
still published, but numbers of it are very scarce for this period, and
no other of the Boston newspapers continued to be published in the
town.[587] It was a convenient vehicle for the British generals, and
any morsel of news likely to be distasteful to the patriots, like the
intercepted correspondence of Washington and John Adams, was
pretty sure to reach the American lines through its columns. The
correspondence of the generals is preserved in the British Archives
and in the papers at the Royal Institution (London), and occasionally
some few letters, like those of Percy in the Boston Public Library,
have been found elsewhere. It is charged that Gage's papers were
stolen in Boston.[588] Some new glimpses were got when
Fonblanque published his Life of Burgoyne.[589] The best accounts of
the succession of events in the town and the daily life are found in
Dr. Ellis's "Chronicles of the Siege",[590] and in Mr. Horace E.
Scudder's "Life in Boston during the Siege", a chapter in the
Memorial Hist. of Boston, vol. iii.,[591] which may be consulted (p.
154) for various sources respecting the details of the privations and
amusements of the people and the garrison, and of the vicissitudes
of its buildings and landmarks.[592] An account of the British works
in Boston is given in Frothingham's Siege of Boston, and the Mem.
Hist. Boston, iii. 79. The current record of the outposts, etc., is noted
in Moore's Diary of the Rev., 109, etc. Carrington (Battles, 154)
refers to a MS. narrative of experiences in the town by one Edw.
Stow. Some of the correspondence of the Boston selectmen with
Thomas, at Roxbury, is in the Thomas Papers. It is, however, to the
diaries, letters, and orderly-books which have been preserved that
we must go for the details of life in the beleaguered town.[593]

E. Boston Evacuated.—The letters of Washington[594] best enable


us to follow the movements, but they may be supplemented by other
contemporary accounts.[595]
Howe's despatch to Dartmouth, dated Nantasket Roads, is in
Dawson, i. 94.[596] His conduct of the siege is criticised in A view of
the evidence relative to the Conduct of the American War (1779).
Contemporary dissatisfaction was expressed in an ironical
congratulatory poem published in London (Sabin, iv., 15,476).
One Crean Brush,[597] acting under orders of Howe, endeavored
to carry off the merchandise from the stores of the town, so far as
he could, on a vessel put at his disposal. Howe's proclamation in his
favor is in fac-simile in the Mem. Hist. of Boston, iii. 97. Brush's
vessel was later captured by Manly (Evacuation Memorial, 166).
Similar experience in trying to escape with his merchandise was
suffered by Jolley Allen, as portrayed in his Account of a part of his
sufferings and losses, ed. by C. C. Smith, given in Mass. Hist. Soc.
Proc., Feb., 1878, and separately. Allen's narrative was reprinted in
the spelling of the original MS. in An Account of a part of the
sufferings and losses of Jolley Allen, a native of London, with a
preface and Notes by Mrs. Frances Mary Stoddard (Boston, 1883).
An inventory of the stores left by the British is in the Siege of Boston,
406.[598] In the cabinet of this society is a handbill adopted by the
freeholders of Boston, Nov. 18 [1776?], calling upon all who had
suffered in property in Boston since March, 1775, to report the same
to a committee.[599]
Washington's instructions (April 4, 1776) to Ward are in the
printed Heath Papers, P. 4. The Mass. legislature, April 30, 1776,
ordered beacons to be set at Cape Ann, Marblehead, and Blue Hill,
ready to be fired in case of the enemy's reappearing, which was for a
long time dreaded. Ward writes to Washington of his measures in
progress.[600]
The correspondence of John Adams and John Winthrop (Mass.
Hist. Coll., xlv.) shows constant anxiety lest the defences should not
be prepared in case of need.[601]
SIEGE OF BOSTON, 1776.
The westerly half of the map in the octavo atlas of Marshall's
Washington, which is a reduction of the map in the earlier quarto
atlas (1804). It is reproduced in the French translations of Marshall
and of Botta.

The cut on the title of the present volume represents one side of
the medal given by Congress to Washington, to commemorate his
raising the siege of Boston.[602]

F. Maps of the Siege of Boston.—Plans of Boston and its


neighborhood, including its harbor, for the illustration of the siege of
Boston, are numerous, and the account of them given in the Mem.
Hist. of Boston (iii., introd.) is in the main followed in the present
enumeration, which divides them into those of American, English,
French, and German origin, and adheres as far as possible to the
order of publication in each group.
The earliest American is the 1769 (or last) edition of what is
known as Price's edition of Bonner's map of Boston, which had done
service since 1722 by successive changes in the plate, this last issue
showing Hancock's Wharf, and "Esqr. Hancock's seat" on Beacon
Street.[603] This map sufficed for local use till the events of 1775
induced new interest in the topography, when the earliest response
came from Philadelphia, where C. Lownes engraved A new plan of
Boston Harbour from an actual survey, for the Pennsylvania
Magazine. It presented a reminder of the great event of the year in
its "N. B. Charlestown burnt, June 17, 1775, by the Regulars." There
is another Draught of the Harbour of Boston and the adjacent towns
and roads, a manuscript, dated 1775, among the Belknap Papers, i.
84, in the cabinet of the Mass. Hist. Society. The same Pennsylvania
Magazine, the next month (July, 1775), gave as engraved by Aitkins
A new and correct plan of the town of Boston and Provincial Camp.
The town seems to be taken from a plan which had appeared in the
Gentleman's Magazine (London) the previous January; but in one
corner was added a plan of the circumvallating lines of the besieging
army.[604] Later in the season two other plans were made, showing
the American lines, which were not published, however, till long after.
One is given in Force's American Archives, 4th series, vol. iii.,[605]
and the other was made by Col. John Trumbull, in Sept., 1775, which
was published in his Autobiography in 1841.[606] Of about the same
time is another very small Plan of Boston and its environs, showing
the circumvallating lines, which is in one corner of a large Map of the
Seat of Civil War in America, engraved by B. Romans, and dedicated
to Hancock. There is also, in the library of the Mass. Hist. Society, a
rude plan of the harbor and vicinity, showing the positions of the
provincials, which are reckoned at 20,000, while the royal forces are
put at 8,000. I find no other American plan till Norman's, in 1781,
reproduced on another page; and not another till The Seat of the
late War at Boston appeared in the Universal Asylum and Columbian
Magazine, July, 1789, p. 444, but this is a rather scant map of the
country as far inland as Worcester. Gordon had the year before this
given a map in his American Revolution (London, 1788) based on
English sources; but it has been the foundation of most of the
eclectic maps since published in this country.[607]
In 1822 a Mr. Finch printed in Silliman's Journal an account of the
traces then remaining of the earthworks of the siege, both American
and British.[608] There is an enumeration of the different sections of
the lines, within and without Boston, in the Mem. Hist. Boston (vol.
iii. 104).[609]

The earliest English plan of this period is one called A plan of


Boston and Charlestown from a drawing made in 1771, which
occupies the margin of a larger map, engraved for The Town and
Country Magazine in 1776, later to be mentioned. The Catalogue of
the King's Maps (British Museum) shows a colored plan of Boston
and vicinity (1773) in the centre of a large sheet, with marginal
views (later to be described).
In 1774 a Plan of the town of Boston made part of a Chart of the
Coast of New England, which appeared in the London Magazine,
April, 1774, and in The American Atlas, issued by Thomas Jefferys in
London, in 1776. This map seems to be the model of a New and
accurate Plan of the town of Boston, which is engraved in the corner
of A Map of the most inhabited part of New England, by Thomas
Jefferys, Nov. 29, 1774, usually also found in The American Atlas
(1776, nos. 15 and 16). This map is found with the date 1755, even
after changes of a later date had been made in the plate.[610] The
original map has also a marginal plan of Boston harbor (Mass. Hist.
Soc. Proc., September, 1864).
The earliest English map of 1775 is one which appeared in the
Gentleman's Magazine (January, 1775), though it is dated Feb. 1,
1775. It shows the town and harbor.[611]
In the June number of the Gentleman's Magazine is a "map of
the country one hundred miles round Boston, in order to show the
situation and march of the troops, as well provincial as regulars,
which are now within sight of each other, and are hourly expected to
engage."
In June, 1775, was also made a not very accurate map of the
town and its environs, which was published in London, Aug. 28, to
satisfy the eagerness for a map of the region to which the news of
the battle of Bunker Hill had turned all eyes. It is to be found in the
first volume of Almon's Remembrancer, and is reproduced herewith.
A few weeks after the fight at Charlestown there was probably made
in Boston the MS. plan of Boston and circumjacent Country, showing
the present situation of the king's troops and the rebel
intrenchments. It is dated July 25, 1775, and is owned by Dr. Charles
Deane.[612]
The largest chart which we have of Boston harbor of this period
is dated August 5, 1775, and was the work of Samuel Holland, the
surveyor-general of the Northern colonies, who was for some years
employed on a coast survey.[613] It takes in Nahant, Nantasket, and
Cambridge, and was based principally on the surveys of George
Callendar (1769).[614] When Des Barres included it in his Atlantic
Neptune (part iii., no. 6, 1780-1783), he marked in the besieging
lines, and dated it Dec. 1, 1781, and in this state Des Barres also
used it in his Coast and Harbors of New England.[615]
A map showing thirty miles round Boston, and bearing date Aug.
14, 1775, is in the king's library (British Museum), and is signed by
M. Armstrong. It has marginal statistical tables, and in the upper
right-hand corner is a plan of the "action near Charlestown, 17 June,
1775."[616] There is among the Force maps in the library of Congress
the MS. original of the map (sketched herewith as Boston and
Charlestown, 1775), which is called A Draught of the Towns of
Boston and Charlestown and the circumjacent country, shewing the
works thrown up by his Majesty's Troops, and also those by the
Rebels during the campaign of 1775. N. B. The rebel entrenchments
are expressed as they appear from Beacon Hill.
On August 28th the British town-major in Boston, James
Urquhart, licensed Henry Pelham to make a Plan of Boston with its
environs. It was engraved in aquatints in London, on two sheets, and
not published till June 2, 1777. Dr. Belknap, who was much troubled
to find a correct plan of the town for this period, thought Pelham's
was the best.[617]
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookmasss.com

You might also like