0% found this document useful (0 votes)

25 views

Learn Pyspark Build Pythonbased Machine Learning And Deep Learning Models 1st Edition Pramod Singh instant download

The document is about the book 'Learn PySpark: Build Python-based Machine Learning and Deep Learning Models' by Pramod Singh, which covers various aspects of using PySpark for machine learning and deep learning. It includes chapters on Spark architecture, data processing, structured streaming, machine learning libraries, and more. The book is aimed at helping readers build and implement machine learning models using PySpark.

Uploaded by

papxpurvys62

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Learn Pyspark Build Pythonbased Machine Learning And Deep Learning Models 1st Edition Pramod Singh instant download

Uploaded by

papxpurvys62

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 91

Learn Pyspark Build Pythonbased Machine Learning

And Deep Learning Models 1st Edition Pramod

Singh download

https://ebookbell.com/product/learn-pyspark-build-pythonbased-
machine-learning-and-deep-learning-models-1st-edition-pramod-
singh-50195356

Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Learn Pyspark Build Pythonbased Machine Learning And Deep Learning

Models Pramod Singh

https://ebookbell.com/product/learn-pyspark-build-pythonbased-machine-
learning-and-deep-learning-models-pramod-singh-10477130

Applied Data Science Using Pyspark Learn The Endtoend Predictive

Modelbuilding Cycle 1st Edition Ramcharan Kakarla

https://ebookbell.com/product/applied-data-science-using-pyspark-
learn-the-endtoend-predictive-modelbuilding-cycle-1st-edition-
ramcharan-kakarla-53080260

Applied Data Science Using Pyspark Learn The Endtoend Predictive

Modelbuilding Cycle Second Edition 2nd Edition Ramcharan Kakarla

https://ebookbell.com/product/applied-data-science-using-pyspark-
learn-the-endtoend-predictive-modelbuilding-cycle-second-edition-2nd-
edition-ramcharan-kakarla-148554030

Data Science Solutions With Python Fast And Scalable Models Using
Keras Pyspark Mllib H2o Xgboost And Scikitlearn 1st Edition Tshepo
Chris Nokeri

https://ebookbell.com/product/data-science-solutions-with-python-fast-
and-scalable-models-using-keras-pyspark-mllib-h2o-xgboost-and-
scikitlearn-1st-edition-tshepo-chris-nokeri-35650824
Learn Javafx Game And App Development With Fxgl 17 1st Edition Almas
Baimagambetov

https://ebookbell.com/product/learn-javafx-game-and-app-development-
with-fxgl-17-1st-edition-almas-baimagambetov-44879768

Learn C The Hard Way Practical Exercises On The Computational Subjects

You Keep Avoiding Like C Zed A Shaw

https://ebookbell.com/product/learn-c-the-hard-way-practical-
exercises-on-the-computational-subjects-you-keep-avoiding-like-c-zed-
a-shaw-44988700

Learn To Read Ancent Sumerian An Introduction For Complete Begiknners

Joshua Bowen

https://ebookbell.com/product/learn-to-read-ancent-sumerian-an-
introduction-for-complete-begiknners-joshua-bowen-45333320

Learn C Programming A Beginners Guide To Learning The Most Powerful

And Generalpurpose Programming Language With Ease 2nd Edition 2nd Jeff
Szuhay

https://ebookbell.com/product/learn-c-programming-a-beginners-guide-
to-learning-the-most-powerful-and-generalpurpose-programming-language-
with-ease-2nd-edition-2nd-jeff-szuhay-45462930

Learn Enough Ruby To Be Dangerous Write Programs Publish Gems And

Develop Sinatra Web Apps With Ruby Michael Hartl

https://ebookbell.com/product/learn-enough-ruby-to-be-dangerous-write-
programs-publish-gems-and-develop-sinatra-web-apps-with-ruby-michael-
hartl-46132216
Learn PySpark
Build Python-based Machine
Learning and Deep Learning Models
—
Pramod Singh

www.allitebooks.com
Learn PySpark
Build Python-based Machine
Learning and Deep Learning
Models

Pramod Singh

www.allitebooks.com
Learn PySpark: Build Python-based Machine Learning and Deep
Learning Models
Pramod Singh
Bangalore, Karnataka, India

ISBN-13 (pbk): 978-1-4842-4960-4 ISBN-13 (electronic): 978-1-4842-4961-1

https://doi.org/10.1007/978-1-4842-4961-1

Copyright © 2019 by Pramod Singh

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way,
and transmission or information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark
symbol with every occurrence of a trademarked name, logo, or image, we use the names, logos,
and images only in an editorial fashion and to the benefit of the trademark owner, with no
intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or not
they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the author nor the editors nor the publisher can accept any legal
responsibility for any errors or omissions that may be made. The publisher makes no warranty,
express or implied, with respect to the material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Celestin Suresh John
Development Editor: James Markham
Coordinating Editor: Aditee Mirashi
Cover designed by eStudioCalamar
Cover image designed by Freepik (www.freepik.com)
Distributed to the book trade worldwide by Springer Science+Business Media New York,
233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505,
e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a
California LLC and the sole member (owner) is Springer Science+Business Media Finance Inc
(SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
For information on translations, please e-mail rights@apress.com, or visit www.apress.com/
rights-permissions.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook
versions and licenses are also available for most titles. For more information, reference our Print
and eBook Bulk Sales web page at www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available
to readers on GitHub via the book’s product page, located at www.apress.com/978-1-4842-4960-4.
For more detailed information, please visit www.apress.com/source-code.
Printed on acid-free paper

www.allitebooks.com
I dedicate this book to my wife, Neha, my son, Ziaan, and
my parents. Without you, this book wouldn’t have
been possible. You complete my world and are the
source of my strength.

www.allitebooks.com
Table of Contents
About the Author��xi
About the Technical Reviewer��xiii
Acknowledgments��xv
Introduction��xvii

Chapter 1: Introduction to Spark��1

History��1
Data Collection��2
Data Storage��3
Data Processing��3
Spark Architecture��4
Storage��5
Resource Management��5
Engine and Ecosystem��8
Programming Language APIs��9
Setting Up Your Environment��10
Local Setup��10
Dockers��11
Cloud Environments��11
Conclusion��16

www.allitebooks.com
Table of Contents

Chapter 2: Data Processing��17

Creating a SparkSession Object��18
Creating Dataframes��18
Null Values��19
Subset of a Dataframe��23
Select��24
Filter��25
Where��26
Aggregations��26
Collect��35
User-Defined Functions (UDFs)��37
Pandas UDF��40
Joins��41
Pivoting��43
Window Functions or Windowed Aggregates��44
Conclusion��48

Chapter 3: Spark Structured Streaming��49

Batch vs. Stream��49
Batch Data��50
Stream Processing��50
Spark Streaming��51
Structured Streaming��53
Data Input��56
Data Processing��57
Final Output��57

vi
Table of Contents

Building a Structured App��57

Operations��59
Joins��63
Structured Streaming Alternatives��65
Conclusion��65

Chapter 4: Airflow��67
Workflows��67
Graph Overview��69
Undirected Graphs��69
Directed Graphs��70
DAG Overview��71
Operators��73
Installing Airflow��74
Airflow Using Docker��74
Creating Your First DAG��76
Step 1: Importing the Required Libraries��78
Step 2: Defining the Default Arguments��78
Step 3: Creating a DAG��79
Step 4: Declaring Tasks��79
Step 5: Mentioning Dependencies��80
Conclusion��84

Chapter 5: MLlib: Machine Learning Library��85

Calculating Correlations��86
Chi-Square Test��89
Transformations��94
Binarizer��94
Principal Component Analysis��96

vii
Table of Contents

Normalizer��98
Standard Scaling��100
Min-Max Scaling��101
MaxAbsScaler��103
Binning��104
Building a Classification Model��107
Step 1: Load the Dataset��107
Step 2: Explore the Dataframe��108
Step 3: Data Transformation��110
Step 4: Splitting into Train and Test Data��112
Step 5: Model Training��112
Step 6: Hyperparameter Tuning��113
Step 7: Best Model��115
Conclusion��115

Chapter 6: Supervised Machine Learning��117

Supervised Machine Learning Primer��117
Binary Classification��120
Multi-class Classification��121
Building a Linear Regression Model��121
Reviewing the Data Information��123
Generalized Linear Model Regression��128
Decision Tree Regression��131
Random Forest Regressors��133
Gradient-Boosted Tree Regressor��136
Step 1: Build and Train a GBT Regressor Model��136
Step 2: Evaluate the Model Performance on Test Data��137

viii
Table of Contents

Building Multiple Models for Binary Classification Tasks��138

Logistic Regression��138
Decision Tree Classifier��148
Support Vector Machines Classifiers��150
Naive Bayes Classifier��152
Gradient Boosted Tree Classifier��154
Random Forest Classifier��156
Hyperparameter Tuning and Cross-Validation��158
Conclusion��159

Chapter 7: Unsupervised Machine Learning��161

Unsupervised Machine Learning Primer��161
Reviewing the Dataset��165
Importing SparkSession and Creating an Object��165
Reshaping a Dataframe for Clustering��169
Building Clusters with K-Means��173
Conclusion��181

Chapter 8: Deep Learning Using PySpark��183

Deep Learning Fundamentals��183
Human Brain Neuron vs. Artificial Neuron��185
Activation Functions��188
Neuron Computation��190
Training Process: Neural Network��192
Building a Multilayer Perceptron Model��198
Conclusion��203

Index��205

ix
About the Author
Pramod Singh has more than 11 years of
hands-on experience in data engineering
and sciences and is currently a manager
(data science) at Publicis Sapient in India,
where he drives strategic initiatives that
deal with machine learning and artificial
intelligence (AI). Pramod has worked with
multiple clients, in areas such as retail,
telecom, and automobile and consumer
goods, and is the author of Machine Learning with PySpark. He also speaks
at major forums, such as Strata Data, and at AI conferences.
Pramod received a bachelor’s degree in electrical and electronics
engineering from Mumbai University and an MBA (operations and
finance) from Symbiosis International University, in addition to data
analytics certification from IIM–Calcutta.
Pramod lives in Bangalore with his wife and three-year-old son. In his
spare time, he enjoys playing guitar, coding, reading, and watching soccer.

xi
About the Technical Reviewer
Manoj Patil has worked in the software
industry for 19 years. He received an
engineering degree from COEP, Pune (India),
and has been enjoying his exciting IT journey
ever since.
As a principal architect at TatvaSoft, Manoj
has taken many initiatives in the organization,
ranging from training and mentoring teams,
leading data science and ML practice, to
successfully designing client solutions from
different functional domains.
He began his career as a Java programmer but is fortunate to have
worked on multiple frameworks with multiple languages and can claim
to be a full stack developer. In the last five years, Manoj has worked
extensively in the field of BI, big data, and machine learning, using such
technologies as Hitachi Vantara (Pentaho), the Hadoop ecosystem,
TensorFlow, Python-based libraries, and more.
He is passionate about learning new technologies, trends, and
reviewing books. When he’s not working, he’s either exercising or reading/
listening to infinitheism literature.

xiii
Acknowledgments
This is my second book on Spark, and along the way, I have come to
realize my love for handling big data and performing machine learning as
well. Going forward, I intend to write many more books, but first, let me
thank a few people who have helped me along this journey. First, I must
thank the most important person in my life, my beloved wife, Neha, who
selflessly supported me throughout and sacrificed so much to ensure that I
completed this book.
I must thank Celestin Suresh John, who believed in me and extended
the opportunity to write this book. Aditee Mirashi is one of the best editors
in India. This is my second book with her, and it was even more exciting
to work with her this time. As usual, she was extremely supportive and
always there to accommodate my requests. I especially would like to thank
Jim Markham, who dedicated his time to reading every single chapter and
offered so many useful suggestions. Thanks, Jim, I really appreciate your
input. I also want to thank Manoj Patil, who had the patience to review
every line of code and check the appropriateness of each example. Thank
you for your feedback and encouragement. It really made a difference to
me and the book.
I also want to thank the many mentors who have constantly forced
me to pursue my dreams. Thank you Sebastian Keupers, Dr. Vijay
Agneeswaran, Sreenivas Venkatraman, Shoaib Ahmed, and Abhishek
Kumar, for your time. Finally, I am infinitely grateful to my son, Ziaan,
and my parents, for their endless love and support, irrespective of
circumstances. You all make my world beautiful.

xv
Introduction
The idea of writing this book had already been seeded while I was working
on my first book, and there was a strong reason for that. The earlier book
was more focused on machine learning using big data and essentially did
not deep-dive sufficiently into supporting aspects, but this book goes a
little deeper into the internals of Spark’s machine learning library, as well
as analyzing of streaming data. It is a good reference point for someone
who wants to learn more about how to automate different workflows and
build pipelines to handle real-time data.
This book is divided into three main sections. The first provides an
introduction to Spark and data analysis on big data; the middle section
discusses using Airflow for executing different jobs, in addition to data
analysis on streaming data, using the structured streaming features of
Spark. The final section covers translation of a business problem into
machine learning and solving it, using Spark’s machine learning library,
with a deep dive into deep learning as well.
This book might also be useful to data analysts and data engineers, as
it covers the steps of big data processing using PySpark. Readers who want
to make a transition to the data science and machine learning fields will
also find this book a good starting point and can gradually tackle more
complicated areas later. The case studies and examples given in the book
make it really easy to follow and understand the related fundamental
concepts. Moreover, there are very few books available on PySpark, and
this book certainly adds value to readers’ knowledge. The strength of this
book lies in its simplicity and on its application of machine learning to
meaningful datasets.

xvii
Introduction

I have tried my best to put all my experience and knowledge into this
book, and I feel it is particularly relevant to what businesses are seeking
in order to solve real challenges. I hope that it will provide you with some
useful takeaways.

xviii
CHAPTER 1

Introduction to Spark
As this book is about Spark, it makes perfect sense to start the first chapter
by looking into some of Spark’s history and its different components.
This introductory chapter is divided into three sections. In the first, I go
over the evolution of data and how it got as far as it has, in terms of size.
I’ll touch on three key aspects of data. In the second section, I delve into
the internals of Spark and go over the details of its different components,
including its architecture and modus operandi. The third and final section
of this chapter focuses on how to use Spark in a cloud environment.

H
istory
The birth of the Spark project occurred at the Algorithms, Machine, and
People (AMP) Lab at the University of California, Berkeley. The project
was initiated to address the potential issues in the Hadoop MapReduce
framework. Although Hadoop MapReduce was a groundbreaking
framework to handle big data processing, in reality, it still had a lot
of limitations in terms of speed. Spark was new and capable of doing
in-memory computations, which made it almost 100 times faster than
any other big data processing framework. Since then, there has been a
continuous increase in adoption of Spark across the globe for big data
applications. But before jumping into the specifics of Spark, let’s consider a
few aspects of data itself.

© Pramod Singh 2019 1

P. Singh, Learn PySpark, https://doi.org/10.1007/978-1-4842-4961-1_1
Chapter 1 Introduction to Spark

Data can be viewed from three different angles: the way it is collected,
stored, and processed, as shown in Figure 1-1.

Figure 1-1. Three aspects of data

D
ata Collection
A huge shift in the manner in which data is collected has occurred over
the last few years. From buying an apple at a grocery store to deleting an
app on your mobile phone, every data point is now captured in the back
end and collected through various built-in applications. Different Internet
of things (IoT) devices capture a wide range of visual and sensory signals
every millisecond. It has become relatively convenient for businesses
to collect that data from various sources and use it later for improved
decision making.

2
Chapter 1 Introduction to Spark

D
ata Storage
In previous years, no one ever imagined that data would reside at some
remote location, or that the cost to store data would be as cheap as it is.
Businesses have embraced cloud storage and started to see its benefits
over on-premise approaches. However, some businesses still opt for on-
premise storage, for various reasons. It’s known that data storage began
by making use of magnetic tapes. Then the breakthrough introduction
of floppy discs made it possible to move data from one place to another.
However, the size of the data was still a huge limitation. Flash drives and
hard discs made it even easier to store and transfer large amounts of data
at a reduced cost. (See Figure 1-2.) The latest trend in the advancement of
storage devices has resulted in flash drives capable of storing data up to
2TBs, at a throwaway price.

Figure 1-2. Evolution of data storage

This trend clearly indicates that the cost to store data has been
reduced significantly over the years and continues to decline. As a result,
businesses don’t shy away from storing huge amounts of data, irrespective
of its kind. From logs to financial and operational transactions to simple
employee feedback, everything gets stored.

D
ata Processing
The final aspect of data is using stored data and processing it for some
analysis or to run an application. We have witnessed how efficient
computers have become in the last 20 years. What used to take five
minutes to execute probably takes less than a second using today’s

3
Chapter 1 Introduction to Spark

machines with advanced processing units. Hence, it goes without saying

that machines can process data much faster and easier. Nonetheless,
there is still a limit to the amount of data a single machine can process,
regardless of its processing power. So, the underlying idea behind Spark
is to use a collection (cluster) of machines and a unified processing
engine (Spark) to process and handle huge amounts of data, without
compromising on speed and security. This was the ultimate goal that
resulted in the birth of Spark.

S
park Architecture
There are five core components that make Spark so powerful and easy
to use. The core architecture of Spark consists of the following layers, as
shown in Figure 1-3:

• Storage

• Resource management

• Engine

• Ecosystem

• APIs

4
Chapter 1 Introduction to Spark

Figure 1-3. Core components of Spark

Storage
Before using Spark, data must be made available in order to process it. This
data can reside in any kind of database. Spark offers multiple options to
use different categories of data sources, to be able to process it on a large
scale. Spark allows you to use traditional relational databases as well as
NoSQL, such as Cassandra and MongoDB.

Resource Management
The next layer consists of a resource manager. As Spark works on a set of
machines (it also can work on a single machine with multiple cores), it
is known as a Spark cluster. Typically, there is a resource manager in any
cluster that efficiently handles the workload between these resources.

5
Chapter 1 Introduction to Spark

The two most widely used resource managers are YARN and Mesos. The
resource manager has two main components internally:

1. Cluster manager

2. Worker

It’s kind of like master-slave architecture, in which the cluster manager

acts as a master node, and the worker acts as a slave node in the cluster.
The cluster manager keeps track of all information pertaining to the
worker nodes and their current status. Cluster managers always maintain
the following information:

• Status of worker node (busy/available)

• Location of worker node

• Memory of worker node

• Total CPU cores of worker node

The main role of the cluster manager is to manage the worker nodes
and assign them tasks, based on the availability and capacity of the worker
node. On the other hand, a worker node is only responsible for executing
the task it’s given by the cluster manager, as shown in Figure 1-4.

6
Chapter 1 Introduction to Spark

Figure 1-4. Resource management

The tasks that are given to the worker nodes are generally the
individual pieces of the overall Spark application. The Spark application
contains two parts:

1. Task
2. Spark driver

The task is the data processing logic that has been written in either
PySpark or Spark R code. It can be as simple as taking a total frequency
count of words to a very complex set of instructions on an unstructured
dataset. The second component is Spark driver, the main controller of a
Spark application, which consistently interacts with a cluster manager to
find out which worker nodes can be used to execute the request. The role
of the Spark driver is to request the cluster manager to initiate the Spark
executor for every worker node.

7
Chapter 1 Introduction to Spark

Engine and Ecosystem

The base of the Spark architecture is its core, which is built on top of RDDs
(Resilient Distributed Datasets) and offers multiple APIs for building other
libraries and ecosystems by Spark contributors. It contains two parts: the
distributed computing infrastructure and the RDD programming abstraction.
The default libraries in the Spark toolkit come as four different offerings.

Spark SQL
SQL being used by most of the ETL operators across the globe makes it a
logical choice to be part of Spark offerings. It allows Spark users to perform
structured data processing by running SQL queries. In actuality, Spark SQL
leverages the catalyst optimizer to perform the optimizations during the
execution of SQL queries.
Another advantage of using Spark SQL is that it can easily deal
with multiple database files and storage systems such as SQL, NoSQL,
Parquet, etc.

MLlib
Training machine learning models on big datasets was starting to become
a huge challenge, until Spark’s MLlib (Machine Learning library) came into
existence. MLlib gives you the ability to train machine learning models on
huge datasets, using Spark clusters. It allows you to build in supervised,
unsupervised, and recommender systems; NLP-based models; and deep
learning, as well as within the Spark ML library.

Structured Streaming
The Spark Streaming library provides the functionality to read and process
real-time streaming data. The incoming data can be batch data or near
real-time data from different sources. Structured Streaming is capable of

8
Chapter 1 Introduction to Spark

ingesting real-time data from such sources as Flume, Kafka, Twitter, etc.
There is a dedicated chapter on this component later in this book (see
Chapter 3).

Graph X
This is a library that sits on top of the Spark core and allows users to
process specific types of data (graph dataframes), which consists of nodes
and edges. A typical graph is used to model the relationship between the
different objects involved. The nodes represent the object, and the edge
between the nodes represents the relationship between them. Graph
dataframes are mainly used in network analysis, and Graph X makes it
possible to have distributed processing of such graph dataframes.

Programming Language APIs

Spark is available in four languages. Because Spark is built using Scala, that
becomes the native language. Apart from Scala, we can also use Python,
Java, and R, as shown in Figure 1-5.

Figure 1-5. Language APIs

9
Chapter 1 Introduction to Spark

Setting Up Your Environment

In this final section of this chapter, I will go over how to set up the
Spark environment in the cloud. There are multiple ways in which we can
use Spark:

• Local setup

• Dockers

• Cloud environment (GCP, AWS, Azure)

• Databricks

L ocal Setup
It is relatively easy to install and use Spark on a local system, but it fails
the core purpose of Spark itself, if it’s not used on a cluster. Spark’s core
offering is distributed data processing, which will always be limited to a
local system’s capacity, in the case that it’s run on a local system, whereas
one can benefit more by using Spark on a group of machines instead.
However, it is always good practice to have Spark locally, as well as to test
code on sample data. So, follow these steps to do so:

1. Ensure that Java is installed; otherwise install Java.

2. Download the latest version of Apache Spark from

https://spark.apache.org/downloads.html.

3. Extract the files from the zipped folder.

4. Copy all the Spark-related files to their respective

directory.

5. Configure the environment variables to be able to

run Spark.

6. Verify the installation and run Spark.

10
Chapter 1 Introduction to Spark

Dockers
Another way of using Spark locally is through the containerization
technique of dockers. This allows users to wrap all the dependencies and
Spark files into a single image, which can be run on any system. We can
kill the container after the task is finished and rerun it, if required. To use
dockers for running Spark, we must install Docker on the system first
and then simply run the following command: [In]: docker run -it -p
8888:8888 jupyter/pyspark-notebook".

Cloud Environments
As discussed earlier in this chapter, for various reasons, local sets are not
of much help when it comes to big data, and that’s where cloud-based
environments make it possible to ingest and process huge datasets in a
short period. The real power of Spark can be seen easily while dealing with
large datasets (in excess of 100TB). Most of the cloud-based infra-providers
allow you to install Spark, which sometimes comes preconfigured as well.
One can easily spin up the clusters with required specifications, according
to need. One of the cloud-based environments is Databricks.

Databricks
Databricks is a company founded by the creators of Spark, in order to
provide the enterprise version of Spark to businesses, in addition to
full-fledged support. To increase Spark’s adoption among the community
and other users, Databricks also provides a free community edition of
Spark, with a 6GB cluster (single node). You can increase the size of the

11
Chapter 1 Introduction to Spark

cluster by signing up for an enterprise account with Databricks, using the

following steps:

1. Search for the Databricks web site and select

Databricks Community Edition, as shown in
Figure 1-6.

Figure 1-6. Databricks web page

2. If you have a user account with Databricks, you can

simply log in. If you don’t have an account, you must
create one, in order to use Databricks, as shown in
Figure 1-7.

12
Chapter 1 Introduction to Spark

Figure 1-7. Databricks login

3. Once you are on the home page, you can choose to

either load a new data source or create a notebook
from scratch, as shown in Figure 1-8. In the latter
case, you must have the cluster up and running, to
be able to use the notebook. Therefore, you must
click New Cluster, to spin up the cluster. (Databricks
provides a 6GB AWS EMR cluster.)

13
Chapter 1 Introduction to Spark

Figure 1-8. Creating a Databricks notebook

4. To set up the cluster, you must give a name to the

cluster and select the version of Spark that must
configure with the Python version, as shown in
Figure 1-9. Once all the details are filled in, you must
click Create Cluster and wait a couple of minutes,
until it spins up.

Figure 1-9. Creating a Databricks cluster

14
Chapter 1 Introduction to Spark

5. You can also view the status of the cluster by going

into the Clusters option on the left side widget, as
shown in Figure 1-10. It gives all the information
associated with the particular cluster and its current
status.

Figure 1-10. Databricks cluster list

6. The final step is to open a notebook and attach it

to the cluster you just created (Figure 1-11). Once
attached, you can start the PySpark code.

Figure 1-11. Databricks notebook

15
Chapter 1 Introduction to Spark

Overall, since 2010, when Spark became an open source platform, its
users have risen in number consistently, and the community continues to
grow every day. It’s no surprise that the number of contributors to Spark
has outpaced that of Hadoop. Some of the reasons for Spark’s popularity
were noted in a survey, the results of which are shown in Figure 1-12.

Figure 1-12. Results of Spark adoption survey

C
onclusion
This chapter provided a brief history of Spark, its core components, and
the process of accessing it in a cloud environment. In upcoming chapters,
I will delve deeper into the various aspects of Spark and how to build
different applications with it.

16
CHAPTER 2

Data Processing
This chapter covers different steps to preprocess and handle data in
PySpark. Preprocessing techniques can certainly vary from case to case,
and many different methods can be used to massage the data into desired
form. The idea of this chapter is to expose some of the common techniques
for dealing with big data in Spark. In this chapter, we are going to go over
different steps involved in preprocessing data, such as handling missing
values, merging datasets, applying functions, aggregations, and sorting.
One major part of data preprocessing is the transformation of numerical
columns into categorical ones and vice versa, which we are going to look at
over the next few chapters and are based on machine learning. The dataset
that we are going to make use of in this chapter is inspired by a primary
research dataset and contains a few attributes from the original dataset,
with additional columns containing fabricated data points.

Note All the following steps are written in Jupyter Notebook,

running Spark on a Docker image (mentioned in Chapter 1). All the
subsequent code can also be run in Databricks.

© Pramod Singh 2019 17

P. Singh, Learn PySpark, https://doi.org/10.1007/978-1-4842-4961-1_2
Chapter 2 Data Processing

Creating a SparkSession Object

The first step is to create a SparkSession object, in order to use Spark. We
also import all the required functions and datatypes from spark.sql:

[In]: from pyspark.sql import SparkSession

[In]: spark=SparkSession.builder.appName('data_processing').
getOrCreate()
[In]: import pyspark.sql.functions as F
[In]: from pyspark.sql.types import *

Now, instead of directly reading a file to create a dataframe, we go

over the process of creating a dataframe, by passing key values. The way
we create a dataframe in Spark is by declaring its schema and pass the
columns values.

Creating Dataframes
In the following example, we are creating a new dataframe with five
columns of certain datatypes (string and integer). As you can see, when
we call show on the new dataframe, it is created with three rows and five
columns containing the values passed by us.

[In]:schema=StructType().add("user_id","string").
add("country","string").add("browser", "string").
add("OS",'string').add("age", "integer")

[In]: df=spark.createDataFrame([("A203",'India',"Chrome","WIN",
33),("A201",'China',"Safari","MacOS",35),("A205",'UK',"Mozilla",
"Linux",25)],schema=schema)

[In]: df.printSchema()
[Out]:

18
Chapter 2 Data Processing

[In]: df.show()
[Out]:

Null Values
It is very common to have null values as part of the overall data. Therefore,
it becomes critical to add a step to the data processing pipeline, to handle
the null values. In Spark, we can deal with null values by either replacing
them with some specific value or dropping the rows/columns containing
null values.
First, we create a new dataframe (df_na) that contains null values in
two of its columns (the schema is the same as in the earlier dataframe).
By the first approach to deal with null values, we fill all null values in the
present dataframe with a value of 0, which offers a quick fix. We use the
fillna function to replace all the null values in the dataframe with 0.
By the second approach, we replace the null values in specific columns
(country, browser) with 'USA' and 'Safari', respectively.

[In]: df_na=spark.createDataFrame([("A203",None,"Chrome","WIN",
33),("A201",'China',None,"MacOS",35),("A205",'UK',"Mozilla",
"Linux",25)],schema=schema)

[In]: df_na.show()
[Out]:

19
Chapter 2 Data Processing

[In]: df_na.fillna('0').show()
[Out]:

[In]: df_na.fillna( { 'country':'USA', 'browser':'Safari' } ).show()

[Out]:

In order to drop the rows with any null values, we can simply use the
na.drop functionality in PySpark. Whereas if this needs to be done for
specific columns, we can pass the set of column names as well, as shown
in the following example:

[In]: df_na.na.drop().show()
[Out]:

20
Chapter 2 Data Processing

[In]: df_na.na.drop(subset='country').show()
[Out]:

Another very common step in data processing is to replace some data

points with particular values. We can use the replace function for this, as
shown in the following example. To drop the column of a dataframe, we
can use the drop functionality of PySpark.

[In]: df_na.replace("Chrome","Google Chrome").show()

[Out]:

[In]: df_na.drop('user_id').show()
[Out]:

21
Chapter 2 Data Processing

Now that we have seen how to create a dataframe by passing a value

and how to treat missing values, we can create a Spark dataframe, by
reading a file (.csv, parquet, etc.). The dataset contains a total of seven
columns and 2,000 rows. The summary function allows us to see the
statistical measures of the dataset, such as the min, max, and mean of the
numerical data present in the dataframe.

[In]: df=spark.read.csv("customer_data.csv",header=True,
inferSchema=True)
[In]: df.count()
[Out]: 2000

[In]: len(df.columns)
[Out]: 7

[In]: df.printSchema()
[Out]:

[In]: df.show(3)
[Out]:

22
Chapter 2 Data Processing

[In]: df.summary().show()
[Out]:

Most of the time, we won’t use all the columns present in the
dataframe, as some might be redundant and carry very little value in terms
of providing useful information. Therefore, subsetting the dataframe
becomes critical for having proper data in place for analysis. I’ll cover this
in the next section.

Subset of a Dataframe
A subset of a dataframe can be created, based on multiple conditions in
which we either select a few rows, columns, or data with certain filters in
place. In the following examples, you will see how we can create a subset
of the original dataframe, based on certain conditions, to demonstrate the
process of filtering records.

• Select

• Filter

• Where

23
Chapter 2 Data Processing

Select
In this example, we take one of the dataframe columns, 'Avg_Salary', and
create a subset of the original dataframe, using select. We can pass any
number of columns that must be present in the subset. We then apply a
filter on the dataframe, to extract the records, based on a certain threshold
(Avg_Salary > 1000000). Once filtered, we can either take the total count
of records present in the subset or take it for further processing.

[In]: df.select(['Customer_subtype','Avg_Salary']).show()
[Out]:

24
Chapter 2 Data Processing

[In]: df.filter(df['Avg_Salary'] > 1000000).count()

[In]: 128
[In]: df.filter(df['Avg_Salary'] > 1000000).show()

Filter
We can also apply more than one filter on the dataframe, by including
more conditions, as shown following. This can be done in two ways: first,
by applying consecutive filters, then by using (&, or) operands with a
where statement.

[In]: df.filter(df['Avg_Salary'] > 500000).filter(df['Number_

of_houses'] > 2).show()
[Out]:

25
Chapter 2 Data Processing

Where
[In]: df.where((df['Avg_Salary'] > 500000) & (df['Number_of_
houses'] > 2)).show()

[Out]:

Now that we have seen how to create a subset from a dataframe, we

can move on to aggregations in PySpark.

Aggregations
Any kind of aggregation can be broken simply into three stages, in the
following order:

• Split

• Apply

• Combine

The first step is to split the data, based on a column or group of

columns, followed by performing the operation on those small individual
groups (count, max, avg, etc.). Once the results are in for each set of
groups, the last step is to combine all these results.
In the following example, we aggregate the data, based on 'Customer
subtype', and simply count the number of records in each category.
We use the groupBy function in PySpark. The output of this is not in
any particular order, as we have not applied any sorting to the results.
Therefore, we will also see how we can apply any type of sorting to the
final results. Because we have seven columns in the dataframe—all are

26
Chapter 2 Data Processing

categorical columns except for one (Avg_Salary), we can iterate over each
column and apply aggregation as in the following example:

[In]: df.groupBy('Customer_subtype').count().show()
[Out]:

[In]:
for col in df.columns:
    if col !='Avg_Salary':
        print(f" Aggregation for  {col}")
           df.groupBy(col).count().orderBy('count',ascending=
False).show(truncate=False)

27
Chapter 2 Data Processing

[Out]:

28
Chapter 2 Data Processing

As mentioned, we can have different kinds of operations on groups of

records, such as

• Mean

• Max

• Min
• Sum

29
Chapter 2 Data Processing

The following examples cover some of these, based on different

groupings. F refers to the Spark sql function here.

[In]: df.groupBy('Customer_main_type').agg(F.mean('Avg_
Salary')).show()
[Out]:

[In]: df.groupBy('Customer_main_type').agg(F.max('Avg_
Salary')).show()
[Out]:

30
Chapter 2 Data Processing

[In]: df.groupBy('Customer_main_type').agg(F.min('Avg_
Salary')).show()
[Out]:

[In]: df.groupBy('Customer_main_type').agg(F.sum('Avg_
Salary')).show()
[Out]:

31
Chapter 2 Data Processing

Sometimes, there is simply a need to sort the data with aggregation

or without any sort of aggregation. That’s where we can make use of the
'sort' and 'orderBy' functionality of PySpark, to rearrange data in a
particular order, as shown in the following examples:

[In]: df.sort("Avg_Salary", ascending=False).show()

[Out]:

[In]: df.groupBy('Customer_subtype').agg(F.avg('Avg_Salary').
alias('mean_salary')).orderBy('mean_salary',ascending=False).
show(50,False)
[Out]:

32
Chapter 2 Data Processing

[In]: df.groupBy('Customer_subtype').agg(F.max('Avg_Salary').
alias('max_salary')).orderBy('max_salary',ascending=False).
show()
[Out]:

33
Chapter 2 Data Processing

In some cases, we must also collect the list of values for particular
groups or for individual categories. For example, let’s say a customer goes
to an online store and accesses different pages on the store’s web site. If we
have to collect all the customer’s activities in a list, we can use the collect
functionality in PySpark. We can collect values in two different ways:

• Collect List

• Collect Set

34
Chapter 2 Data Processing

Collect
Collect list provides all the values in the original order of occurrence (they
can be reversed as well), and collect set provides only the unique values,
as shown in the following example. We consider grouping on Customer
subtype and collecting the Numberof houses values in a new column,
using list and set separately.

[In]: df.groupby("Customer_subtype").agg(F.collect_set("Number_
of_houses")).show()
[Out]:

[In]:
df.groupby("Customer_subtype").agg(F.collect_list("Number_of_
houses")).show()
[Out]:

35
Chapter 2 Data Processing

The need to create a new column with a constant value can be very
common. Therefore, we can do that in PySpark, using the 'lit' function.
In the following example, we create a new column with a constant value:

[In]: df=df.withColumn('constant',F.lit('finance'))
[In]: df.select('Customer_subtype','constant').show()

36
Random documents with unrelated
content Scribd suggests to you:
Within three years afterwards seven others were published; and in
the eighth, to the poem of ‘The Duke and the Sculptor,’ was
appended the following note to his wife:—“Dedicated to the Señora
Matilda O’Reilly de Zorrilla. I began the publication of my poems with
our acquaintance, and I conclude them with thy name. Madrid, 10
October, 1840.”
What were the circumstances attending this acquaintance or
union, we are not informed; but it is fortunate for the world that the
intimation it might convey of its being the conclusion of his literary
works has not been fulfilled. Since then he has published ‘Songs of
the Troubadour,’ in three volumes, and other minor poems and plays
separately. A larger work he meditated on the conquest of Granada,
to be entitled ‘The Cross and the Crescent,’ has not been completed;
and another he projected with the title ‘Maria,’ intending to celebrate
the different characters under which the Holy Virgin is venerated in
Roman Catholic countries, he has published, with the greater part
supplied by a friend, all very inferior to what might have been
expected from him.
It is much to be regretted that Zorrilla has in all his works allowed
carelessnesses to prevail, which too often mar the effect of his
verses, and still more that he has often inserted some that were of
very inferior merit compared with the rest. It is not to be supposed
that an author can be equally sustained in all his productions, but it
is somewhat extraordinary in his volumes to find some poems of
such transcendent merit, and others so inferior. These, however, are
very few, and probably were hastily composed and hastily published,
to supply the demand arising for the day. He is probably the only
author in Spain who has profited by the sale of his writings to any
extent, and to do this he must have been often under the necessity
of tasking his mind severely, without regard to its spontaneous
suggestions. Thus then, when he found his inspiration failing, he has
often had recourse to memory, and repeated from himself, and even
from others, verses previously published. It is to be hoped that he
may be induced soon to give the world a revised edition of his
works, in which the oversights may be corrected, and the poems
unworthy of his fame may be omitted.
On reading over dispassionately the ‘Lines to Larra,’ by which he
was first brought so prominently into notice, it may occasion some
surprise to learn they had produced so remarkable an effect. If they
had previously been read over alone to any one of the auditors, he
probably might not have considered them so ideal, so beautiful, or
so original as they seemed at the public recital. Some phrase might
have appeared incomprehensible, some sentiment exaggerated or
not true; some expression or line, hard or weak or forced. He might
have observed a want of order or connection in the ideas, or the
whole to be vague and leaving no fixed thought in the mind; or he
might have pronounced them, as they have been since pronounced,
an imitation of Victor Hugo or Lamartine. But to the auditors
assembled, in the excited state of their feelings, there was no time
for reflection or criticism. It was a composition of the hour for that
particular scene,—for themselves, in language and feelings with
which they could sympathize. Thus the verses seized on their minds
and electrified them, so that they had no time to dwell on any
discussion or dispute of their merits, but yielded at once to the
fascination of the melodious verse they heard, and the appropriate
application of the homage they testified.
In the first volume of poems that Zorrilla published, containing his
earliest productions, are to be found all the selections made for
translation in this work. They may not be so highly finished as some
afterwards published, nor so marked by that distinctive character he
has made his own; but they show the first promises of the fruit that
was in store, to be afterwards brought to such maturity. As he had
scarcely emerged from boyhood when he began to tread the path to
fame, his first steps could scarcely fail to betray that sort of
uncertainty which attends on all who are going on an unknown road.
Thus then through the volume he appears to be seeking a ground
whereon to fix his energies and build the temple for his future fame,
without being able confidently to fix on any place in preference. His
poetry from the first, always sonorous and easy, often evidently
spontaneous and true to nature, at times is weak and deficient in
the depth of thought that at other times distinguishes it, especially in
the compositions of a philosophic cast, which require fuller age and
reflection to give them with perfectness. Subject to these remarks,
independently of the poems hereafter given in the translations, there
are others, ‘To Toledo,’ ‘The Statue of Cervantes,’ ‘The Winter Night,’
more clearly portraying the peculiar character of his poetry as
afterwards developed.
In the second volume published about six months afterwards, he
seems already to have taken his ground and to proceed with a more
decided step. The poem, ‘The Day without Sun,’ is full of poetic
vigour and richness of description, and several tales of greater
length and legendary character show the bent of his mind and the
direction it was in future to take. In the third volume it was reserved
for his genius to be fully developed. It opens with a magnificent
composition, ‘To Rome,’ in which deep philosophy and reflection are
combined with exquisite description, all so clear and distinct as fully
to captivate the mind and leave an impression of complete
satisfaction. But beyond this it contains the poem ‘To the last
Moorish King of Granada, Boabdil the Little,’ which is generally
considered his best. He was already recognized as an admirable
descriptive poet, but he now proved his power of moving the inmost
feelings to be as great as his power of imagination. It is undoubtedly
a splendid composition and highly finished, so as to be well worthy
of study for the Spanish reader, though too long for translation for
this work. The same volume contains another poem, also worthy of
mention, ‘To a Skull,’ as written with much force and effect, but in
the style of the French imitators of Byron, whom Zorrilla has too
much copied, though it must be stated without their affectation and
exaggerations.
In the following volumes he continues the course now so markedly
his own as a national poet. He avowedly chooses, as becoming him
in that character, subjects taken from the traditions and legends
current in Spain, and clothing them in glowing language reproduces
them to his delighted readers as the dreams and remembrances of
their youth. He is especially partial to the tales connected with the
Moorish wars, and in so doing, with great poetic effect, always
represents the Moors in the most favourable light. Thus he
throughout makes them worthy rivals of the Christians, and thereby
renders greater the merit of the conquerors. The richness of his
diction is truly extraordinary, often so as to make us lose sight of the
paucity of ideas contained in his poems, and that those again are
too much the same repeated constantly over.
If it was a wonderful and admirable triumph for one so young to
achieve by one bound the unqualified commendations of his
countrymen, and to sustain the success then acquired by
subsequent efforts, we have still to regret that there were evils
attending that precocity to prevent his attaining apparently the
highest excellence. Perhaps there is no one we can point out as so
truly exemplifying the maxim “poeta nascitur.” He was truly born a
poet; and though he often writes showing that he had been reading
Calderon or some other of the elder writers of Spain, or even some
of the French poets, yet he always gives the colouring of his own
mind to those imitations so as to make them his own. This often
again leads him to a mannerism and repetition of himself; but
notwithstanding these faults or occasional errors of carelessness, his
compositions always remain uniformly and irresistibly captivating.
Besides his poems, Zorrilla has published upwards of twenty
dramatic pieces, some of which have been repeatedly produced on
the stage with the fullest success. They are all remarkable for the
richness of versification and high tone of poetry which distinguish his
lyrical compositions, and, like them, all tend to honour and promote
the chivalrous spirit for which the Spanish nation has ever been
renowned.
The modern poetry of Spain shows that her nationality is still as
distinct, her genius as elevated, and her sense of honour as pure, as
in any former period of her history. It shows itself in unison with the
spirit that has always animated the people in their public conduct, in
their loyalty and devotion, the same now as a thousand years since,
making every hill a fortress and every plain a battle-field, to dispute
the ground at every foot with the enemy till they were driven from
their soil. The poets of Spain have still, as ever, the most stirring
tasks before them, to commemorate the glories of their romantic
country, and they are worthy of their task.

JOSÈ ZORRILLA.
THE CHRISTIAN LADY AND THE MOOR.
Hastening to Granada’s gates,
Came o’er the Vega’s land,
Some forty Gomel horsemen,
And the Captain of the band.

He, entering in the city,

Check’d his white steed’s career;
And to a lady on his arm,
Borne weeping many a tear,

Said, “Cease your tears, fair Christian,

That grief afflicting me,
I have a second Eden,
Sultana, here for thee.

“A palace in Granada,
With gardens and with flowers,
And a gilded fountain playing
More than a hundred showers.

“And in the Henil’s valley

I have a fortress gray,
To be among a thousand queen
Beneath thy beauty’s sway.

“For over all yon winding shore

Extends my wide domain,
Nor Cordova’s, nor Seville’s lands,
A park like mine contain.

“There towers the lofty palm-tree,

The pomegranate’s glowing there,
And the leafy fig-tree, spreading
O’er hill and valley fair.

“There grows the hardy walnut,

Th ll l t ll
The yellow nopal tall,
And mulberry darkly shading
Beneath the castle wall;

“And elms I have in my arcades

That to the skies aspire,
And singing birds in cages
Of silk, and silver wire.

“And thou shalt my Sultana be,

My halls alone to cheer;
My harem without other fair,
Without sweet songs my ear.

“And velvets I will give thee,

And eastern rich perfumes,
From Greece I’ll bring thee choicest veils,
And shawls from Cashmere’s looms:

“And I will give thee feathers white,

To deck thy beauteous brow,
Whiter than ev’n the ocean foam
Our eastern waters know.

“And pearls to twine amid thy hair,

Cool baths when heat’s above,
And gold and jewels for thy neck,
And for thy lips be—love!”

“O! what avail those riches all,”

Replied the Christian fair,
“If from my father and my friends,
My ladies, me you tear?

“Restore me, O! restore me, Moor,

To my father’s land, my own;
To me more dear are Leon’s towers
Than thy Granada’s throne ”
Than thy Granada s throne.

Smoothing his beard, awhile the Moor

In silence heard her speak;
Then said as one who deeply thinks,
With a tear upon his cheek,

“If better seem thy castles there

Than here our gardens shine,
And thy flowers are more beautiful,
Because in Leon thine;

“And thou hast given thy youthful love

One of thy warriors there,
Houri of Eden! weep no more,
But to thy knights repair!”

Then giving her his chosen steed,

And half his lordly train,
The Moorish chieftain turn’d him back
In silence home again.

ROMANCE. THE WAKING.

No sound is in the midnight air,
No colour in its shade,
The old are resting free from care,
Duenna’s voice is stay’d;
But when all else in slumber meet,
We two are waking nigh,
She on the grated window’s seat,
And at its foot am I.

I cannot see her beaming eyes,

Nor her clear brow above,
Nor her face with its rosy dyes,
Nor yet her smile of love:
I cannot see the virgin flush
That heightens her cheek’s glow,
The enchantments of that maiden blush,
She is but fifteen now.

Nor can my searching eyes behold

Her form scarce wrapp’d about;
Nor from the flowing garment’s fold
Her white foot peeping out;
As on some gentle river’s spring,
To glide the foam between,
Spread forth her snowy floatsome wing,
The stately swan is seen.

Nor can I see her white neck shine,

Or shoulders as they part;
Nor from her face can I divine
Her restlessness of heart;
While like a guard, too watchful o’er,
The grated bars I find;
Audacious love is there before,
Poor virtue is behind.
But in despite of that thick grate,
And shades that round us twine,
I have, my dove, to compensate,
My soul embathed in thine:
My lips of fire I hold impress’d
On thine of roses free;
And well I feel there’s in that breast
A heart that beats for me.

But see along the East arise

The unwelcome god of day,
Enveloped in the humid skies,
The darkness drive away.
And when a maid has watch’d the night,
With gallant by her side,
The bright red dawn has too much light
Its coming to abide!

The smiling morn is shedding round

Its harmony and hues,
And fragrant odours o’er the ground
The breezes soft diffuse:
Robbing the rose, the lily fair,
And cherish’d pinks they fly,
And leave upon the laurels there
A murmur moaning by.

Murmurs the fountain’s freshening spring,

Beneath its crystal veil,
And the angelic turtles sing
Their tender mournful tale;
The love-sick dove the morning light
Drinks with enraptured throat,
Mixing the balmy air so bright
With her unequal note.
Paces the while the noble youth
The garden’s paths along,
And lowly sings, his soul to soothe,
His love-inspiring song;

“O! soundless midnight hour, again

Come with thy kindly shade,
When rest thy old from cares, and when
Duenna’s voice is stay’d;
For then, while they in slumber meet,
We two are waking nigh,
She on the grated window’s seat,
And at its foot am I.”

ORIENTAL ROMANCE,—BOABDIL.
Lady of the dark head-dress,
And monkish vest of purple hue,
Gladly would Boabdil give
Granada for a kiss of you.

He would give the best adventure

Of the bravest horseman tried,
And with all its verdant freshness
A whole bank of Darro’s tide.

He would give rich carpets, perfumes,

Armours of rare price and force,
And so much he values you,
A troop, ay, of his favourite horse.

“Because thine eyes are beautiful,

Because the morning’s blushing light
From them arises to the East,
And gilds the whole world bright.

“From thy lips smiles are flowing,

From thy tongue gentle peace,
Light and aërial as the course
Of the purple morning’s breeze.

“O! lovely Nazarene, how choice!

For an Eastern harem’s pride,
Those dark locks waving freely
Thy crystal neck beside.

“Upon a couch of velvet,

I n a cloud of perfumed air,
Wrapp’d in the white and flowing veil
Of Mahomet’s daughters fair.

“O, Lady! come to Cordova,

Th S lt th h lt b
There Sultana thou shalt be,
And the Sultan there, Sultana,
Shall be but a slave for thee.

“Such riches he will give thee,

And such robes of Tunisine,
That thou wilt judge thy beauty,
To repay him for them, mean.”

O! Lady of the dark head-dress!

That him a kiss of thee might bless,
Resign a realm Boabdil would!
But I for that, fair Christian, fain
Would give of heavens, and think it gain,
A thousand if I only could.

THE CAPTIVE.
I go, fair Nazarene, tomorrow
To queenly Cordova again;
Then thou, my song of love and sorrow
To hear, no longer mayst complain,
Sung to the compass of my chain.

When home the Christians shall return,

In triumph o’er the Moorish foe,
My cruel destiny wouldst thou learn?
The history of my loves to know,
The blood upon their hands shall show.

Better it were at once to close,

In this dark tower a captive here,
The life I suffer now of woes,
Than that today thou sett’st me clear;
Alas! thou sell’st it very dear.

Adieu! tomorrow o’er, thy slave

May never vex thy soul again,
But vain is all the hope it gave:
Still must I bear the captive’s chain,
Thine eyes my prison still remain.

Fair Christian! baleful is my star;

What values it this life to me,
If I must bear it from thee far?
Nor in Granada’s bowers may be,
Nor, my fair Cordova, with thee?

Today’s bright sun to me will seem

A lamp unseasonably by:
Daughter of Spain, thy beauties gleam
Alone my sun and moon on high,
The dawn and brightness of my sky.
Since then I lose thy light today,
Without that light I cannot live!
To Cordova I take my way;
But in the doom my fortunes give,
Alas! ’tis death that I receive.

A paradise and houri fair

Has Mahomet promised we shall prove:
Aye, thou wilt be an angel there,
And in that blissful realm above
We meet again, and there to love.

THE TOWER OF MUNION.

Dark-shadow’d giant! shame of proud Castille,
Castle without bridge, battlements or towers,
In whose wide halls now loathsome reptiles steal,
Where nobles once and warriors held their bowers!
Tell me, where are they? where thy tapestries gay,
Thy hundred troubadours of lofty song?
Thy mouldering ruins in the vale decay,
Thou humbled warrior! time has quell’d the strong:
Thy name and history to oblivion thrown,
The world forgets that there thou standst, Munion.

To me thou art a spectre, shade of grief!

With black remembrances my soul’s o’ercast;
To me thou art a palm with wither’d leaf,
Burnt by the lightning, bow’d beneath the blast.
I, wandering bard, proscribed perchance my doom
In the bier’s dust nor name, nor glory know;
With useless toil my brow’s consumed in gloom;
Of her I loved, dark dwelling-place below,
Whom I was robb’d of, angel from above,
Cursed be thy name, thy soil, as was my love.

There rest, aye, in thy loftiness,

To shame the plain around,
Warderless castle, matron lone,
In whom no beauty’s found.
At thee time laughs, thy towers o’erthrown,
Scorn’d by thy vassals, by thy Lord
Deserted, rest, black skeleton!
Stain of the vale’s green sward.

Priestless hermitage of Castille,

On thee no banners wave;
Unblazon’d gate, thy pointed vaults
No more their weight can save:
Thou hast no soldier on thy heights
Thou hast no soldier on thy heights,
No echo in thy halls,
And rank weeds festering grow uncheck’d
Beneath thy mouldering walls.

Chieftain dead in a foreign land,

Forgotten of thy race,
While storm-torn fragments from thy brow
Are scatter’d o’er thy place;
And men pass careless at thy feet,
Nor seek thy tale to find;
Because thy history is not read,
Thy name’s not in their mind.

But thou hast one, who in a luckless hour

Inscribed another’s name on thy worn stone:
’Twas I, and that my deep relentless shame
Remains with thee alone.
When my lips named that name, they play’d me false;
When my hands graved it, ’twas a like deceit;
Now it exists not; in time’s impious course
’Twas swept beneath his feet.

And that celestial name,

To time at length a prey,
A woman for my sin,
For a seraph snatch’d away;
The hurricane of life
Has left me, loved one, worse
For my eternal grief,
In pledge as of a curse,
Thy name ne’er from my thoughts to part,
Nor thy love ever from my heart.

THE WARNING.
Yesterday the morning’s light
Shone on thy window crystal bright,
And lightsome breezes floating there
Gave richest perfumes to the air,
Which the gay flowers had lent to them,
All scatter’d from the unequal stem.

The nightingale had bathed his wing

Beneath the neighbouring murmuring spring;
And birds, and flowers, and streamlets gay,
Seem’d to salute the new-born day;
And in requital of the light,
Their grateful harmony unite.

The sun was bright, the sky serene,

The garden fresh and pleasant seen;
Life was delight, and thou, sweet maid,
No blush of shame thy charms betray’d;
For innocence ruled o’er thy breast,
Alike thy waking and thy rest.

Maiden, or angel upon earth,

Thy laugh, and song of gentle mirth,
In heaven were surely heard; thine eyes
Were stars, and like sweet melodies
Thy wandering tones; thy breath perfume,
And dawn-like thy complexion’s bloom.

As phantoms then thou didst not find

The hours pass heavy on thy mind,
A poet, under Love’s decree,
Sang melancholy songs to thee;
And of his griefs the voice they lend
Thou didst not, maiden, comprehend.

Poor maiden, now what change has come

O’er that glad brow and youthful bloom?
Forgotten flower, thy leaves are sere,
Thy fruitless blossoms dried appear;
Thy powerless stem all broken, low,
May to the sun no colours show.

O! dark-eyed maid of ill-starr’d birth,

Why camest thou on this evil earth?
Rose amid tangled briars born,
What waits thee from the world but scorn?
A blasting breath around thee, see,
Thy bloom is gone, who’ll ask for thee?

Return, my angel, to thy sphere,

Before the world shall see thee here:
The joys of earth are cursed and brief,
Buy them not with eternal grief!
Heaven is alone, my soul, secure
The mansion for an angel pure.

MEDITATION.
Upon the obscure and lonely tomb,
Beneath the yellow evening’s gloom,
To offer up to Heaven I come,
For her I loved, my prayer!
Upon the marble bow’d my head,
Around my knees the moist herbs spread,
The wild flowers bend beneath my tread,
That deck the thicket there.

Far from the world, and pleasures vain,

From earth my frenzied thoughts to gain,
And read in characters yet plain
Names of the long since past;
There by the gilded lamp alone,
That waves above the altar stone,
As by the wandering breezes moan,
A light’s upon me cast.

Perchance some bird will pause its flight

Upon the funeral cypress height,
Warbling the absence of the light,
As sorrowing for its loss;
Or takes leave of the day’s bright power,
From the high window of the tower,
Or skims, where dark the cupolas lower,
On the gigantic cross.

With eyes immersed in tears, around

I watch it silent from the ground,
Until it startled flies the sound
The harsh bolts creaking gave;
A funeral smile salutes me dread,
The only dweller with the dead,
Lends me a hard and rough hand, led
To ope another grave.
Pardon, O God! the worldly thought,
Nor mark it midst my prayer;
Grant it to pass, with evil fraught,
As die the river’s murmurings brought
Upon the breezy air.

Why does a worldly image rise

As if my prayer to stain?
Perchance in evil shadow’s guise,
Which may when by the morrow flies
Sign of a curse remain.

Why has my mind been doom’d to dream

A phantom loveliness?
To see those charms transparent gleam,
That brow in tranquil light supreme,
And neck’s peculiar grace?

Not heighten’d its enchantments shine

By pomp or worldly glow;
I only see that form recline
In tears, before some sacred shrine,
Or castle walls below.

Like a forgotten offering lone,

In ruin’d temple laid;
Upon the carved and time-worn stone,
Where fell it by the rough wind thrown,
So bent beneath the shade.

With such a picture in my mind,

Such name upon my ear,
Before my God the place to find,
Where the forgotten are consign’d,
I come, and bow down here.
With eyes all vaguely motionless,
Perhaps my wanderings view
The dead, with horror and distress,
As, roused up in their resting-place,
They look their dark walls through.

’Twas not to muse I hither came

Of nothingness my part;
Nor of my God, but of a name,
That deep in characters of flame
Is written on my heart.

Pardon, O God! the worldly thought,

Nor mark it midst my prayer;
Grant it to pass, with evil fraught,
As die the river’s murmurings brought
Upon the breezy air.
NOTES.
1. Page 3. “Gaspar Melchor de Jovellanos.”
This name (pronounced Hovellianos) was formerly written as two
distinct names, Jove Llanos, as it is still by several members of the
family, one, an Advocate, at present at Madrid, and another the
Spanish Consul at Jamaica.

2. Page 3. “An able and distinguished writer,”

&c.
Antonio Alcalà Galiano, author also of the able article in the
Foreign Quarterly Review on Jovellanos, afterwards mentioned. He
was born at Cadiz, in 1789, the son of a distinguished officer in the
Spanish navy, who was killed at Trafalgar. In his youth, Alcalà
Galiano studied the English language so assiduously as to receive
much benefit from his knowledge of it when he had to take refuge in
London, on the various political changes that took place in Spain. He
then wrote much for the Westminster and Foreign Quarterly
Reviews, as well as other publications, and was subsequently named
one of the Professors of Languages in the London University. Having
returned to Spain, on the death of Ferdinand VII., he was appointed
a Minister of State, with the Señor Isturitz, and has held, at various
times, several high offices in the government. In the Cortes he was
considered one of the most able orators of his time, having been put
on a rivalry with Martinez de la Rosa and Argüelles. He has published
a few poems, and contributed several valuable papers for the
different learned societies of Madrid, besides having written much
for the periodicals, according to the continental system for public
men seeking to disseminate their opinions. His principal work as an
author is a ‘History of Spain.’ Ferrer del Rio says of him, that “he
writes Spanish with an English idiom, and though he puts his name
to a history of Spain, it seems a translation from the language of
Byron.” Few foreigners have ever obtained so complete a knowledge
of the English language; in fact his writings in the several reviews
might be pointed out as compositions which would do credit to our
own best writers. As an instance of his knowledge of the state of
literature in England, we may quote a few observations from an
article bearing his name in the first number of the Madrid Review.
He says, “The Bible and the Plays of Shakespeare, if they may be
named together without profanation, are the two works which have
most influence on the thoughts of the English;” adding, that
“classical literature is there better cultivated than in France, or at
least cultivated with more profound knowledge,” deducing the
conclusion, “that the English drama is consequently radically
different from the French.”

3. Page 11. “Bermudez, his biographer.”

This industrious writer was born at Gijon, in 1749, and died at
Cadiz in 1829. He may be termed the Vasari of Spain, as the
historian of the artists of his country. His two biographical works, the
one on her painters, the other on her architects, are a rich mine of
materials. The former was published in six volumes 8vo, in 1800: the
latter, in four volumes 4to, was almost the last work on which he
was engaged, and did not appear till 1829. Besides these, he was
the author of various other publications on the principal edifices in
Seville, and had completed a ‘History of the Roman Antiquities in
Spain;’ a ‘General History of Painting;’ a work on ‘Architecture,’ and
other pieces, which yet remain unedited. As a fellow-townsman, as
well as an artist of considerable genius, he was much assisted by
Jovellanos, who, when Minister of State, gave him a valuable
appointment at Madrid under the government. When that eminent
individual fell, his friends had to suffer also, and Cean Bermudez,
deprived of his appointment, had to return to Seville, where he
instituted a school for drawing. It was no doubt under the feelings of
regret, occasioned by the reflection of having his friends involved in
his misfortunes, that Jovellanos wrote to him the Epistle selected for
translation in this work.

4. Page 16. “Merit of first bringing into

favour.”
See Hermosilla, ‘Juicio Critico de los principales Poetas Españoles
de la ultima era,’ vol. i. p. 11.

5. Page 18. “Epistle to Cean Bermudez.”

From Works of Jovellanos, Mellado’s edition, vol. iv. p. 226.

6. Page 30. “To Galatea’s Bird.”

From the same, p. 369.

7. Page 32. “To Enarda.—I.”

From the same, p. 368. In submission to the recommendations of
several friends to give the original of at least part or the whole of
some one poem of each author, from whose works the translations
have been made, selections of such as the English students of
Spanish literature would probably most desire, are offered for their
comparison.
Riñen me bella Enarda
Los mozos y los viejos,
Por que tal vez jugando
Te escribo dulces versos.
Debiera un magistrado
(Susurran) mas severo,
De las livianas Musas
Huir el vil comercio.
Que mal el tiempo gastas!
Predican otros,—pero
Por mas que todos riñan
Tengo de escribir versos.

Quiero loar de Enarda

El peregrino ingenio
Al son de mi zampoña
Y en bien medidos metros.
Quiero de su hermosura
Encaramar al cielo
Las altas perfecciones;
De su semblante quiero
Cantar el dulce hechizo
Y con pincel maestro
Pintar su frente hermosa
Sus traviesos ojuelos,
El carmin de sus labios,
La nieve de su cuello;
Y vàyanse à la … al rollo
Los Catonianos ceños
Las frentes arrugadas
Y adustos sobrecejos,
Que Enarda serà siempre
Celebrada en mis versos.
8. Page 33. “To Enarda.—II.”
From Works of Jovellanos, vol. iv. p. 364.

9. Page 46. “Epistle to Domingo de Iriarte.”

From Works of Tomas Iriarte, 1805, vol. ii. p. 56.
Domingo Iriarte was subsequently much engaged in the
diplomatic service of Spain, and signed the treaty of peace with
France of 1795, as Plenipotentiary, along with the celebrated M.
Barthélemy.

10. Page 50. “But now the confines of,” &c.

The following is the original of this passage:—
Mas ya dexar te miro
Los confines Germanos,
Y el polìtico giro
Seguir hasta los ùltimos Britanos.
Desde luego la corte populosa
Cuyas murallas baña
La corriente anchurosa
Del Tàmesis, la imàgen te presenta
De una nacion en todo bien extraña:
Nacion en otros siglos no opulenta,
Hoi feliz por su industria, y siempre esenta:
Nacion tan liberal como ambiciosa;
Flemàtica y activa;
Ingenua, pero adusta;
Humana, pero altiva;
Y en la causa que abraza, iniqua ó justa
Violenta defensora,
Del riesgo y del temor despreciadora.
Alli serà preciso que te asombres
De ver (qual no habràs visto en parte alguna)
Obrar y hablar con libertad los hombres.
Admiraràs la rapida fortuna
Que alli logra el valor y la eloqüencia,
Sin que ni el oro, ni la ilustre cuna
Roben el premio al mèrito y la ciencia.
Adverteràs el numeroso enxambre
De diligentes y habiles Isleños
Que han procurado, del comercio Dueños
No conocer la ociosidad ni el hambre;
Ocupados en ùtiles inventos
En fàbricas, caminos, arsenales,
Escuelas, academias, hospitales,
Libros, experimentos,
Y estudios de las Artes liberales.
Alli sabràs, en fin, à quanto alcanza
La sabia educacion, y el acertado
Mètodo de patriòtica enseñanza,
La privada ambicion bien dirigida
Al pùblico provecho del Estado;
La justa recompensa y acogida
En que fundan las Letras su esperanza,
Y el desvelo de un pròvido Gobierno
Que al bien aspira, y à un renombre eterno.
This Epistle is addressed to his brother, as the reader may
observe, in the second person singular, which, in Spanish, has a tone
of more familiarity than in English, and understanding it so intended,
I have altered it, in the translation, into our colloquial form of the
second person plural.
The above extract is the same in his printed works of both
editions; but I have in my possession a collection of his manuscripts,
among which is a copy of this Epistle, with several variations, less
flattering to England. Had he lived to superintend the second edition,
these variations might probably have been adopted in it. They are
not, however, of any material variance, but they seem to me to show
that his eulogium had not been favourably received in some
quarters, and that he had therefore thought it prudent to soften it in
preparing for another edition. The publisher of the edition of 1805
does not seem to have been aware of these manuscripts, nor indeed
to have taken the trouble of doing more for Iriarte’s memory than
merely to reprint the first edition, without even any biographical or
critical notice of him or his writings, as he might well have done,
Iriarte having been then deceased fourteen years.
For another eloquent and encomiastic description of English
usages and institutions, the student of Spanish literature would do
well to read a work, published in London in 1834, by the Marques de
Miraflores, ‘Apuntes historico-criticos para escribir la Historia de la
Revolucion de España.’ This distinguished nobleman was born the
23rd December, 1792, at Madrid, and succeeded to the honours and
vast property of his ancient house in 1809, on the death of his elder
brother, during the campaign of that year. He has been much
engaged in public affairs, having held various offices in the state. He
has been twice Ambassador to England; the last time, Ambassador
Extraordinary on the coronation of Her Majesty Queen Victoria. The
Marques has written several works on political subjects, of which the
one above-mentioned is particularly deserving of study.

11. Page 52. “Saying as Seneca has said of

yore.”
Stet quicumque volet potens
Aulæ culmine lubrico:
Me dulcis saturet quies.
Obscuro positus loco
Leni perfruar otio.
Nullis notus Quiritibus
Ætas per tacitum fluat.
Sic cum transierint mei
Nullo cum strepitu dies,
Plebeius moriar senex.
Illi mors gravis incubat
Qui notus nimis omnibus
Ignotus moritur sibi.

Thyestes, Act II. The critical reader will observe, that the
translation into English has been made from the Spanish rather than
the Latin.

12. Page 53. “Fables.”

The Fables translated are numbered respectively III., VIII., XI.,
LIII. and LIV., in the original collection. The two first, III. and VIII.,
having been given by Bouterwek as specimens of Iriarte’s style,
without any translation, I took them for my first essays, and had
already versified them, before finding Roscoe had done the same
also in his translation of Sismondi, and it was subsequently to that I
became aware of other similar versions. Having, however, made
those translations, I have, notwithstanding the others, allowed them
to remain in this work. The fable of the Two Rabbits has been
selected as particularly noticed by Martinez de la Rosa, and the
others almost without cause of peculiar preference. The last one
contains an old but good lesson, which cannot be too frequently and
earnestly repeated:—

Ego nec studium sine divite venâ

Nec rude quid prosit video ingenium, alterius sic
Altera poscit opem res et conjurat amicè.

13. Page 64. “Iglesias and Gonzalez.”

Diego Gonzalez was born at Ciudad Rodrigo in 1733, and died at
Madrid, 1794. Josè Iglesias de la Casa was born at Salamanca in
1753, and died there in 1791. His poems were first published seven
years after his death, and have been several times reprinted. The
best edition is that of Barcelona, 1820, from which the one of Paris,
1821, was taken. The poems of Gonzalez also were first published
after his death, and have been several times reprinted. Both wrote
very pleasing verses, and are deservedly popular in Spain.

14. Page 69. “It was for his detractors,” &c.

Hermosilla, author of a work, ‘Juicio Critico de los principales
Poetas Españoles de la ultima era,’ published after his death, Paris
1840, gives in it, as Mr. Ticknor pithily observes, “a criticism of the
poems of Melendez so severe that I find it difficult to explain its
motive;” at the same time that he gives “an unreasonably laudatory
criticism of L. Moratin’s works.” Hermosilla appears to have been a
man of considerable learning, but little judgement. His criticisms are
generally worthless, and the only excuse for him, with regard to his
book, is, that he did not publish it. With regard to Melendez, taking
every opportunity to depreciate his merits, he is constantly found
constrained to acknowledge them, and sometimes even in
contradiction to himself. Thus, having several times intimated, as at
p. 31, that the erotic effusions of Melendez only were praiseworthy,
he says, at p. 297, when speaking of his Epistles, that they are “his
best compositions; thoughts, language, style, tone and versification,
all in general are good.” In another part he censures Melendez for
his poems addressed to different ladies, especially some to ‘Fanny,’
who appears to have been an Englishwoman; and yet those epistles,
addressed to her, on the death of her husband, are among the
purest and most elegant specimens that can be pointed out of
consolation to a mourner. It is but justice to his editor, Salva, to say,
that he has expressed his dissent from these criticisms, though he
thought proper to publish the work.

15. Page 73. “The Duke de Frias.”

This estimable nobleman, who died in 1850, was descended from
the Counts of Haro, one of the three great families of Spain. He was
the munificent friend of literary men, and in the case of Melendez
extended his protection to the dead, having taken much personal
trouble to have his remains removed from the common burying-
ground to a vault, where they might not afterwards be disturbed. He
also wrote verses occasionally, of which have been preserved, by Del
Rio, a ‘Sonnet to the Duke of Wellington,’ and by Ochoa, an ‘Elegy
on the Death of his Duchess,’ whose virtues will be found hereafter
commemorated by Martinez de la Rosa.

16. Page 76. “Best edition, that by Salvà.”

In taking the edition of 1820 for the text, Salvà, in his edition, has
exercised much judgement in giving some of the poems as they
were originally published, rather than as Melendez afterwards had
left them, weakened by over-correction.
Salvà was in early life distinguished for learning and study, having
been, when only twenty years of age, named Professor of Greek in
the University of Alcalà de Henares. On the French invasion he
returned to his native city Valencia, and engaged in trade as a
bookseller, in which occupation he continued in London, when
obliged to emigrate hither in 1823, in consequence of his having
joined in the political events of the times. He had been, during those
events, Deputy from Valencia, and Secretary to the Cortes. In 1830
he transferred his house to Paris, where he continued his pursuits,
publishing many valuable works of his own compilation, as a
Grammar and Dictionary of the Spanish language, as well as editing
and superintending the publication of many other standard works.
He closed his useful life, in his native city, in 1850.

17. Page 77. “Juvenilities.”

Works of Melendez, Salvà’s Edition, vol. i. p. 39.
This piece was also taken for translation from Bouterwek, when
first entering on a study of Spanish literature. From Bouterwek it
was copied by Sismondi, when borrowing, as he did largely, from
that compiler; but Mr. Roscoe has not given a translation of this, as
he probably found it difficult to do so satisfactorily. It is in fact
almost as difficult to translate Melendez as it is to translate
Anacreon, their peculiar simplicity and grace being so nearly allied.

18. Page 79. “The Timid Lover.”

Works of Melendez, ibid., p. 263.
This poem having been particularly mentioned by Martinez de la
Rosa as favourably characteristic of the style of the author, may be
considered best to be selected as an exemplification of it. It is what
is termed a Letrillia.
El Amante timido.

En la pena aguda
Que me hace sufrir
El Amor tirano
Desde que te vi
Mil veces su alivio
Te voy à pedir,
Y luego, aldeana,
Que llego ante ti,
Si quiero atreverme
No sè que decir.

Las voces me faltan

Y mi frenesí
Con mìseros ayes
Las cuida suplir
Pero el dios que aleve
Se burla de mi
Cuanto ansio mas tierno
Mis labios abrir
Se quiero atreverme
No sè que decir.

Sus fuegos entonces

Empieza à sentir
Tan vivos el alma
Que pienso morir,
Mis làgrimas corren,
Mi agudo gemir
Tu pecho sensible
Conmueve, y al fin
Si quiero atreverme
No sè que decir.

N l è t bl d
No lo sè, temblando
Si por descubrir
Con loca esperanza
Mi amor infeliz,
Tu lado por siempre
Tendrè ya que huir:
Sellàndome el miedo
La boca: y asì
Si quiero atreverme
No sè que decir.

Ay! si tu, adorada,

Pudieras oir
Mis hondos suspiros
Yo fuera feliz.
Yo, Filis, lo fuera
Mas, triste de mi!
Que tìmido al verte
Burlarme y reir,
Si quiero atreverme
No sè que decir.

19. Page 81. “My Village Life.”

This and the two following poems are taken from those at pages
94, 110 and 64 of the first volume of the Works of Melendez Valdes;
the Disdainful Shepherdess from the one at p. 62 of vol. ii.

20. Page 95. “Merits of their national dramas.”

For an excellent criticism on the Spanish drama, see the article in
the twenty-fifth volume of the Quarterly Review.

21. Page 104. “There, says his biographer,” &c.

Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge

connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and

personal growth every day!

ebookbell.com

Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
From Everand
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
Anish Chapagain
No ratings yet
SAP MDG (Master Data Governance) Online Tutorial
No ratings yet
SAP MDG (Master Data Governance) Online Tutorial
14 pages
Ansi Aiaa G-043-1993
No ratings yet
Ansi Aiaa G-043-1993
37 pages
Adobe Photoshop cs6 13.0 Final Multilanguage PDF
No ratings yet
Adobe Photoshop cs6 13.0 Final Multilanguage PDF
3 pages
Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh pdf download
100% (2)
Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh pdf download
67 pages
(Ebook) Machine Learning with PySpark: With Natural Language Processing and Recommender Systems by Pramod Singh ISBN 9781484241301, 1484241304 pdf download
100% (4)
(Ebook) Machine Learning with PySpark: With Natural Language Processing and Recommender Systems by Pramod Singh ISBN 9781484241301, 1484241304 pdf download
56 pages
Download Full Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh PDF All Chapters
100% (4)
Download Full Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh PDF All Chapters
55 pages
Download ebooks file Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh all chapters
100% (3)
Download ebooks file Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh all chapters
55 pages
Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh - The latest updated ebook is now available for download
100% (5)
Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh - The latest updated ebook is now available for download
69 pages
Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh - The ebook is ready for download, no waiting required
100% (1)
Learn PySpark: Build python-based machine learning and deep learning models 1st Edition Pramod Singh - The ebook is ready for download, no waiting required
61 pages
Full Download (Ebook) Learn TensorFlow 2.0: Implement Machine Learning and Deep Learning Models with Python by Pramod Singh , Avinash Manure ISBN 9781484255582, 1484255585 PDF DOCX
100% (7)
Full Download (Ebook) Learn TensorFlow 2.0: Implement Machine Learning and Deep Learning Models with Python by Pramod Singh , Avinash Manure ISBN 9781484255582, 1484255585 PDF DOCX
81 pages
Complete Download (Ebook) Learn PySpark: Build python-based machine learning and deep learning models by Pramod Singh ISBN 9781484249604, 9781484249611, 1484249607, 1484249615 PDF All Chapters
100% (9)
Complete Download (Ebook) Learn PySpark: Build python-based machine learning and deep learning models by Pramod Singh ISBN 9781484249604, 9781484249611, 1484249607, 1484249615 PDF All Chapters
65 pages
Full download Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh pdf docx
100% (5)
Full download Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh pdf docx
65 pages
Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh pdf download
100% (2)
Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh pdf download
56 pages
Where can buy Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh ebook with cheap price
100% (2)
Where can buy Building an Enterprise Chatbot: Work with Protected Enterprise Data Using Open Source Frameworks Abhishek Singh ebook with cheap price
55 pages
Learn TensorFlow 2.0: Implement Machine Learning and Deep Learning Models with Python 1st Edition Pramod Singh instant download
No ratings yet
Learn TensorFlow 2.0: Implement Machine Learning and Deep Learning Models with Python 1st Edition Pramod Singh instant download
81 pages
Where Can Buy Practical DataOps: Delivering Agile Data Science at Scale 1st Edition Harvinder Atwal Ebook With Cheap Price
100% (6)
Where Can Buy Practical DataOps: Delivering Agile Data Science at Scale 1st Edition Harvinder Atwal Ebook With Cheap Price
62 pages
Download full Introduction to Deep Learning Using R: A Step-by-Step Guide to Learning and Implementing Deep Learning Models Using R Taweh Beysolow Ii ebook all chapters
100% (3)
Download full Introduction to Deep Learning Using R: A Step-by-Step Guide to Learning and Implementing Deep Learning Models Using R Taweh Beysolow Ii ebook all chapters
62 pages
Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing Alfonso Antolínez García download
No ratings yet
Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing Alfonso Antolínez García download
63 pages
Data Science Solutions With Python Fast and Scalable Models Using
100% (1)
Data Science Solutions With Python Fast and Scalable Models Using
128 pages
Complete Download Building An Enterprise Chatbot: Work With Protected Enterprise Data Using Open Source Frameworks Abhishek Singh PDF All Chapters
100% (3)
Complete Download Building An Enterprise Chatbot: Work With Protected Enterprise Data Using Open Source Frameworks Abhishek Singh PDF All Chapters
52 pages
Python Machine Learning Case Studies Five Case Studies For The Data Scientist 1st Edition Haroon pdf download
100% (4)
Python Machine Learning Case Studies Five Case Studies For The Data Scientist 1st Edition Haroon pdf download
80 pages
Building An Enterprise Chatbot Work With Protected Enterprise Data Using Open Source Frameworks 1st Edition Abhishek Singh download
No ratings yet
Building An Enterprise Chatbot Work With Protected Enterprise Data Using Open Source Frameworks 1st Edition Abhishek Singh download
90 pages
Learn Tensorflow 20 Implement Machine Learning And Deep Learning Models With Python 1st Edition Pramod Singh download
No ratings yet
Learn Tensorflow 20 Implement Machine Learning And Deep Learning Models With Python 1st Edition Pramod Singh download
47 pages
Machine Learning with Spark - Second Edition
From Everand
Machine Learning with Spark - Second Edition
Rajdeep Dua
No ratings yet
Building Computer Vision Applications Using Artificial Neural Networks, 2nd Edition Shamshad Ansari - Quickly download the ebook to read anytime, anywhere
100% (1)
Building Computer Vision Applications Using Artificial Neural Networks, 2nd Edition Shamshad Ansari - Quickly download the ebook to read anytime, anywhere
52 pages
Apache Spark 2.x Cookbook
From Everand
Apache Spark 2.x Cookbook
Rishi Yadav
No ratings yet
Building Computer Vision Applications Using Artificial Neural Networks, 2nd Edition Shamshad Ansari pdf download
100% (1)
Building Computer Vision Applications Using Artificial Neural Networks, 2nd Edition Shamshad Ansari pdf download
86 pages
Download Complete Deep Learning with Swift for TensorFlow Differentiable Programming with Swift 1st Edition Rahul Bhalley PDF for All Chapters
100% (1)
Download Complete Deep Learning with Swift for TensorFlow Differentiable Programming with Swift 1st Edition Rahul Bhalley PDF for All Chapters
65 pages
Real-Time IoT Imaging with Deep Neural Networks: Using Java on the Raspberry Pi 4 1st Edition Nicolas Modrzyk All Chapters Instant Download
100% (2)
Real-Time IoT Imaging with Deep Neural Networks: Using Java on the Raspberry Pi 4 1st Edition Nicolas Modrzyk All Chapters Instant Download
40 pages
Python Unit Test Automation: Automate, Organize, and Execute Unit Tests in Python Ashwin Pajankar
No ratings yet
Python Unit Test Automation: Automate, Organize, and Execute Unit Tests in Python Ashwin Pajankar
73 pages
Deep Learning Crash Course For Beginners With Python Theory And Applications Stepbystep Using Tensorflow 20contains A Lot Of Exercises And Handson Projects Publishing download
100% (1)
Deep Learning Crash Course For Beginners With Python Theory And Applications Stepbystep Using Tensorflow 20contains A Lot Of Exercises And Handson Projects Publishing download
86 pages
(Na) Aven Jeffrey - Sams Teach Yourself Spark in 24 Hours
No ratings yet
(Na) Aven Jeffrey - Sams Teach Yourself Spark in 24 Hours
1,229 pages
Download Complete Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon PDF for All Chapters
100% (1)
Download Complete Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon PDF for All Chapters
50 pages
Effective Amazon Machine Learning 1st Edition Alexis Perrier 2024 Scribd Download
100% (10)
Effective Amazon Machine Learning 1st Edition Alexis Perrier 2024 Scribd Download
60 pages
Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon pdf download
100% (1)
Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon pdf download
54 pages
Machine Learning With Spark - Sample Chapter
100% (1)
Machine Learning With Spark - Sample Chapter
36 pages
Instant Download Machine Learning For Decision Makers 1st Edition Patanjali Kashyap PDF All Chapters
100% (4)
Instant Download Machine Learning For Decision Makers 1st Edition Patanjali Kashyap PDF All Chapters
46 pages
Download Complete Essential Computer Science: A Programmer’s Guide to Foundational Concepts 1st Edition Paul D. Crutcher PDF for All Chapters
100% (1)
Download Complete Essential Computer Science: A Programmer’s Guide to Foundational Concepts 1st Edition Paul D. Crutcher PDF for All Chapters
55 pages
Low-Code Development with Appsmith 1st Edition Rahul Sharma - Download the full set of chapters carefully compiled
100% (3)
Low-Code Development with Appsmith 1st Edition Rahul Sharma - Download the full set of chapters carefully compiled
52 pages
Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon pdf download
100% (1)
Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon pdf download
37 pages
Download ebooks file (Ebook) Hands-on Guide to Apache Spark 3 by -- all chapters
100% (3)
Download ebooks file (Ebook) Hands-on Guide to Apache Spark 3 by -- all chapters
71 pages
Get Machine Learning and AI For Healthcare: Big Data For Improved Health Outcomes Arjun Panesar Free All Chapters
100% (8)
Get Machine Learning and AI For Healthcare: Big Data For Improved Health Outcomes Arjun Panesar Free All Chapters
62 pages
28581
No ratings yet
28581
77 pages
Time Series Algorithms Recipes: Implement Machine Learning and Deep Learning Techniques with Python Akshay R Kulkarni All Chapters Instant Download
100% (3)
Time Series Algorithms Recipes: Implement Machine Learning and Deep Learning Techniques with Python Akshay R Kulkarni All Chapters Instant Download
32 pages
Download Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon ebook All Chapters PDF
100% (1)
Download Python Machine Learning Case Studies: Five Case Studies for the Data Scientist 1st Edition Danish Haroon ebook All Chapters PDF
55 pages
Practical Big Data Analytics Hands on techniques to implement enterprise analytics and machine learning using Hadoop Spark NoSQL and R 1st Edition Nataraj Dasgupta pdf download
No ratings yet
Practical Big Data Analytics Hands on techniques to implement enterprise analytics and machine learning using Hadoop Spark NoSQL and R 1st Edition Nataraj Dasgupta pdf download
55 pages
[FREE PDF sample] Getting Started with Open Source Technologies: Applying Open Source Technologies with Projects and Real Use Cases 1st Edition Sachin Rathee ebooks
100% (1)
[FREE PDF sample] Getting Started with Open Source Technologies: Applying Open Source Technologies with Projects and Real Use Cases 1st Edition Sachin Rathee ebooks
65 pages
Instant Access To Hands-On Matplotlib: Learn Plotting and Visualizations With Python 3 Ashwin Pajankar Ebook Full Chapters
100% (2)
Instant Access To Hands-On Matplotlib: Learn Plotting and Visualizations With Python 3 Ashwin Pajankar Ebook Full Chapters
79 pages
Download Practical GraphQL: Learning Full-Stack GraphQL Development with Projects 1st Edition Nabendu Biswas ebook All Chapters PDF
100% (4)
Download Practical GraphQL: Learning Full-Stack GraphQL Development with Projects 1st Edition Nabendu Biswas ebook All Chapters PDF
51 pages
Where can buy Deep Learning Pipeline: Building a Deep Learning Model with TensorFlow 1st Edition Hisham El-Amir ebook with cheap price
100% (5)
Where can buy Deep Learning Pipeline: Building a Deep Learning Model with TensorFlow 1st Edition Hisham El-Amir ebook with cheap price
55 pages
2686316Google Cloud Platform For Data Science A Crash Course On Big Data Machine Learning And Data Analytics Services Dr Shitalkumar R Sukhdeve pdf download
100% (1)
2686316Google Cloud Platform For Data Science A Crash Course On Big Data Machine Learning And Data Analytics Services Dr Shitalkumar R Sukhdeve pdf download
65 pages
Instant Download Practical GraphQL: Learning Full-Stack GraphQL Development with Projects 1st Edition Nabendu Biswas PDF All Chapters
100% (2)
Instant Download Practical GraphQL: Learning Full-Stack GraphQL Development with Projects 1st Edition Nabendu Biswas PDF All Chapters
41 pages
Computer Vision Projects With Pytorch Design And Develop Productiongrade Models Akshay Kulkarni download
No ratings yet
Computer Vision Projects With Pytorch Design And Develop Productiongrade Models Akshay Kulkarni download
87 pages
Machine Learning For Decision Makers Cognitive Computing Fundamentals For Better Decision Making 2nd Edition 2nd Edition Patanjali Kashyap pdf download
No ratings yet
Machine Learning For Decision Makers Cognitive Computing Fundamentals For Better Decision Making 2nd Edition 2nd Edition Patanjali Kashyap pdf download
81 pages
PyTorch Recipes: A Problem-Solution Approach to Build, Train and Deploy Neural Network Models, 2nd Edition Pradeepta Mishra - The full ebook set is available with all chapters for download
100% (1)
PyTorch Recipes: A Problem-Solution Approach to Build, Train and Deploy Neural Network Models, 2nd Edition Pradeepta Mishra - The full ebook set is available with all chapters for download
55 pages
Managing Your Data Science Projects Learn Salesmanship Presentation and Maintenance of Completed Models 1st Edition Robert De Graaf download pdf
100% (2)
Managing Your Data Science Projects Learn Salesmanship Presentation and Maintenance of Completed Models 1st Edition Robert De Graaf download pdf
40 pages
Get Introduction to Python for Engineers and Scientists: Open Source Solutions for Numerical Computation 1st Edition Sandeep Nagar free all chapters
100% (3)
Get Introduction to Python for Engineers and Scientists: Open Source Solutions for Numerical Computation 1st Edition Sandeep Nagar free all chapters
40 pages
Machine Learning for Decision Makers 1st Edition Patanjali Kashyap - Quickly download the ebook to never miss any content
100% (2)
Machine Learning for Decision Makers 1st Edition Patanjali Kashyap - Quickly download the ebook to never miss any content
63 pages
Essential Computer Science: A Programmer’s Guide to Foundational Concepts 1st Edition Paul D. Crutcher All Chapters Instant Download
100% (1)
Essential Computer Science: A Programmer’s Guide to Foundational Concepts 1st Edition Paul D. Crutcher All Chapters Instant Download
50 pages
Full download Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale 1st Edition Mathangi Sri pdf docx
100% (10)
Full download Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale 1st Edition Mathangi Sri pdf docx
40 pages
58529
No ratings yet
58529
35 pages
Machine Learning for Decision Makers Cognitive Computing Fundamentals for Better Decision Making 2nd Edition Patanjali Kashyap download
100% (1)
Machine Learning for Decision Makers Cognitive Computing Fundamentals for Better Decision Making 2nd Edition Patanjali Kashyap download
37 pages
Beyond Earth Day Fulfilling the Promise 1st Edition Gaylord Nelson - Explore the complete ebook content with the fastest download
100% (2)
Beyond Earth Day Fulfilling the Promise 1st Edition Gaylord Nelson - Explore the complete ebook content with the fastest download
60 pages
Social Theory Re Wired New Connections to Classical and Contemporary Perspectives 2nd Edition Wesley Longhofer - The ebook in PDF and DOCX formats is ready for download
100% (2)
Social Theory Re Wired New Connections to Classical and Contemporary Perspectives 2nd Edition Wesley Longhofer - The ebook in PDF and DOCX formats is ready for download
58 pages
Between Rhyme and Reason Vladimir Nabokov Translation and Dialogue 1st Edition Shvabrin - The ebook is ready for instant download and access
100% (2)
Between Rhyme and Reason Vladimir Nabokov Translation and Dialogue 1st Edition Shvabrin - The ebook is ready for instant download and access
47 pages
Cool Calm Kids Resources to Help Prep to Year 2 Find Better Ways to Deal with Conflict and Bossy Peers Amelia Suckling - Download the ebook today and own the complete version
100% (2)
Cool Calm Kids Resources to Help Prep to Year 2 Find Better Ways to Deal with Conflict and Bossy Peers Amelia Suckling - Download the ebook today and own the complete version
49 pages
Git for Teams A User Centered Approach to Creating Efficient Workflows in Git 1 (Early Release) Edition Emma Jane Hogbin Westby - The ebook is ready for download with just one simple click
100% (2)
Git for Teams A User Centered Approach to Creating Efficient Workflows in Git 1 (Early Release) Edition Emma Jane Hogbin Westby - The ebook is ready for download with just one simple click
54 pages
The Globalization of International Law 1st Edition Paul Schiff Berman (Editor) - Download the ebook in PDF with all chapters to read anytime
100% (2)
The Globalization of International Law 1st Edition Paul Schiff Berman (Editor) - Download the ebook in PDF with all chapters to read anytime
53 pages
Chloroplast Biotechnology Methods and Protocols 1st Edition Pal Maliga (Eds.) - Quickly download the ebook to never miss important content
100% (2)
Chloroplast Biotechnology Methods and Protocols 1st Edition Pal Maliga (Eds.) - Quickly download the ebook to never miss important content
50 pages
Creating Systems That Work - Royal Academy of Eng - Elliott
No ratings yet
Creating Systems That Work - Royal Academy of Eng - Elliott
17 pages
Bookkeeper Resume Samples
100% (2)
Bookkeeper Resume Samples
5 pages
Screenshot 2024-03-07 at 3.28.29 PM
No ratings yet
Screenshot 2024-03-07 at 3.28.29 PM
4 pages
EL005 Data Logger Datasheet-Multi Server-Remote Monitoring
No ratings yet
EL005 Data Logger Datasheet-Multi Server-Remote Monitoring
10 pages
2021BALLB06Sociology - Docx - Abhishek Dubey PDF
No ratings yet
2021BALLB06Sociology - Docx - Abhishek Dubey PDF
24 pages
Fortnite Pull Method 2
No ratings yet
Fortnite Pull Method 2
6 pages
CC assignment 2
No ratings yet
CC assignment 2
8 pages
Algorithm Design and Problem Solving
No ratings yet
Algorithm Design and Problem Solving
7 pages
ACI Multi-Pod Upgrade MOP - Adecco v.05
No ratings yet
ACI Multi-Pod Upgrade MOP - Adecco v.05
71 pages
CST STUDIO SUITE - Thermal and Mechanical Simulation
No ratings yet
CST STUDIO SUITE - Thermal and Mechanical Simulation
78 pages
Unit 5 Files For Civil
No ratings yet
Unit 5 Files For Civil
17 pages
CM2 GTX000 2002 03
No ratings yet
CM2 GTX000 2002 03
70 pages
Datasheet M 1ia 05A
No ratings yet
Datasheet M 1ia 05A
1 page
Slides Osep
No ratings yet
Slides Osep
84 pages
Markdown Guide
No ratings yet
Markdown Guide
11 pages
Agri Informatics
No ratings yet
Agri Informatics
5 pages
Icafetech OBM Diskless 1.9
No ratings yet
Icafetech OBM Diskless 1.9
11 pages
Config
No ratings yet
Config
14 pages
Minimalist Business Slides by Slidesgo
No ratings yet
Minimalist Business Slides by Slidesgo
55 pages
ftjmhgejkgsejhgrvsjhgr
No ratings yet
ftjmhgejkgsejhgrvsjhgr
4 pages
Modular Smartphone
No ratings yet
Modular Smartphone
46 pages
Build Supercomputers with Raspberry Pi 3 1st Edition Carlos R Morrison instant download
100% (2)
Build Supercomputers with Raspberry Pi 3 1st Edition Carlos R Morrison instant download
53 pages
Programación C++
No ratings yet
Programación C++
25 pages
User Manual
No ratings yet
User Manual
13 pages
Ict114 Lecture 2
No ratings yet
Ict114 Lecture 2
7 pages
Sift Gpu
No ratings yet
Sift Gpu
5 pages
Online Voting System
No ratings yet
Online Voting System
97 pages

Learn Pyspark Build Pythonbased Machine Learning And Deep Learning Models 1st Edition Pramod Singh instant download

Uploaded by

Learn Pyspark Build Pythonbased Machine Learning And Deep Learning Models 1st Edition Pramod Singh instant download

Uploaded by

Learn Pyspark Build Pythonbased Machine Learning

And Deep Learning Models 1st Edition Pramod

Explore and download more ebooks at ebookbell.com

Learn Pyspark Build Pythonbased Machine Learning And Deep Learning

Applied Data Science Using Pyspark Learn The Endtoend Predictive

Applied Data Science Using Pyspark Learn The Endtoend Predictive

Learn C The Hard Way Practical Exercises On The Computational Subjects

Learn To Read Ancent Sumerian An Introduction For Complete Begiknners

Learn C Programming A Beginners Guide To Learning The Most Powerful

Learn Enough Ruby To Be Dangerous Write Programs Publish Gems And

ISBN-13 (pbk): 978-1-4842-4960-4 ISBN-13 (electronic): 978-1-4842-4961-1

Copyright © 2019 by Pramod Singh

Chapter 1: Introduction to Spark����������������������������������������������������������1

Chapter 2: Data Processing����������������������������������������������������������������17

Chapter 3: Spark Structured Streaming���������������������������������������������49

Building a Structured App�����������������������������������������������������������������������������������57

Chapter 5: MLlib: Machine Learning Library��������������������������������������85

Chapter 6: Supervised Machine Learning�����������������������������������������117

Building Multiple Models for Binary Classification Tasks����������������������������������138

Chapter 7: Unsupervised Machine Learning�������������������������������������161

Chapter 8: Deep Learning Using PySpark�����������������������������������������183

© Pramod Singh 2019 1

Figure 1-1. Three aspects of data

Figure 1-2. Evolution of data storage

machines with advanced processing units. Hence, it goes without saying

Figure 1-3. Core components of Spark

It’s kind of like master-slave architecture, in which the cluster manager

• Status of worker node (busy/available)

• Location of worker node

• Memory of worker node

• Total CPU cores of worker node

Figure 1-4. Resource management

Engine and Ecosystem

Programming Language APIs

Figure 1-5. Language APIs

Setting Up Your Environment

• Cloud environment (GCP, AWS, Azure)

1. Ensure that Java is installed; otherwise install Java.

2. Download the latest version of Apache Spark from

3. Extract the files from the zipped folder.

4. Copy all the Spark-related files to their respective

5. Configure the environment variables to be able to

6. Verify the installation and run Spark.

cluster by signing up for an enterprise account with Databricks, using the

1. Search for the Databricks web site and select

Figure 1-6. Databricks web page

2. If you have a user account with Databricks, you can

Figure 1-7. Databricks login

3. Once you are on the home page, you can choose to

Figure 1-8. Creating a Databricks notebook

4. To set up the cluster, you must give a name to the

Figure 1-9. Creating a Databricks cluster

5. You can also view the status of the cluster by going

Figure 1-10. Databricks cluster list

6. The final step is to open a notebook and attach it

Figure 1-11. Databricks notebook

Figure 1-12. Results of Spark adoption survey

Note All the following steps are written in Jupyter Notebook,

© Pramod Singh 2019 17

Creating a SparkSession Object

[In]: from pyspark.sql import SparkSession

Now, instead of directly reading a file to create a dataframe, we go

[In]: df_na.fillna( { 'country':'USA', 'browser':'Safari' } ).show()

Another very common step in data processing is to replace some data

[In]: df_na.replace("Chrome","Google Chrome").show()

Now that we have seen how to create a dataframe by passing a value

[In]: df.filter(df['Avg_Salary'] > 1000000).count()

[In]: df.filter(df['Avg_Salary'] > 500000).filter(df['Number_

Now that we have seen how to create a subset from a dataframe, we

The first step is to split the data, based on a column or group of

As mentioned, we can have different kinds of operations on groups of

The following examples cover some of these, based on different

Sometimes, there is simply a need to sort the data with aggregation

[In]: df.sort("Avg_Salary", ascending=False).show()

He, entering in the city,

Said, “Cease your tears, fair Christian,

Chapter 1: Introduction to Spark��1

Chapter 2: Data Processing��17

Chapter 3: Spark Structured Streaming��49

Building a Structured App��57

Chapter 5: MLlib: Machine Learning Library��85

Chapter 6: Supervised Machine Learning��117

Building Multiple Models for Binary Classification Tasks��138

Chapter 7: Unsupervised Machine Learning��161

Chapter 8: Deep Learning Using PySpark��183

Engine and Ecosystem

Programming Language APIs

Setting Up Your Environment

Creating a SparkSession Object