100% found this document useful (5 votes)

27 views

Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley instant download

The document is a guide for learning data analysis using Python, specifically through the book 'Learn Data Analysis with Python: Lessons in Coding' by A.J. Henley and Dave Wolf. It covers various topics including installation of Jupyter Notebook, data import/export methods, data cleaning, and visualization techniques. The book is structured as a workbook with practical exercises and is suitable for both beginners and those with some Python experience in data analysis.

Uploaded by

thyerrabey7s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (5 votes)

27 views

Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley instant download

Uploaded by

thyerrabey7s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

Learn Data Analysis with Python: Lessons in

Coding First Edition A.J. Henley install

download

https://ebookmeta.com/product/learn-data-analysis-with-python-
lessons-in-coding-first-edition-a-j-henley/

Download more ebook from https://ebookmeta.com

We believe these products will be a great fit for you. Click
the link to download now, or visit ebookmeta.com
to discover even more!

Learn Python By Coding Video Games Intermediate A step

by step guide to coding in Python fast Patrick Felicia

https://ebookmeta.com/product/learn-python-by-coding-video-games-
intermediate-a-step-by-step-guide-to-coding-in-python-fast-
patrick-felicia/

Tiny Python Projects Learn coding and testing with

puzzles and games 1st Edition Ken Youens Clark

https://ebookmeta.com/product/tiny-python-projects-learn-coding-
and-testing-with-puzzles-and-games-1st-edition-ken-youens-clark/

Learn Python by Coding Video Games Beginner 1st Edition

Patrick Felicia

https://ebookmeta.com/product/learn-python-by-coding-video-games-
beginner-1st-edition-patrick-felicia/

Emerging Threats of Synthetic Biology and

Biotechnology: Addressing Security and Resilience
Issues (NATO Science for Peace and Security Series C:
Environmental Security) Benjamin D. Trump
https://ebookmeta.com/product/emerging-threats-of-synthetic-
biology-and-biotechnology-addressing-security-and-resilience-
issues-nato-science-for-peace-and-security-series-c-
Figures of Natality Reading the Political in the Age of
Goethe 1st Edition Joseph D. O’Neil

https://ebookmeta.com/product/figures-of-natality-reading-the-
political-in-the-age-of-goethe-1st-edition-joseph-d-oneil/

DK Eyewitness Madrid Travel Guide Dk Eyewitness

https://ebookmeta.com/product/dk-eyewitness-madrid-travel-guide-
dk-eyewitness/

Clinical Dermatology 5th Edition Richard B Weller

Hamish J A Hunter Margaret W Mann

https://ebookmeta.com/product/clinical-dermatology-5th-edition-
richard-b-weller-hamish-j-a-hunter-margaret-w-mann/

The Lost Child Complex in Australian Film Jung Story

and Playing Beneath the Past 1st Edition Terrie Waddell

https://ebookmeta.com/product/the-lost-child-complex-in-
australian-film-jung-story-and-playing-beneath-the-past-1st-
edition-terrie-waddell-2/

Sons of Abraham vol 2 Pawns of Terror 1st Edition

Joseph Ray

https://ebookmeta.com/product/sons-of-abraham-vol-2-pawns-of-
terror-1st-edition-joseph-ray/
An Analysis of Robert E Lucas Jr s Why Doesn t Capital
Flow from Rich to Poor Countries 1st Edition Padraig
Belton

https://ebookmeta.com/product/an-analysis-of-robert-e-lucas-jr-s-
why-doesn-t-capital-flow-from-rich-to-poor-countries-1st-edition-
padraig-belton/
A. J. Henley and Dave Wolf

Learn Data Analysis with Python

Lessons in Coding
A. J. Henley
Washington, D.C., District of Columbia, USA

Dave Wolf
Sterling Business Advantage, LLC, Adamstown, Maryland, USA

Any source code or other supplementary material referenced by the

author in this book is available to readers on GitHub via the book’s
product page, located at www.apress.com/9781484234853 . For
more detailed information, please visit http://www.apress.com/
source-code .

ISBN 978-1-4842-3485-3 e-ISBN 978-1-4842-3486-0

https://doi.org/10.1007/978-1-4842-3486-0

Library of Congress Control Number: 2018933537

© A.J. Henley and Dave Wolf 2018

This work is subject to copyright. All rights are reserved by the

Publisher, whether the whole or part of the material is concerned,
specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other
physical way, and transmission or information storage and retrieval,
electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.

Trademarked names, logos, and images may appear in this book.

Rather than use a trademark symbol with every occurrence of a
trademarked name, logo, or image we use the names, logos, and
images only in an editorial fashion and to the benefit of the
trademark owner, with no intention of infringement of the
trademark. The use in this publication of trade names, trademarks,
service marks, and similar terms, even if they are not identified as
such, is not to be taken as an expression of opinion as to whether or
not they are subject to proprietary rights.

While the advice and information in this book are believed to be true
and accurate at the date of publication, neither the authors nor the
editors nor the publisher can accept any legal responsibility for any
errors or omissions that may be made. The publisher makes no
warranty, express or implied, with respect to the material contained
herein.

Printed on acid-free paper

Distributed to the book trade worldwide by Springer

Science+Business Media New York, 233 Spring Street, 6th Floor,
New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505,
email orders-ny@springer-sbm.com, or visit
www.springeronline.com. Apress Media, LLC is a California LLC and
the sole member (owner) is Springer Science + Business Media
Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware
corporation.
Table of Contents
Chapter 1:How to Use This Book

Installing Jupyter Notebook

What Is Jupyter Notebook?

What Is Anaconda?

Getting Started

Getting the Datasets for the Workbook’s Exercises

Chapter 2:Getting Data Into and Out of Python

Loading Data from CSV Files

Your Turn

Saving Data to CSV

Your Turn

Loading Data from Excel Files

Your Turn

Saving Data to Excel Files

Your Turn

Combining Data from Multiple Excel Files

Your Turn

Loading Data from SQL

Your Turn

Saving Data to SQL

Your Turn

Random Numbers and Creating Random Data

Your Turn

Chapter 3:Preparing Data Is Half the Battle

Cleaning Data

Calculating and Removing Outliers

Missing Data in Pandas Dataframes

Filtering Inappropriate Values

Finding Duplicate Rows

Removing Punctuation from Column Contents

Removing Whitespace from Column Contents

Standardizing Dates

Standardizing Text like SSNs, Phone Numbers, and Zip

Codes

Creating New Variables

Binning Data

Applying Functions to Groups, Bins, and Columns

Ranking Rows of Data

Create a Column Based on a Conditional

Making New Columns Using Functions

Converting String Categories to Numeric Variables

Organizing the Data

Removing and Adding Columns

Selecting Columns

Change Column Name

Setting Column Names to Lower Case

Finding Matching Rows

Filter Rows Based on Conditions

Selecting Rows Based on Conditions

Random Sampling Dataframe

Chapter 4:Finding the Meaning

Computing Aggregate Statistics

Your Turn

Computing Aggregate Statistics on Matching Rows

Your Turn

Sorting Data

Your Turn

Correlation
Your Turn

Regression

Your Turn

Regression without Intercept

Your Turn

Basic Pivot Table

Your Turn

Chapter 5:Visualizing Data

Data Quality Report

Your Turn

Graph a Dataset:Line Plot

Your Turn

Graph a Dataset:Bar Plot

Your Turn

Graph a Dataset:Box Plot

Your Turn

Graph a Dataset:Histogram

Your Turn

Graph a Dataset:Pie Chart

Your Turn
Graph a Dataset:Scatter Plot

Your Turn

Chapter 6:Practice Problems

Analysis Exercise 1

Analysis Exercise 2

Analysis Exercise 3

Analysis Exercise 4

Analysis Project

Required Deliverables

Index
About the Authors and About the
Technical Reviewer
About the Authors
A. J. Henley

is a technology educator with over 20 years’

experience as a developer, designer, and
systems engineer. He is an instructor at both
Howard University and Montgomery College.

Dave Wolf

is a certified Project Management

Professional (PMP) with over 20 years’
experience as a software developer, analyst,
and trainer. His latest projects include
collaboratively developing training materials
and programming bootcamps for Java and
Python.
About the Technical Reviewer
Michael Thomas

has worked in software development for

more than 20 years as an individual
contributor, team lead, program manager,
and vice president of engineering. Michael
has more than ten years of experience
working with mobile devices. His current
focus is in the medical sector, using mobile
devices to accelerate information transfer
between patients and health-care providers.
© A.J. Henley and Dave Wolf 2018
A.J. Henley and Dave Wolf, Learn Data Analysis with Python,
https://doi.org/10.1007/978-1-4842-3486-0_1

1. How to Use This Book

A. J. Henley1 and Dave Wolf2
(1) Washington, D.C., District of Columbia, USA
(2) Sterling Business Advantage, LLC, Adamstown, Maryland, USA

If you are already using Python for data analysis, just browse this
book’s table of contents. You will probably find a bunch of things
that you wish you knew how to do in Python. If so, feel free to turn
directly to that chapter and get to work. Each lesson is, as much as
possible, self-contained.

Be warned! This book is more a workbook than a textbook.

If you aren’t using Python for data analysis, begin at the

beginning. If you work your way through the whole workbook, you
should have a better of idea of how to use Python for data analysis
when you are done.
If you know nothing at all about data analysis, this workbook
might not be the place to start. However, give it a try and see how it
works for you.

Installing Jupyter Notebook

The fastest way to install and use Python is to do what you already
know how to do, and you know how to use your browser. Why not
use Jupyter Notebook?

What Is Jupyter Notebook?

Jupyter Notebook is an interactive Python shell that runs in your
browser. When installed through Anaconda, it is easy to quickly set
up a Python development environment. Since it’s easy to set up and
easy to run, it will be easy to learn Python.
Jupyter Notebook turns your browser into a Python development
environment. The only thing you have to install is Anaconda. In
essence, it allows you to enter a few lines of Python code, press
CTRL+Enter, and execute the code. You enter the code in cells and
then run the currently selected cell. There are also options to run all
the cells in your notebook. This is useful if you are developing a
larger program.

What Is Anaconda?
Anaconda is the easiest way to ensure that you don’t spend all day
installing Jupyter. Simply download the Anaconda package and run
the installer. The Anaconda software package contains everything
you need to create a Python development environment. Anaconda
comes in two versions—one for Python 2.7 and one for Python 3.x.
For the purposes of this guide, install the one for Python 2.7.
Anaconda is an open source data-science platform. It contains
over 100 packages for use with Python, R, and Scala. You can
download and install Anaconda quickly with minimal effort. Once
installed, you can update the packages or Python version or create
environments for different projects.

Getting Started
1. Download and install Anaconda at
https://www.anaconda.com/download .

2. Once you’ve installed Anaconda, you’re ready to create your first

notebook. Run the Jupyter Notebook application that was
installed as part of Anaconda.

3. Your browser will open to the following address:

http://localhost:8888. If you’re running Internet
Explorer, close it. Use Firefox or Chrome for best results. From
there, browse to http://localhost:8888.

4. Start a new notebook. On the right-hand side of the browser,

click the drop-down button that says "New" and select Python or
Python 2.

5. This will open a new iPython notebook in another browser tab.

You can have many notebooks open in many tabs.

6. Jupyter Notebook contains cells. You can type Python code in

each cell. To get started (for Python 2.7), type print "Hello,
World!" in the first cell and hit CTRL+Enter. If you’re using
Python 3.5, then the command is print("Hello, World!").

Getting the Datasets for the Workbook’s Exercises

1. Download the dataset files from
http://www.ajhenley.com/dwnld .

2. Upload the file datasets.zip to Anaconda in the same folder

as your notebook.

3. Run the Python code in Listing 1-1 to unzip the datasets.

path_to_zip_file = "datasets.zip"
directory_to_extract_to = ""
import zipfile
zip_ref = zipfile.ZipFile(path_to_zip_file, 'r')
zip_ref.extractall(directory_to_extract_to)
zip_ref.close()

Listing 1-1 Unzipping dataset.zip

© A.J. Henley and Dave Wolf 2018
A.J. Henley and Dave Wolf, Learn Data Analysis with Python,
https://doi.org/10.1007/978-1-4842-3486-0_2

2. Getting Data Into and Out of Python

A. J. Henley1 and Dave Wolf2
(1) Washington, D.C., District of Columbia, USA
(2) Sterling Business Advantage, LLC, Adamstown, Maryland, USA

The first stage of data analysis is getting the data. Moving your data
from where you have it stored into your analytical tools and back out
again can be a difficult task if you don't know what you are doing.
Python and its libraries try to make it as easy as possible.
With just a few lines of code, you will be able to import and
export data in the following formats:
CSV
Excel
SQL

Loading Data from CSV Files

Normally, data will come to us as files or database links. See Listing
2-1 to learn how to load data from a CSV file.

import pandas as pd
Location = "datasets/smallgradesh.csv"
df = pd.read_csv(Location, header=None)

Listing 2-1 Loading Data from CSV File

Now, let's take a look at what our data looks like (Listing 2-2):

df.head()

Listing 2-2 Display First Five Lines of Data

As you can see, our dataframe lacks column headers. Or, rather,
there are headers, but they weren't loaded as headers; they were
loaded as row one of your data. To load data that includes headers,
you can use the code shown in Listing 2-3.

import pandas as pd
Location = "datasets/gradedata.csv"
df = pd.read_csv(Location)

Listing 2-3 Loading Data from CSV File with Headers

Then, as before, we take a look at what the data looks like by

running the code shown in Listing 2-4.

df.head()

Listing 2-4 Display First Five Lines of Data

If you have a dataset that doesn't include headers, you can add
them afterward. To add them, we can use one of the options shown
in Listing 2-5.

import pandas as pd
Location = "datasets/smallgrades.csv"
# To add headers as we load the data...
df = pd.read_csv(Location, names=
['Names','Grades'])
# To add headers to a dataframe
df.columns = ['Names','Grades']

Listing 2-5 Loading Data from CSV File and Adding Headers
Your Turn
Can you make a dataframe from a file you have uploaded and
imported on your own? Let's find out. Go to the following website,
which contains U.S. Census data (
http://census.ire.org/data/bulkdata.html ), and
download the CSV datafile for a state. Now, try to import that data
into Python.

Saving Data to CSV

Maybe you want to save your progress when analyzing data. Maybe
you are just using Python to massage some data for later analysis in
another tool. Or maybe you have some other reason to export your
dataframe to a CSV file. The code shown in Listing 2-6 is an example
of how to do this.

import pandas as pd
names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,95,77,78,99]
GradeList = zip(names,grades)
df = pd.DataFrame(data = GradeList, columns=
['Names','Grades'])

df.to_csv('studentgrades.csv',index=False,header
=False)

Listing 2-6 Exporting a Dataset to CSV

Lines 1 to 6 are the lines that create the dataframe. Line 7 is the
code to export the dataframe df to a CSV file called
studentgrades.csv.
The only parameters we use are index and header. Setting
these parameters to false will prevent the index and header names
from being exported. Change the values of these parameters to get
a better understanding of their use.
If you want in-depth information about the to_csv method, you
can, of course, use the code shown in Listing 2-7.

df.to_csv?

Listing 2-7 Getting Help on to_csv

Your Turn
Can you export the dataframe created by the code in Listing 2-8 to
CSV?

import pandas as pd
names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,95,77,78,99]
bsdegrees = [1,1,0,0,1]
msdegrees = [2,1,0,0,0]
phddegrees = [0,1,0,0,0]
Degrees =
zip(names,grades,bsdegrees,msdegrees,phddegrees)
columns = ['Names','Grades','BS','MS','PhD']
df = pd.DataFrame(data = Degrees,
columns=column)
df

Listing 2-8 Creating a Dataset for the Exercise

Loading Data from Excel Files

Normally, data will come to us as files or database links. Let's see
how to load data from an Excel file (Listing 2-9).

import pandas as pd
Location = "datasets/gradedata.xlsx"
df = pd.read_excel(Location)

Listing 2-9 Loading Data from Excel File

Now, let's take a look at what our data looks like (Listing 2-10).

df.head()

Listing 2-10 Display First Five Lines of Data

If you wish to change or simplify your column names, you can

run the code shown in Listing 2-11.

df.columns =
['first','last','sex','age','exer','hrs','grd','ad
dr']
df.head()

Listing 2-11 Changing Column Names

Your Turn
Can you make a dataframe from a file you have uploaded and
imported on your own? Let's find out. Go to
https://www.census.gov/support/USACdataDownloads.h
tml and download one of the Excel datafiles at the bottom of the
page. Now, try to import that data into Python.

Saving Data to Excel Files

The code shown in Listing 2-12 is an example of how to do this.

import pandas as pd
names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,95,77,78,99]
GradeList = zip(names,grades)
df = pd.DataFrame(data = GradeList,
columns=['Names','Grades'])
writer = pd.ExcelWriter('dataframe.xlsx',
engine="xlsxwriter")
df.to_excel(writer, sheet_name="Sheet1")
writer.save()
Listing 2-12 Exporting a Dataframe to Excel
If you wish, you can save different dataframes to different
sheets, and with one .save() you will create an Excel file with
multiple worksheets (see Listing 2-13).

writer =
pd.ExcelWriter('dataframe.xlsx',engine='xlsxwriter
')
df.to_excel(writer, sheet_name="Sheet1")
df2.to_excel(writer, sheet_name="Sheet2")
writer.save()

Listing 2-13 Exporting Multiple Dataframes to Excel

Note This assumes that you have another dataframe already

loaded into the df2 variable.

Your Turn
Can you export the dataframe created by the code shown in Listing
2-14 to Excel?

import pandas as pd
names = ['Nike','Adidas','New
Balance','Puma',’Reebok’]
grades = [176,59,47,38,99]
PriceList = zip(names,prices)
df = pd.DataFrame(data = PriceList, columns=
['Names',’Prices’])

Listing 2-14 Creating a Dataset for the Exercise

Combining Data from Multiple Excel Files

In earlier lessons, we opened single files and put their data into
individual dataframes. Sometimes we will need to combine the data
from several Excel files into the same dataframe.
We can do this either the long way or the short way. First, let's
see the long way (Listing 2-15).

import pandas as pd
import numpy as np

all_data = pd.DataFrame()

df = pd.read_excel("datasets/data1.xlsx")
all_data = all_data.append(df,ignore_index=True)

df = pd.read_excel("datasets/data2.xlsx")
all_data = all_data.append(df,ignore_index=True)

df = pd.read_excel("datasets/data3.xlsx")
all_data = all_data.append(df,ignore_index=True)
all_data.describe()

Listing 2-15 Long Way

Line 4: First, let's set all_data to an empty dataframe.

Line 6: Load the first Excel file into the dataframe df.
Line 7: Append the contents of df to the dataframe all_data.
Lines 9 & 10: Basically the same as lines 6 & 7, but for the next
Excel file.
Why do we call this the long way? Because if we were loading a
hundred files instead of three, it would take hundreds of lines of
code to do it this way. In the words of my friends in the startup
community, it doesn't scale well. The short way, however, does scale.
Now, let's see the short way (Listing 2-16).

import pandas as pd
import numpy as np
import glob

all_data = pd.DataFrame()
for f in glob.glob("datasets/data*.xlsx"):
df = pd.read_excel(f)
all_data =
all_data.append(df,ignore_index=True)
all_data.describe()

Listing 2-16 Short Way

Line 3: Import the glob library.

Line 5: Let's set all_data to an empty dataframe.
Line 6: This line will loop through all files that match the
pattern.
Line 7: Load the Excel file in f into the dataframe df.
Line 8: Append the contents of df to the dataframe all_data.
Since we only have three datafiles, the difference in code isn't
that noticeable. However, if we were loading a hundred files, the
difference in the amount of code would be huge. This code will load
all the Excel files whose names begin with data that are in the
datasets directory no matter how many there are.

Your Turn
In the datasets/weekly_call_data folder, there are 104 files of
weekly call data for two years. Your task is to try to load all of that
data into one dataframe.

Loading Data from SQL

Normally, our data will come to us as files or database links. Let's
learn how to load our data from a sqlite database file (Listing 2-17).

import pandas as pd
from sqlalchemy import create_engine
# Connect to sqlite db
db_file = r'datasets/gradedata.db'
engine = create_engine(r"sqlite:///{}"
.format(db_file))
sql = 'SELECT * from test'
'where Grades in (76,77,78)'
sales_data_df = pd.read_sql(sql, engine)
sales_data_df

Listing 2-17 Load Data from sqlite

This code creates a link to the database file called
gradedata.db and runs a query against it. It then loads the data
resulting from that query into the dataframe called
sales_data_df. If you don't know the names of the tables in a
sqlite database, you can find out by changing the SQL statement to
that shown in Listing 2-18.

sql = "select name from sqlite_master"

"where type = 'table';"

Listing 2-18 Finding the Table Names

Once you know the name of a table you wish to view (let's say it
was test), if you want to know the names of the fields in that
table, you can change your SQL statement to that shown in Listing
2-19.

sql = "select * from test;"

Listing 2-19 A Basic Query

Then, once you run sales_data_df.head() on the

dataframe, you will be able to see the fields as headers at the top of
each column.
As always, if you need more information about the command,
you can run the code shown in Listing 2-20.

sales_data_df.read_sql?

Listing 2-20 Get Help on read_sql

Your Turn
Can you load data from the datasets/salesdata.db database?

Saving Data to SQL

See Listing 2-21 for an example of how to do this.

import pandas as pd
names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,95,77,78,99]
GradeList = zip(names,grades)
df = pd.DataFrame(data = GradeList,
columns=['Names', 'Grades'])
df

Listing 2-21 Create Dataset to Save

To export it to SQL, we can use the code shown in Listing 2-22.

import os
import sqlite3 as lite
db_filename = r'mydb.db'
con = lite.connect(db_filename)
df.to_sql('mytable',
con,
flavor='sqlite',
schema=None,
if_exists='replace',
index=True,
index_label=None,
chunksize=None,
dtype=None)
con.close()

Listing 2-22 Export Dataframe to sqlite

Line 14: mydb.db is the path and name of the sqlite database
you wish to use.
Line 18: mytable is the name of the table in the database.
As always, if you need more information about the command,
you can run the code shown in Listing 2-23.

df.to_sql?

Listing 2-23 Get Help on to_sql

Your Turn
This might be a little tricky, but can you create a sqlite table that
contains the data found in datasets/gradedata.csv?

Random Numbers and Creating Random

Data
Normally, you will use the techniques in this guide with datasets of
real data. However, sometimes you will need to create random
values.
Say we wanted to make a random list of baby names. We could
get started as shown in Listing 2-24.

import pandas as pd
from numpy import random
from numpy.random import randint
names = ['Bob','Jessica','Mary','John','Mel']

Listing 2-24 Getting Started

First, we import our libraries as usual. In the last line, we create

a list of the names we will randomly select from.
Next, we add the code shown in Listing 2-25.

random.seed(500)
Listing 2-25 Seeding Random Generator
This seeds the random number generator. If you use the same
seed, you will get the same "random” numbers.
What we will try to do is this:
1. randint(low=0,high=len(names))
Generates a random integer between zero and the length of
the list names.

2. names[n]
Selects the name where its index is equal to n.

3. for i in range(n)
Loops until i is equal to n, i.e., 1,2,3,….n.

4. random_names =
Selects a random name from the name list and does this n
times.

We will do all of this in the code shown in Listing 2-26.

randnames = []
for i in range(1000):
name = names[randint(low=0,high=len(names))]
randnames.append(name)

Listing 2-26 Selecting 1000 Random Names

Now we have a list of 1000 random names saved in our

random_names variable. Let's create a list of 1000 random numbers
from 0 to 1000 (Listing 2-27).

births = []
for i in range(1000):
births.append(randint(low=0, high=1000))
Listing 2-27 Selecting 1000 Random Numbers
And, finally, zip the two lists together and create the dataframe
(Listing 2-28).

BabyDataSet2 = list(zip(randnames,births))
df = pd.DataFrame(data = BabyDataSet2,
columns=['Names', 'Births'])
df

Listing 2-28 Creating Dataset from the Lists of Random Names and Numbers

Your Turn
Create a dataframe called parkingtickets with 250 rows
containing a name and a number between 1 and 25.
© A.J. Henley and Dave Wolf 2018
A.J. Henley and Dave Wolf, Learn Data Analysis with Python,
https://doi.org/10.1007/978-1-4842-3486-0_3

3. Preparing Data Is Half the Battle

A. J. Henley1 and Dave Wolf2
(1) Washington, D.C., District of Columbia, USA
(2) Sterling Business Advantage, LLC, Adamstown, Maryland, USA

The second step of data analysis is cleaning the data. Getting data
ready for analytical tools can be a difficult task. Python and its
libraries try to make it as easy as possible.
With just a few lines of code, you will be able to get your data
ready for analysis. You will be able to
clean the data;
create new variables; and
organize the data.

Cleaning Data
To be useful for most analytical tasks , data must be clean. This
means it should be consistent, relevant, and standardized. In this
chapter, you will learn how to
remove outliers;
remove inappropriate values;
remove duplicates;
remove punctuation;
remove whitespace;
standardize dates; and
standardize text.

Calculating and Removing Outliers

Assume you are collecting data on the people you went to high
school with. What if you went to high school with Bill Gates? Now,
even though the person with the second-highest net worth is only
worth $1.5 million, the average of your entire class is pushed up by
the billionaire at the top. Finding the outliers allows you to remove
the values that are so high or so low that they skew the overall view
of the data.
We cover two main ways of detecting outliers:
1. Standard Deviations: If the data is normally distributed, then
95 percent of the data is within 1.96 standard deviations of the
mean. So we can drop the values either above or below that
range.

2. Interquartile Range (IQR) : The IQR is the difference

between the 25 percent quantile and the 75 percent quantile.
Any values that are either lower than Q1 - 1.5 x IQR or greater
than Q3 + 1.5 x IQR are treated as outliers and removed.

Let's see what these look like (Listings 3-1 and 3-2).

import pandas as pd
Location = "datasets/gradedata.csv"
df = pd.read_csv(Location)
meangrade = df['grade'].mean()
stdgrade = df['grade'].std()
toprange = meangrade + stdgrade * 1.96
botrange = meangrade - stdgrade * 1.96
copydf = df
copydf = copydf.drop(copydf[copydf['grade']
> toprange].index)
copydf = copydf.drop(copydf[copydf['grade']
< botrange].index)
copydf

Listing 3-1 Method 1: Standard Deviation

Line 6: Here we calculate the upper range equal to 1.96 times

the standard deviation plus the mean.
Line 7: Here we calculate the lower range equal to 1.96 times
the standard deviation subtracted from the mean.
Line 9: Here we drop the rows where the grade is higher than
the toprange.
Line 11: Here we drop the rows where the grade is lower than
the botrange.

import pandas as pd
Location = "datasets/gradedata.csv"
df = pd.read_csv(Location)
q1 = df['grade'].quantile(.25)
q3 = df['grade'].quantile(.75)
iqr = q3-q1
toprange = q3 + iqr * 1.5
botrange = q1 - iqr * 1.5
copydf = df
copydf = copydf.drop(copydf[copydf['grade']
> toprange].index)
copydf = copydf.drop(copydf[copydf['grade']
< botrange].index)
copydf

Listing 3-2 Method 2: Interquartile Range

Line 9: Here we calculate the upper boundary = the third

quartile + 1.5 * the IQR.
Line 10: Here we calculate the lower boundary = the first
quartile - 1.5 * the IQR.
Line 13: Here we drop the rows where the grade is higher than
the toprange.
Line 14: Here we drop the rows where the grade is lower than
the botrange.

Your Turn
Load the dataset datasets/outlierdata.csv. Can you remove
the outliers? Try it with both methods.

Missing Data in Pandas Dataframes

One of the most annoying things about working with large datasets
is finding the missing datum. It can make it impossible or
unpredictable to compute most aggregate statistics or to generate
pivot tables. If you look for missing data points in a 50-row dataset
it is fairly easy. However, if you try to find a missing data point in a
500,000-row dataset it can be much tougher.
Python's pandas library has functions to help you find, delete, or
change missing data (Listing 3-3).

import pandas as pd
df =
pd.read_csv("datasets/gradedatamissing.csv")
df.head()

Listing 3-3 Creating Dataframe with Missing Data

The preceding code loads a legitimate dataset that includes rows

with missing data. We can use the resulting dataframe to practice
dealing with missing data.
To drop all the rows with missing (NaN) data, use the code
shown in Listing 3-4.

df_no_missing = df.dropna()
df_no_missing

Listing 3-4 Drop Rows with Missing Data

To add a column filled with empty values, use the code in Listing
3-5.

import numpy as np
df['newcol'] = np.nan
df.head()

Listing 3-5 Add a Column with Empty Values

To drop any columns that contain nothing but empty values, see
Listing 3-6.

df.dropna(axis=1, how="all")

Listing 3-6 Drop Completely Empty Columns

To replace all empty values with zero, see Listing 3-7.

df.fillna(0)

Listing 3-7 Replace Empty Cells with 0

To fill in missing grades with the mean value of grade, see Listing
3-8.

df["grade"].fillna(df["grade"].mean(),
inplace=True)

Listing 3-8 Replace Empty Cells with Average of Column

Note, inplace=True means that the changes are saved to the

dataframe right away.
To fill in missing grades with each gender's mean value of grade,
see Listing 3-9.

df["grade"].fillna(df.groupby("gender")
["grade"].transform("mean"), inplace=True)

Listing 3-9 It's Complicated

We can also select some rows but ignore the ones with missing
data points. To select the rows of df where age is not NaN and
gender is not NaN, see Listing 3-10.

df[df['age'].notnull() & df['gender'].notnull()]

Listing 3-10 Selecting Rows with No Missing Age or Gender

Your Turn
Load the dataset datasets/missinggrade.csv. Your mission, if
you choose to accept it, is to delete rows with missing grades and to
replace the missing values in hours of exercise by the mean value for
that gender.

Filtering Inappropriate Values

Sometimes, if you are working with data you didn't collect yourself,
you need to worry about whether the data is accurate. Heck,
sometimes you need to worry about that even if you did collect it
yourself! It can be difficult to check the veracity of each and every
data point, but it is quite easy to check if the data is appropriate.
Python's pandas library has the ability to filter out the bad
values (see Listing 3-11).

import pandas as pd

names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,-2,77,78,101]

GradeList = zip(names,grades)
df = pd.DataFrame(data = GradeList,
columns=['Names', 'Grades'])
df

Listing 3-11 Creating Dataset

To eliminate all the rows where the grades are too high, see
Listing 3-12.

df.loc[df['Grades'] <= 100]

Listing 3-12 Filtering Out Impossible Grades

To change the out-of-bound values to the maximum or minimum

allowed value, we can use the code seen in Listing 3-13.

df.loc[(df['Grades'] >= 100,'Grades')] = 100

Listing 3-13 Changing Impossible Grades

Your Turn
Using the dataset from this section, can you replace all the subzero
grades with a grade of zero?

Finding Duplicate Rows

Another thing you need to worry about if you are using someone
else’s data is whether any data is duplicated. (Did the same data get
reported twice, or recorded twice, or just copied and pasted?) Heck,
sometimes you need to worry about that even if you did collect it
yourself! It can be difficult to check the veracity of each and every
data point, but it is quite easy to check if the data is duplicated.
Python's pandas library has a function for finding not only
duplicated rows, but also the unique rows (Listing 3-14).

import pandas as pd
names =
['Jan','John','Bob','Jan','Mary','Jon','Mel','Mel'
]
grades = [95,78,76,95,77,78,99,100]
GradeList = zip(names,grades)
df = pd.DataFrame(data = GradeList,
columns=['Names', 'Grades'])
df
Listing 3-14 Creating Dataset with Duplicates
To indicate the duplicate rows, we can simply run the code seen
in Listing 3-15.

df.duplicated()

Listing 3-15 Displaying Only Duplicates in the Dataframe

To show the dataset without duplicates, we can run the code

seen in Listing 3-16.

df.drop_duplicates()

Listing 3-16 Displaying Dataset without Duplicates

You might be asking, “What if the entire row isn't duplicated, but
I still know it's a duplicate?" This can happen if someone does your
survey or retakes an exam again, so the name is the same, but the
observation is different. In this case, where we know that a duplicate
name means a duplicate entry, we can use the code seen in Listing
3-17.

df.drop_duplicates(['Names'], keep="last")

Listing 3-17 Drop Rows with Duplicate Names, Keeping the Last Observation

Your Turn
Load the dataset datasets/dupedata.csv. We figure people
with the same address are duplicates. Can you drop the duplicated
rows while keeping the first?

Removing Punctuation from Column Contents

Whether in a phone number or an address, you will often find
unwanted punctuation in your data. Let's load some data to see how
to address that (Listing 3-18).

import pandas as pd
Location = "datasets/gradedata.csv"
## To add headers as we load the data...
df = pd.read_csv(Location)
df.head()

Listing 3-18 Loading Dataframe with Data from CSV File

To remove the unwanted punctuation, we create a function that
returns all characters that aren't punctuation, and them we apply
that function to our dataframe (Listing 3-19).

import string
exclude = set(string.punctuation)
def remove_punctuation(x):
try:
x = ''.join(ch for ch in x if ch not in
exclude)
except:
pass
return x
df.address =
df.address.apply(remove_punctuation)
df

Listing 3-19 Stripping Punctuation from the Address Column

Removing Whitespace from Column Contents

import pandas as pd
Location = "datasets/gradedata.csv"
## To add headers as we load the data...
df = pd.read_csv(Location)
df.head()

Listing 3-20 Loading Dataframe with Data from CSV File

To remove the whitespace, we create a function that returns all

characters that aren't punctuation, and them we apply that function
to our dataframe (Listing 3-21).
def remove_whitespace(x):
try:
x = ''.join(x.split())
except:
pass
return x
df.address = df.address.apply(remove_whitespace)
df

Listing 3-21 Stripping Whitespace from the Address Column

Standardizing Dates
One of the problems with consolidating data from different sources
is that different people and different systems can record dates
differently. Maybe they use 01/03/1980 or they use 01/03/80 or
even they use 1980/01/03. Even though they all refer to January 3,
1980, analysis tools may not recognize them all as dates if you are
switching back and forth between the different formats in the same
column (Listing 3-22).

import pandas as pd
names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,95,77,78,99]
bsdegrees = [1,1,0,0,1]
msdegrees = [2,1,0,0,0]
phddegrees = [0,1,0,0,0]
bdates = ['1/1/1945','10/21/76','3/3/90',
'04/30/1901','1963-09-01']
GradeList =
zip(names,grades,bsdegrees,msdegrees,
phddegrees,bdates)
columns=
['Names','Grades','BS','MS','PhD',"bdates"]
df = pd.DataFrame(data = GradeList,
columns=columns)
df
Listing 3-22 Creating Dataframe with Different Date Formats
Listing 3-23 shows a function that standardizes dates to single
format.

from time import strftime

from datetime import datetime
def standardize_date(thedate):
formatted_date = ""
thedate = str(thedate)
if not thedate or thedate.lower() ==
"missing"
or thedate == "nan":
formatted_date = "MISSING"
if the_date.lower().find('x') != -1:
formatted_date = "Incomplete"
if the_date[0:2] == "00":
formatted_date = thedate.replace("00",
"19")
try:
formatted_date =
str(datetime.strptime(
thedate,'%m/%d/%y')
.strftime('%m/%d/%y'))
except:
pass
try:
formatted_date = str(datetime.strptime(
thedate, '%m/%d/%Y')
.strftime('%m/%d/%y'))
except:
pass
try:
if int(the_date[0:4]) < 1900:
formatted_date = "Incomplete"
else:
formatted_date =
str(datetime.strptime(
thedate, '%Y-%m-%d')
.strftime('%m/%d/%y'))
except:
pass
return formatted_date

Listing 3-23 Function to Standardize Dates

Now that we have this function, we can apply it to the
birthdates column on our dataframe (Listing 3-24).

df.bdates = df.bdates.apply(standardize_date)
df

Listing 3-24 Applying Date Standardization to Birthdate Column

Standardizing Text like SSNs, Phone Numbers, and Zip

Codes
One of the problems with consolidating data from different sources
is that different people and different systems can record certain data
like Social Security numbers, phone numbers, and zip codes
differently. Maybe they use hyphens in those numbers, and maybe
they don't. This section quickly covers how to standardize how these
types of data are stored (see Listing 3-25).

import pandas as pd
names = ['Bob','Jessica','Mary','John','Mel']
grades = [76,95,77,78,99]
bsdegrees = [1,1,0,0,1]
msdegrees = [2,1,0,0,0]
phddegrees = [0,1,0,0,0]
ssns = ['867-53-0909','333-22-4444','123-12-
1234',
'777-93-9311','123-12-1423']
GradeList =
zip(names,grades,bsdegrees,msdegrees,
phddegrees,ssns)
columns=['Names','Grades','BS','MS','PhD',"ssn"]
df = pd.DataFrame(data = GradeList,
columns=columns)
df

Listing 3-25 Creating Dataframe with SSNs

The code in Listing 3-26 creates a function that standardizes the
SSNs and applies it to our ssn column.

def right(s, amount):

return s[-amount]
def standardize_ssn(ssn):
try:
ssn = ssn.replace("-","")
ssn = "".join(ssn.split())
if len(ssn)<9 and ssn != 'Missing':
ssn="000000000" + ssn
ssn=right(ssn,9)
except:
pass
return ssn
df.ssn = df.ssn.apply(standardize_ssn)
df

Listing 3-26 Remove Hyphens from SSNs and Add Leading Zeros if Necessary

Creating New Variables

Once the data is free of errors, you need to set up the variables that
will directly answer your questions. It's a rare dataset in which every
question you need answered is directly addressed by a variable. So,
you may need to do a lot of recoding and computing of variables to
get exactly the dataset that you need.
Examples include the following:
Creating bins (like converting numeric grades to letter grades or
ranges of dates into Q1, Q2, etc.)
Creating a column that ranks the values in another column
Creating a column to indicate that another value has reached a
threshold (passing or failing, Dean's list, etc.)
Converting string categories to numbers (for regression or
correlation)

Binning Data
Sometimes, you will have discrete data that you need to group into
bins. (Think: converting numeric grades to letter grades.) In this
lesson, we will learn about binning (Listing 3-27).

import pandas as pd
Location = "datasets/gradedata.csv"
df = pd.read_csv(Location)
df.head()

Listing 3-27 Loading the Dataset from CSV

Now that the data is loaded, we need to define the bins and
group names (Listing 3-28).

# Create the bin dividers

bins = [0, 60, 70, 80, 90, 100]
# Create names for the four groups
group_names = ['F', 'D', 'C', 'B', 'A']

Listing 3-28 Define Bins as 0 to 60, 60 to 70, 70 to 80, 80 to 90, 90 to 100

Notice that there is one more bin value than there are
group_names. This is because there needs to be a top and bottom
limit for each bin.

df['lettergrade'] = pd.cut(df['grade'], bins,

labels=group_names)
df

Listing 3-29 Cut Grades

Listing 3-29 categorizes the column grade based on the bins
list and labels the values using the group_names list.
And if we want to count the number of observations for each
category, we can do that too (Listing 3-30).

pd.value_counts(df['lettergrade'])

Listing 3-30 Count Number of Observations

Your Turn
Recreate the dataframe from this section and create a column
classifying the row as pass or fail. This is for a master's program that
requires a grade of 80 or above for a student to pass.

Applying Functions to Groups, Bins, and Columns

The number one reason I use Python to analyze data is to handle
datasets larger than a million rows. The number two reason is the
ease of applying functions to my data.
To see this, first we need to load up some data (Listing 3-31).

import pandas as pd
Location = "datasets/gradedata.csv"
df = pd.read_csv(Location)
df.head()

Listing 3-31 Loading a Dataframe from a CSV File

Then, we use binning to divide the data into letter grades (Listing
3-32).

# Create the bin dividers

bins = [0, 60, 70, 80, 90, 100]
# Create names for the four groups
group_names = ['F', 'D', 'C', 'B', 'A']
df['letterGrades'] = pd.cut(df['grade'],
bins, labels=group_names)
df.head()
Listing 3-32 Using Bins
To find the average hours of study by letter grade, we apply our
functions to the binned column (Listing 3-33).

df.groupby('letterGrades')['hours'].mean()

Listing 3-33 Applying Function to Newly Created Bin

Applying a function to a column looks like Listing 3-34.

# Applying the integer function to the grade

column
df['grade'] = df['grade'] = df['grade']
.apply(lambda x: int(x))
df.head()

Listing 3-34 Applying a Function to a Column

Line 1: Let's get an integer value for each grade in the

dataframe.
Applying a function to a group can be seen in Listing 3-35.

gender_preScore =
df['grade'].groupby(df['gender'])
gender_preScore.mean()

Listing 3-35 Applying a Function to a Group

Line 1: Create a grouping object. In other words, create an

object that represents that particular grouping. In this case, we
group grades by the gender.
Line 2: Display the mean value of each regiment's pre-test
score.

Your Turn
Import the datasets/gradedata.csv file and create a new
binned column of the 'status' as either passing (> 70) or failing
(<=70). Then, compute the mean hours of exercise of the female
students with a 'status' of passing.

Ranking Rows of Data

It is relatively easy to find the row with the maximum value or the
minimum value, but sometimes you want to find the rows with the
50 highest or the 100 lowest values for a particular column. This is
when you need ranking (Listing 3-36).

import pandas as pd
Location = "datasets/gradedata.csv"
df = pd.read_csv(Location)
df.head()

Listing 3-36 Load Data from CSV

If we want to find the rows with the lowest grades, we will need
to rank all rows in ascending order by grade. Listing 3-37 shows the
code to create a new column that is the rank of the value of grade in
ascending order.

df['graderanked'] =
df['grade'].rank(ascending=1)
df.tail()

Listing 3-37 Create Column with Ranking by Grade

So, if we just wanted to see the students with the 20 lowest

grades, we would use the code in Listing 3-38.

df[df['graderanked'] < 21]

Listing 3-38 Bottom 20 Students

And, to see them in order, we need to use the code in Listing 3-

39.
Discovering Diverse Content Through
Random Scribd Documents
Col. Culberson was congressman from the First Congressional
District for twenty-two years. He was one of the leading lawyers of
the State, and was prominent in the famous Abe Rothchild case. He
was the father of C. A. Culberson, who was born and reared in 32
Jefferson and started his political career as County Attorney of
Marion County and was later Attorney General of the State, Governor
of Texas, and was elected to the Senate of the United States. He
was known as Senior Senator for a number of years—until his death.

Rev. D. B. Culberson, the father of Col. Culberson was one of the

early pastors of the First Baptist Church of Jefferson.

W. L. Crawford
A leading criminal lawyer of Texas. Upon leaving Jefferson he moved
to Dallas.

Hector McKay
Hector McKay, born in Tennessee, came to Texas with his mother
and family when very young, settled near Elysian Fields, where the
family remained many years. The old McKay burying ground is there.
He was a member of Ector’s Brigade during the Civil War, enlisting at
Marshall. He attained the rank of Captain. After the war, he practiced
law in Marshall where he was a law partner of Judge Mabry and later
of W. T. Armistead. Captain McKay was one of the prominent lawyers
of early days of Jefferson.

Captain Moss
Captain Moss, the grandfather of Mrs. Will Sims, of Jefferson, in
1836 operated and owned one of the finest steamboats on the river
—The Hempstead. He assisted Captain Shreve in blowing out the
rafts to make Cypress Bayou navigable to Jefferson and during the
Mexican war he transported soldiers across the river into Texas.

Mr. T. L. Lyon
Mr. T. L. Lyon, with his family, came to Jefferson during the summer
of 1867. For many years Captain Lyon was a member of the firm,
Mooring and Lyon, buying cotton and doing general mercantile
business on Dallas Street. They commanded a wide scope of
business in the palmy days of the city.

Later in life business reverses came and he accepted a clerkship in

the “Lessie 13” a small freight packet, which burned. After which
Captain Lyon was clerk on the Alpha. Both boats were commanded
by the late Captain Ben Bonham.

Capt. Lyon continued his service on the Alpha, for a number of

years. He was a good citizen, a devout Christian and an active
member of the Methodist Church until his death Nov. 28th, 1908,
leaving many friends to mourn his loss.

A daughter, Mrs. G. M. Jones, occupied the old home which was

bought at the time of her father’s coming to Jefferson sixty-nine
years ago, until her death recently.

Royal A. Ferris
Mr. Ferris was a leading lawyer of Dallas, Texas, and he too was
reared in Jefferson.
Nelson Phillips
Chief Justice of the Supreme Court of Texas, and a leading lawyer of
the State, made his home in Dallas and was a product of Marion
County, living near Jefferson.

W. B. Harrison
A leading business man and banker of Ft. Worth, Texas for many
years started his business career in Jefferson during the palmy days.

The Bateman Family

(King, Andy, and Quincy)
This prominent family in the business and social life of the early days
of Jefferson, branched out into the business world in Jefferson and
when Jefferson began to lose navigation, along with it, many of her
population, the Bateman family moved west and were helpful in
building Ft. Worth, Texas, in business and banking lines.

In fact Jefferson furnished many of the leading business and

professional men, who went west in the early days and built the
State of Texas.

Jefferson is truly the “mother city” of the State of Texas as it was the
largest and most noted in the early ’70’s and gave to the balance of
the State leading business and professional men to make “The Lone
Star State” great.

The old families, who were not so famous, but were the real stamina
of the town down through the ages, when prosperity had passed on
to other fields and living was hard—yet they lived on and kept the
home fires burning until today Jefferson seems doomed to again
come to be, and is known the state over as a promising oil center—
with prosperity again in view. These with their children may be
numbered by the hundreds. Among them we find:

Dr. B. J. Terry
Dr. T. H. Stallcup
W. B. Stallcup
Ward Taylor
Dick Terry
J. H. Rowell
S. W. Moseley
Shep Haywood
T. L. Torrans
W. P. Schluter
Louis Schluter
R. B. Walker
W. B. Kennon
J. B. Zachery
W. B. Sims
D. C. Wise
A. Urquhart
S. A. Spellings
Capt. Lyon
A. Stutz
T. J. Rogers
W. J. Sedberry
Sam Moseley
W. B. Ward
Sam Ward
J. M. DeWare
B. J. Benefield
J. C. Preston
P. Eldridge
I. Goldberg
M. Bower
J. J. Rives
H. Rives

These with many others have done much for Jefferson and 34
Texas—So “Come to Texas” and be sure you come to
Jefferson.

Benj. H. Epperson
Benj. H. Epperson was born in Mississippi in about 1828. He was
educated in North Carolina and at Princeton University, New Jersey.
He came to Texas and settled at Clarksville sometime in the ’40’s. He
studied law, was admitted to the bar and practiced with marked
ability and success. He was active Whig politician before and after
the war and was the candidate of his party for governor in 1851 at a
time when he was below the constitutional age. In 1852 he was at
the head of the Texas delegation to the Whig National convention.
He served in the Legislature practically from 1858 until his death. He
was a personal friend of Sam Houston and was consulted by
Houston on numerous affairs of state.

In the controversy over secession Epperson was a Union man,

standing substantially with General Houston on that question. After
Texas seceded he cast his allegiance with the Confederacy and did a
great deal for the cause, giving very liberally of his time and money.
He was a member of the first Confederate Congress.

In 1866 he was elected to the U. S. Congress and went to

Washington, but as Texas was not recognized as a State, he was not
permitted to take his seat.

In the early ’70’s he moved to Jefferson, Texas where he lived until

his death in 1878.
Because of his wide personal acquaintance and unusual ability he
exercised a wide political influence throughout the State. He was one
of the first presidents of East-Line Railroad, and was highly
instrumental in the railroad development of Texas.

In 1931 a collection of papers and letters that had belonged to B. H.

Epperson were sent to the University of Texas by his son. Among
them were letters dealing with affairs in Texas during the
Confederate war and Reconstruction periods, also Indian papers
saved from the time that Mr. Epperson had represented the Indians
in Washington before, or in the early ’50’s, besides many papers
pertaining to Railway matters, etc. The Archivist, Mrs. Mattie Austin
Hatcher has written that “they are very valuable.” He says that these
things are used by historians and also by students in getting material
for their thesis.

W. P. Torrans
W. P. Torrans, born in Mobile, Alabama, in 1849 moved to Houston,
Texas and in 1850 came to Jefferson. He established a general
mercantile business on Austin Street, in the building next to the
present Goldberg Feed Store.

In 1862 he was Tax Collector, but maintained his business also. In

1872 built the first brick block on Polk Street, where the Torrans
Manufacturing Company is now located.

The W. P. Torrans home at one time stood in the middle of this

block, made into an office building and used by Dr. A. C. Clopton. Mr.
Torrans bought a home on the corner of Lafayette and Market
Streets which is standing and in good repair.

The Torrans business has run continuously all these years and is
known as the Torrans Manufacturing Company, a very flourishing
business, owned and operated by T. L. Torrans, who is one of
Jefferson’s most prominent and active citizens, and a son of W. P.
Torrans. Mr. T. L. Torrans married Miss Elizabeth Schluter, daughter
of the late Mr. and Mrs. Louis Schluter, who was a very prominent
lawyer in this city.

Tom Lee Torrans, Jr., a son of Mr. Torrans, is now active manager of
the store. Mrs. Kelly Spearman and Louis Torrans are also children of
Mr. and Mrs. Tom Torrans.

Story From Past Recalls Worth of Two

Jefferson Men
Captain DeWare, Col. H. McKay, Have Sons
Prominent Here Now
“Before the final adjournment of district court and during a short
recess Saturday evening, at Jefferson, the friends of Sheriff J. M.
DeWare of Marion County presented him with a Smith and Wesson
forty-four caliber pistol and belt. The weapon was elaborately
carved, pearl handled, inlaid with gold, and bore this inscription: ‘J.
M. DeWare, Sheriff, Jefferson, Texas. From his friends, Jan. 1, 1887.’
A graceful and appropriate presentation speech was delivered by
Col. McKay.”

Taken from the “Dallas and Texas fifty years ago” column in the
Dallas News on Jan. 25, the story above was sent The Jefferson
Journal by Ollie B. Webb, Texas and Pacific official, with the notation
that “it seems to me will be of interest to many in Jefferson.”

Mr. Webb was right. As he goes on to explain, Sheriff J. M. DeWare,

or Captain DeWare, as he was more generally known, was the father
of J. M. DeWare, present local agent for the Texas and Pacific in
Jefferson, and the Col. H. McKay mentioned, was the father of Arch
McKay, now Tax Assessor-Collector.

Both were men who stood out as leaders in their time. Both they
and their descendants have many friends in Jefferson.

James Jackson Rives

James Jackson Rives came to Jefferson from Caddo Parish, La.,
before the Civil War, and established a cotton and hide business
after returning from the war. When his son, Herbert Rives, 36
returned from Sewanee Military Institute he joined the
business of J. J. Rives and Son, which continued until the warehouse
was destroyed by fire about 1902.

R. Ballauf, Merchant and Banker

Rudolph Ballauf was born in Hamburg, Germany, June 30, 1832. At
the age of 16 years he sought his fortune in America. Arriving at
New Orleans, La., he obtained employment. Later he obtained a
position with the Mallory Steamship Lines and gradually worked up
to the position as interpreter. He was serving in this capacity when
the war between the states was declared and he joined the
Confederate Army.

He was married in 1866 to Miss Mary Louise Hottinger of New

Orleans. To this union seven children were born—Lula (Mrs. D. P.
Alvarez), Julia (Mrs. Asa E. Ramsey), Mamie (Mrs. I. L. Goldberg),
Corine, George Henry, Emma (Mrs. Eugene Meyer), and Fred W.

Mr. Ballauf came to Jefferson in 1867, he and his wife making the
trip by boat.
He opened a general merchandise on the corner of Marshall and
Austin streets, later moving to Walnut and Lafayette and later to
Austin street. The G. A. Kelly foundry of Kellyville was purchased by
Mr. Ballauf and the material used for all manufactured articles was
secured in Marion county. Mr. Ballauf operated the foundry until
1895. His mercantile business was later devoted entirely to hardware
and mill machinery.

Along with Mr. Ballauf’s mercantile business a private bank was

opened in 1885 and operated as “R. Ballauf and Co.” The bank was
operated in the office of the store by his three daughters, Lula, Julia
and Mamie.

Mr. Ballauf sold his business to his son Fred and his son-in-law
Eugene Meyer in 1897, the banking business was discontinued and
Mr. Ballauf retired from business having spent thirty years without a
failure, assignment or compromise with creditors.

He was an active member of the General Dick Taylor Confederate

Camp.

Mr. Ballauf died in 1910. The business established by him has

continued through these 69 years and is today successfully operated
by his grandchildren under the name of Eugene Meyer and Son.

Robert Potter
March 3, 1843, Senator Robert Potter, a signer of the Texas
Declaration of Independence and first secretary of the Navy of the
Republic, was murdered at his home on Caddo Lake.

He was born in Gainesville, North Carolina, in 1800. Served in the U.

S. Navy from 1815 to 1817, then he returned home and studied law
and in 1826 he moved to Halifax and practiced law. He served in the
legislature in North Carolina; was elected to the House of 37
Representatives of the 21st United States Congress as a
Jackson Democrat. His course was brilliant and improving.

His brilliancy, connected with the fact that he had been a

midshipman, led to his appointment in the Cabinet of President
Burnett as the first secretary of the Navy of the Republic. He was
expelled from the House of Representatives of the Legislature of
North Carolina for cheating at cards.

Potter later moved to a place twenty-five miles northeast of

Jefferson, Texas, now known as Potter’s Point.

A feud arose between Potter and a Captain William Pickney Rose,

who was known as the “Lion of the Lakes.” The feud arising from the
claims that Potter had prevailed upon President Lamar to offer a
reward for Rose.

The widow of Rose’s brother settled on a league of land claimed by

Potter. This was intensified when Rose espoused the candidacy of
John B. Denton, who was defeated for a seat in the Senate by
Potter.

Potter, who lived on a bluff overlooking Caddo Lake organized a

posse of about twenty-five men, surrounded the home of Rose with
the intention of capturing, chastising and probably killing Rose.

Rose was near by with some slaves clearing a woodland and when
he saw Potter’s men, he lay upon the ground while one of his slaves,
“Uncle Jerry,” piled brush over him and effectually concealed him
from view.

Foiled in their purpose, the posse returned and were followed, at a

safe distance, by Preston Rose, a son of Captain Rose, who saw
them disband; most of them going to Smithland, while nine went
with Potter to his home. That night Rose secured “warrant for
trespass” against Potter. This was placed in the hands of a
Constable, who summoned a posse, consisting of Rose, Preston
Rose, J. W. Scott and thirteen others to execute the warrant, as if a
warrant for trespass required “the body to be taken.” They reached
Potter’s home at midnight and surrounded it. At daybreak the
bodyguard of Potter began to reconnoiter the premises, when
Hesekiah George came suddenly upon Captain Rose. Upon being
commanded to surrender he turned for flight and gave the alarm.
Rose fired both barrels of his shotgun at him and although he
survived the wounds he was ever afterwards known as “Old Rose’s
Lead Mine.” Potter became alarmed and ran about a hundred yards
to the lake. Being an excellent diver, he plunged into the water and
disappeared from sight, but when he came up for air, John W. Scott
killed him. He was buried on Potter’s Point.

Rogers National Bank

Captain T. J. Rogers, founder of the Rogers National Bank of
Jefferson, and one of Jefferson’s oldest citizens, was born in 1832, in
Hinds County, Mississippi. In 1849 he came to Texas with his 38
father and family. In 1856 in Gilmer, Texas, he married Emily
Mayberry and they moved to Jefferson, living in what is now known
as the Brewer home, with the family of Dr. B. J. Terry. During the
Civil War, he served in the Confederate Army in General Ochiltree’s
regiment (the 18th) in General Waul’s division, as a lieutenant, later
being made captain, after Captain John Cocke, brother-in-law, was
killed, in the battle of Mansfield. After the war he returned to
Jefferson, again engaging in the mercantile business. In 1868 he
went into business for himself. He was identified with the material,
civic and religious interests of Jefferson. He was one of the
promoters of the East Line and Red River railroad (later a branch of
the M. K. and T. railroad of Texas.) He was secretary and treasurer of
this railroad until it was sold to the M. K. and T. He was also principal
owner of the Jefferson Cotton Oil Mill, later selling his interests to
the Jefferson Cotton Oil Company, which operated until it was
burned in 1903.

In 1896 the banking business, T. J. Rogers & Son, was founded in

connection with the mercantile business which was now under the
name T. J. Rogers & Son, (Ben Rogers).

In 1904 T. J. Rogers & Son, bankers, was nationalized, becoming the

Rogers National Bank of Jefferson, with T. J. Rogers president and B.
F. Rogers active vice-president. In 1904 Herbert A. Spellings was
elected cashier, which position he held until 1918 when he
succeeded to the presidency by reason of the death of Capt. T. J.
Rogers, in the meantime B. F. Rogers, vice president, had withdrawn
active participation in the bank’s management.

Shortly after Mr. Spellings became president the bank became one of
the honor banks of the United States and maintained this position to
the present time, and throughout the most depressing period the
banks have ever faced the Rogers National Bank of Jefferson under
Mr. Spellings’ guidance maintained more than its legal reserve,
willingly met the demands made upon his bank and was never
embarrassed to the least extent.

When the national moratorium was declared and conservators were

being appointed for the safety management of national banks. It
was freely stated that the Rogers National Bank had had a
conservator for many years in the person of Mr. Spellings, therefore
the government would not be called upon to appoint one for that
bank, and this bank was one of the first in the United States to re-
open without a special examination. Mr. Spellings remained as
president until the summer of 1935 when he was removed by death
and was succeeded by Mr. Rogers Rainey as president. Mr. Rainey
being a grandson of Capt. T. J. Rogers, and nephew of Ben F.
Rogers, the founder of the bank, which is the only bank in Marion
County, and an outstanding one in the State of Texas.
ONLY ONE BANK IN FIVE CAN QUALIFY FOR
THIS HONOR
What is a “Roll of Honor” bank, and what does it mean to you as a
depositor, or as a possible depositor, that this institution has 39
been given that rating in the banking “hall of fame?”

A “Roll of Honor” bank is a bank that has voluntarily provided double

protection for its depositors by building up its surplus and undivided
profits account to a point where this reserve fund is equal to, or
greater than the capital of the bank.

The laws, either National or State, do not require any bank to

provide this “extra measure of safety.” As a matter of fact, the
soundest banking practice and the legal requirements of some states
fix 20 per cent of the bank’s capital as a sufficient reserve fund to
maintain for the safety of its depositors.

But before a bank can become known as a “Roll of Honor” bank, it

must voluntarily build up its surplus reserve fund to an amount at
least five times the usual requirements. So severe are the
requirements that only one bank in five in the entire country can
qualify as a “Roll of Honor” institution.

The fact that this bank has achieved this distinction stamps it as one
of the strongest institutions for its size in the whole United States.

You can see, therefore, that it does mean a great deal to you as a
depositor or as a possible depositor, that this is a “Roll of Honor”
bank. In addition to giving you “more than the law requires” in
protection, we are only striving to give you a “double measure” of
courteous and friendly service.

David Browning Culberson

David Browning Culberson was born in Troupe County, Georgia,
Sept. 24th, 1830; was educated at Brownwood, La., and Grange,
Ca., and studied law under Chief Justice Chilton of Alabama.

He was married to Miss Eugenia Kimball, a lady of sterling character

and brilliant mind. It was to her influence and encouragement that
he owed a large measure of his success. To this union three children
were born, Charles A., the oldest, was one year old when the family
moved to Texas in 1856. Robert Owen and a daughter, Anna, were
born in Texas. Robert Owen is the only surviving one. He now
resides in Houston, Texas.

The Culberson family—Jim Culberson, a brother with his family and

Dr. R. L. Rowell and family, and others and a large company of
slaves came to Texas in covered wagons.

Dr. Rowell located in Jefferson but the Culberson brothers moved to

Gilmer, Texas, where they practiced law for two years, then came to
Jefferson to make their home.

Col. D. B. Culberson was elected to the State Legislature in 1859,

was elected again in 1864. He was then elected to the Forty-fourth
Congress and served continuously until his death.

From a private in the Confederate Army he was promoted to 40

the rank of Colonel of the 18th Texas Infantry, was assigned to
duty in 1864 as Adjutant-General, with rank of Colonel.

Although a nationally known lawyer of shrewd and brilliant mind he

remained always unassuming almost careless in his dress. In manner
he was shy and retiring.

He spoke so seldom in Congress that he was known as “The Silent

Member.”

His brilliant mind, sterling qualities of character won for him the title
of “Honest Dave.” He was one of the lawyers in the famous Diomond
Bessie-Rothchild case.

The Rev. D. B. Culberson, Sr., the father of Col. Culberson, was one
of the early pastors of the First Baptist Church of Jefferson.

CLUBS

The 1881 Club

The 1881 Club was organized in Jefferson, Texas in October 1881 at
the home of Mrs. W. B. Ward, where a room full of enthusiastic
members organized a chautauqua circle. Among the charter
members were: Mrs. J. H. Bemis, Mrs. J. P. Russell, Mrs. Sallie
Dickson and Miss Sarah Terhune. The circle was composed of both
men and women and met at night. Captain J. P. Russell was the first
president with Ben Epperson as Secretary. At the end of four years
diplomas and credits were given.

Without a break in the meetings the chautauqua circle was merged

into a woman’s club, called the Review Club lessons being taken
from current magazines, then known as the Shakespearean Club for
several years.

Finally in 1882 when it became a member of the State and Third

District Federation the name was changed to “The 1881 Club” in
honor of the year of its organization. Many have been the courses of
study by the enthusiastic members. In fifty-five years of its existence
the club meetings have continued each week, only disbanding from
second Saturday in May to the first Saturday in October. One of the
charter members, Mrs. Sarah Terhune Taylor, is now an honorary
member. One of the active members, Mrs. D. C. Wise dates her
membership to 1896. This club has the distinction of being the
oldest club in the State of Texas.

THE WEDNESDAY MUSIC CLUB

The Wednesday Music Club is one of the oldest music clubs in the
state, having been organized in 1909 by the late Mrs. W. H. Mason,
for the purpose of study and to assist the 1881 Club in putting on a
program when the 1881 Club entertained the Third District
Federation of Women’s Clubs in 1910.

The Music Club requested membership in the Federation at 41

this meeting and was a member of the Federation of Woman’s
Clubs until the Music Club began a separate organization.

The members of the club who are now living are:

Mrs. Addie Terry Nance, Lebannon, Oregon.

Mrs. G. T. Haggard, Jefferson, Texas.
Mrs. J. M. DeWare, Jefferson, Texas.
Mrs. Mattie Vines, Jefferson, Texas.
Mrs. S. S. Minor, Jefferson, Texas.
Miss May Belle Hale, Jefferson, Texas.
Miss Eva Eberstadt, Jefferson, Texas.
Mrs. Murph Smith DeWare, Jefferson, Texas.
Mrs. J. A. Nance was the first president of the club.
Mrs. Mattie Vines, first director.
Miss Ethel Leaf (Mrs. J. M. DeWare) first accompanist.

The first choruses “Carmen” and “In Old Madrid” were presented at
the District meeting of Federated Women’s Clubs.

The Wednesday Music Club was a member of the East Texas Music
Festival during its seven years of existence and with the May Belle
Hale Symphony Orchestra, as Co-Hostess, entertained the Festival in
May 1924.

Mrs. H. A. Spellings was president of the East Texas Festival at this

time also president of The Wednesday Music Club, serving the club
in this capacity for fourteen years.

Mrs. G. T. Haggard was the Club president last year and she and
Mrs. Murph Smith DeWare are the only members who have served
the club, uninterruptedly from its beginning.

The Club had the honor and pleasure of taking part in the wedding
of one of its members, Miss Ethel Leaf to Mr. J. M. DeWare by
rendering “Lohengrin’s Wedding March.” This really was a “few”
years back but a happy memory to those present on such a joyous
occasion.

MARION COUNTY
Possibly few counties have the distinction of having been a part of so
many other counties as has Marion County, so no wonder she is so
“tiny” in size after having been sliced and served to six different
others.

The records at Austin, Texas, tell us that Marion County was first a
part of Red River County, later a part of Shelby, Bowie, Titus, Cass
and Harrison. Cass County was for ten years known as Davis County.
Thus again taking the name of Cass, so really another “slice” may
have been taken off Marion.

However it is up to Harrison for being the “big hearted” county.

Years ago a negro representative was sent to the Legislature from
Harrison County and during his term of office Marion County
acquired a nice acreage of Harrison County, and when the 42
Negro Representative returned to Marshall he was asked “Why in the
mischief did you allow anything like that to happen?”, he replied:
“Well Sir, Senator Culberson just talked me right out of that little
piece of land.’”

Marion County today has an abundant supply of high grade iron ore;
saw mills, chair factory, an abundant supply of the purest and best
artesian water to be found any where. The county is well adapted to
the raising of hogs and cattle. The most delicious sweet potatoes,
fruit and berries of all kinds. Mayhaws grow wild and from these is
made a most palatable and beautiful jelly; in fact almost anything
will do well in Marion County. There are many kinds of clover
growing wild.

STERN MEMORIAL FOUNTAIN

Stern Memorial Fountain was given to the City of Jefferson by the
children (Eva, Leopold, Alfred and Fred) of Jacob and Ernestine
Stern in 1913.

In the gift of this splendid piece of work lay the life time love of
Jefferson, a devotion of a little immigrant girl grown to womanhood,
and the gratitude of her children to a little city that had given Mother
and Father happiness.

The fountain is entirely of purest bronze and is 13½ feet high, with
bowls of 7½ feet broad, and has a statue six feet tall representing
“L’ducation,” the total cost being $4,000.

Engraved on the fountain is: “Dedicated in honor of Jacob &

Ernestine Stern, who lived in Jefferson for many years. Presented to
the City of Jefferson by their children as an expression of affection
for their native town”.
More than seventy-six years ago, Mr. and Mrs. Stern came to
Jefferson from Houston, in a two horse wagon. Mr. Stern was buried
in Jefferson in 1872 and later the family went to New York to live.

The fountain is still used, as was originally intended, for the good of
man, stock and dogs, and the pure water that flows through it was
given the ladies of Jefferson by the late W. B. Ward in appreciation
for work done in the prohibition election many years ago.

As the people of Jefferson appreciated the noble qualities of the

Stern family, they too appreciate the gift of love from the children.

In connection with the foregoing article a little book has been

written by Mr. Stern’s sister-in-law, Mrs. Eva Stern, a most beautiful
token of the noble lives of Mr. and Mrs. Stern.

In the book is printed a bill-of-sale for a negro woman slave. When

Mr. Stern gave the bill-of-sale to his wife he said, “I felt like a mean
creature when I paid the money for that girl, but I knew that we
needed a nurse girl ... so what was to be done ... Where I was born,
on the Rhine, no one would believe for a moment that I would 43
buy a human being. They would hate me, as I hate myself, for
bartering in human flesh.”

The exact bill of sales for Sarah read as follows:

“Received from Jacob Stern two thousand dollars for a negro

woman, by name Sarah, about thirty-four years of age, copper
colored. Said woman I promise to deliver to Jacob Stern, in course
of six days. I hereby guarantee the woman, Sarah, to be sound in
body and mind. I also guarantee said woman, Sarah, to be a good
house woman. If not, I promise to take her back and refund to said
Jacob Stern $1000.”

Just before Mr. Stern’s death their old servant “Aunt Caroline” and he
were talking and he told her that he thanked God he had set the
colored people free, and she replied, “But thanks be to him mos’en
fer giben me my good marsar and misses, who gib me my close, my
vittles and my medicine.”

WALNUT GROVE
Five miles south of Linden there stands today an immense walnut
grove. Planted on both sides of the old dirt road, one hundred or
more of these trees are all that are left of the 320 planted by Mr. Jim
Lockett, more than 60 years ago. The trees make a dense shade and
a beautiful lane.

The story is, that Mr. Lockett in a reminiscent mood, thought, that
the country some day would run out of split rails, with which to
make fences. Realizing that wire would some day be used for
making fences he knew that fence posts would be needed, so he
ordered his farm hands to plant in every other corner of the rail
fence a slim seedling walnut tree to be used for future fence posts.

They are standing today waiting for the wire and we are told that
when the new highway was built that it was moved over 200 feet to
keep from injuring the roots of Mr. Lockett’s trees.

Mr. Lockett passed away more than 20 years ago, but his Walnut line
is still a joy to the many who pass that way and many people gather
the Walnuts by the bushel each fall.

Another interesting thing that Mr. Lockett had on his farm was his
water gin. One of the neighbors said, “that in its day it could really
go after the cotton.”

The water was brought to the gin through a series of ditches and
water troughs a mile and a quarter long.
From overhead and controlled by a gate, the water fell onto the top
of a large wooden wheel 36 feet in diameter. Around the wheel were
attached buckets holding 15 gallons of water each, and when
enough buckets were filled with water the wheel began turning and
the gin ran. “She would launch out five bales a day, if you got going
by daylight.”

MURDER ALLEY
“Murder Alley” may be reached by taking the left where Line Street
divides, going south to the river, leaving the Barbee home on the
right.

The name “Murder Alley” was derived from the fact that one and
often two dead bodies would be found each morning in this alley.

Col. Lowery is said to have edited a paper in the Barbee home

during these much trying days.

It may be of interest to many Jeffersonians to know that the original

courthouse was, according to the Allen Urquhart plan, located just in
front of the P. C. Henderson home.

THE STORY OF DIAMOND BESSIE

This sketch of a famous murder case in Jefferson is mostly from
the pen of W. H. Ward who lived at that time and later moved to
Texarkana where he was editor of “The Twentieth Century” and
this sketch is taken from the December issue, along with a few
other legends from other sources.
The recent mysterious murder of a woman in Jefferson, Texas,
recalls the death of Bessie Moore or “Diamond Bessie,” who was
believed to have been slain by her husband and erstwhile paramour,
Abe Rothchild, within rifle shot of where the murder was committed
more than twenty years ago.

The murder of Bessie Moore, properly Bessie Rothchild, a young and

beautiful woman who had won the sobriquet of Diamond Bessie by
the number and splendor of her jewels, was one of the most
startling and sensational crimes in the criminal history of Texas. The
scene of the crime was visited by thousands of curious spectators
and the entire press of the Southwest teemed with gruesome
incidents of the awful crime. Crowds came from afar to view the spot
where the young mother had been hurled into eternity without
warning, carrying with her the half formed life of an unborn infant.

On the shiny slope of a Southern hillside, almost within call of the

then thriving populous city of Jefferson, Texas, in the calm of a
Sabbath afternoon, the cruel and cowardly crime was committed, for
which the husband was twice sentenced to hang but escaped justice
by a technicality of the law. The murder of Bessie Rothchild, by the
man who was first her betrayer and then her husband, was a crime
so weird and terrible that the hand of a master might make it
immortal, without for one instant diverging from the strict line of
truth into the realm of romance.

Twenty-four years ago, Bessie Moore, the daughter of respectable

parents of moderate circumstances, was decoyed from her home in
the country by the son of a wealthy Cincinnati family. With the
inexperience of youth, and that blind faith which makes a woman
follow the man she loves to the utmost ends of the earth, 45
Bessie Moore followed Abe Rothchild to Cincinnati. There for
one year the young girl was plunged into that maelstrom of sin
which whirls and eddies about a great city. Her companions were
those of the half-world, the submerged half. Rothchild was rich and
he showered his wealth upon the girl from whom he had taken all
that life holds dear, home, family and friends.

The motley population of Jefferson added color to the restless

movement of the town. The streets were crowded with men of many
kinds of dress, cowboys in chaps and spurs, gentlemen in morning
coats with canes, farmers in dingy overalls, ladies in elaborate
flowing gowns, old slavery negroes, self confident, northern negroes,
carpet-baggers, and into the crowd came Bessie Moore, sparkling
with diamonds, accompanied by dark and tall Abe Rothchild and did
she create a sensation? She was part of this restless life the three
short days that she was among them, diamonds sparkled in her ears
as she shook her head and laughed, diamonds so large on her
fingers that it seemed they must tire her small hands. This poor
return for her sacrifice satisfied the girl only for a time, then the
glittering jewels, silken raiment, which gave her the sobriquet of
“Diamond Bessie” and which were purchased with a woman’s
shame, began to pall upon her. Bessie Moore was to become a
mother.

Amid all the dissipation into which her betrayer had thrust her, the
woman had remained true and steadfast to the man she loved, for
whom she had given up her innocence and home. Through all this
time she had relied upon Rothchild’s promise to make her his wife
and she prayed that the promise might be fulfilled.

Finding her prayers of no avail she demanded a fulfillment of the

pledge. There was a scene, of course, and other scenes followed but
Rothchild had now to deal, not with a silly trusting girl, but with a
wronged, outraged and desperate woman, who battled not only for
her rights but for her child, yet unborn. In a fit of desperation she
threatened to lay the shameful story of her betrayal before
Rothchild’s father, a wealthy and influential citizen of Cincinnati.
Then Rothchild is alleged to have conceived and proceeded to carry
out a crime so dark, so despicable and so diabolical that Satan
himself must have blushed at its conception. He promised the young
girl to make her his wife, told her that it would not do for them to be
married in Cincinnati; where both themselves and their intimacy
were so well known, but that he would take her on his western trips.
Rothchild was a traveling salesman representing a jewelry house in
which his father was financially interested and he himself being
slated for partnership, and that they would be married in some out
of the way place out west and that by changing one figure in the
marriage certificate, it would make it appear that they had been
married immediately upon the young girl leaving home, which would
have given legitimate birth to the child, to which Bessie Moore was
about to become a mother.

The girl believed him and blessed him and they left Cincinnati, 46
together, traveling westward and passing through Texarkana.
From the moment Rothchild promised to make Bessie Moore his wife
he had been planning the woman’s murder. They left the Texas and
Pacific railway at Kildare, Rothchild telling the woman that they
would go through Linden, the County seat of Cass County, to be
married, choosing that spot, he said, because it was so obscure that
news of the marriage would not be heard outside the little town in
which the ceremony was to be performed. His real intention was to
murder the woman on the road. He was thwarted in this by being
compelled to make the trip on a public coach, there being no such
thing as private conveyances in Kildare.

Once at Linden, Rothchild was compelled to make good his promise

and Bessie Moore, the wronged and betrayed girl became Bessie
Rothchild, the wife of her betrayer. From Linden they came to
Jefferson, Texas from which point it was agreed that Mrs. Rothchild
should return to Cincinnati and have her marriage certificate
recorded changing the date, as agreed upon, after which she was to
return to her husband. The poor girl looked forward with eagerness
and hunger to the day she would return to her home bearing the
honored name of wife and be clasped once more in her mother’s
arms. Alas! The poor girl lies in an obscure corner of the Jefferson
Cemetery, her body long since dust and food for worms. They

Ebook Download Business Analytics, 1e Sanjiv Jaggia
0% (2)
Ebook Download Business Analytics, 1e Sanjiv Jaggia
58 pages
Grade 7 CBC Computer Science Notes
93% (15)
Grade 7 CBC Computer Science Notes
73 pages
Teach Yourself Data Analytics in 30 Days
0% (1)
Teach Yourself Data Analytics in 30 Days
133 pages
Guide To Intelligent Data Science: Michael R. Berthold Christian Borgelt Frank Höppner Frank Klawonn Rosaria Silipo
No ratings yet
Guide To Intelligent Data Science: Michael R. Berthold Christian Borgelt Frank Höppner Frank Klawonn Rosaria Silipo
427 pages
AXI Interview Que
100% (3)
AXI Interview Que
7 pages
Synopsis 1
No ratings yet
Synopsis 1
14 pages
Flight Direction Cosine Matrix
No ratings yet
Flight Direction Cosine Matrix
11 pages
Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley instant download
100% (1)
Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley instant download
55 pages
Complete Download Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley PDF All Chapters
100% (3)
Complete Download Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley PDF All Chapters
40 pages
Full Download (Ebook) Learn Data Analysis with Python: Lessons in Coding by A.J. Henley, Dave Wolf ISBN 9781484234853, 1484234855 PDF DOCX
100% (6)
Full Download (Ebook) Learn Data Analysis with Python: Lessons in Coding by A.J. Henley, Dave Wolf ISBN 9781484234853, 1484234855 PDF DOCX
81 pages
(Ebook) Data Science Essentials in Python: Collect – Organize – Explore – Predict – Value by Dmitry Zinoviev ISBN 9781680501841, 1680501844 - Download the full ebook now for a seamless reading experience
No ratings yet
(Ebook) Data Science Essentials in Python: Collect – Organize – Explore – Predict – Value by Dmitry Zinoviev ISBN 9781680501841, 1680501844 - Download the full ebook now for a seamless reading experience
61 pages
Introduction To Data ScienceA Python Approach To Concepts, Techniques and Applications PDF
100% (7)
Introduction To Data ScienceA Python Approach To Concepts, Techniques and Applications PDF
227 pages
Data Science Fundamentals for Python and MongoDB 1st Edition David Paper instant download
100% (3)
Data Science Fundamentals for Python and MongoDB 1st Edition David Paper instant download
67 pages
Data Science - A First Introduction With Python (Z-Lib - Io)
No ratings yet
Data Science - A First Introduction With Python (Z-Lib - Io)
452 pages
R High Performance Programming
From Everand
R High Performance Programming
Aloysius Lim
4.5/5 (2)
(Ebook) Data Science Essentials in Python: Collect – Organize – Explore – Predict – Value by Dmitry Zinoviev ISBN 9781680501841, 1680501844 pdf download
100% (2)
(Ebook) Data Science Essentials in Python: Collect – Organize – Explore – Predict – Value by Dmitry Zinoviev ISBN 9781680501841, 1680501844 pdf download
51 pages
Instant download R for Data Science 2nd Edition (Early Release) Hadley Wickham pdf all chapter
No ratings yet
Instant download R for Data Science 2nd Edition (Early Release) Hadley Wickham pdf all chapter
47 pages
R Object-oriented Programming
From Everand
R Object-oriented Programming
Kelly Black
3/5 (1)
CRC.Data.Science
No ratings yet
CRC.Data.Science
443 pages
Data Science Fundamentals with R Python and Open Data 1st Edition Marco Cremonini - Get instant access to the full ebook content
100% (1)
Data Science Fundamentals with R Python and Open Data 1st Edition Marco Cremonini - Get instant access to the full ebook content
57 pages
Data Science Fundamentals with R Python and Open Data 1st Edition Marco Cremonini - The full ebook with all chapters is available for download now
100% (1)
Data Science Fundamentals with R Python and Open Data 1st Edition Marco Cremonini - The full ebook with all chapters is available for download now
52 pages
Python Data Analysis Second Edition Armando Fandangoinstant download
100% (1)
Python Data Analysis Second Edition Armando Fandangoinstant download
54 pages
Python Data Analytics Data Analysis and Science Using Pandas Matplotlib and the Python Programming Language 1st Edition Nelli Fabio pdf download
100% (6)
Python Data Analytics Data Analysis and Science Using Pandas Matplotlib and the Python Programming Language 1st Edition Nelli Fabio pdf download
72 pages
The Art of Data Science Roger D. Peng pdf download
No ratings yet
The Art of Data Science Roger D. Peng pdf download
76 pages
Data Analysis for Beginners: 2 in 1 Guide: A Beginner's Adventure in Analysis and Visualization Daniel Garfield download
No ratings yet
Data Analysis for Beginners: 2 in 1 Guide: A Beginner's Adventure in Analysis and Visualization Daniel Garfield download
46 pages
Marco Cremonini - Data Science Fundamentals With R, Python, And Open Data (2024, WILEY) - Libgen.li
No ratings yet
Marco Cremonini - Data Science Fundamentals With R, Python, And Open Data (2024, WILEY) - Libgen.li
1,442 pages
Modern Data Science with R Chapman Hall CRC Texts in Statistical Science 2nd Edition Benjamin S. Baumer - Own the ebook now with all fully detailed chapters
100% (1)
Modern Data Science with R Chapman Hall CRC Texts in Statistical Science 2nd Edition Benjamin S. Baumer - Own the ebook now with all fully detailed chapters
73 pages
The Art of Data Science Roger D. Peng pdf download
No ratings yet
The Art of Data Science Roger D. Peng pdf download
55 pages
89430
No ratings yet
89430
40 pages
R for Data Science 2nd Edition (Early Release) Hadley Wickham instant download
No ratings yet
R for Data Science 2nd Edition (Early Release) Hadley Wickham instant download
73 pages
Solve Any Data Analysis Problem MEAP V02 David Asboth - The full ebook with all chapters is available for download now
100% (3)
Solve Any Data Analysis Problem MEAP V02 David Asboth - The full ebook with all chapters is available for download now
79 pages
2606
No ratings yet
2606
51 pages
Buy ebook Solve Any Data Analysis Problem MEAP V02 David Asboth cheap price
100% (2)
Buy ebook Solve Any Data Analysis Problem MEAP V02 David Asboth cheap price
40 pages
Complete Download (Ebook) Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist by Thomas Mailund ISBN 9781484226704, 9781484226711, 1484226704, 1484226712 PDF All Chapters
100% (13)
Complete Download (Ebook) Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist by Thomas Mailund ISBN 9781484226704, 9781484226711, 1484226704, 1484226712 PDF All Chapters
65 pages
Download ebooks file Python Data Science Chaolemen Borjigin all chapters
100% (3)
Download ebooks file Python Data Science Chaolemen Borjigin all chapters
59 pages
Solve Any Data Analysis Problem MEAP V02 David Asboth download
100% (3)
Solve Any Data Analysis Problem MEAP V02 David Asboth download
82 pages
Statistics and Data Visualisation with Python Jesús Rogel-Salazar download
100% (2)
Statistics and Data Visualisation with Python Jesús Rogel-Salazar download
85 pages
Teach Yourself VISUALLY Python
From Everand
Teach Yourself VISUALLY Python
Ted Hart-Davis
No ratings yet
Download full Foundations of Data and Digital Journalism 1st Edition Alex Richards ebook all chapters
100% (4)
Download full Foundations of Data and Digital Journalism 1st Edition Alex Richards ebook all chapters
37 pages
Basic Python for Data Management, Finance, and Marketing: Advance Your Career by Learning the Most Powerful Analytical Tool 1st Edition Art Yudin pdf download
No ratings yet
Basic Python for Data Management, Finance, and Marketing: Advance Your Career by Learning the Most Powerful Analytical Tool 1st Edition Art Yudin pdf download
30 pages
Immediate download Python Programming for Data Analysis 1st Edition José Unpingco ebooks 2024
100% (1)
Immediate download Python Programming for Data Analysis 1st Edition José Unpingco ebooks 2024
65 pages
Full download Modern Data Science with R Chapman Hall CRC Texts in Statistical Science 2nd Edition Benjamin S. Baumer pdf docx
100% (3)
Full download Modern Data Science with R Chapman Hall CRC Texts in Statistical Science 2nd Edition Benjamin S. Baumer pdf docx
40 pages
[FREE PDF sample] Foundations of Data and Digital Journalism 1st Edition Alex Richards ebooks
100% (1)
[FREE PDF sample] Foundations of Data and Digital Journalism 1st Edition Alex Richards ebooks
37 pages
The Art of Software Testing
From Everand
The Art of Software Testing
Glenford J. Myers
3/5 (1)
OceanofPDF.com Modern Data Science With R - Baumer Benjamin SKaplan Daniel THort
No ratings yet
OceanofPDF.com Modern Data Science With R - Baumer Benjamin SKaplan Daniel THort
985 pages
Python Programming for Data Analysis 1st Edition José Unpingco - Read the ebook online or download it to own the full content
No ratings yet
Python Programming for Data Analysis 1st Edition José Unpingco - Read the ebook online or download it to own the full content
55 pages
Foundationfor DataScience
No ratings yet
Foundationfor DataScience
41 pages
Python Data Science Essentials
From Everand
Python Data Science Essentials
Alberto Boschetti
No ratings yet
Big Data Forensics – Learning Hadoop Investigations
From Everand
Big Data Forensics – Learning Hadoop Investigations
Joe Sremack
No ratings yet
Get An Introduction to Data Science 1st Edition, (Ebook PDF) free all chapters
No ratings yet
Get An Introduction to Data Science 1st Edition, (Ebook PDF) free all chapters
47 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
255 pages
Solve Any Data Analysis Problem MEAP V02 David Asboth All Chapters Instant Download
100% (3)
Solve Any Data Analysis Problem MEAP V02 David Asboth All Chapters Instant Download
79 pages
An Introduction to Data Science 1st Edition, (Ebook PDF) pdf download
100% (2)
An Introduction to Data Science 1st Edition, (Ebook PDF) pdf download
59 pages
(Ebook) Python Programming for Data Analysis by José Unpingco ISBN 9783030689513, 3030689514 2024 Scribd Download
100% (2)
(Ebook) Python Programming for Data Analysis by José Unpingco ISBN 9783030689513, 3030689514 2024 Scribd Download
55 pages
Solve Any Data Analysis Problem MEAP V02 David Asboth pdf download
100% (1)
Solve Any Data Analysis Problem MEAP V02 David Asboth pdf download
53 pages
Python for Data Analysis 3rd Edition by Wes McKinney ISBN 9781098103989 109810398X - Experience the full ebook by downloading it now
100% (11)
Python for Data Analysis 3rd Edition by Wes McKinney ISBN 9781098103989 109810398X - Experience the full ebook by downloading it now
76 pages
Data Science Book
No ratings yet
Data Science Book
383 pages
Python for Data Science For Dummies
From Everand
Python for Data Science For Dummies
John Paul Mueller
No ratings yet
(Ebook) Statistical Application Development with R and Python by Prabhanjan Narayanachar Tattar ISBN 9781788621199, 1788621190 - Download the full set of chapters carefully compiled
100% (1)
(Ebook) Statistical Application Development with R and Python by Prabhanjan Narayanachar Tattar ISBN 9781788621199, 1788621190 - Download the full set of chapters carefully compiled
56 pages
Mastering Python Data Analysis
From Everand
Mastering Python Data Analysis
Magnus Vilhelm Persson
No ratings yet
Get MATLAB Recipes: a problem-solution approach 2nd Edition Michael Paluszek PDF ebook with Full Chapters Now
100% (7)
Get MATLAB Recipes: a problem-solution approach 2nd Edition Michael Paluszek PDF ebook with Full Chapters Now
55 pages
Data Analysis with Python Introducing NumPy Pandas Matplotlib and Essential Elements of Python Programming 1st Edition Rituraj Dixit download
100% (1)
Data Analysis with Python Introducing NumPy Pandas Matplotlib and Essential Elements of Python Programming 1st Edition Rituraj Dixit download
49 pages
Statistical Analysis with R Essentials For Dummies
From Everand
Statistical Analysis with R Essentials For Dummies
Joseph Schmuller
No ratings yet
Python for Data Analysis 3rd Edition by Wes McKinney ISBN 9781098103989 109810398X pdf download
No ratings yet
Python for Data Analysis 3rd Edition by Wes McKinney ISBN 9781098103989 109810398X pdf download
56 pages
Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny 2nd Edition Matt Wileyinstant download
100% (2)
Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny 2nd Edition Matt Wileyinstant download
50 pages
Understanding Nanoelectromechanical Quantum Circuits and Systems (NEMX) for the Internet of Things (IoT) Era (River Publishers Series in Electronic Materials and Devices) 1st Edition Héctor J. De Los Santos instant download
100% (3)
Understanding Nanoelectromechanical Quantum Circuits and Systems (NEMX) for the Internet of Things (IoT) Era (River Publishers Series in Electronic Materials and Devices) 1st Edition Héctor J. De Los Santos instant download
62 pages
The Bonds Of Inequality Debt And The Making Of The American City Destin Jenkins instant download
100% (1)
The Bonds Of Inequality Debt And The Making Of The American City Destin Jenkins instant download
27 pages
The Oxford History Of Hinduism Hindu Practice Gavin Flood download
100% (1)
The Oxford History Of Hinduism Hindu Practice Gavin Flood download
31 pages
Hitler And The Habsburgs James Longo pdf download
100% (1)
Hitler And The Habsburgs James Longo pdf download
32 pages
Penury 1st Edition Pete Brassett instant download
100% (5)
Penury 1st Edition Pete Brassett instant download
77 pages
Unwell Women Elinor Cleghorn instant download
100% (1)
Unwell Women Elinor Cleghorn instant download
38 pages
Development Capitalism and Rent The Political Economy of Hartmut Elsenhans Hannes Warnecke-Berger pdf download
100% (4)
Development Capitalism and Rent The Political Economy of Hartmut Elsenhans Hannes Warnecke-Berger pdf download
84 pages
Successful Writing at Work 4th Edition Philip C Kolin pdf download
100% (3)
Successful Writing at Work 4th Edition Philip C Kolin pdf download
70 pages
Surgical Recall Lorne H Blackbourne pdf download
100% (3)
Surgical Recall Lorne H Blackbourne pdf download
72 pages
Global Perspectives on Indian Spirituality and Management Sanjoy Mukherjee download
100% (5)
Global Perspectives on Indian Spirituality and Management Sanjoy Mukherjee download
82 pages
Teaching Social Studies Today 2nd Edition Kathleen Kopp instant download
100% (3)
Teaching Social Studies Today 2nd Edition Kathleen Kopp instant download
73 pages
The Greek Vegetarian Cookbook Heather Thomas instant download
100% (3)
The Greek Vegetarian Cookbook Heather Thomas instant download
32 pages
Simulation with Python 1st Edition Rongpeng Li instant download
100% (4)
Simulation with Python 1st Edition Rongpeng Li instant download
79 pages
Civil Engineering Procedure 8th Edition Institution Of Civil Engineers. pdf download
100% (5)
Civil Engineering Procedure 8th Edition Institution Of Civil Engineers. pdf download
78 pages
Beyond Sustainability A Thriving Environment 2nd Edition Tim Delaney pdf download
100% (6)
Beyond Sustainability A Thriving Environment 2nd Edition Tim Delaney pdf download
73 pages
Animals in Antarctica Julie Murray download
100% (5)
Animals in Antarctica Julie Murray download
84 pages
Hypnosis with Children, 5th 5th Edition Daniel P. Kohen instant download
100% (5)
Hypnosis with Children, 5th 5th Edition Daniel P. Kohen instant download
81 pages
One Good Deed Terri Fields instant download
100% (4)
One Good Deed Terri Fields instant download
57 pages
Managing Your Money Kristy Stark download
100% (3)
Managing Your Money Kristy Stark download
74 pages
National Geographic Readers Helen Keller Level 2 Kitson Jazynka instant download
100% (5)
National Geographic Readers Helen Keller Level 2 Kitson Jazynka instant download
74 pages
National Geographic Readers Wild Cats Level 1 Elizabeth Carney instant download
100% (5)
National Geographic Readers Wild Cats Level 1 Elizabeth Carney instant download
75 pages
Judah Touro Didn t Want to Be Famous Audrey Ades download
100% (4)
Judah Touro Didn t Want to Be Famous Audrey Ades download
57 pages
Sam Is Six Sara E Hoffmann pdf download
100% (4)
Sam Is Six Sara E Hoffmann pdf download
56 pages
Ollie Goes the Distance All About Electric Cars Claire Winslow download
100% (4)
Ollie Goes the Distance All About Electric Cars Claire Winslow download
57 pages
Pinkalicious and Planet Pink Victoria Kann pdf download
100% (4)
Pinkalicious and Planet Pink Victoria Kann pdf download
71 pages
The Digital Revolution in Health Innovating and Acting for Sustaining Transformations in the Health System Vol 2 1st Edition Roland Rizoulieres pdf download
100% (4)
The Digital Revolution in Health Innovating and Acting for Sustaining Transformations in the Health System Vol 2 1st Edition Roland Rizoulieres pdf download
74 pages
Minic-Ii e
No ratings yet
Minic-Ii e
3 pages
Test Bank for Modern Database Management 10th Edition by Hoffer - Download Instantly To Explore The Full Content
100% (8)
Test Bank for Modern Database Management 10th Edition by Hoffer - Download Instantly To Explore The Full Content
54 pages
Et ZC434 Course Handout
No ratings yet
Et ZC434 Course Handout
5 pages
Digi Lims
No ratings yet
Digi Lims
56 pages
REFORM IN INTEGRATED SCIENCE CURRICULUM IN NIGERIA
No ratings yet
REFORM IN INTEGRATED SCIENCE CURRICULUM IN NIGERIA
7 pages
Add Remote IEC 60870-5-103 Points: Generic Data
No ratings yet
Add Remote IEC 60870-5-103 Points: Generic Data
1 page
CSE161 Lec 16 Dynamic Memory Allocation
No ratings yet
CSE161 Lec 16 Dynamic Memory Allocation
17 pages
Iso Osi Protocols
No ratings yet
Iso Osi Protocols
3 pages
All Netapp Print Ha
No ratings yet
All Netapp Print Ha
60 pages
Lab 1 - Arrays
No ratings yet
Lab 1 - Arrays
16 pages
Hiiii
No ratings yet
Hiiii
4 pages
Airbnb and Hotel Performance 1 18
No ratings yet
Airbnb and Hotel Performance 1 18
18 pages
DBMS Module 1
No ratings yet
DBMS Module 1
56 pages
10.siniflar Fizik 1.konu Dominoes Project
No ratings yet
10.siniflar Fizik 1.konu Dominoes Project
5 pages
House Price Statistics For Small Areas (Hpssas)
No ratings yet
House Price Statistics For Small Areas (Hpssas)
1,057 pages
Mixed Methods Research
No ratings yet
Mixed Methods Research
9 pages
7Sv5OgVaQ3CmuCgoyZ2I - Data Literacy Foundations-Maven
No ratings yet
7Sv5OgVaQ3CmuCgoyZ2I - Data Literacy Foundations-Maven
109 pages
Problem 1: Consider Two Processor A and B. Processor A Has Clock Speed 3.2Ghz and B Has 2.00Ghz. Which Processor Is Faster? How Much?
No ratings yet
Problem 1: Consider Two Processor A and B. Processor A Has Clock Speed 3.2Ghz and B Has 2.00Ghz. Which Processor Is Faster? How Much?
6 pages
Computer Hardware Thesis Topics
100% (1)
Computer Hardware Thesis Topics
6 pages
Hadoop Tutorial
No ratings yet
Hadoop Tutorial
13 pages
Chapter 3 4 Seun
No ratings yet
Chapter 3 4 Seun
8 pages
Week 1-Module
No ratings yet
Week 1-Module
7 pages
01 - Introduction Into Data Analyze
No ratings yet
01 - Introduction Into Data Analyze
18 pages
Demonstration: Understanding Pig: HDP Developer: Apache Pig and Hive
No ratings yet
Demonstration: Understanding Pig: HDP Developer: Apache Pig and Hive
26 pages
What Is A Cloud Storage?
No ratings yet
What Is A Cloud Storage?
10 pages

Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley instant download

Uploaded by

Learn Data Analysis with Python: Lessons in Coding First Edition A.J. Henley instant download

Uploaded by

Learn Data Analysis with Python: Lessons in

Coding First Edition A.J. Henley install

Download more ebook from https://ebookmeta.com

Learn Python By Coding Video Games Intermediate A step

Tiny Python Projects Learn coding and testing with

Learn Python by Coding Video Games Beginner 1st Edition

Emerging Threats of Synthetic Biology and

DK Eyewitness Madrid Travel Guide Dk Eyewitness

Clinical Dermatology 5th Edition Richard B Weller

The Lost Child Complex in Australian Film Jung Story

Sons of Abraham vol 2 Pawns of Terror 1st Edition

Learn Data Analysis with Python

Any source code or other supplementary material referenced by the

ISBN 978-1-4842-3485-3 e-ISBN 978-1-4842-3486-0

Library of Congress Control Number: 2018933537

© A.J. Henley and Dave Wolf 2018

This work is subject to copyright. All rights are reserved by the

Trademarked names, logos, and images may appear in this book.

Printed on acid-free paper

Distributed to the book trade worldwide by Springer

Installing Jupyter Notebook

What Is Jupyter Notebook?​

Getting the Datasets for the Workbook’s Exercises

Chapter 2:​Getting Data Into and Out of Python

Loading Data from CSV Files

Saving Data to CSV

Loading Data from Excel Files

Saving Data to Excel Files

Combining Data from Multiple Excel Files

Loading Data from SQL

Saving Data to SQL

Random Numbers and Creating Random Data

Chapter 3:​Preparing Data Is Half the Battle

Calculating and Removing Outliers

Missing Data in Pandas Dataframes

Filtering Inappropriate Values

Finding Duplicate Rows

Removing Punctuation from Column Contents

Removing Whitespace from Column Contents

Standardizing Text like SSNs, Phone Numbers, and Zip

Creating New Variables

Applying Functions to Groups, Bins, and Columns

Ranking Rows of Data

Making New Columns Using Functions

Converting String Categories to Numeric Variables

Organizing the Data

Removing and Adding Columns

Change Column Name

Setting Column Names to Lower Case

Finding Matching Rows

Filter Rows Based on Conditions

Selecting Rows Based on Conditions

Random Sampling Dataframe

Chapter 4:​Finding the Meaning

Computing Aggregate Statistics

Computing Aggregate Statistics on Matching Rows

Regression without Intercept

Basic Pivot Table

Chapter 5:​Visualizing Data

Data Quality Report

Graph a Dataset:​Line Plot

Graph a Dataset:​Bar Plot

Graph a Dataset:​Box Plot

Graph a Dataset:​Pie Chart

Chapter 6:​Practice Problems

is a technology educator with over 20 years’

is a certified Project Management

has worked in software development for

1. How to Use This Book

Be warned! This book is more a workbook than a textbook.

If you aren’t using Python for data analysis, begin at the

Installing Jupyter Notebook

What Is Jupyter Notebook?

2. Once you’ve installed Anaconda, you’re ready to create your first

3. Your browser will open to the following address:

4. Start a new notebook. On the right-hand side of the browser,

What Is Jupyter Notebook?

Chapter 2:Getting Data Into and Out of Python

Chapter 3:Preparing Data Is Half the Battle

Chapter 4:Finding the Meaning

Chapter 5:Visualizing Data

Graph a Dataset:Line Plot

Graph a Dataset:Bar Plot

Graph a Dataset:Box Plot

Graph a Dataset:Pie Chart

Chapter 6:Practice Problems