Clojure Data Analysis Cookbook 2nd Edition Edition Eric Rochester instant download
Clojure Data Analysis Cookbook 2nd Edition Edition Eric Rochester instant download
https://ebookgate.com/product/clojure-data-analysis-cookbook-2nd-
edition-edition-eric-rochester/
https://ebookgate.com/product/analysis-of-economic-data-2nd-edition-
gary-koop/
ebookgate.com
https://ebookgate.com/product/data-collection-and-analysis-2nd-
edition-roger-sapsford/
ebookgate.com
https://ebookgate.com/product/data-analysis-for-physical-scientists-
featuring-excel-2nd-edition-les-kirkup/
ebookgate.com
Statistics and Data Analysis for Nursing Research 2nd
Edition Denise Polit
https://ebookgate.com/product/statistics-and-data-analysis-for-
nursing-research-2nd-edition-denise-polit/
ebookgate.com
https://ebookgate.com/product/the-joy-of-clojure-thinking-the-clojure-
way-1st-edition-michael-fogus/
ebookgate.com
https://ebookgate.com/product/html5-graphing-and-data-visualization-
cookbook-1st-edition-ben-fhala/
ebookgate.com
https://ebookgate.com/product/categorical-data-analysis-using-the-sas-
system-2nd-edition-maura-e-stokes/
ebookgate.com
https://ebookgate.com/product/microarrays-volume-2-applications-and-
data-analysis-2nd-edition-conor-w-sipe/
ebookgate.com
Clojure Data Analysis
Cookbook
Second Edition
Eric Rochester
BIRMINGHAM - MUMBAI
Clojure Data Analysis Cookbook
Second Edition
All rights reserved. No part of this book may be reproduced, stored in a retrieval system,
or transmitted in any form or by any means, without the prior written permission of the
publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the
information presented. However, the information contained in this book is sold without
warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers
and distributors will be held liable for any damages caused or alleged to be caused directly
or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.
ISBN 978-1-78439-029-7
www.packtpub.com
Credits
Reviewers Proofreaders
Vitomir Kovanovic Ameesha Green
Muktabh Mayank Srivastava Joel T. Johnson
Federico Tomassetti Samantha Lyon
Eric Rochester enjoys reading, writing, and spending time with his wife and kids. When
he’s not doing these things, he programs in a variety of languages and platforms, including
websites and systems in Python, and libraries for linguistics and statistics in C#. Currently,
he is exploring functional programming languages, including Clojure and Haskell. He works
at Scholars’ Lab in the library at the University of Virginia, helping humanities professors and
graduate students realize their digitally informed research agendas. He is also the author of
Mastering Clojure Data Analysis, Packt Publishing.
A special thanks to Jackie, Melina, and Micah. They’ve been patient and
supportive while I worked on this project. It is, in every way, for them.
About the Reviewers
His new venture is ParallelDots. It is a tool that allows any content archive to be presented
in a story using advanced techniques of NLP and machine learning. For publishers and
bloggers, it automatically creates a timeline of any event using their archive and presents
it in an interactive, intuitive, and easy-to-navigate interface on their webpage. You can find
him on LinkedIn at http://in.linkedin.com/in/muktabh/ and on Twitter at
@muktabh / @ParallelDots.
Federico Tomassetti has been programming since he was a child and has a PhD
in software engineering. He works as a consultant on model-driven development and
domain-specific languages, writes technical articles, teaches programming, and works as
a full-stack software engineer.
He has experience working in Italy, Germany, and Ireland, and he is currently working
at Groupon International.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub
files available? You can upgrade to the eBook version at www.PacktPub.com and as a print
book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
service@packtpub.com for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up
for a range of free newsletters and receive exclusive discounts and offers on Packt books
and eBooks.
TM
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt’s online digital book
library. Here, you can search, access, and read Packt’s entire library of books.
Why subscribe?
ff Fully searchable across every book published by Packt
ff Copy and paste, print, and bookmark content
ff On demand and accessible via a web browser
ii
Table of Contents
iii
Table of Contents
iv
Table of Contents
v
Preface
Welcome to the second edition of Clojure Data Analysis Cookbook! It seems that books
become obsolete almost as quickly as software does, so here we have the opportunity to
keep things up-to-date and useful.
Moreover, the state of the art of data analysis is also still evolving and changing. The
techniques and technologies are being refined and improved. Hopefully, this book will capture
some of that. I've also added a new chapter on how to work with unstructured textual data.
In spite of these changes, some things have stayed the same. Clojure has further proven
itself to be an excellent environment to work with data. As a member of the lisp family of
languages, it inherits a flexibility and power that is hard to match. The concurrency and
parallelization features have further proven themselves as great tools for developing
software and analyzing data.
Clojure's usefulness for data analysis is further improved by a number of strong libraries.
Incanter provides a practical environment to work with data and perform statistical analysis.
Cascalog is an easy-to-use wrapper over Hadoop and Cascading. Finally, when you're ready
to publish your results, ClojureScript, an implementation of Clojure that generates JavaScript,
can help you to visualize your data in an effective and persuasive way.
Moreover, Clojure runs on the Java Virtual Machine (JVM), so any libraries written for Java are
available too. This gives Clojure an incredible amount of breadth and power.
I hope that this book will give you the tools and techniques you need to get answers from
your data.
Preface
Chapter 4, Improving Performance with Parallel Programming, covers how to use Clojure's
parallel processing capabilities to speed up the processing of data.
Chapter 5, Distributed Data Processing with Cascalog, covers how to use Cascalog as a
wrapper over Hadoop and the Cascading library to process large amounts of data distributed
over multiple computers.
Chapter 6, Working with Incanter Datasets, covers the basics of working with Incanter
datasets. Datasets are the core data structures used by Incanter, and understanding them is
necessary in order to use Incanter effectively.
Chapter 7, Statistical Data Analysis with Incanter, covers a variety of statistical processes and
tests used in data analysis. Some of these are quite simple, such as generating summary
statistics. Others are more complex, such as performing linear regressions and auditing data
with Benford's Law.
Chapter 8, Working with Mathematica and R, talks about how to set up Clojure in order to talk
to Mathematica or R. These are powerful data analysis systems, and we might want to use
them sometimes. This chapter will show you how to get these systems to work together, as
well as some tasks that you can perform once they are communicating.
Chapter 9, Clustering, Classifying, and Working with Weka, covers more advanced machine
learning techniques. In this chapter, we'll primarily use the Weka machine learning library.
Some recipes will discuss how to use it and the data structures its built on, while other recipes
will demonstrate machine learning algorithms.
Chapter 10, Working with Unstructured and Textual Data, looks at tools and techniques used
to extract information from the reams of unstructured, textual data.
Chapter 11, Graphing in Incanter, shows you how to generate graphs and other visualizations
in Incanter. These can be important for exploring and learning about your data and also for
publishing and presenting your results.
Chapter 12, Creating Charts for the Web, shows you how to set up a simple web application in
order to present findings from data analysis. It will include a number of recipes that leverage
the powerful D3 visualization library.
2
Preface
The other major piece of software that you'll need is Leiningen 2, which you can download
and install from http://leiningen.org/. Leiningen 2 is a tool used to manage Clojure
projects and their dependencies. It has become the de facto standard project tool in the
Clojure community.
Throughout this book, we'll use a number of other Clojure and Java libraries, including Clojure
itself. Leiningen will take care of downloading these for us as we need them.
You'll also need a text editor or Integrated Development Environment (IDE). If you already have
a text editor of your choice, you can probably use it. See http://clojure.org/getting_
started for tips and plugins for using your particular favorite environment. If you don't have a
preference, I'd suggest that you take a look at using Eclipse with Counterclockwise. There are
instructions to this set up at https://code.google.com/p/counterclockwise/.
That is all that's required. However, at various places throughout the book, some recipes will
access other software. The recipes in Chapter 8, Working with Mathematica and R, that are
related to Mathematica will require Mathematica, obviously, and those that are related to R
will require that. However, these programs won't be used in the rest of the book, and whether
you're interested in those recipes might depend on whether you already have this software.
Likewise, you don't have to be an expert on data analysis, although you should probably be
familiar with its tasks, processes, and techniques. While you might be able to glean enough
from these recipes to get started with, for it to be truly effective, you'll want to get a more
thorough introduction to this field.
3
Preface
Conventions
In this book, you will find a number of styles of text that distinguish between different kinds of
information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions,
pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Now, there
will be a new subdirectory named getting-data.
When we wish to draw your attention to a particular part of a code block, the relevant lines or
items are set in bold:
(defn watch-debugging
[input-file]
(let [reader (agent
(seque
(mapcat
lazy-read-csv
input-files)))
caster (agent nil)
sink (agent [])
counter (ref 0)
done (ref false)]
(add-watch caster :counter
(partial watch-caster counter))
(add-watch caster :debug debug-watch)
(send reader read-row caster sink done)
(wait-for-it 250 done)
{:results @sink
:count-watcher @counter}))
4
Preface
New terms and important words are shown in bold. Words that you see on the screen,
in menus or dialog boxes for example, appear in the text like this: "Take a look at the
Hadoop website for the Getting Started documentation of your version. Get a single
node setup working".
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about this
book—what you liked or may have disliked. Reader feedback is important for us to develop
titles that you really get the most out of.
If there is a topic that you have expertise in and you are interested in either writing or
contributing to a book, see our author guide on www.packtpub.com/authors.
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to help you to
get the most from your purchase.
5
Preface
Errata
Although we have taken every care to ensure the accuracy of our content, mistakes do happen.
If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be
grateful if you could report this to us. By doing so, you can save other readers from frustration
and help us improve subsequent versions of this book. If you find any errata, please report them
by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on
the Errata Submission Form link, and entering the details of your errata. Once your errata are
verified, your submission will be accepted and the errata will be uploaded to our website or
added to any list of existing errata under the Errata section of that title.
Piracy
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt,
we take the protection of our copyright and licenses very seriously. If you come across any
illegal copies of our works, in any form, on the Internet, please provide us with the location
address or website name immediately so that we can pursue a remedy.
We appreciate your help in protecting our authors, and our ability to bring you
valuable content.
Questions
You can contact us at questions@packtpub.com if you are having a problem with any
aspect of the book, and we will do our best to address it.
6
Importing Data for
1
Analysis
In this chapter, we will cover the following recipes:
Introduction
There's not much data analysis that can be done without data, so the first step in any project
is to evaluate the data we have and the data that we need. Once we have some idea of what
we'll need, we have to figure out how to get it.
Importing Data for Analysis
Many of the recipes in this chapter and in this book use Incanter (http://incanter.org/)
to import the data and target Incanter datasets. Incanter is a library that is used for statistical
analysis and graphics in Clojure (similar to R) an open source language for statistical
computing (http://www.r-project.org/). Incanter might not be suitable for every task
(for example, we'll use the Weka library for machine learning later) but it is still an important
part of our toolkit for doing data analysis in Clojure. This chapter has a collection of recipes
that can be used to gather data and make it accessible to Clojure.
For the very first recipe, we'll take a look at how to start a new project. We'll start with very
simple formats such as comma-separated values (CSV) and move into reading data from
relational databases using JDBC. We'll examine more complicated data sources, such as
web scraping and linked data (RDF).
We'll use Leiningen for this (http://leiningen.org/). This has become a standard
package automation and management system.
Getting ready
Visit the Leiningen site and download the lein script. This will download the Leiningen JAR
file when it's needed. The instructions are clear, and it's a simple process.
How to do it...
To generate a new project, use the lein new command, passing the name of the project
to it:
$ lein new getting-data
Generating a project called getting-data based on the default template.
To see other templates (app, lein plugin, etc), try lein help new.
There will be a new subdirectory named getting-data. It will contain files with stubs for the
getting-data.core namespace and for tests.
8
Chapter 1
How it works...
The new project directory also contains a file named project.clj. This file contains
metadata about the project, such as its name, version, license, and more. It also contains
a list of the dependencies that our code will use, as shown in the following snippet. The
specifications that this file uses allow it to search Maven repositories and directories of
Clojure libraries (Clojars, https://clojars.org/) in order to download the project's
dependencies. Thus, it integrates well with Java's own packaging system as developed with
Maven (http://maven.apache.org/).
(defproject getting-data "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:dependencies [[org.clojure/clojure "1.6.0"]])
In the Getting ready section of each recipe, we'll see the libraries that we need to list in the
:dependencies section of this file. Then, when you run any lein command, it will download
the dependencies first.
Getting ready
First, let's make sure that we have the correct libraries loaded. Here's how the project
Leiningen (https://github.com/technomancy/leiningen) project.clj file should
look (although you might be able to use more up-to-date versions of the dependencies):
(defproject getting-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]])
9
Importing Data for Analysis
Finally, downloaded a list of rest area locations from POI Factory at http://www.poi-
factory.com/node/6643. The data is in a file named data/RestAreasCombined(Ver.
BN).csv. The version designation might be different though, as the file is updated. You'll also
need to register on the site in order to download the data. The file contains this data, which is
the location and description of the rest stops along the highway:
-67.834062,46.141129,"REST AREA-FOLLOW SIGNS SB I-95 MM305","RR, PT,
Pets, HF"
-67.845906,46.138084,"REST AREA-FOLLOW SIGNS NB I-95 MM305","RR, PT,
Pets, HF"
-68.498471,45.659781,"TURNOUT NB I-95 MM249","Scenic Vista-NO
FACILITIES"
-68.534061,45.598464,"REST AREA SB I-95 MM240","RR, PT, Pets, HF"
In the project directory, we have to create a subdirectory named data and place the file in
this subdirectory.
I also created a copy of this file with a row listing the names of the columns and named it
RestAreasCombined(Ver.BN)-headers.csv.
How to do it…
1. Now, use the incanter.io/read-dataset function in your REPL:
user=> (read-dataset "data/RestAreasCombined(Ver.BJ).csv")
10
Chapter 1
2. If we have a header row in the CSV file, then we include :header true in the call to
read-dataset:
How it works…
Together, Clojure and Incanter make a lot of common tasks easy, which is shown in the How to
do it section of this recipe.
We've taken some external data, in this case from a CSV file, and loaded it into an Incanter
dataset. In Incanter, a dataset is a table, similar to a sheet in a spreadsheet or a database
table. Each column has one field of data, and each row has an observation of data. Some
columns will contain string data (all of the columns in this example did), some will contain
dates, and some will contain numeric data. Incanter tries to automatically detect when a
column contains numeric data and coverts it to a Java int or double. Incanter takes away a
lot of the effort involved with importing data.
There's more…
For more information about Incanter datasets, see Chapter 6, Working with Incanter Datasets.
11
Importing Data for Analysis
Because JSON is a much richer data model than CSV, we might need to transform the
data. In that case, we can just pull out the information we're interested in and flatten the
nested maps before we pass it to Incanter. In this recipe, however, we'll just work with fairly
simple data structures.
Getting ready
First, here are the contents of the Leiningen project.clj file:
(defproject getting-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]
[org.clojure/data.json "0.2.5"]])
Moreover, you need some data. For this, I have a file named delicious-rss-214k.json
and placed it in the folder named data. It contains a number of top-level JSON objects.
For example, the first one starts like this:
{
"guidislink": false,
"link": "http://designreviver.com/tips/a-collection-of-wordpress-
tutorials-tips-and-themes/",
"title_detail": {
"base": "http://feeds.delicious.com/v2/rss/
recent?min=1&count=100",
"value": "A Collection of Wordpress Tutorials, Tips and Themes
| Design Reviver",
"language": null,
"type": "text/plain"
},
"author": "mccarrd4",
…
12
Chapter 1
How to do it…
Once everything's in place, we'll need a couple of functions to make it easier to handle the
multiple JSON objects at the top level of the file:
2. Now, we'll build on this to repeatedly parse a JSON document from an instance of
java.io.Reader. We do this by repeatedly calling test-eof until eof or until it
returns nil, accumulating the returned values as we go:
(defn read-all-json [reader]
(loop [accum []]
(if-let [record (test-eof reader json/read)]
(recur (conj accum record))
accum)))
3. Finally, we'll perform the previously mentioned two steps to read the data from
the file:
(def d (i/to-dataset
(with-open
[r (io/reader
"data/delicious-rss-214k.json")]
(read-all-json r))))
This binds d to a new dataset that contains the information read in from the JSON documents.
13
Importing Data for Analysis
How it works…
Similar to all Lisp's (List Processing), Clojure is usually read from the inside out and from
right to left. Let's break it down. clojure.java.io/reader opens the file for reading.
read-all-json parses all of the JSON documents in the file into a sequence. In this case, it
returns a vector of the maps. incanter.core/to-dataset takes a sequence of maps and
returns an Incanter dataset. This dataset will use the keys in the maps as column names, and
it will convert the data values into a matrix. Actually, to-dataset can accept many different
data structures. Try doc to-dataset in the REPL (doc shows the documentation string
attached to the function), or see the Incanter documentation at http://data-sorcery.
org/contents/ for more information.
Getting ready
First, make sure that your Leiningen project.clj file contains the right dependencies:
(defproject getting-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]"]])
Also, make sure that you've loaded those packages into the REPL or script:
(use 'incanter.core
'incanter.excel)
Find the Excel spreadsheet you want to work on. The file name of my spreadsheet is data/
small-sample-header.xls, as shown in the following screenshot. You can download this
from http://www.ericrochester.com/clj-data-analysis/data/small-sample-
header.xls.
14
Chapter 1
How to do it…
Now, all you need to do is call incanter.excel/read-xls:
user=> (read-xls "data/small-sample-header.xls")
How it works…
This can read standard Excel files (.xls) and the XML-based file format introduced in
Excel 2003 (.xlsx).
Fortunately, there's a Clojure-contributed package that sits on top of JDBC (the Java database
connector API, http://www.oracle.com/technetwork/java/javase/jdbc/index.
html) and makes working with databases much easier. In this example, we'll load a table
from an SQLite database (http://www.sqlite.org/), which stores the database in a
single file.
Getting ready
First, list the dependencies in your Leiningen project.clj file. We will also need to include
the database driver library. For this example, it is org.xerial/sqlite-jdbc:
(defproject getting-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]
[org.clojure/java.jdbc "0.3.3"]
[org.xerial/sqlite-jdbc "3.7.15-M1"]])
15
Importing Data for Analysis
Finally, get the database connection information. I have my data in an SQLite database file
named data/small-sample.sqlite, as shown in the following screenshot. You can
download this from http://www.ericrochester.com/clj-data-analysis/data/
small-sample.sqlite.
How to do it…
Loading the data is not complicated, but we'll make it easier with a wrapper function:
1. We'll create a function that takes a database connection map and a table name and
returns a dataset created from this table:
(defn load-table-data
"This loads the data from a database table."
[db table-name]
(i/to-dataset
(j/query db (str "SELECT * FROM " table-name ";"))))
16
Chapter 1
2. Next, we define a database map with the connection parameters suitable for
our database:
(defdb {:subprotocol "sqlite"
:subname "data/small-sample.sqlite"
:classname "org.sqlite.JDBC"})
How it works…
The load-table-data function passes the database connection information directly
through to clojure.java.jdbc/query.query. It creates an SQL query that returns all
of the fields in the table that is passed in. Each row of the result is a sequence of hashes
mapping column names to data values. This sequence is wrapped in a dataset by
incanter.core/to-dataset.
See also
Connecting to different database systems using JDBC isn't necessarily a difficult task,
but it's dependent on which database you wish to connect to. Oracle has a tutorial for
how to work with JDBC at http://docs.oracle.com/javase/tutorial/jdbc/
basics, and the documentation for the clojure.java.jdbc library has some good
information too (http://clojure.github.com/java.jdbc/). If you're trying to find
out what the connection string looks like for a database system, there are lists available
online. The list at http://www.java2s.com/Tutorial/Java/0340__Database/
AListofJDBCDriversconnectionstringdrivername.htm includes the major drivers.
17
Importing Data for Analysis
Getting ready
First, include these dependencies in your Leiningen project.clj file:
(defproject getting-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]])
Then, find a data file. I visited the website for the Open Data Catalog for Washington, D.C.
(http://data.octo.dc.gov/), and downloaded the data for the 2013 crime incidents.
I moved this file to data/crime_incidents_2013_plain.xml. This is how the contents
of the file look:
<?xml version="1.0" encoding="iso-8859-1"?>
<dcst:ReportedCrimes
xmlns:dcst="http://dc.gov/dcstat/types/1.0/">
<dcst:ReportedCrime
xmlns:dcst="http://dc.gov/dcstat/types/1.0/">
<dcst:ccn><![CDATA[04104147]]></dcst:ccn>
<dcst:reportdatetime>
2013-04-16T00:00:00-04:00
</dcst:reportdatetime>
…
18
Chapter 1
How to do it…
Now, let's see how to load this file into an Incanter dataset:
1. The solution for this recipe is a little more complicated, so we'll wrap it into a function:
(defn load-xml-data [xml-file first-data next-data]
(let [data-map (fn [node]
[(:tag node) (first (:content node))])]
(->>
(xml/parse xml-file)
zip/xml-zip
first-data
(iterate next-data)
(take-while #(not (nil? %))
(map zip/children)
(map #(mapcat data-map %))
(map #(apply array-map %))
i/to-dataset)))
2. We can call the function like this. Because there are so many columns, we'll just
verify the data that is loaded by looking at the column names and the row count:
user=> (def d
(load-xml-data "data/crime_incidents_2013_plain.xml"
zip/down zip/right))
user=> (i/col-names d)
[:dcst:ccn :dcst:reportdatetime :dcst:shift :dcst:offense
:dcst:method :dcst:lastmodifieddate :dcst:blocksiteaddress
:dcst:blockxcoord :dcst:blockycoord :dcst:ward :dcst:anc
:dcst:district :dcst:psa :dcst:neighborhoodcluster :dcst:busi
nessimprovementdistrict :dcst:block_group :dcst:census_tract
:dcst:voting_precinct :dcst:start_date :dcst:end_date]
user=> (i/nrow d)
35826
This looks good. This gives you the number of crimes reported in the dataset.
How it works…
This recipe follows a typical pipeline for working with XML:
19
Importing Data for Analysis
First, the function parses the XML file and wraps it in a zipper (we'll talk more about zippers in
the next section). Then, it uses the two functions that are passed in to extract all of the data
nodes as a sequence. For each data node, the function retrieves that node's child nodes and
converts them into a series of tag name / content pairs. The pairs for each data node are
converted into a map, and the sequence of maps is converted into an Incanter dataset.
There's more…
We used a couple of interesting data structures or constructs in this recipe. Both are common
in functional programming or Lisp, but neither have made their way into more mainstream
programming. We should spend a minute with them.
Zippers are very useful and interesting, and understanding them can help you understand
and work better with immutable data structures. For more information on zippers, the
Clojure-doc page is helpful (http://clojure-doc.org/articles/tutorials/
parsing_xml_with_zippers.html). However, if you would rather dive into the deep
end, see Gerard Huet's paper, The Zipper (http://www.st.cs.uni-saarland.de/edu/
seminare/2005/advanced-fp/docs/huet-zipper.pdf).
Processing in a pipeline
We used the ->> macro to express our process as a pipeline. For deeply nested function calls,
this macro lets you read it from the left-hand side to the right-hand side, and this makes the
process's data flow and series of transformations much more clear.
20
Chapter 1
We can do this in Clojure because of its macro system. ->> simply rewrites the calls into
Clojure's native, nested format as the form is read. The first parameter of the macro is
inserted into the next expression as the last parameter. This structure is inserted into the third
expression as the last parameter, and so on, until the end of the form. Let's trace this through
a few steps. Say, we start off with the expression (->> x first (map length) (apply
+)). As Clojure builds the final expression, here's each intermediate step (the elements to be
combined are highlighted at each stage):
When we're dealing with these formats in Clojure, the biggest difference is that JSON is
converted directly to native Clojure data structures that mirror the data, such as maps and
vectors Meanwhile, XML is read into record types that reflect the structure of XML, not the
structure of the data.
In other words, the keys of the maps for JSON will come from the domains, first_name or
age, for instance. However, the keys of the maps for XML will come from the data format, such
as tag, attribute, or children, and the tag and attribute names will come from the domain.
This extra level of abstraction makes XML more unwieldy.
21
Importing Data for Analysis
Getting ready
First, you have to add Enlive to the dependencies in the project.clj file:
(defproject getting-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]
[enlive "1.1.5"]])
Finally, identify the file to scrape the data from. I've put up a file at http://www.
ericrochester.com/clj-data-analysis/data/small-sample-table.html,
which looks like this:
It's intentionally stripped down, and it makes use of tables for layout (hence the comment
about 1999).
22
Chapter 1
How to do it…
1. Since this task is a little complicated, let's pull out the steps into several functions:
(defn to-keyword
"This takes a string and returns a normalized keyword."
[input]
(->input
string/lower-case
(string/replace \space \-)
keyword))
(defn load-data
"This loads the data from a table at a URL."
[url]
(let [page (html/html-resource (URL. url))
table (html/select page [:table#data])
headers (->>
(html/select table [:tr :th])
(map html/text)
(map to-keyword)
vec)
rows (->> (html/select table [:tr])
(map #(html/select % [:td]))
(map #(map html/text %))
(filterseq))]
(i/dataset headers rows))))))
2. Now, call load-data with the URL you want to load data from:
user=> (load-data (str "http://www.ericrochester.com/"
"clj-data-analysis/data/small-sample-table.html"))
| :given-name | :surname | :relation |
|-------------+----------+-------------|
| Gomez | Addams | father |
| Morticia | Addams | mother |
| Pugsley | Addams | brother |
| Wednesday | Addams | sister |
…
23
Importing Data for Analysis
How it works…
The let bindings in load-data tell the story here. Let's talk about them one by one.
The first binding has Enlive download the resource and parse it into Enlive's internal
representation:
(let [page (html/html-resource (URL. url))
The next binding selects the table with the data ID:
table (html/select page [:table#data])
Now, select of all the header cells from the table, extract the text from them, convert each to a
keyword, and then convert the entire sequence into a vector. This gives headers for the dataset:
headers (->>
(html/select table [:tr :th])
(map html/text)
(map to-keyword)
vec)
First, select each row individually. The next two steps are wrapped in map so that the cells in
each row stay grouped together. In these steps, select the data cells in each row and extract
the text from each. Last, use filterseq, which removes any rows with no data, such as the
header row:
rows (->> (html/select table [:tr])
(map #(html/select % [:td]))
(map #(map html/text %))
(filterseq))]
Here's another view of this data. In this image, you can see some of the code from this web
page. The variable names and select expressions are placed beside the HTML structures that
they match. Hopefully, this makes it more clear how the select expressions correspond to the
HTML elements:
24
Chapter 1
It's important to realize that the code, as presented here, is the result of a lot of trial and error.
Screen scraping usually is. Generally, I download the page and save it, so I don't have to keep
requesting it from the web server. Next, I start the REPL and parse the web page there. Then,
I can take a look at the web page and HTML with the browser's view source function, and I can
examine the data from the web page interactively in the REPL. While working, I copy and paste
the code back and forth between the REPL and my text editor, as it's convenient. This workflow
and environment (sometimes called REPL-driven-development) makes screen scraping
(a fiddly, difficult task at the best of times) almost enjoyable.
See also
ff The next recipe, Scraping textual data from web pages, has a more involved example
of data scraping on an HTML page
ff The Aggregating data from different formats recipe has a practical, real-life example
of data scraping in a table
Getting ready
First, we'll use the same dependencies and the require statements as we did in the last
recipe, Scraping data from tables in web pages.
Next, we'll identify the file to scrape the data from. I've put up a file at http://www.
ericrochester.com/clj-data-analysis/data/small-sample-list.html.
This is a much more modern example of a web page. Instead of using tables, it marks up the
text with the section and article tags and other features from HTML5, which help convey
what the text means, not just how it should look.
25
Importing Data for Analysis
As the screenshot shows, this page contains a list of sections, and each section contains a list
of characters:
How to do it…
1. Since this is more complicated, we'll break the task down into a set of
smaller functions:
(defn get-family
"This takes an article element and returns the family
name."
[article]
(string/join
(map html/text (html/select article [:header :h2]))))
26
Chapter 1
(defn get-person
"This takes a list item and returns a map of the person's
name and relationship."
[li]
(let [[{pnames :content} rel] (:content li)]
{:name (apply str pnames)
:relationship (string/trim rel)}))
(defn get-rows
"This takes an article and returns the person mappings,
with the family name added."
[article]
(let [family (get-family article)]
(map #(assoc % :family family)
(map get-person
(html/select article [:ul :li])))))
(defn load-data
"This downloads the HTML page and pulls the data out of
it."
[html-url]
(let [html (html/html-resource (URL. html-url))
articles (html/select html [:article])]
(i/to-dataset (mapcat get-rows articles))))
2. Now that these functions are defined, we just call load-data with the URL that we
want to scrape:
user=> (load-data (str "http://www.ericrochester.com/"
"clj-data-analysis/data/"
"small-sample-list.html"))
| :family | :name | :relationship |
|----------------+-----------------+---------------|
| Addam's Family | Gomez Addams | — father |
| Addam's Family | Morticia Addams | — mother |
| Addam's Family | Pugsley Addams | — brother |
…
27
Importing Data for Analysis
How it works…
After examining the web page, each family is wrapped in an article tag that contains a
header with an h2 tag. get-family pulls that tag out and returns its text.
get-person processes each person. The people in each family are in an unordered list
(ul), and each person is in an li tag. The person's name itself is in an em tag. let gets the
contents of the li tag and decomposes it in order to pull out the name and relationship
strings. get-person puts both pieces of information into a map and returns it.
get-rows processes each article tag. It calls get-family to get that information from
the header, gets the list item for each person, calls get-person on the list item, and adds
the family to each person's mapping.
Here's how the HTML structures correspond to the functions that process them. Each function
name is mentioned beside the elements it parses:
Finally, load-data ties the process together by downloading and parsing the HTML file and
pulling the article tags from it. It then calls get-rows to create the data mappings and
converts the output to a dataset.
28
Chapter 1
Linked data represents entities as consistent URLs and includes links to other databases
of the linked data. In a sense, it's the computer equivalent of human-readable web
pages. Often, these formats are used for open data, such as the data published by some
governments, like in the UK and elsewhere.
Linked data adds a lot of flexibility and power, but it also introduces more complexity.
Often, to work effectively with linked data, we need to start a triple store of some kind.
In this recipe and the next three, we'll use Sesame (http://rdf4j.org/) and the
kr Clojure library (https://github.com/drlivingston/kr).
Getting ready
First, we need to make sure that the dependencies are listed in our Leiningen
project.clj file:
We'll execute these packages to have these loaded into our script or REPL:
(use 'incanter.core
'edu.ucdenver.ccp.kr.kb
'edu.ucdenver.ccp.kr.rdf
'edu.ucdenver.ccp.kr.sparql
'edu.ucdenver.ccp.kr.sesame.kb
'clojure.set)
(import [java.io File])
For this example, we'll get data from the Telegraphis Linked Data assets. We'll pull down the
database of currencies at http://telegraphis.net/data/currencies/currencies.
ttl. Just to be safe, I've downloaded that file and saved it as data/currencies.ttl, and
we'll access it from there.
We'll store the data, at least temporarily, in a Sesame data store (http://notes.3kbo.
com/sesame) that allows us to easily store and query linked data.
29
Importing Data for Analysis
How to do it…
The longest part of this process will be to define the data. The libraries we're using do all of
the heavy lifting, as shown in the steps given below:
1. First, we will create the triple store and register the namespaces that the data uses.
We'll bind this triple store to the name tstore:
(defn kb-memstore
"This creates a Sesame triple store in memory."
[]
(kb :sesame-mem))
(defn init-kb [kb-store]
(register-namespaces
kb-store
'(("geographis"
"http://telegraphis.net/ontology/geography/geography#")
("code"
"http://telegraphis.net/ontology/measurement/code#")
("money"
"http://telegraphis.net/ontology/money/money#")
("owl"
"http://www.w3.org/2002/07/owl#")
("rdf"
"http://www.w3.org/1999/02/22-rdf-syntax-ns#")
("xsd"
"http://www.w3.org/2001/XMLSchema#")
("currency"
"http://telegraphis.net/data/currencies/")
("dbpedia" "http://dbpedia.org/resource/")
("dbpedia-ont" "http://dbpedia.org/ontology/")
("dbpedia-prop" "http://dbpedia.org/property/")
("err" "http://ericrochester.com/"))))
30
Another Random Scribd Document
with Unrelated Content
Emily to Mrs. Shaen.
Octavia is starting on a three days’ tour among the lakes with Miss
Harris. On Monday evening we had a concert and reading for the
tenants; and a letter from Octavia was read to them, which they all
responded to most beautifully; one of the men made a most touching
little speech in reply. Many of them said they had never enjoyed an
evening so much in their lives; and I have been so much touched and
delighted at several little acts of kindness and consideration towards
me; their silent answer to Octavia’s appeal that they would try to
make the work easier for those who are carrying it on for her.
HOME-SICKNESS
Heatherside, Wellington College,
Woking,
September 28th, 1867.
Thank you so much for the accounts; how beautifully you have
managed them....
It is dreadfully tempting to be so near you all. I long to be amongst
you, if it were only just to feel myself with you for an hour or two.
You seem to me such a blessed company gathered round that dear
old home of ours. But the time will not now seem long before I really
see you, and am once more at work. Remember me to all the tenants.
Tell Mrs. Moirey she must remember I don’t mean to lose sight of
her and hers....
My love to Mrs. Hughes and the children. It is such a comfort to
think of your being back to help them at home, and I like to think of
your cheering them by your uniform brightness as you used to cheer
me.
Emily to Octavia.
We had a very nice work class. Andy reads while I attend to the
people. They were anxious to hear about you, and were touched at
your sending the little presents to the children.... I think I have got a
much better set of tenants. M. is very anxious to pay; still I feel it so
uncertain with his health in that state. You know he was a year
without work; and when he got it, he was too weak and gave himself
an internal strain....
It was so nice to see how the pupils had thought of the poor
children, and had brought little presents. Mary had brought an ivy
root and fern roots, and clothes for the work class. Louisa had cut out
and made entirely a lovely dress and jacket for Alice P., and had
dressed some very pretty dolls, and brought some splendid flowers.
Harriet had gathered blackberries and made them into jam. It is so
nice that they remembered and cared for these things in their
holidays.
To Emily.
EXPERIENCES IN
FLORENCE
October 18th, 1867.
Miranda to Octavia.
Florence,
November 2nd, 1867.
To Emily.
HOME-SICKNESS
Florence.
To Emily.
Via de Serragli,
January 24th, 1868.
To Emily.
Emily to Octavia.
To Emily.
To Miss Mayo.
The time of my leaving here draws sadly near and I have done so
little—mostly weeding I think, and that is so interesting, it keeps me
out of doors, not standing or walking and yet gives me something to
do. It is quiet and nice and I like the smell of the earth and the
soothing monotony of the movement and thought. We have not been
reading anything of any depth or weight; usually we do here, but
somehow this time we have read scraps of things, and what I should
call decidedly light reading. “Scenes in Clerical Life,” part of Chaucer,
the “Story of Doom” (I am delighted with Laurence), a good deal of
Browning, and a little of Thackeray.
To Miss Mayo.
MORRIS’S
“JASON”
14, Nottingham Place,
October 4th, 1868.
To a Friend.
To Miss Baumgartner.
My dearest Mary,
I have had a most delightful week. The crowning day was last
Sunday, when I dined at Ruskin’s. It was exceedingly interesting. I
had been determined to ask him a little about Greek mythology,
literature and art; and how, without knowledge of Greek, one might
enter into some comprehension of all these; for I have lived long
enough to remember the passionate revolt of our then young
thinkers against the dead formal worship of all that had its origin in
Greece; and now I am interested to notice the men, leading from
weight of earnestness, tho’ educated in all the Gothic and Teuton
sympathies, turning back to Greek thought, and even imagery, as if it
contained nobler symbols of abiding truth than our northern
legends. Yes, even to feel the influence of the Grecian wave myself.
So we got into interesting talk. He told me that there was little
translation of Greek which he knew or cared for; that he had done a
little himself, which will be published with next Tuesday’s lecture;
that Homer, even translated by Pope, taught one a good deal; that
some tales by Cox (do you know them?) were intensely good; but (as
I was pleased to know that I had instinctively felt), Morris’s Jason
was the most helpful almost of all. He sketched for me most
beautifully, a kind of plan of Greek mythology, saying that the deities
who governed the elements were the primary ones; the earth the
sustainer of man; the water governing the ebb and flow of his
fortunes, the two fiery deities earthly and heavenly; and the goddess
of the air the inspirer. He quoted curious parallel thoughts from the
Bible; “the wind bloweth where it listeth.” He told me some strange
things, too, about Minerva giving men strength from winged beings,
and once, when enduing Menelaus with courage to fight Paris, giving
it from a mosquito; whereas most gods give them strength from
quadrupeds that are strong. Round these central deities are grouped
many minor ones; Mercury, the cloud-compeller, often represented
as a shepherd, guides the footsteps of men in life and death.
RUSKIN ON I asked him how far Virgil was too Roman to
GREEK be trusted. He seemed very much pleased to
MYTHOLOGY
find that I could read Virgil, and was fond of
him; it seems that he is very fond. He said moreover that Latin was
untranslateable—being so magnificent a language; whereas Greek,
mainly depending for its interest on thought, could be perfectly well
translated. I found that he and I agreed in liking the 2nd and 7th
books best, he rather inclining to the Infernal Regions and the Fall of
Troy. He told me that the exquisite tenderness between fathers and
sons delighted him above all things in Virgil, and led one to the root
of the main source of Roman greatness in its noblest time.
You will be sorry to hear that Miss Cons can only, at present, give
one day and a half weekly to the work; and that Miss Sterling is so
much interested in what she calls “linking my little affairs to
whatever has life,” that she will not work except near us—nor of
course could Miss Cons do more than this in so short a time.
On Wednesday we are to have our play.[64] We are actually to have
an audience of 200 poor people. Everyone is very kind about it; we
have a splendid room, and all promises well.
Oh, Mary! life and its many interests is a great and blessed
possession. I love it so much.... And yet it seems such a simple, quiet
thing to slip out of it presently; and for other and better people to
take up their work, and carry it on for their day too.
... The trees are of course very small; but the creepers helped us,
and the playground never looked so pretty. Our new swings were put
up; and three people were entirely occupied with superintending
them the whole time. Each child had a definite time allowed; and all
others were kept out of the way; no easy matter with children so
eager and so unaccustomed to control. The little band acquitted itself
admirably, considering how young it is yet; it is an acquisition. We
had numbers of games of course. The see-saw was crowded all the
time. Two people took charge of it; and it seemed about as much as
they could manage. It was very touching to see the children, when
they first saw me open the gate. Our tenants were to come in first;
and I had to pick them out from the dense mass of eager faces. Such
impatience! as if a few minutes were hours! Such a break of light
came over the face as I caught the eye of a tenant; the “Mary, you
may come,” or “Dickey, you next,” was entirely unnecessary to the
child addressed, but was the signal for others to make way; and thro’
such tiny avenues, or from under bigger girls’ skirts, the tiny
creatures emerged to the wonderful place of flowers and the many
welcoming friends. I was rather proud to see that I was usually
guided by a neater dress or cleaner face to a tenant. Then followed
the admission of a few children coming to classes, or members of the
band or drill classes, but not tenants. And then the mass of children
from the neighbourhood. Oh such a troop! The grown up people
crowded on any place from which they could see. I wished our wall
had been moved, and the rails up, both for the extra space, and that
more people might have seen. All children had flowers, cake, and an
orange on leaving. My conclusion is, the place is really getting into
order.
I had the report from a surveyor on the houses for which we are in
treaty. He says very naïvely, “It seems to me the houses are much out
of repair, tho’ considered by the landlord in excellent condition for
the class of inmates.” He says, too, the property in the
neighbourhood is in excellent condition, and will let well.... Will you
send me a copy of papers respecting boarding out? I should much
like to send them to Mrs. N. Senior. I believe the chances are better
in the country, and the plan more likely to be tried there.
I am glad you think it is best to wait and see the June list for
Macmillan. It will be very odd if the thing ever is published. I am
looking forward so eagerly to throwing the burden of the playground
expenses, at least partially, on our new buildings; they are such a
perpetual worry to me, and for so small a sum it seems a pity to be
annoyed. If the surplus profit of the rooms will, as I hope, pay for the
superintendence, it will make a great difference to me. We hope to
finish the building this week. I feel so ungrateful when I complain of
anything, when all has prospered in this wonderful way. Perhaps I
am a little tired to-night.
14, Nottingham Place,
April 13th, 1869.
To Mrs. W. Shaen.
I cannot tell you what my people are to me. We are such thorough
friends. Sometimes small actions of theirs go straight to one’s heart,
making me feel how nice our relation to one another is. The other
day I went down the court, once so savage and desolate. I saw two or
three of the worst boys in the neighbourhood looking very happy and
smiling. “Have you seen Mrs. Mayne?” they clamoured eagerly. Mrs.
Mayne is our superintendent there. “She’s got something taking care
of for you.” I found that the boys had walked twelve miles, doubtless
delighted with the expedition, but specially to bring me back a great
quantity of “palm.” And, as I came out carrying it, “Will you have
some more?” “Wait a bit and have some more,” they cried. When I
remembered that these same boys had been our greatest trouble,
defying authority, climbing walls, breaking windows, throwing
stones, with their hands against us in all things, I could not but feel
that we had got on a little, however the houses may fall short in
external perfection of what one longs for them to be. I have hardly
any of the teaching at home; dear Andy and Minnie having thrown
their strength fully into it; so Flo and I only take special classes; but
the bright young life round one is very refreshing; and I grow much
attached to some of the girls;—not the old sense of being any longer
their head; this, you will understand, I am not sorry to resign,
however precious the position was. Meantime, I have my little
sanctum here and go out among my ever-increasing circle of real
friends. My work now is mainly teaching drawing, which I enjoy
much.
IMPROVEMENT
IN THE COURTS
June 7th, 1869.
... We are having a large meeting in the parish this week to try to
organise the relief given; very opposite creeds will be represented—
Archbishop Manning, Mr. Davies, Mr. Fremantle, Eardley-Wilmot,
and others. I must go myself. I shall try to get Rose to go too....
Lady Ducie writes that she is perfectly engrossed in your book, and
tells me she must get it. She is quite appalled at the state of things in
the workhouse; it seems quite to be weighing on her mind.
To the Same.
... I daresay one is apt to overrate one’s own work; but one is the
more anxious to have it fairly weighed, and receive all advice from
other people; and I do want to have it fairly considered, and get the
authorities to recognise it. Mr. F., the rector of our district, and the
main mover in the matter, is to call on me to-day. May some power
inspire me with intellect and speech! I have hardly a hope that they
will place me on the Committee. I shall try boldly; but I think no
ladies will be admitted. Mr. F. is happily a friend of Lady Ducie’s.
P.S.—Mr. F. has just been, and will propose my name at the
Committee.
Ben Rhydding,
September 10th, 1869.
To Emily.
... Life here has been a great success every way. It is odd, in a place
like this, to get on so well; but energy and enjoyment are such a
delight to people, they forgive much, where they can secure them and
have these. A large picnic party went to Fountains yesterday. They
begged me to go. I could not, and said, “I will ask all the people, and,
when you are started, you really won’t want me.” “Oh,” said a young,
buoyant Quaker youth, “but we do want you to talk.” ... In pity also
give me some more teaching; it is the only anchor I have, and I shall
be destroyed by dissipation if you don’t preserve me. Oh dear, I have
been writing three hours; and I did so want to do my miniature; for
you don’t know how much I want to finish it.
ALARM AT HER
OWN FAME
6, Clifton Villas,
Bradford,
September 17th, 1869.
To Emily.
The period from 1870–1875, if it contains less of what may be called new
departures in Octavia’s life than the period which preceded it, or that which
followed it, yet can show phases of struggle, constructive work, and the discipline
of trial and opposition, as remarkable as at any time of her life; and it also includes
an important change in her circumstances, which much affected all her subsequent
career.
It may be said, perhaps, that the distinctive characteristic of this period was that
it brought her greater publicity than her previous efforts had produced, and so
answered her question to Ruskin, “Who will ever hear of what I do?”
First of all: the time was one in which a variety of circumstances had been
compelling many, who had not hitherto shown much interest in the poor, to turn
their attention in that direction; while many others, who had been anxious to do
their duty to the poor, had begun to realise that the hap-hazard methods of relief
hitherto in vogue had broken down.
The failure of large Mansion House Funds, which had been, raised in the ’sixties
to meet special distress, had brought home to many workers among the poor the
need of substituting closer co-operation for their isolated efforts. Some of those,
who had realised this need, also perceived that it was necessary to make enquiry
into the conditions of the applicants for relief, before they could discover the best
means of assisting them.
The great variety of characters and ideals and experiences which marked the
people, who were thus temporarily drawn together, naturally tended to produce
considerable collisions; and, in order to understand Octavia’s attitude to the
Charity Organisation Society, one must remember the different difficulties with
which she had to deal. There were, of course, those who had rushed into the
movement, as they would have taken up any other new fashion in dress or mode of
life or locomotion, and who wished to do nothing that would unduly offend
fashionable feeling. These were backed in many cases by people of a higher stamp,
—tender-hearted men and women, who were impressed by the misery of the poor,
and who merely looked to the Society as a newer, and more efficient, relief agency.
At the other extreme were those who thought that organisation and rules could do
everything. Then again the attempts at organisation of charity had led to the
discovery that many so-called charitable societies were utterly corrupt in their
objects, and that many more were unwise and careless in their methods of relief.
This raised a furious desire for radical reform, which at one time threatened to
substitute destruction for organisation. Along with this iconoclastic zeal was a
violent anti-clerical feeling, founded on the belief that the clergy were the authors
and chief abettors of the old irregular system of relief. Into this vortex of
controversy Octavia was unavoidably dragged.
EARLY DAYS OF It will have been seen (and it will have to be
THE C.O.S. reiterated in various forms) that she believed in
personal and sympathetic intercourse with the poor, as
far more important than any organisation; and that, where co-operation and
organisation were necessary, she preferred small local efforts to great centralised
schemes. At the same time, she felt that the giving of money, when dissociated, as
it too often is, from real sympathy, does infinite harm, and should be checked by
reformers of charity.
Both points were emphasised by Octavia in the paper which she read before the
Social Science Association in 1869 on the “Importance of aiding the poor without
alms-giving.”
“Alleviation of distress,” she says, “may be systematically arranged by a society;
but I am satisfied that, without strong personal influence, no radical cure of those
who have fallen low can be effected. Gifts may be pretty fairly distributed by a
Committee, though they lose half their graciousness; but, if we are to place our
people in permanently self-supporting positions, it will depend on the various
courses of action suitable to various people and circumstances, the ground of
which can be perceived only by sweet subtle human sympathy, and power of
human love.”
And again:—
“By knowledge of character more is meant than whether a man is a drunkard or
a woman is dishonest; it means knowledge of the passions, hopes, and history of
people; where the temptation will touch them, what is the little scheme they have
made of their own lives, or would make, if they had encouragement; what training
long past phases of their lives may have afforded; how to move, touch, teach them.
Our memories and our hopes are more truly factors of our lives than we often
remember.”
With regard to her relations to the clergy, I may mention that, while the Charity
Organisation Society was still in its infancy, she began an experiment in a
Marylebone district which was entirely under the guidance of Rev. W. Fremantle,
the Vicar of the parish, now Dean of Ripon. So much was Mr. Fremantle impressed
by the usefulness of this work, that he persuaded Octavia to send in an account of it
to the Local Government Board.
It was also through this work that she became acquainted with Rev. Samuel
Barnett, then curate to Mr. Fremantle, and since so widely known as the promoter
of various good works, and especially as the Founder of Toynbee Hall. It was in
connection with this Committee that Octavia insisted most on the desirability of
substituting employment for relief whenever possible; and out of this plan also
arose the scheme of Charity Organisation pensions, which has since formed so
important a part of the work of the Society.
It may seem strange that, with her preference for individual effort, and for small
local organisations, she should have consented to become a member of the Central
Council of the Charity Organisation Society. But there was much in that position
which chimed in with her aspirations. The Society was, after all, a federation of
local Committees, acting in sympathy with each other, but quite independent of
each other in many of their arrangements. Then, in theory at least, the Committees
acted on the principle that every case was to be dealt with on its own merits; a
principle which, if fully carried out, would have been a great protection against
mere officialism. The Central Council too was a debating Society, for the exchange
of ideas on specially pressing difficulties, rather than a regular governing body.
And, in spite of what I have said of the mixed elements in the Council, it must be
remembered that the membership of that body brought Octavia into touch with
many eminent workers in the reform of charity, amongst whom I would specially
mention the courteous and tactful Secretary, Mr. C. P. B. Bosanquet, whose
services in the stormy birth time of the Society are too often forgotten.
DEFECTS IN THE Nevertheless there were some reforms in the spirit
C.O.S. and methods of the Society to which Octavia found it
necessary to give attention; and, as I often went with
her to the Council meetings, I may claim to know the points which interested her.
Thus she soon began to be alarmed at that iconoclastic zeal of which I have spoken;
particularly as in some who then influenced the Society’s action this zeal had
produced a positive delight in attacking for attack’s sake. A long struggle, in which
Octavia took part, ended in changes which at least modified this unfortunate state
of mind.
Another and marked defect in the organisation of the Council led Octavia to
abandon, for a time, one of her special beliefs in order to enforce another, which
seemed to her of more importance. The Committees of the Society, through which
direct relief work has always been carried on, were divided according to the chief
London districts; and thus some Committees of the richer parishes were much
more able to raise funds in their own neighbourhood than could the Eastern and
Southern Committees. The consequence was that the Central Society was obliged
to supply funds to supplement the needs of the poorer districts; and, in return,
claimed to exercise a control over the distribution of those funds, which could not
be claimed over the richer Committees.
Thus the poorer Committees were deprived of the independence secured by the
richer ones.
In order to equalise these arrangements, it was proposed to centralise all the
funds of the Society in Buckingham Street. Octavia advocated the change; but the
majority of the Council felt that such a change would destroy that local interest in
the work, on which the strength of the Society depended; and subsequent
modifications in the arrangements of the Committees, aided, perhaps, by a
considerable change in the personnel of the Council, did, to some extent, reform
the defect which I have referred to. It may be said generally that, as the aims of the
Society became more coherent and definite, and the chief workers grew more alike
in their fundamental principles, Octavia’s sympathies with the Society increased;
and when Mr. Loch succeeded Mr. Bosanquet as secretary of the Council, her
friendship for the new secretary still further strengthened her approval of the
action of the Society.
Her sympathies with the enquiry traditions of the Society, and with the
restrictions on reckless relief, often startled and repelled some of the more
impulsive philanthropists; but one of the most earnest of them wrote, “I remember
taking to her a typical case for advice, and she gave me what I thought stern advice,
and I demurred. But she was right; and I often thought of it afterwards.”
During this very period, her attention was painfully drawn to the difficulties of
her local and more individual work, and to the dangers of that purely official view
of charitable movements, against which she was always on her guard.
She had published in a magazine an account of the courts which had been placed
under her care; and of course, in explaining the object of her undertaking, she was
obliged to describe the condition of the houses when first she undertook the
management of them. Unfortunately, some fussy person took the article to the
medical officer, with the question, “If these things were so, what were you doing?”
The medical officer was at once seized with a panic, and ordered the destruction of
all the houses in that court. Octavia thereupon went to remonstrate with him; and,
after hearing her explanations, he withdrew the order. But he had to report to the
Vestry, so the matter could not end with that withdrawal. The majority of the
Vestry took the side of their officer; and one zealous vestryman exclaimed that he
hoped they would hear no more of Miss Hill and her houses. The bitterness was so
keen that Octavia feared that the tenants of the court would be affected by the local
opinion. Mr. Bond, however, who took an active interest in the workmen’s club,
which had been formed in the court, explained the circumstances to the men; and
the general feeling of the tenants was drawn to Octavia’s side. Mr. Ernest Hart
undertook to discuss the matter with the medical officer; and gradually the official
feeling changed, or at least was greatly modified. But three incidents bearing on
the affair should be mentioned.
DIFFICULTIES During the controversy, Octavia’s attention was called
WITH VESTRY to the dangers which would come to the court from a
OFFICIALS public house built close to it. Her first idea was to
secure some kind of disinterested management which should prevent the evils of
the ordinary public house; but, finding that, for the time, this was impracticable,
she addressed herself to the work of defeating the licence. This she succeeded in
doing, but one of the J.P.’s, who had specially championed the publican, was so
furious that he addressed insulting remarks to her in reference to her management
of the houses.
On the other hand she was much cheered by a letter from Ruskin, received
during this crisis. Not long after the first houses were bought he had begun a little
to cool towards the work, partly from not understanding Octavia’s attitude towards
alms-giving; and partly from that horror of London ugliness which led him to think
that any London scheme must fail. But his personal regard for Octavia remained
untouched; and, visiting Carlyle during the crisis, he spoke of Octavia’s work, and
received such a warm expression of admiration from the “Sage of Chelsea,” that he
noted down the words and promptly sent them to Octavia, greatly to her delight.
The third incident refers to the attitude of her friends on the Charity
Organisation Council. Some of them thought that her management of the courts
should be considered as affecting their movement, and that a friendly enquiry into
her methods would strengthen their hands. She disliked the thought of greater
publicity, but reluctantly consented to submit her books and papers to the Special
Committee appointed for this enquiry. Though they were friendly in tone, Octavia
greatly disliked the visits of these gentlemen; and, when they wished to examine
the tenants of the courts to find out the moral effects produced on them by the
changes, Octavia put her foot down, and declined to allow this interference
between herself and her “friends.”
I have given what some may think an undue prominence to this attack on her by
the Marylebone officials; but I have two grounds for that course. One is that it was
the first important exhibition of that officialism which increased in Octavia her
strong dislike of State or Municipal management. The other is that the intensity of
her feeling on the matter brings out a point in her character of which many were
unaware. I remember well that when Mrs. Nassau Senior was smarting under the
attacks on her report on Workhouse Reform, two men remarked that “Miss
Octavia Hill would not have felt such attacks, as Mrs. Senior did.” Both were
intelligent men, and both had some personal acquaintance with Octavia. But both
were mistaken.
It was in the middle of these difficulties and struggles that her attention was
partly diverted from her own work by her interest in the affairs of a friend; and, for
what I believe to have been the only time in her life, she took an active share in an
attempt to return a Member to Parliament. This was in 1874 when Mr. Thomas
Hughes came forward as a candidate for Marylebone. Her personal admiration for
him, dating from the old Christian Socialist days, and strengthened by her
experiences as teacher to his children, decided her to abandon her general
indifference to Parliamentary work; and she declared with her usual vehemence
that they would return him. Canvassers went out from 14 Nottingham Place with
electioneering circulars; and all friends whom Octavia could influence were
pressed into the service. Unfortunately, for reasons which do not concern this
biography, the effort failed; and, by a curious combination of circumstances,
several people were led to attribute Octavia’s zeal to an interest in the cause of
Female Suffrage.
ATTITUDE This mistaken idea seems to make this a proper place
TOWARDS for a short word of explanation of her attitude on this
FEMALE question. The fact is that Octavia never felt the keen
SUFFRAGE interest in the public questions of the day which
animated Miranda; and, since she had discovered that she could do a definite piece
of work for the good of the poor, she had begun to feel a positive dislike for
Parliamentary life, and party politics, as tending to draw people away from
“cultivating their own garden,” into taking part in wider, but less immediately
useful, work. This opinion she felt it specially necessary to emphasise in reference
to women.
First; it was with women that she specially co-operated in her work among the
poor; and her discovery of a new outlet for their energies, and her warm
appreciation of their possible capacity, led her to look on the Female Suffrage
movement as a sort of red herring drawn across the path of her fellow workers,
which hindered them from taking an adequate interest in those subjects with
which she considered them specially fitted to deal. Secondly, even in that pacific
phase of the Female Suffrage movement, there were champions of this cause who
thought it more important to call attention to what women could accomplish than
to undertake regular work. Thus they seemed to promote that intense love of
advertising which Octavia abhorred. Lastly, there were always people who assumed
that one, who had done so much efficient work, must be in favour of a change,
which would enable so many other women less well provided with powers of work
to accomplish more than they could now succeed in doing. And this mistake was
strengthened by the constant confusion between Octavia and her friend Miss
Davenport Hill.
Although she acknowledged in a letter (written from Tortworth and published in
this book) that this indifference to these larger issues deprived her of some
valuable information, and put her at a disadvantage, she always continued, to the
end of her life, to act in these matters rather (as in the Marylebone election) from
motives of personal sympathy with some special adviser than from those carefully
considered reasons which guided her in the work identified with her name.
Of course in the biography of any original thinker or actor one must record
apparent contradictions; and it is rather curious that this same period, which
contains her one interference in a Parliamentary election, is marked also by her
one active attempt to assist in the framing of an Act of Parliament, the Artizans’
Dwellings Bill, which brought her into some opposition to more extreme
individualists than herself. The main part of her action in this matter will be best
brought out by the letters which follow; but there is one point which may be
overlooked, and on which I should specially like to insist. In the very period when
she was enduring such harsh treatment from the medical officer and the Vestry,
she helped in promoting a measure which increased the power both of medical
officers and of local councils, in dealing with houses like those under her charge.
Thus she made it clear that she could see the general advantage of machinery,
which had been, and might be, turned against herself.
It must be remembered that all this trying work was carried on while she was
still engaged in teaching the pupils at the Nottingham Place School; and many of
her friends had felt, for some time, that the effort, needed for the two kinds of
work, was too great a strain on her strength. An offer of pecuniary assistance by a
friend a few years before this time had been gently, but firmly, refused; but, under
Mr. William Shaen’s guidance, a number of wealthy friends succeeded, without her
knowledge, in organising a fund which should make her free for the further
development of the housing-reform schemes. As this plan had been brought into a
definite form before Octavia was aware of it, and as she remembered that one
break down in her health had recently occurred, she felt bound to accept the offer,
under the limitations mentioned in the letter to Mr. Shaen given in this chapter.
And thus she was placed for the rest of her life in a position which raised her out of
the struggles, which had hampered her early years.
In 1875 Miss Louisa Schuyler, President of the State Charity Aid Association,
collected five of Octavia’s Magazine articles, and brought them out in America
under the title of “Homes of the London Poor.” This book was afterwards
published in England, and later on translated into German by H.R.H. Princess
Alice.
DESCRIPTION OF
A FELLOW-
WORKER About 1870.
I am just back from an evening with the men. I can’t help writing
to tell you of their talk. They were all of one mind in approving of
your system. “It is charity, and it is not charity,” said one man. “It is
charity because it is human kindness; and it is not charity because it
does not make people cringing.” Another said, “We had heard that
none but your supporters spoke; for every complaint brought out
more clearly what you had done.” A third thought that they ought to
get up a testimonial to you.
A JOKING LETTER
January 3rd, 1871.
To Miranda.
Your humble servant, the writer, is in good health and spirits, but
is growing so deeply devoted to the delights of her own sweet society,
that she is somewhat alarmed, and fears that on your return she may
be found to have lost the power of speech.
To such tremendous reactions does Nature at times lead us!!! The
circle of interest grows also narrower daily, (barring Walmer Street).
She cares for nothing and nobody beyond her reach, while she sits in
her beloved arm-chair.[65]
Entre nous, however, I think there is still somewhere some little
tenderness in her heart for her respected and absent relatives.
I don’t know how to sign myself, my persons being hopelessly now
in a jumble; so beg you with your ordinary penetration to discover
The Writer.
I was so much touched and delighted with your letter. Words, such
as those from such as you, do so much to help on our way those of us
who are struggling, somewhat alone, to meet and master the
difficulties that beset us. What I am trying to do is simply in my eyes
a bit of adult education or reformatory work, among a few people
corrupted by gifts. It seems to me that, if we will give them a little
Welcome to Our Bookstore - The Ultimate Destination for Book Lovers
Are you passionate about books and eager to explore new worlds of
knowledge? At our website, we offer a vast collection of books that
cater to every interest and age group. From classic literature to
specialized publications, self-help books, and children’s stories, we
have it all! Each book is a gateway to new adventures, helping you
expand your knowledge and nourish your soul
Experience Convenient and Enjoyable Book Shopping Our website is more
than just an online bookstore—it’s a bridge connecting readers to the
timeless values of culture and wisdom. With a sleek and user-friendly
interface and a smart search system, you can find your favorite books
quickly and easily. Enjoy special promotions, fast home delivery, and
a seamless shopping experience that saves you time and enhances your
love for reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
ebookgate.com