100% found this document useful (3 votes)
15 views

Apache Flume Distributed Log Collection for Hadoop 2nd Edition Steve Hoffman instant download

The document is a promotional description for the book 'Apache Flume: Distributed Log Collection for Hadoop, Second Edition' by Steve Hoffman, which provides guidance on designing and implementing Flume agents for data streaming into Hadoop. It includes information about the author's background, the book's content structure, and various chapters covering Flume's architecture, configuration, and components. Additionally, it offers links to other related Apache books and resources for further exploration.

Uploaded by

madurohpn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
15 views

Apache Flume Distributed Log Collection for Hadoop 2nd Edition Steve Hoffman instant download

The document is a promotional description for the book 'Apache Flume: Distributed Log Collection for Hadoop, Second Edition' by Steve Hoffman, which provides guidance on designing and implementing Flume agents for data streaming into Hadoop. It includes information about the author's background, the book's content structure, and various chapters covering Flume's architecture, configuration, and components. Additionally, it offers links to other related Apache books and resources for further exploration.

Uploaded by

madurohpn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Apache Flume Distributed Log Collection for

Hadoop 2nd Edition Steve Hoffman pdf download

https://ebookname.com/product/apache-flume-distributed-log-
collection-for-hadoop-2nd-edition-steve-hoffman/

Get Instant Ebook Downloads – Browse at https://ebookname.com


Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...

Apache Cookbook Solutions and Examples for Apache


Administrators 2nd Edition Rich Bowen

https://ebookname.com/product/apache-cookbook-solutions-and-
examples-for-apache-administrators-2nd-edition-rich-bowen/

Apache Server 2 bible 2nd Edition Mohammed J. Kabir

https://ebookname.com/product/apache-server-2-bible-2nd-edition-
mohammed-j-kabir/

Power Supplies for LED Driving 2nd Edition Steve Winder

https://ebookname.com/product/power-supplies-for-led-driving-2nd-
edition-steve-winder/

Couple and Family Assessment Contemporary and Cutting


Edge Strategies 3rd Edition Len Sperry

https://ebookname.com/product/couple-and-family-assessment-
contemporary-and-cutting-edge-strategies-3rd-edition-len-sperry/
Fitness for work the medical aspects Fifth Edition
Keith T Palmer

https://ebookname.com/product/fitness-for-work-the-medical-
aspects-fifth-edition-keith-t-palmer/

Growing up Between Two Cultures Issues and Problems of


Muslim Children 1st Edition Farideh Salili

https://ebookname.com/product/growing-up-between-two-cultures-
issues-and-problems-of-muslim-children-1st-edition-farideh-
salili/

Power Plants and Power Systems Control 2006 A


Proceedings Volume from the IFAC Symposium on Power
Plants and Power Systems Control Kananaskis Canada 2006
IFAC Proceedings Volumes 2006 ed. Edition Westwick
https://ebookname.com/product/power-plants-and-power-systems-
control-2006-a-proceedings-volume-from-the-ifac-symposium-on-
power-plants-and-power-systems-control-kananaskis-
canada-2006-ifac-proceedings-volumes-2006-ed-edition-west/

Project Risk Management Processes Techniques and


Insights 2nd Edition Chris Chapman

https://ebookname.com/product/project-risk-management-processes-
techniques-and-insights-2nd-edition-chris-chapman/

Therigatha Poems of the First Buddhist Women 1st


Edition Charles Hallisey

https://ebookname.com/product/therigatha-poems-of-the-first-
buddhist-women-1st-edition-charles-hallisey/
Inventing Money The Story of Long Term Capital
Management and the Legends Behind It 1st Edition
Nicholas Dunbar

https://ebookname.com/product/inventing-money-the-story-of-long-
term-capital-management-and-the-legends-behind-it-1st-edition-
nicholas-dunbar/
Apache Flume: Distributed
Log Collection for Hadoop
Second Edition

Design and implement a series of Flume agents to send


streamed data into Hadoop

Steve Hoffman

BIRMINGHAM - MUMBAI
Apache Flume: Distributed Log Collection for Hadoop
Second Edition

Copyright © 2015 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, without the prior written
permission of the publisher, except in the case of brief quotations embedded in
critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy
of the information presented. However, the information contained in this book is
sold without warranty, either express or implied. Neither the author, nor Packt
Publishing, and its dealers and distributors will be held liable for any damages
caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.

First published: July 2013

Second edition: February 2015

Production reference: 1190215

Published by Packt Publishing Ltd.


Livery Place
35 Livery Street
Birmingham B3 2PB, UK.

ISBN 978-1-78439-217-8
www.packtpub.com
Credits

Author Project Coordinator


Steve Hoffman Mary Alex

Reviewers Proofreader
Sachin Handiekar Simran Bhogal
Michael Keane Safis Editing
Stefan Will
Indexer
Commissioning Editor Rekha Nair
Dipika Gaonkar
Graphics
Acquisition Editor Sheetal Aute
Reshma Raman Abhinash Sahu

Content Development Editor Production Coordinator


Neetu Ann Mathew Komal Ramchandani

Technical Editor Cover Work


Menza Mathew Komal Ramchandani

Copy Editors
Vikrant Phadke
Stuti Srivastava
About the Author

Steve Hoffman has 32 years of experience in software development, ranging from


embedded software development to the design and implementation of large-scale,
service-oriented, object-oriented systems. For the last 5 years, he has focused on
infrastructure as code, including automated Hadoop and HBase implementations
and data ingestion using Apache Flume. Steve holds a BS in computer engineering
from the University of Illinois at Urbana-Champaign and an MS in computer
science from DePaul University. He is currently a senior principal engineer at
Orbitz Worldwide (http://orbitz.com/).

More information on Steve can be found at http://bit.ly/bacoboy and on


Twitter at @bacoboy.

This is the first update to Steve's first book, Apache Flume: Distributed Log Collection
for Hadoop, Packt Publishing.

I'd again like to dedicate this updated book to my loving and


supportive wife, Tracy. She puts up with a lot, and that is very much
appreciated. I couldn't ask for a better friend daily by my side.
My terrific children, Rachel and Noah, are a constant reminder that
hard work does pay off and that great things can come from chaos.
I also want to give a big thanks to my parents, Alan and Karen, for
molding me into the somewhat satisfactory human I've become.
Their dedication to family and education above all else guides me
daily as I attempt to help my own children find their happiness in
the world.
About the Reviewers

Sachin Handiekar is a senior software developer with over 5 years of experience


in Java EE development. He graduated in computer science from the University of
Greenwich, London, and currently works for a global consulting company, developing
enterprise applications using various open source technologies, such as Apache
Camel, ServiceMix, ActiveMQ, and ZooKeeper.

Sachin has a lot of interest in open source projects. He has contributed code to Apache
Camel and developed plugins for Spring Social, which can be found at GitHub
(https://github.com/sachin-handiekar).

He also actively writes about enterprise application development on his blog


(http://sachinhandiekar.com).

Michael Keane has a BS in computer science from the University of Illinois


at Urbana-Champaign. He has worked as a software engineer, coding almost
exclusively in Java since JDK 1.1. He has also worked on the mission-critical medical
device software, e-commerce, transportation, navigation, and advertising domains.
He is currently a development leader for Conversant, where he maintains Flume
flows of nearly 100 billion log lines per day.

Michael is a father of three, and besides work, he spends most of his time with his
family and coaching youth softball.

Stefan Will is a computer scientist with a degree in machine learning and pattern
recognition from the University of Bonn, Germany. For over a decade, he has worked
for several start-ups in Silicon Valley and Raleigh, North Carolina, in the area of search
and analytics. Presently, he leads the development of the search backend and real-time
analytics platform at Zendesk, a provider of customer service software.
www.PacktPub.com

Support files, eBooks, discount offers, and more


For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF
and ePub files available? You can upgrade to the eBook version at www.PacktPub.
com and as a print book customer, you are entitled to a discount on the eBook copy.
Get in touch with us at service@packtpub.com for more details.

At www.PacktPub.com, you can also read a collection of free technical articles,


sign up for a range of free newsletters and receive exclusive discounts and offers
on Packt books and eBooks.
TM

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital
book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?
• Fully searchable across every book published by Packt
• Copy and paste, print, and bookmark content
• On demand and accessible via a web browser

Free access for Packt account holders


If you have an account with Packt at www.PacktPub.com, you can use this to access
PacktLib today and view nine entirely free books. Simply use your login credentials
for immediate access.
Table of Contents
Preface 1
Chapter 1: Overview and Architecture 7
Flume 0.9 8
Flume 1.X (Flume-NG) 8
The problem with HDFS and streaming data/logs 9
Sources, channels, and sinks 10
Flume events 10
Interceptors, channel selectors, and sink processors 11
Tiered data collection (multiple flows and/or agents) 12
The Kite SDK 13
Summary 14
Chapter 2: A Quick Start Guide to Flume 15
Downloading Flume 15
Flume in Hadoop distributions 16
An overview of the Flume configuration file 17
Starting up with "Hello, World!" 18
Summary 23
Chapter 3: Channels 25
The memory channel 26
The file channel 28
Spillable Memory Channel 31
Summary 35
Chapter 4: Sinks and Sink Processors 37
HDFS sink 37
Path and filename 39
File rotation 42
Table of Contents

Compression codecs 43
Event Serializers 44
Text output 44
Text with headers 44
Apache Avro 45
User-provided Avro schema 46
File type 47
SequenceFile 47
DataStream 48
CompressedStream 48
Timeouts and workers 48
Sink groups 49
Load balancing 50
Failover 51
MorphlineSolrSink 52
Morphline configuration files 53
Typical SolrSink configuration 54
Sink configuration 56
ElasticSearchSink 57
LogStash Serializer 60
Dynamic Serializer 61
Summary 61
Chapter 5: Sources and Channel Selectors 63
The problem with using tail 63
The Exec source 65
Spooling Directory Source 67
Syslog sources 71
The syslog UDP source 72
The syslog TCP source 73
The multiport syslog TCP source 74
JMS source 77
Channel selectors 80
Replicating 80
Multiplexing 81
Summary 81
Chapter 6: Interceptors, ETL, and Routing 83
Interceptors 83
Timestamp 84
Host 85
Static 85

[ ii ]
Table of Contents

Regular expression filtering 86


Regular expression extractor 87
Morphline interceptor 91
Custom interceptors 92
The plugins directory 94
Tiering flows 95
The Avro source/sink 95
Compressing Avro 98
SSL Avro flows 99
The Thrift source/sink 101
Using command-line Avro 102
The Log4J appender 103
The Log4J load-balancing appender 104
The embedded agent 105
Configuration and startup 106
Sending data 107
Shutdown 108
Routing 108
Summary 110
Chapter 7: Putting It All Together 111
Web logs to searchable UI 111
Setting up the web server 113
Configuring log rotation to the spool directory 115
Setting up the target – Elasticsearch 120
Setting up Flume on collector/relay 122
Setting up Flume on the client 126
Creating more search fields with an interceptor 130
Setting up a better user interface – Kibana 134
Archiving to HDFS 139
Summary 143
Chapter 8: Monitoring Flume 145
Monitoring the agent process 145
Monit 145
Nagios 146
Monitoring performance metrics 146
Ganglia 147
Internal HTTP server 148
Custom monitoring hooks 150
Summary 151

[ iii ]
Table of Contents

Chapter 9: There Is No Spoon – the Realities of Real-time


Distributed Data Collection 153
Transport time versus log time 153
Time zones are evil 154
Capacity planning 155
Considerations for multiple data centers 156
Compliance and data expiry 157
Summary 158
Index 159

[ iv ]
Preface
Hadoop is a great open source tool for shifting tons of unstructured data into
something manageable so that your business can gain better insight into your
customers' needs. It's cheap (mostly free), scales horizontally as long as you have
space and power in your datacenter, and can handle problems that would crush your
traditional data warehouse. That said, a little-known secret is that your Hadoop cluster
requires you to feed it data. Otherwise, you just have a very expensive heat generator!
You will quickly realize (once you get past the "playing around" phase with Hadoop)
that you will need a tool to automatically feed data into your cluster. In the past, you
had to come up with a solution for this problem, but no more! Flume was started as a
project out of Cloudera, when its integration engineers had to keep writing tools over
and over again for their customers to automatically import data. Today, the project
lives with the Apache Foundation, is under active development, and boasts of users
who have been using it in their production environments for years.

In this book, I hope to get you up and running quickly with an architectural overview
of Flume and a quick-start guide. After that, we'll dive deep into the details of many of
the more useful Flume components, including the very important file channel for the
persistence of in-flight data records and the HDFS Sink for buffering and writing
data into HDFS (the Hadoop File System). Since Flume comes with a wide variety
of modules, chances are that the only tool you'll need to get started is a text editor
for the configuration file.

By the time you reach the end of this book, you should know enough to build a highly
available, fault-tolerant, streaming data pipeline that feeds your Hadoop cluster.

What this book covers


Chapter 1, Overview and Architecture, introduces Flume and the problem space that
it's trying to address (specifically with regards to Hadoop). An architectural
overview of the various components to be covered in later chapters is given.
Preface

Chapter 2, A Quick Start Guide to Flume, serves to get you up and running quickly. It
includes downloading Flume, creating a "Hello, World!" configuration, and running it.

Chapter 3, Channels, covers the two major channels most people will use and the
configuration options available for each of them.

Chapter 4, Sinks and Sink Processors, goes into great detail on using the HDFS Flume
output, including compression options and options for formatting the data. Failover
options are also covered so that you can create a more robust data pipeline.

Chapter 5, Sources and Channel Selectors, introduces several of the Flume input
mechanisms and their configuration options. Also covered is switching between
different channels based on data content, which allows the creation of complex
data flows.

Chapter 6, Interceptors, ETL, and Routing, explains how to transform data in-flight
as well as extract information from the payload to use with Channel Selectors to
make routing decisions. Then this chapter covers tiering Flume agents using Avro
serialization, as well as using the Flume command line as a standalone Avro client
for testing and importing data manually.

Chapter 7, Putting It All Together, walks you through the details of an end-to-end use
case from the web server logs to a searchable UI, backed by Elasticsearch as well as
archival storage in HDFS.

Chapter 8, Monitoring Flume, discusses various options available for monitoring Flume
both internally and externally, including Monit, Nagios, Ganglia, and custom hooks.

Chapter 9, There Is No Spoon – the Realities of Real-time Distributed Data Collection, is


a collection of miscellaneous things to consider that are outside the scope of just
configuring and using Flume.

What you need for this book


You'll need a computer with a Java Virtual Machine installed, since Flume is
written in Java. If you don't have Java on your computer, you can download it
from http://java.com/.

You will also need an Internet connection so that you can download Flume to run
the Quick Start example.

This book covers Apache Flume 1.5.2.

[2]
Preface

Who this book is for


This book is for people responsible for implementing the automatic movement
of data from various systems to a Hadoop cluster. If it is your job to load data into
Hadoop on a regular basis, this book should help you to code yourself out of manual
monkey work or from writing a custom tool you'll be supporting for as long as
you work at your company.

Only basic knowledge of Hadoop and HDFS is required. Some custom


implementations are covered, should your needs necessitate them. For this
level of implementation, you will need to know how to program in Java.

Finally, you'll need your favorite text editor, since most of this book covers how
to configure various Flume components via an agent's text configuration file.

Conventions
In this book, you will find a number of styles of text that distinguish between
different kinds of information. Here are some examples of these styles, and
explanations of their meanings.

Code words in text are shown as follows: "If you want to use this feature, you set
the useDualCheckpoints property to true and specify a location for that second
checkpoint directory with the backupCheckpointDir property."

A block of code is set as follows:


agent.sinks.k1.hdfs.path=/logs/apache/access
agent.sinks.k1.hdfs.filePrefix=access
agent.sinks.k1.hdfs.fileSuffix=.log

When we wish to draw your attention to a particular part of a code block,


the relevant lines or items are set in bold:
agent.sources.s1.command=uptime
agent.sources.s1.restart=true
agent.sources.s1.restartThrottle=60000

Any command-line input or output is written as follows:


$ tar -zxf apache-flume-1.5.2.tar.gz
$ cd apache-flume-1.5.2

[3]
Preface

New terms and important words are shown in bold. Words that you see on the
screen, in menus or dialog boxes for example, appear in the text like this: "Flume was
first introduced in Cloudera's CDH3 distribution in 2011."

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Reader feedback
Feedback from our readers is always welcome. Let us know what you think about
this book—what you liked or may have disliked. Reader feedback is important for
us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to feedback@packtpub.com,


and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing
or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support
Now that you are the proud owner of a Packt book, we have a number of things to
help you to get the most from your purchase.

Downloading the example code


You can download the example code files for all Packt books you have purchased
from your account at http://www.packtpub.com. If you purchased this book
elsewhere, you can visit http://www.packtpub.com/support and register to
have the files e-mailed directly to you.

[4]
Preface

Errata
Although we have taken every care to ensure the accuracy of our content, mistakes do
happen. If you find a mistake in one of our books—maybe a mistake in the text
or the code—we would be grateful if you would report this to us. By doing so, you can
save other readers from frustration and help us improve subsequent versions of this
book. If you find any errata, please report them by visiting http://www.packtpub.
com/submit-errata, selecting your book, clicking on the errata submission form link,
and entering the details of your errata. Once your errata are verified, your submission
will be accepted and the errata will be uploaded on our website, or added to any list of
existing errata, under the Errata section of that title. Any existing errata can be viewed
by selecting your title from http://www.packtpub.com/support.

Piracy
Piracy of copyright material on the Internet is an ongoing problem across all media.
At Packt, we take the protection of our copyright and licenses very seriously. If you
come across any illegal copies of our works, in any form, on the Internet, please
provide us with the location address or website name immediately so that we
can pursue a remedy.

Please contact us at copyright@packtpub.com with a link to the suspected


pirated material.

We appreciate your help in protecting our authors, and our ability to bring
you valuable content.

Questions
You can contact us at questions@packtpub.com if you are having a problem
with any aspect of the book, and we will do our best to address it.

[5]
Overview and Architecture
If you are reading this book, chances are you are swimming in oceans of data. Creating
mountains of data has become very easy, thanks to Facebook, Twitter, Amazon, digital
cameras and camera phones, YouTube, Google, and just about anything else you can
think of being connected to the Internet. As a provider of a website, 10 years ago, your
application logs were only used to help you troubleshoot your website. Today, this
same data can provide a valuable insight into your business and customers if you
know how to pan gold out of your river of data.

Furthermore, as you are reading this book, you are also aware that Hadoop was
created to solve (partially) the problem of sifting through mountains of data. Of
course, this only works if you can reliably load your Hadoop cluster with data for
your data scientists to pick apart.

Getting data into and out of Hadoop (in this case, the Hadoop File System, or
HDFS) isn't hard; it is just a simple command, such as:
% hadoop fs --put data.csv .

This works great when you have all your data neatly packaged and ready to upload.

However, your website is creating data all the time. How often should you batch
load data to HDFS? Daily? Hourly? Whatever processing period you choose,
eventually somebody always asks "can you get me the data sooner?" What you
really need is a solution that can deal with streaming logs/data.

Turns out you aren't alone in this need. Cloudera, a provider of professional services
for Hadoop as well as their own distribution of Hadoop, saw this need over and over
when working with their customers. Flume was created to fill this need and create a
standard, simple, robust, flexible, and extensible tool for data ingestion into Hadoop.
Overview and Architecture

Flume 0.9
Flume was first introduced in Cloudera's CDH3 distribution in 2011. It consisted
of a federation of worker daemons (agents) configured from a centralized master
(or masters) via Zookeeper (a federated configuration and coordination system).
From the master, you could check the agent status in a web UI as well as push
out configuration centrally from the UI or via a command-line shell (both really
communicating via Zookeeper to the worker agents).

Data could be sent in one of three modes: Best effort (BE), Disk Failover (DFO), and
End-to-End (E2E). The masters were used for the E2E mode acknowledgements and
multimaster configuration never really matured, so you usually only had one master,
making it a central point of failure for E2E data flows. The BE mode is just what it
sounds like: the agent would try to send the data, but if it couldn't, the data would
be discarded. This mode is good for things such as metrics, where gaps can easily be
tolerated, as new data is just a second away. The DFO mode stores undeliverable data
to the local disk (or sometimes, a local database) and would keep retrying until the
data could be delivered to the next recipient in your data flow. This is handy for those
planned (or unplanned) outages, as long as you have sufficient local disk space to
buffer the load.

In June, 2011, Cloudera moved control of the Flume project to the Apache Foundation.
It came out of the incubator status a year later in 2012. During the incubation year,
work had already begun to refactor Flume under the Star-Trek-themed tag, Flume-NG
(Flume the Next Generation).

Flume 1.X (Flume-NG)


There were many reasons why Flume was refactored. If you are interested in
the details, you can read about them at https://issues.apache.org/jira/
browse/FLUME-728. What started as a refactoring branch eventually became the
main line of development as Flume 1.X.

The most obvious change in Flume 1.X is that the centralized configuration master(s)
and Zookeeper are gone. The configuration in Flume 0.9 was overly verbose, and
mistakes were easy to make. Furthermore, centralized configuration was really outside
the scope of Flume's goals. Centralized configuration was replaced with a simple on-
disk configuration file (although the configuration provider is pluggable so that it
can be replaced). These configuration files are easily distributed using tools such as
cf-engine, Chef, and Puppet. If you are using a Cloudera distribution, take a look at
Cloudera Manager to manage your configurations. About two years ago, they created
a free version with no node limit, so it may be an attractive option for you. Just be
sure you don't manage these configurations manually, or you'll be editing these files
manually forever.
[8]
Chapter 1

Another major difference in Flume 1.X is that the reading of input data and the
writing of output data are now handled by different worker threads (called
Runners). In Flume 0.9, the input thread also did the writing to the output (except
for failover retries). If the output writer was slow (rather than just failing outright),
it would block Flume's ability to ingest data. This new asynchronous design leaves
the input thread blissfully unaware of any downstream problem.

The first edition of this book covered all the versions of Flume up till Version 1.3.1.
This second edition will cover till Version 1.5.2 (the current version at the time of
writing this).

The problem with HDFS and streaming


data/logs
HDFS isn't a real filesystem, at least not in the traditional sense, and many of
the things we take for granted with normal filesystems don't apply here, such
as being able to mount it. This makes getting your streaming data into Hadoop
a little more complicated.

In a regular POSIX-style filesystem, if you open a file and write data, it still exists
on the disk before the file is closed. That is, if another program opens the same
file and starts reading, it will get the data already flushed by the writer to the disk.
Furthermore, if this writing process is interrupted, any portion that made it to disk
is usable (it may be incomplete, but it exists).

In HDFS, the file exists only as a directory entry; it shows zero length until the file
is closed. This means that if data is written to a file for an extended period without
closing it, a network disconnect with the client will leave you with nothing but an
empty file for all your efforts. This may lead you to the conclusion that it would be
wise to write small files so that you can close them as soon as possible.

The problem is that Hadoop doesn't like lots of tiny files. As the HDFS filesystem
metadata is kept in memory on the NameNode, the more files you create, the more
RAM you'll need to use. From a MapReduce prospective, tiny files lead to poor
efficiency. Usually, each Mapper is assigned a single block of a file as the input
(unless you have used certain compression codecs). If you have lots of tiny files,
the cost of starting the worker processes can be disproportionally high compared
to the data it is processing. This kind of block fragmentation also results in more
Mapper tasks, increasing the overall job run times.

[9]
Overview and Architecture

These factors need to be weighed when determining the rotation period to use when
writing to HDFS. If the plan is to keep the data around for a short time, then you can
lean toward the smaller file size. However, if you plan on keeping the data for a very
long time, you can either target larger files or do some periodic cleanup to compact
smaller files into fewer, larger files to make them more MapReduce friendly. After
all, you only ingest the data once, but you might run a MapReduce job on that data
hundreds or thousands of times.

Sources, channels, and sinks


The Flume agent's architecture can be viewed in this simple diagram. Inputs are
called sources and outputs are called sinks. Channels provide the glue between
sources and sinks. All of these run inside a daemon called an agent.

Keep in mind:
• A source writes events to one or more channels.
• A channel is the holding area as events are passed from a
source to a sink.
• A sink receives events from one channel only.
• An agent can have many channels.

Flume events
The basic payload of data transported by Flume is called an event. An event is
composed of zero or more headers and a body.

[ 10 ]
Chapter 1

The headers are key/value pairs that can be used to make routing decisions or
carry other structured information (such as the timestamp of the event or the
hostname of the server from which the event originated). You can think of it as
serving the same function as HTTP headers—a way to pass additional information
that is distinct from the body.

The body is an array of bytes that contains the actual payload. If your input is
comprised of tailed log files, the array is most likely a UTF-8-encoded string
containing a line of text.

Flume may add additional headers automatically (like when a source adds the
hostname where the data is sourced or creating an event's timestamp), but the
body is mostly untouched unless you edit it en route using interceptors.

Interceptors, channel selectors, and sink


processors
An interceptor is a point in your data flow where you can inspect and alter Flume
events. You can chain zero or more interceptors after a source creates an event. If
you are familiar with the AOP Spring Framework, think MethodInterceptor. In
Java Servlets, it's similar to ServletFilter. Here's an example of what using four
chained interceptors on a source might look like:

[ 11 ]
Overview and Architecture

Channel selectors are responsible for how data moves from a source to one or more
channels. Flume comes packaged with two channel selectors that cover most use
cases you might have, although you can write your own if need be. A replicating
channel selector (the default) simply puts a copy of the event into each channel,
assuming you have configured more than one. In contrast, a multiplexing channel
selector can write to different channels depending on some header information.
Combined with some interceptor logic, this duo forms the foundation for routing
input to different channels.

Finally, a sink processor is the mechanism by which you can create failover paths for
your sinks or load balance events across multiple sinks from a channel.

Tiered data collection (multiple flows and/or


agents)
You can chain your Flume agents depending on your particular use case. For
example, you may want to insert an agent in a tiered fashion to limit the number
of clients trying to connect directly to your Hadoop cluster. More likely, your
source machines don't have sufficient disk space to deal with a prolonged outage
or maintenance window, so you create a tier with lots of disk space between your
sources and your Hadoop cluster.

In the following diagram, you can see that there are two places where data is
created (on the left-hand side) and two final destinations for the data (the HDFS
and ElasticSearch cloud bubbles on the right-hand side). To make things more
interesting, let's say one of the machines generates two kinds of data (let's call
them square and triangle data). You can see that in the lower-left agent, we use a
multiplexing channel selector to split the two kinds of data into different channels.
The rectangle channel is then routed to the agent in the upper-right corner (along
with the data coming from the upper-left agent). The combined volume of events
is written together in HDFS in Datacenter 1. Meanwhile, the triangle data is sent
to the agent that writes to ElasticSearch in Datacenter 2. Keep in mind that data
transformations can occur after any source. How all of these components can be
used to build complicated data workflows will be become clear as we proceed.

[ 12 ]
Chapter 1

The Kite SDK


One of the new technologies incorporated in Flume, starting with Version 1.4,
is something called a Morphline. You can think of a Morphline as a series of
commands chained together to form a data transformation pipe.

If you are a fan of pipelining Unix commands, this will be very familiar to you.
The commands themselves are intended to be small, single-purpose functions that
when chained together create powerful logic. In many ways, using a Morphline
command chain can be identical in functionality to the interceptor paradigm just
mentioned. There is a Morphline interceptor we will cover in Chapter 6, Interceptors,
ETL, and Routing, which you can use instead of, or in addition to, the included
Java-based interceptors.

To get an idea of how useful these commands can be, take a look
at the handy grok command and its included extensible regular
expression library at https://github.com/kite-sdk/kite/
blob/master/kite-morphlines/kite-morphlines-core/
src/test/resources/grok-dictionaries/grok-patterns

[ 13 ]
Overview and Architecture

Many of the custom Java interceptors that I've written in the past were to modify
the body (data) and can easily be replaced with an out-of-the-box Morphline
command chain. You can get familiar with the Morphline commands by checking
out their reference guide at http://kitesdk.org/docs/current/kite-
morphlines/morphlinesReferenceGuide.html

Flume Version 1.4 also includes a Morphline-backed sink used primarily to feed
data into Solr. We'll see more of this in Chapter 4, Sinks and Sink Processors,
Morphline Solr Search Sink.

Morphlines are just one component of the KiteSDK included in Flume. Starting
with Version 1.5, Flume has added experimental support for KiteData, which is an
effort to create a standard library for datasets in Hadoop. It looks very promising,
but it is outside the scope of this book.

Please see the project home page for more information, as it will
certainly become more prominent in the Hadoop ecosystem as
the technology matures. You can read all about the KiteSDK at
http://kitesdk.org.

Summary
In this chapter, we discussed the problem that Flume is attempting to solve: getting
data into your Hadoop cluster for data processing in an easily configured, reliable
way. We also discussed the Flume agent and its logical components, including
events, sources, channel selectors, channels, sink processors, and sinks. Finally, we
briefly discussed Morphlines as a powerful new ETL (Extract, Transform, Load)
library, starting with Version 1.4 of Flume.

The next chapter will cover these in more detail, specifically, the most commonly
used implementations of each. Like all good open source projects, almost all of these
components are extensible if the bundled ones don't do what you need them to do.

[ 14 ]
Exploring the Variety of Random
Documents with Different Content
having something about me that interests most people at first sight in my favour?

In a letter to Mrs. Thrale, Johnson once wrote: “It has become so much
the fashion to publish letters that in order to avoid it, I put as little into mine
as I can.” Boswell was not afraid of publication. His fear, as he said, was
that letters, like sermons, would not continue to attract public curiosity, so
he spiced his highly. Did he do or say a foolish thing, he at once sat down
and told Temple all about it, usually adding that in the near future he
intended to amend. His comment on his contemporaries is characteristic.
“Hume,” he says, “told me that he would give me half-a-crown for every
page of Johnson’s Dictionary in which he could not find an absurdity, if I
would give him half-a-crown for every page in which he could find one.”
He announces Adam Smith’s election to membership in the famous
literary club by saying: “Smith is now of our club—it has lost its select
merit.” Of Gibbon he says: “I hear nothing of the publication of his second
volume. He is an ugly, affected, disgusting fellow, and poisons our literary
club to me.”
As he grows older and considers how unsuccessful his life has been, how
he had failed at the bar both in Scotland and in London, he begins to
complain. He can get no clients; he fears that, even were he entrusted with
cases, he would fail utterly.
I am afraid [he says], that, were I to be tried, I should be found so deficient in the
forms, the quirks and the quiddities, which early habit acquires, that I should expose
myself. Yet the delusion of Westminster Hall, of brilliant reputation and splendid fortune as
a barrister, still weighs upon my imagination. I must be seen in the Courts, and must hope
for some happy openings in causes of importance. The Chancellor, as you observe, has not
done as I expected; but why did I expect it? I am going to put him to the test. Could I be
satisfied with being Baron of Auchinleck, with a good income for a gentleman in Scotland,
I might, no doubt, be independent. What can be done to deaden the ambition which has
ever raged in my veins like a fever?

But the highest spirits will sometimes flag. Boswell, the friendly,
obliging, generous roué, was getting old. He begins to speak of the past.
Do you remember when you and I sat up all night at Cambridge, and read Gray with a
noble enthusiasm; when we first used to read Mason’s “Elfrida,” and when we talked of
that elegant knot of worthies, Gray, Mason and Walpole?

“Elfrida” calls itself on the title-page, “A Dramatic Poem written on the


model of the Ancient Greek Tragedy.” I happen to own and value highly the
very copy of this once famous poem, which Boswell and Temple read
together; on the fly leaf, under Boswell’s signature, is a characteristic note
in his bold, clear hand: “A present from my worthy friend Temple.”
He becomes more than ever before the butt of
his acquaintance. He tells his old friend of a trick
which has been played on him—only one of
many. He was staying at a great house crowded
with guests.
I and two other gentlemen were laid in one room. On
Thursday morning my wig was missing; a strict search was
made, all in vain. I was obliged to go all day in my
nightcap, and absent myself from a party of ladies and
gentlemen who went and dined with an Earl on the banks
of the lake, a piece of amusement which I was glad to
shun, as well as a dance which they had at night. But I was
in a ludicrous situation. I suspect a wanton trick, which
some people think witty; but I thought it very ill-timed to
one in my situation.

When his father dies and he comes into his


estates, he is deeply in debt; he hates Scotland,
he longs to be in London, to enjoy the Club, to
see Johnson, to whom he writes of his
difficulties, asking his advice. Johnson gives him just such advice as might
be expected.
To come hither with such expectations at the expense of borrowed money, which I find
you know not where to borrow, can hardly be considered prudent. I am sorry to find, what
your solicitations seem to imply, that you have already gone the length of your credit. This
is to set the quiet of your whole life at hazard. If you anticipate your inheritance, you can at
last inherit nothing; all that you receive must pay for the past. You must get a place, or pine
in penury, with the empty name of a great estate. Poverty, my dear friend, is so great an
evil, that I cannot but earnestly enjoin you to avoid it. Live on what you have; live, if you
can, on less; do not borrow either for vanity or pleasure; the vanity will end in shame, and
the pleasure in regret; stay therefore at home till you have saved money for your journey
hither.

His wife dies and Johnson dies. One by one the props are pulled from
under him; he drinks, constantly gets drunk; is, in this condition, knocked
down in the streets and robbed, and thinks with horror of giving up his soul,
intoxicated, to his Maker. “Oh, Temple, Temple!” he writes, “is this
realizing any of the towering hopes which have so often been the subject of
our conversation and letters?” At last he begins a letter which he is never to
finish. “I would fain write you in my own hand but really cannot.” These
were the last words poor Boswell ever wrote.

But Boswell’s life is chiefly interesting where it impinges upon that of


his great friend. A few months after the famous meeting in Davies’s book-
shop, he started for the Continent, with the idea, following the fashion of
the time, of studying law at Utrecht, Johnson accompanying him on his way
as far as Harwich.
After a short time at the University, during which he could have learned
nothing, we find him wandering about Europe in search of celebrities,—big
game,—the hunting of which was to be the chief interest of his life. He
succeeded in bagging Voltaire and Rousseau,—there was none bigger,—and
after a short stay in Rome he turned North, sailing from Leghorn to Corsica,
where he met Paoli, the patriot, and finally returned home, escorting
Thérèse Levasseur, Rousseau’s mistress, as far as London. Hume at this
time speaks of him as “a friend of mine, very good-humored, very
agreeable and very mad.”
Meanwhile his father, Lord Auchinleck, who had borne with admirable
patience such stories as had reached him of his son’s wild ways, insisted
that it was time for him to settle down; but Boswell was too full of his
adventures in the island of Corsica and his meeting with Paoli, to begin
drudgery at the law. His accounts of his travels made him a welcome guest
at London dinner-parties, and he had finally decided to write a book of his
experiences.
At last the father, by a threat to cut off supplies, secured his son’s return;
but his desire to publish a book had not abated, and while he finally was
admitted to the Scotch bar, we find him corresponding with his friend Mr.
Dilly, the publisher, in regard to the book upon which he was busily
employed. From an unpublished letter, which I was fortunate enough to
secure quite recently from a book-seller in New York, Gabriel Wells, we
may follow Boswell in his negotiations.
Edinburgh, 6 August, 1767.
Sir
I have received your letter agreeing to pay me One Hundred Guineas for the Copy-
Right of my Account of Corsica, &c., the money to be due three months after the
publication of the work in London, and also agreeing that the first Edition shall be printed
in Scotland, under my direction, and a map of Corsica be engraved for the work at your
Expence.
In return to which, I do hereby agree that you shall have the sole Property of the said
work. Our Bargain therefore is now concluded and I heartily wish that it may be of
advantage to you.
I am Sir
Your most humble Servant
James Boswell.
To Mr. Dilly, Bookseller, London.

COPY OF JAMES BOSWELL’S AGREEMENT


WITH MR. DILLY, RECITING THE TERMS
AGREED ON FOR THE PUBLICATION OF
“CORSICA”

Through the kindness of my fellow collector and generous friend, Judge


Patterson of Philadelphia, I own an interesting fragment of a brief in
Boswell’s hand, written at about this period. It appears therefrom that
Boswell had been retained to secure the return of a stocking-frame of the
value of a few shillings, which had been forcibly carried off. The outcome
of the litigation is not known, but the paper bears the interesting
indorsement, “This was the first Paper drawn by me as an Advocate. James
Boswell.”
But I am allowing my collector’s passion to
carry me too far afield. The preface of Boswell’s
“Account of Corsica” closes with an interesting
bit of self-revelation. He says, characteristically,

For my part I should be proud to be known as an
author; I have an ardent ambition for literary fame; for of
all possessions I should imagine literary fame to be the
most valuable. A man who has been able to furnish a book
which has been approved by the world has established
himself as a respectable character in distant society,
without any danger of having that character lessened by the
observation of his weaknesses. To preserve a uniform
dignity among those who see us every day is hardly
possible; and to aim at it must put us under the fetters of a
perpetual restraint. The author of an approved book may
allow his natural disposition an easy play, and yet indulge
the pride of superior genius, when he considers that by
those who know him only as an author he never ceases to
be respected. Such an author in his hours of gloom and
discontent may have the consolation to think that his writings are at that very time giving
pleasure to numbers, and such an author may cherish the hope of being remembered after
death, which has been a great object of the noblest minds in all ages.

A brief contemporary criticism sums up the merits of “Corsica” in a


paragraph. “There is a deal about the Island and its dimensions that one
doesn’t care a straw about, but that part which relates to Paoli is amusing
and interesting. The author has a rage for knowing anybody that was ever
talked of.”
Boswell thought that he was the first, but he proved to be the second
Englishman (the first was an Englishwoman) who had ever set foot upon
the island. He visited Paoli, and his accounts of his reception by the great
patriot and his conversation with the people are amusing in the extreme. To
his great satisfaction it was generally believed that he was on a public
mission.
The more I disclaimed any such thing, the more they persevered in affirming it; and I
was considered as a very close young man. I therefore just allowed them to make a
minister of me, till time should undeceive them.... The Ambasciadore Inglese—as the good
peasants and soldiers used to call me—became a great favorite among them. I got a
Corsican dress made, in which I walked about with an air of true satisfaction.

On another occasion:—
When I rode out I was mounted on Paoli’s own horse, with rich furniture of crimson
velvet, with broad gold lace, and had my guard marching along with me. I allowed myself
to indulge a momentary pride in this parade, as I was curious to experience what should
really be the pleasure of state and distinction with which mankind are so strangely
intoxicated.

The success of this publication led Boswell into some absurd


extravagances which he thought were necessary to support his position as a
distinguished English author. Praise for his work he skillfully extracted
from most of his friends, but Johnson proved obdurate. He had expressed a
qualified approval of the book when it appeared; but when Boswell in a
letter sought more than this, the old Doctor charged him to empty his head
of “Corsica,” which he said he thought had filled it rather too long.
Boswell wrote at least two of what we should to-day call press notices of
himself. One is reminded of the story of the man in a hired dress-suit at a
charity ball rushing about inquiring the whereabouts of the man who puts
your name in the paper. To such an one Boswell presented this brief account
of himself on the occasion of the famous Shakespeare Jubilee.
One of the most remarkable masks upon this occasion was James Boswell, Esq., in the
dress of an armed Corsican Chief. He entered the amphitheatre about twelve o’clock. He
wore a short dark-coloured coat of coarse cloth, scarlet waistcoat and breeches, and black
spatter-dashes; his cap or bonnet was of black cloth; on the front of it was embroidered in
gold letters, “Viva la Liberta,” and on one side of it was a handsome blue feather and
cockade, so that it had an elegant as well as a warlike appearance. On the breast of his coat
was sewed a Moor’s head, the crest of Corsica, surrounded with branches of laurel. He had
also a cartridge-pouch into which was stuck a stiletto, and on his left side a pistol was hung
upon the belt of his cartridge-pouch. He had a fusee slung across his shoulder, wore no
powder in his hair, but had it plaited at full length with a knot of blue ribbon at the end of
it. He had, by way of staff, a very curious vine all of one piece, with a bird finely carved
upon it emblematical of the sweet bard of Avon. He wore no mask, saying that it was not
proper for a gallant Corsican. So soon as he came into the room he drew universal
attention. The novelty of the Corsican dress, its becoming appearance, and the character of
that brave nation concurred to distinguish the armed Corsican Chief.

May we not suppose that several bottles of “Old Hock” contributed to


his enjoyment of this occasion? Here is the other one:—
Boswell, the author, is a most excellent man: he is of an ancient family in the West of
Scotland, upon which he values himself not a little. At his nativity there appeared omens of
his future greatness. His parts are bright, and his education has been good. He has travelled
in post-chaises miles without number. He is fond of seeing much of the world. He eats of
every good dish, especially apple pie. He drinks Old Hock. He has a very fine temper. He
is somewhat of a humorist and a little tinctured with pride. He has a good manly
countenance, and he owns himself to be amorous. He has infinite vivacity, yet is observed
at times to have a melancholy cast. He is rather fat than lean, rather short than tall, rather
young than old. His shoes are neatly made, and he never wears spectacles.

The success of “Corsica” was not very great, but it sufficed to turn
Boswell’s head completely. He spent as much time in London as he could
contrive to, and led there the life of a dissipated man of fashion. He
quarreled with his father, and after a series of escapades with women of the
town and love-affairs with heiresses, he finally married his cousin, Margaret
Montgomerie, a girl without a fortune. Much to Boswell’s disgust, his
father, on the very same day, married for the second time, and married his
cousin.
For a time after marriage he seemed to take his profession seriously, but
he deceived neither his father nor his clients. The old man said that Jamie
was simply taking a toot on a new horn. Meanwhile Boswell never allowed
his interest in Johnson to cool for a moment. When he was in London,—and
he went there on one excuse or another as often as his means permitted,—
he was much with Johnson; and when he was at home, he was constantly
worrying Johnson for some evidence of his affection for him. Finally
Johnson writes, “My regard for you is greater almost than I have words to
express” (this from the maker of a dictionary); “but I do not chuse to be
always repeating it; write it down in the first leaf of your pocketbook, and
never doubt of it again.”
Neither wife nor father could understand the feeling of reverence and
affection which their Jamie had for Johnson. I always delight in the story of
his father saying to an old friend, “There’s nae hope for Jamie, mon. Jamie
is gaen clean gyte. What do you think, mon? He’s done wi’ Paoli—he’s off
wi’ the land-louping scoundrel of a Corsican; and whose tail do you think
he has pinned himself to now, mon? A dominie, mon—an auld dominie: he
keeped a schule, and ca’d it an academy.”
Mrs. Boswell, a sensible, cold, rather shadowy person, saw but little of
Johnson, and was satisfied that it should be so. There is one good story to
her credit. Unaccustomed to the ways of genius, she caught Johnson, who
was nearsighted, one evening burnishing a lighted candle on her carpet to
make it burn more brightly, and remarked, “I have seen many a bear led by
a man, but never before have I seen a man led by a bear.” Boswell was just
the fellow to appreciate this, and promptly repeated it to Johnson, who
failed to see the humor of it.
In 1782 his father died and he came into the estate, but by his
improvident management he soon found himself in financial difficulties.
Johnson’s death two years later removed a restraining influence that he
much needed. He tried to practice law, but he was unsuccessful. Never an
abstemious man, he now drank heavily and constantly, and as constantly
resolved to turn over a new leaf.
Shortly after Johnson’s death, Boswell published his “Journal of the Tour
of the Hebrides,” which reached a third edition within the year and
established his reputation as a writer of a new kind, in which anecdotes and
conversation are woven into a narrative with a fidelity and skill which were
as easy to him as they were impossible to others.
The great success of this book encouraged him to begin, and continue to
work upon, the great biography of Johnson on which his fame so securely
rests. Others had published before him. Mrs. Piozzi’s “Anecdotes of the
Late Samuel Johnson” had sold well, and Hawkins, the “unclubable
Knight,” as Johnson called him, had been commissioned by the booksellers
of London to write a formal biography, which appeared in 1787; while of
lesser publications there was seemingly no end; nevertheless, Boswell
persevered, and wrote his friend Temple that his
mode of biography which gives not only a history of Johnson’s visible progress through the
world, and of his publications, but a view of his mind in his letters and conversations, is the
most perfect that can be conceived, and will be more of a life than any work that has yet
appeared.

He had been preparing for the task for more than twenty years; he had, in
season and out, been taking notes of Johnson’s conversations, and Johnson
himself had supplied him with much of the material. Thus in poverty,
interrupted by periods of dissipation, amid the sneers of many, he continued
his work. While it was in progress his wife died, and he, poor fellow, justly
upbraided himself for his neglect of her.

DR. JOHNSON IN TRAVELING DRESS, AS DESCRIBED IN BOSWELL’S TOUR


Engraved by Trotter

Meanwhile, a “new horn” was presented to him. He had, or thought he


had, a chance of being elected to Parliament, or at least of securing a place
under government; but in all this he was destined to be disappointed. It
would be difficult to imagine conditions more unfavorable to sustained
effort than those under which Boswell labored. He was desperately hard up.
Always subject to fits of the blues, which amounted almost to melancholia,
he many a time thought of giving up the task from which he hoped to derive
fame and profit. He considered selling his rights in the publication for a
thousand pounds. But it would go to his heart, he said, to accept such a
sum; and again, “I am in such bad spirits that I have fear concerning it—I
may get no profit, nay, may lose—the public may be disappointed and think
I have done it poorly—I may make enemies, and even have quarrels.” Then
the depression would pass and he could write: “It will be, without
exception, the most entertaining book you ever read.” When his friends
heard that the Life would make two large volumes quarto, and that the price
was two guineas, they shook their heads and Boswell’s fears began again.
At last, on May 16, 1791, the book appeared, with the imprint of Charles
Dilly, in the Poultry; and so successful was it that by August twelve
hundred copies had been disposed of, and the entire edition was exhausted
before the end of the year. The writer confesses to such a passion for this
book that of this edition he owns at present four copies in various states, the
one he prizes most having an inscription in Boswell’s hand: “To James
Boswell, Esquire, Junior, from his affectionate father, the Authour.” Of
other editions—but why display one’s weakness?
“Should there,” in Boswell’s phrase, “be any cold-blooded and morose
mortals who really dislike it,” I am sorry for them. To me it has for thirty
years been a never-ending source of profit—and pleasure, which is as
important. It is a book to ramble in—and with. I have never, I think, read it
through from cover to cover, as the saying is, but some day I will;
meanwhile let me make a confession. There are parts of it which are deadly
dull; the judicious reader will skip these without hint from me. I have,
indeed, always had a certain sympathy with George Henry Lewes, who for
years threatened to publish an abridgment of it. It could be done: indeed, the
work could be either expanded or contracted at will; but every good
Boswellian will wish to do this for himself; tampering with a classic is
somewhat like tampering with a will—it is good form not to.

What is really needed is a complete index to the sayings of Johnson—his


dicta, spoken or written. It would be an heroic task, but heroic tasks are
constantly being undertaken. My friend Osgood, of Princeton, a ripe scholar
and an ardent Johnsonian, has been devoting the scanty leisure of years to a
concordance of Spenser. No one less competent than he should undertake to
supervise such a labor of love.
It will be remembered that the Bible is not lacking in quotations, nor is
Shakespeare; but these sources of wisdom aside, Boswell, quoting Johnson,
supplies us more frequently with quotations than any other author whatever.
Could the irascible old Doctor come to earth again, and with that wonderful
memory of his call to mind the purely casual remarks which he chanced to
make to Boswell, he would surely be amazed to hear himself quoted, and to
learn that his obiter dicta had become fixed in the minds of countless
thousands who perhaps have never heard his name.
I chanced the other day to stop at my broker’s office to see how much I
had lost in an unexpected drop in the market, and to beguile the time,
picked up a market letter in which this sentence met my eye: “The
unexpected and perpendicular decline in the stock of Golden Rod mining
shares has left many investors sadder if not wiser. When will the public
learn that investors in securities of this class are only indulging themselves
in proving the correctness of Franklin’s [sic] adage, that the expectation of
making a profit in such securities is simply the triumph of hope over
experience?” Good Boswellians will hardly need to be reminded that this is
Dr. Johnson on marriage. He had something equally wise to say, too, on the
subject of “shares”; but in this instance he was speaking of a man’s second
venture into matrimony, his first having proved very unhappy.

Most men, when they write a book of memoirs in which hundreds of


living people are mentioned, discreetly postpone publication until after they
and the chief personages of the narrative are dead. Johnson refers to
Bolingbroke as a “cowardly scoundrel” for writing a book (charging a
blunderbuss, he called it) and leaving half a crown to a beggarly Scotchman
to pull the trigger after his death. Boswell spent some years in charging his
blunderbuss; he filled it with shot, great and small, and then, taking careful
aim, pulled the trigger.
Cries of rage, anguish, and delight instantly arose from all over the
kingdom. A vast number of living people were mentioned, and their merits
or failings discussed with an abandon which is one of the great charms of
the book to-day, but which, when it appeared, stirred up a veritable hornets’
nest. As some one very cleverly said, “Boswell has invented a new kind of
libel.” “A man who is dead once told me so and so”—what redress have
you in law? None! The only thing to do is to punch his head.
Fortunately Boswell escaped personal chastisement, but he made many
enemies and alienated some friends. Mrs. Thrale, by this time Mrs. Piozzi,
quite naturally felt enraged at Boswell’s contemptuous remarks about her,
and at his references to what Johnson said of her while he was enjoying the
hospitality of Streatham. The best of us like to criticize our friends behind
their backs; and Johnson could be frank, and indeed brutal, on occasion.
Mrs. Boscawen, the wife of the admiral, on the other hand, had no reason to
be displeased when she read: “If it is not presumptuous in me to praise her,
I would say that her manners are the best of any lady with whom I ever had
the happiness to be acquainted.”
Bishop Percy, shrewdly suspecting that Boswell’s judgment was not to
be trusted, when he complied with his request for some material for the
Life, desired that his name might not be mentioned in the work; to which
Boswell replied that it was his intention to introduce as many names of
eminent persons as he could, adding, “Believe me, my Lord, you are not the
only Bishop to grace my pages.” We may suspect that he, like many
another, took up the book with fear and trembling, and put it down in a
rage.
Wilkes, too, got a touch of tar, but little he cared; the best beloved and
the best hated man in England, he probably laughed, properly thinking that
Boswell could do little damage to his reputation. But what shall we say of
Lady Diana Beauclerk’s feelings when she read the stout old English epithet
which Johnson had applied to her. Johnson’s authorized biographer, Sir
John Hawkins, dead and buried “without his shoes and stawkin’s,” as the
old jingle goes, had sneered at Boswell and passed on; verily he hath his
reward. Boswell accused him of stupidity, inaccuracy, and writing fatiguing
and disgusting “rigmarole.” His daughter came to the rescue of his fame,
and Boswell and she had a lively exchange of letters; indeed Boswell, at all
times, seemed to court that which most men shrink from, a discussion of
questions of veracity with a woman.
But on the whole the book was well received, and over his success
Boswell exulted, as well he might; he had achieved his ambition, he had
written his name among the immortals. With its publication his work was
done. He became more and more dissipated. His sober hours he devoted to
schemes for self-reform and a revision of the text for future editions. He
was engaged on a third printing when death overtook him. The last words
he wrote—the unfinished letter to his old friend Temple—have already been
quoted. The pen which he laid down was taken up by his son, who finished
the letter. From him we learn the sad details of his death. He passed away
on May 19, 1795, in his fifty-fifth year.
Like many another man, Boswell was always intending to reform, and
never did. His practice was ever at total variance with his principles. In
opinions he was a moralist; in conduct he was—otherwise. Let it be
remembered, however, that he was of a generous, open-hearted, and loving
disposition. A clause in his will, written in his own hand, sheds important
light upon his character. “I do beseech succeeding heirs of entail to be kind
to the tenants, and not to turn out old possessors to get a little more rent.”
What were the contemporary opinions of Boswell? Walpole did not like
him, but Walpole liked few. Paoli was his friend; with Goldsmith and with
Garrick he had been intimate. Mrs. Thrale and he did not get along well
together; he could not bear the thought that she saw more of Johnson than
he, and he was jealous of her influence over him. Fanny Burney did not like
him, and declined to give him some information which he very naturally
wanted for his book, because she wanted to use it herself. Gibbon thought
him terribly indiscreet, which, compared with Gibbon, he certainly was.
Reynolds and he were firm friends—the great book is dedicated to Sir
Joshua.
Of Boswell, Johnson wrote during their journey in Scotland, “There is
no house where he is not received with kindness and respect”; and
elsewhere, “He never left a house without leaving a wish for his return”;
also, “He was a man who finds himself welcome wherever he goes and
makes friends faster than he can want them”; and “He was the best traveling
companion in the world.” If there is a greater test than this, I do not know it.
It is summering and wintering with a man in a month. Burke said of him
that “good humor was so natural to him as to be scarcely a virtue to him.” I
know many admirable men of whom this cannot be said.
Several years ago, being in Ayrshire, I found myself not far from
Auchinleck; and although I knew that Boswell’s greatest editor, Birkbeck
Hill, had experienced a rebuff upon his attempt to visit the old estate which
Johnson had described as “very magnificent and very convenient,” I
determined, out of loyalty to James Boswell, to make the attempt. I thought
that perhaps American nerve would succeed where English scholarship had
failed.
We had spent the night at Ayr, and early next morning I inquired the cost
of a motor-trip to take my small party over to Auchinleck; and I was careful
to pronounce the word as though spelled Afflek, as Boswell tells us to.
“To where, sir?”
“Afflek,” I repeated.
The man seemed dazed. Finally I spelled it for him, “A-u-c-h-i-n-l-e-c-
k.”
“Ah, sir, Auchinleck,”—in gutturals the types will not reproduce,—“that
would be two guineas, sir.”
“Very good,” I said; “pronounce it your own way, but let me have the
motor.”
We were soon rolling over a road which Boswell must have taken many
times, but certainly never so rapidly or luxuriously. How Dr. Johnson would
have enjoyed the journey! I recalled his remark, “Sir, if I had no duties and
no reference to futurity, I would spend my life driving briskly in a post-
chaise with a pretty woman.” Futurity was not bothering me and I had a
pretty woman, my wife, by my side. Moreover, to complete the Doctor’s
remark, she was “one who could understand me and add something to the
conversation.” We set out in high spirits.
As we approached the house by a fine avenue bordered by venerable
trees,—no doubt those planted by the old laird, who delighted in such work,
—my courage almost failed me; but I had gone too far to retire. To the
servant who responded to my ring I stated my business, which seemed
trivial enough.
I might as well have addressed a graven image. At last it spoke. “The
family are away. The instructions are that no one is to be admitted to the
house under pain of instant dismissal.”
Means elsewhere successful failed me here.
“You can walk in the park.”
“Thanks, but I did not come to Scotland to walk in a park. Perhaps you
can direct me to the church where Boswell is buried.”
“You will find the tomb in the kirk in the village.”
Coal has been discovered on the estate, and the village, a mile or two
away, is ugly, and, to judge from the number of places where beer and
spirits could be had, their consumption would seem to be the chief
occupation of the population. I found the kirk, with door securely locked.
Would I try for the key at the minister’s? I would; but the minister was
away for the day. Would I try the sexton? I would; but he, too, was away,
and I found myself in the midst of a crowd of barefooted children who
embarrassed me by their profitless attentions. It was cold and it began to
rain. I remembered that we were not far from Greenock where “when it
does not rain, it snaws.”
My visit had not been a success, I cannot recommend a Boswell
pilgrimage. I wished that I was in London, and bethought me of Johnson’s
remark that “the noblest prospect in Scotland is the high-road that leads to
England.” On that high-road my party made no objection to setting out.
I once heard an eminent college professor speak disparagingly of
Boswell’s “Life of Johnson,” saying that it was a mere literary slop-pail into
which Boswell dropped scraps of all kinds—gossip, anecdotes and scandal,
literary and biographical refuse generally. I stood aghast for a moment; then
my commercial instinct awakened. I endeavored to secure this nugget of
criticism in writing, with permission to publish it over the author’s name. In
vain I offered a rate per word that would have aroused the envy of a
Kipling. My friend pleaded “writer’s cramp,” or made some other excuse,
and it finally appeared that, after all, this was only one of the cases where I
had neglected, in Boswell’s phrase, to distinguish between talk for the sake
of victory and talk with the desire to inform and illustrate. Against this
opinion there is a perfect chorus of praise rendered by a full choir.[11]
SAMUEL JOHNSON
Painted by Sir J. Reynolds. Engraved by Heath

The great scholar Jowett confessed that he had read the book fifty times.
Carlyle said, “Boswell has given more pleasure than any other man of this
time, and perhaps, two or three excepted, has done the world greater
service.” Lowell refers to the “Life” as a perfect granary of discussion and
conversation. Leslie Stephen says that his fondness for reading began and
would end with Boswell’s “Life of Johnson.” Robert Louis Stevenson
wrote: “I am taking a little of Boswell daily by way of a Bible. I mean to
read him now until the day I die.” It is one of the few classics which is not
merely talked about and taken as read, but is constantly being read; and I
love to think that perhaps not a day goes by when some one, somewhere,
does not open the book for the first time and become a confirmed
Boswellian.
“What a wonderful thing your English literature is!” a learned Hungarian
once said to me. “You have the greatest drama, the greatest poetry, and the
greatest fiction in the world, and you are the only nation that has any
biography.” The great English epic is Boswell’s “Life of Johnson.”
VII

A LIGHT-BLUE STOCKING
SOMETIME, when seated in your library, as it becomes too dark to read and
is yet too light,—to ring for candles, I was going to say, but nowadays we
simply touch a button,—let your thoughts wander over the long list of
women who have made for themselves a place in English literature, and see
if you do not agree with me that the woman you would like most to meet in
the flesh, were it possible, would be Mrs. Piozzi, born Hester Lynch
Salusbury, but best known to us as Mrs. Thrale.
Let us argue the matter. It may at first seem almost absurd to mention the
wife of the successful London brewer, Henry Thrale, in a list which would
include the names of Fanny Burney, Jane Austen, George Eliot, the Brontës,
and Mrs. Browning; but the woman I have in mind should unite feminine
charm with literary gifts: she should be a woman whom you would honestly
enjoy meeting and whom you would be glad to find yourself seated next to
at dinner.
The men of the Johnsonian circle affected to love “little Burney,” but
was it not for the pleasure her “Evelina” gave them rather than for anything
in the author herself? According to her own account, she was so easily
embarrassed as to be always “retiring in confusion,” or “on the verge of
swooning.” It is possible that we would find this rather limp young lady a
trifle tiresome.
Jane Austen was actually as shy and retiring as Fanny Burney affected to
be. She could hardly have presided gracefully in a drawing-room in a
cathedral city; much less would she have been at home among the wits in a
salon in London.
Of George Eliot one would be inclined to say, as Dr. Johnson said of
Burke when he was ill, “If I should meet Burke now it would kill me.”
Perhaps it would not kill one to meet George Eliot, but I suspect few men
would care for an hour’s tête-à-tête with her without a preliminary oiling of
their mental machinery—a hateful task.
The Brontës were geniuses undoubtedly, particularly Emily, but one
would hardly select the author of “Wuthering Heights” as a companion for a
social evening.
Mrs. Browning, with her placid smile and tiresome ringlets, was too
deeply in love with her husband. After all, the woman one enjoys meeting
must be something of a woman of the world. She need not necessarily be a
good wife or mother. We are provided with the best of wives and at the
moment are not on the lookout for a good mother.
It may at once be admitted that as a mother Mrs. Thrale was not a
conspicuous success; but she was a woman of charm, with a sound mind in
a sound body. Although she could be brilliant in conversation, she would let
you take the lead if you were able to; but she was quite prepared to take it
herself rather than let the conversation flag; and she must have been a very
exceptional woman, to steady, as she did, a somewhat roving husband, to
call Dr. Johnson to order, and upon occasion to reprove Burke, even while
entertaining the most brilliant society of which London at the period could
boast.
At the time when we first make her acquaintance, she was young and
pretty, the mistress of a luxurious establishment; and if she was not
possessed of literary gifts herself, it may fairly be said that she was the
cause of literature in others.
In these days, when women, having everything else, want the vote also
(and I would give it to them promptly and end the discussion), it may be
suggested that to shine by a reflected light is to shine not at all. Frankly,
Mrs. Thrale owes her position in English letters, not to anything important
that she herself did or was capable of doing, but to the eminence of those
she gathered about her. But her position is not the less secure; she was a
charming and fluffy person; and as firmly as I believe that women have
come to stay, so firmly am I of the opinion that, in spite of all the well-
meaning efforts of some of their sex to prevent it, a certain, and, thank God,
sufficient number of women will stay charming and fluffy to the end of the
chapter.
On one subject only could Mrs. Thrale be tedious—her pedigree. I have
it before me, written in her own bold hand, and I confess that it seems very
exalted indeed. She would not have been herself had she not stopped in
transcribing it to relate how one of her ancestors, Katherine Tudor de
Berayne, cousin and ward of Queen Elizabeth and a famous heiress, as she
was returning from the grave of her first husband, Sir John Salusbury, was
asked in marriage by Maurice Wynne of Gwydir, who was amazed to learn
that he was too late, as she had already engaged herself to Sir Richard
Clough. “But,” added the lady, “if in the providence of God I am
unfortunate enough to survive him, I consent to be the lady of Gwydir.” Nor
does the tale end here, for she married yet another, and having sons by all
four husbands, she came to be called “Mam y Cymry,”—Mother of Wales,
—and no doubt she deserved the appellation.
With such marrying blood in her veins it is easily understood that, as
soon as Thrale’s halter was off her neck,—this sporting phrase, I regret to
say, is Dr. Johnson’s,—she should think of marrying again; and that having
the first time married to please her family, she should, at the second
venture, marry to please herself. But this chapter is moving too rapidly—the
lady is not yet born.

Hester Lynch Salusbury’s birthplace was Bodvel, in Wales, and the year,
1741. She was an only child, very precocious, with a retentive memory. She
soon became the plaything of the elderly people around her, who called her
“Fiddle.” Her father had the reputation of being a scamp, and it fell to her
uncle’s lot to direct, somewhat, her education. Handed from one relation to
another, she quickly adapted herself to her surroundings. Her mother taught
her French; a tutor, Latin; Quin, the actor, taught her to recite; Hogarth
painted her portrait; and the grooms of her grandmother, whom she visited
occasionally, made her an accomplished horsewoman. In those days
education for a woman was highly irregular, but judging from the results in
the case of Mrs. Thrale and her friends, who shall say that it was
ineffective? We have no Elizabeth Carters nowadays, good at translating
Epictetus, and—we have it on high authority—better at making a pudding.
Study soon became little Hester’s delight. At twelve years she wrote for
the newspapers; also, she used to rise at four in the morning to study, which
her mother would not have allowed had she known of it. I have a letter
written many years afterwards in which she says: “My mother always told
me I ruined my Figure and stopt my Growth by sitting too long at a Writing
Desk, though ignorant how much Time I spent at it. Dear Madam, was my
saucy Answer,—

“Tho’ I could reach from Pole to Pole


And grasp the Ocean with my Span,
I would be measur’d by my Soul.
The Mind’s the Standard of the Man.”

She is quoting Dr. Watts from memory evidently, and improving, perhaps,
upon the original.
But little girls grow up and husbands must be found for them. Henry
Thrale, the son of a rich Southwark brewer, was brought forward by her
uncle; while her father, protesting that he would not have his only child
exchanged for a barrel of “bitter,” fell into a rage and died of an apoplexy.
Her dot was provided by the uncle; her mother did the courting, with little
opposition on the part of the lady and no enthusiasm on the part of the
suitor. So, without love on either side, she being twenty-two and her
husband thirty-five, she became Mrs. Thrale. “My uncle,” she records in her
journal, “went with us to the church, gave me away, dined with us at
Streatham after the ceremony, and then left me to conciliate as best I could
a husband who had never thrown away five minutes of his time upon me
unwitnessed by company till after the wedding day was done.”

More happiness came from this marriage than might have been expected.
Henry Thrale, besides his suburban residence, Streatham, had two other
establishments, one adjoining the brewery in Southwark, where he lived in
winter, and another, an unpretentious villa at the seaside. He also
maintained a stable of horses and a pack of hounds at Croydon; but,
although a good horsewoman, Mrs. Thrale was not permitted to join her
husband in his equestrian diversions; indeed, her place in her husband’s
establishment was not unlike that of a woman in a seraglio. She was
allowed few pleasures, and but one duty was impressed upon her, namely,
that of supplying an heir to the estate; to this duty she devoted herself
unremittingly.
In due time a child was born, a daughter; and while this was of course
recognized as a mistake, it was believed to be one which could be corrected.
Meanwhile Thrale was surprised to find that his wife could think and
talk—that she had a mind of her own. The discovery dawned slowly upon
him, as did the idea that the pleasure of living in the country may be
enhanced by hospitality. Finally the doors of Streatham Park were thrown
open. For a time her husband’s bachelor friends and companions were the
only company. Included among these was one Arthur Murphy, who had
been un maître de plaisir to Henry Thrale in the gay days before his
marriage, when they had frequented the green rooms and Ranelagh
together. It was Murphy who suggested that “Dictionary Johnson” might be
secured to enliven a dinner-party, and then followed some discussion as to
the excuse which should be given Johnson for inviting him to the table of
the rich brewer. It was finally suggested that he be invited to meet a minor
celebrity, James Woodhouse, the shoemaker poet.
Johnson rose to the bait,—Johnson rose easily to any bait which would
provide him a good dinner and lift him out of himself,—and the dinner
passed off successfully. Mrs. Thrale records that they all liked each other so
well that a dinner was arranged for the following week, without the
shoemaker, who, having served his purpose, disappears from the record.
And now, and for twenty years thereafter, we find Johnson enjoying the
hospitality of the Thrales, which opened for him a new world. When he was
taken ill, not long after the introduction, Mrs. Thrale called on him in his
stuffy lodgings in a court off Fleet Street, and suggested that the air of
Streatham would be good for him. Would he come to them? He would. He
was not the man to deny himself the care of a young, rich, and charming
woman, who would feed him well, understand him, and add to the joys of
conversation. From that time on, whether at their residence in Deadman’s
Place in Southwark, or at Streatham, or at Brighton, even on their journeys,
the Thrales and Johnson were constantly together; and when he went on a
journey alone, as was sometimes the case, he wrote long letters to his
mistress or his master, as he affectionately called his friends.
Who gained most by this intercourse? It would be hard to say. It is a fit
subject for a debate, a copy of Boswell’s “Life of Johnson” to go to the
successful contestant. Johnson summed up his obligations to the lady in the
famous letter written just before her second marriage, probably the last he
ever wrote her. “I wish that God may grant you every blessing, that you
may be happy in this world ... and eternally happy in a better state; and
whatever I can contribute to your happiness I am ready to repay for that
kindness which soothed twenty years of a life radically wretched.”
On the other hand, the Thrales secured what, perhaps unconsciously,
they most desired, social position and distinction. At Streatham they
entertained the best, if not perhaps the very highest, society of the time.
Think for a moment of the intimates of this house, whose portraits, painted
by Reynolds, hung in the library. There were my Lords Sandys and
Westcote, college friends of Thrale; there were Johnson and Goldsmith;
Garrick and Burke; Burney, and Reynolds himself, and a number of others,
all from the brush of the great master; and could we hear the voices which
from time to time might have been heard in the famous room, we should
recognize Boswell and Piozzi, Baretti, and a host of others; and would it be
necessary for the servant to announce the entrance of the great Mrs.
Siddons, or Mrs. Garrick, or Fanny Burney, or Hannah More, or Mrs.
Montagu, or any of the other ladies who later formed that famous coterie
which came to be known as the Blue-Stockings?
But Johnson was the Thrales’ first lion and remained their greatest. He
first gave Streatham parties distinction. The master of the house enjoyed
having the wits about him, but was not one himself. Johnson said of him
that “his mind struck the hours very regularly but did not mark the
minutes.” It was his wife who, by her sprightliness and her wit and
readiness, kept the ball rolling, showing infinite tact and skill in drawing
out one and, when necessary, repressing another; asking—when the Doctor
was not speaking—for a flash of silence from the company that a newcomer
might be heard.
But I am anticipating. All this was not yet. A salon such as she created at
Streatham Park is not the work of a month or of a year.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookname.com

You might also like