100% found this document useful (1 vote)
11 views

Introduction to Machine Learning with Applications in Information Security 1st Edition Mark Stamp download

The document is an introduction to machine learning with a focus on its applications in information security, authored by Mark Stamp. It outlines the aims and scope of the Machine Learning & Pattern Recognition Series, which includes various topics such as computational intelligence and natural language processing. The book provides a comprehensive overview of machine learning concepts, tools, and techniques relevant to information security.

Uploaded by

larwateheeeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
11 views

Introduction to Machine Learning with Applications in Information Security 1st Edition Mark Stamp download

The document is an introduction to machine learning with a focus on its applications in information security, authored by Mark Stamp. It outlines the aims and scope of the Machine Learning & Pattern Recognition Series, which includes various topics such as computational intelligence and natural language processing. The book provides a comprehensive overview of machine learning concepts, tools, and techniques relevant to information security.

Uploaded by

larwateheeeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Introduction to Machine Learning with

Applications in Information Security 1st Edition


Mark Stamp download

https://textbookfull.com/product/introduction-to-machine-
learning-with-applications-in-information-security-1st-edition-
mark-stamp/

Download more ebook from https://textbookfull.com


We believe these products will be a great fit for you. Click
the link to download now, or visit textbookfull.com
to discover even more!

Introduction to machine learning with R rigorous


mathematical analysis First Edition Burger

https://textbookfull.com/product/introduction-to-machine-
learning-with-r-rigorous-mathematical-analysis-first-edition-
burger/

An Introduction To Statistical Learning With


Applications In R Gareth James

https://textbookfull.com/product/an-introduction-to-statistical-
learning-with-applications-in-r-gareth-james/

Fundamentals of optimization theory with applications


to machine learning Gallier J.

https://textbookfull.com/product/fundamentals-of-optimization-
theory-with-applications-to-machine-learning-gallier-j/

Fundamentals of optimization theory with applications


to machine learning Gallier J.

https://textbookfull.com/product/fundamentals-of-optimization-
theory-with-applications-to-machine-learning-gallier-j-2/
Artificial Intelligence With an Introduction to Machine
Learning 2nd Edition Richard E. Neapolitan

https://textbookfull.com/product/artificial-intelligence-with-an-
introduction-to-machine-learning-2nd-edition-richard-e-
neapolitan/

Introduction to Machine Learning with Python A Guide


for Data Scientists 1st Edition Andreas C. Müller

https://textbookfull.com/product/introduction-to-machine-
learning-with-python-a-guide-for-data-scientists-1st-edition-
andreas-c-muller/

A first course in machine learning Second Edition Mark


Girolami

https://textbookfull.com/product/a-first-course-in-machine-
learning-second-edition-mark-girolami/

Introduction to Machine Learning with Python A Guide


for Data Scientists Andreas C. Müller

https://textbookfull.com/product/introduction-to-machine-
learning-with-python-a-guide-for-data-scientists-andreas-c-
muller/

Machine learning and security protecting systems with


data and algorithms First Edition Chio

https://textbookfull.com/product/machine-learning-and-security-
protecting-systems-with-data-and-algorithms-first-edition-chio/
INTRODUCTION TO

MACHINE
LEARNING with
APPLICATIONS
in INFORMATION
SECURITY
Chapman & Hall/CRC
Machine Learning & Pattern Recognition Series

SERIES EDITORS

Ralf Herbrich Thore Graepel


Amazon Development Center Microsoft Research Ltd.
Berlin, Germany Cambridge, UK

AIMS AND SCOPE

This series reflects the latest advances and applications in machine learning and pattern rec-
ognition through the publication of a broad range of reference works, textbooks, and hand-
books. The inclusion of concrete examples, applications, and methods is highly encouraged.
The scope of the series includes, but is not limited to, titles in the areas of machine learning,
pattern recognition, computational intelligence, robotics, computational/statistical learning
theory, natural language processing, computer vision, game AI, game theory, neural networks,
computational neuroscience, and other relevant topics, such as machine learning applied to
bioinformatics or cognitive science, which might be proposed by potential contributors.

PUBLISHED TITLES

BAYESIAN PROGRAMMING
Pierre Bessière, Emmanuel Mazer, Juan-Manuel Ahuactzin, and Kamel Mekhnacha
UTILITY-BASED LEARNING FROM DATA
Craig Friedman and Sven Sandow
HANDBOOK OF NATURAL LANGUAGE PROCESSING, SECOND EDITION
Nitin Indurkhya and Fred J. Damerau
COST-SENSITIVE MACHINE LEARNING
Balaji Krishnapuram, Shipeng Yu, and Bharat Rao
COMPUTATIONAL TRUST MODELS AND MACHINE LEARNING
Xin Liu, Anwitaman Datta, and Ee-Peng Lim
MULTILINEAR SUBSPACE LEARNING: DIMENSIONALITY REDUCTION OF
MULTIDIMENSIONAL DATA
Haiping Lu, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos
MACHINE LEARNING: An Algorithmic Perspective, Second Edition
Stephen Marsland
SPARSE MODELING: THEORY, ALGORITHMS, AND APPLICATIONS
Irina Rish and Genady Ya. Grabarnik
A FIRST COURSE IN MACHINE LEARNING, SECOND EDITION
Simon Rogers and Mark Girolami
INTRODUCTION TO MACHINE LEARNING WITH APPLICATIONS IN
INFORMATION SECURITY
Mark Stamp
Chapman & Hall/CRC
Machine Learning & Pattern Recognition Series

INTRODUCTION TO

MACHINE
LEARNING with
APPLICATIONS
in INFORMATION
SECURITY

Mark Stamp
San Jose State University
California
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2018 by Taylor & Francis Group, LLC


CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper

International Standard Book Number-13: 978-1-138-62678-2 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize
to copyright holders if permission to publish in this form has not been obtained. If any copyright material
has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information storage
or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted a
photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.

Visit the Taylor & Francis Web site at


http://www.taylorandfrancis.com

and the CRC Press Web site at


http://www.crcpress.com
To Melody, Austin, and Miles.
Contents

Preface xiii

About the Author xv

Acknowledgments xvii

1 Introduction 1
1.1 What Is Machine Learning? . . . . . . . . . . . . . . . . . . . 1
1.2 About This Book . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Necessary Background . . . . . . . . . . . . . . . . . . . . . . 4
1.4 A Few Too Many Notes . . . . . . . . . . . . . . . . . . . . . 4

I Tools of the Trade 5

2 A Revealing Introduction to Hidden Markov Models 7


2.1 Introduction and Background . . . . . . . . . . . . . . . . . . 7
2.2 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 The Three Problems . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 HMM Problem 1 . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 HMM Problem 2 . . . . . . . . . . . . . . . . . . . . . 14
2.4.3 HMM Problem 3 . . . . . . . . . . . . . . . . . . . . . 14
2.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 The Three Solutions . . . . . . . . . . . . . . . . . . . . . . . 15
2.5.1 Solution to HMM Problem 1 . . . . . . . . . . . . . . 15
2.5.2 Solution to HMM Problem 2 . . . . . . . . . . . . . . 16
2.5.3 Solution to HMM Problem 3 . . . . . . . . . . . . . . 17
2.6 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . 20
2.7 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8 All Together Now . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.9 The Bottom Line . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

vii
viii CONTENTS

3 A Full Frontal View of Profile Hidden Markov Models 37


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Overview and Notation . . . . . . . . . . . . . . . . . . . . . 39
3.3 Pairwise Alignment . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Multiple Sequence Alignment . . . . . . . . . . . . . . . . . . 46
3.5 PHMM from MSA . . . . . . . . . . . . . . . . . . . . . . . . 50
3.6 Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.7 The Bottom Line . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Principal Components of Principal Component Analysis 63


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.1 A Brief Review of Linear Algebra . . . . . . . . . . . . 64
4.2.2 Geometric View of Eigenvectors . . . . . . . . . . . . 68
4.2.3 Covariance Matrix . . . . . . . . . . . . . . . . . . . . 70
4.3 Principal Component Analysis . . . . . . . . . . . . . . . . . 73
4.4 SVD Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.5 All Together Now . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.5.1 Training Phase . . . . . . . . . . . . . . . . . . . . . . 80
4.5.2 Scoring Phase . . . . . . . . . . . . . . . . . . . . . . . 82
4.6 A Numerical Example . . . . . . . . . . . . . . . . . . . . . . 83
4.7 The Bottom Line . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5 A Reassuring Introduction to Support Vector Machines 95


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 Constrained Optimization . . . . . . . . . . . . . . . . . . . . 102
5.2.1 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . 104
5.2.2 Lagrangian Duality . . . . . . . . . . . . . . . . . . . . 108
5.3 A Closer Look at SVM . . . . . . . . . . . . . . . . . . . . . . 110
5.3.1 Training and Scoring . . . . . . . . . . . . . . . . . . . 112
5.3.2 Scoring Revisited . . . . . . . . . . . . . . . . . . . . . 114
5.3.3 Support Vectors . . . . . . . . . . . . . . . . . . . . . 115
5.3.4 Training and Scoring Re-revisited . . . . . . . . . . . . 116
5.3.5 The Kernel Trick . . . . . . . . . . . . . . . . . . . . . 117
5.4 All Together Now . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.5 A Note on Quadratic Programming . . . . . . . . . . . . . . . 121
5.6 The Bottom Line . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
CONTENTS ix

6 A Comprehensible Collection of Clustering Concepts 133


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.2 Overview and Background . . . . . . . . . . . . . . . . . . . . 133
6.3 �-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.4 Measuring Cluster Quality . . . . . . . . . . . . . . . . . . . . 141
6.4.1 Internal Validation . . . . . . . . . . . . . . . . . . . . 143
6.4.2 External Validation . . . . . . . . . . . . . . . . . . . 148
6.4.3 Visualizing Clusters . . . . . . . . . . . . . . . . . . . 150
6.5 EM Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.5.1 Maximum Likelihood Estimator . . . . . . . . . . . . 154
6.5.2 An Easy EM Example . . . . . . . . . . . . . . . . . . 155
6.5.3 EM Algorithm . . . . . . . . . . . . . . . . . . . . . . 159
6.5.4 Gaussian Mixture Example . . . . . . . . . . . . . . . 163
6.6 The Bottom Line . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

7 Many Mini Topics 177


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.2 �-Nearest Neighbors . . . . . . . . . . . . . . . . . . . . . . . 177
7.3 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.4 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.4.1 Football Analogy . . . . . . . . . . . . . . . . . . . . . 182
7.4.2 AdaBoost . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.5 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . 186
7.6 Linear Discriminant Analysis . . . . . . . . . . . . . . . . . . 192
7.7 Vector Quantization . . . . . . . . . . . . . . . . . . . . . . . 202
7.8 Naı̈ve Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
7.9 Regression Analysis . . . . . . . . . . . . . . . . . . . . . . . 205
7.10 Conditional Random Fields . . . . . . . . . . . . . . . . . . . 208
7.10.1 Linear Chain CRF . . . . . . . . . . . . . . . . . . . . 209
7.10.2 Generative vs Discriminative Models . . . . . . . . . . 210
7.10.3 The Bottom Line on CRFs . . . . . . . . . . . . . . . 213
7.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

8 Data Analysis 219


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . 220
8.3 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8.4 ROC Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
8.5 Imbalance Problem . . . . . . . . . . . . . . . . . . . . . . . . 228
8.6 PR Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.7 The Bottom Line . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
x CONTENTS

II Applications 235

9 HMM Applications 237


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.2 English Text Analysis . . . . . . . . . . . . . . . . . . . . . . 237
9.3 Detecting Undetectable Malware . . . . . . . . . . . . . . . . 240
9.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . 240
9.3.2 Signature-Proof Metamorphic Generator . . . . . . . . 242
9.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.4 Classic Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . 245
9.4.1 Jakobsen’s Algorithm . . . . . . . . . . . . . . . . . . 245
9.4.2 HMM with Random Restarts . . . . . . . . . . . . . . 251

10 PHMM Applications 261


10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
10.2 Masquerade Detection . . . . . . . . . . . . . . . . . . . . . . 261
10.2.1 Experiments with Schonlau Dataset . . . . . . . . . . 262
10.2.2 Simulated Data with Positional Information . . . . . . 265
10.3 Malware Detection . . . . . . . . . . . . . . . . . . . . . . . . 269
10.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . 270
10.3.2 Datasets and Results . . . . . . . . . . . . . . . . . . . 271

11 PCA Applications 277


11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.2 Eigenfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.3 Eigenviruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
11.3.1 Malware Detection Results . . . . . . . . . . . . . . . 280
11.3.2 Compiler Experiments . . . . . . . . . . . . . . . . . . 282
11.4 Eigenspam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
11.4.1 PCA for Image Spam Detection . . . . . . . . . . . . . 285
11.4.2 Detection Results . . . . . . . . . . . . . . . . . . . . . 285

12 SVM Applications 289


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.2 Malware Detection . . . . . . . . . . . . . . . . . . . . . . . . 289
12.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . 290
12.2.2 Experimental Results . . . . . . . . . . . . . . . . . . 293
12.3 Image Spam Revisited . . . . . . . . . . . . . . . . . . . . . . 296
12.3.1 SVM for Image Spam Detection . . . . . . . . . . . . 298
12.3.2 SVM Experiments . . . . . . . . . . . . . . . . . . . . 300
12.3.3 Improved Dataset . . . . . . . . . . . . . . . . . . . . 304
CONTENTS xi

13 Clustering Applications 307


13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
13.2 �-Means for Malware Classification . . . . . . . . . . . . . . 307
13.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . 308
13.2.2 Experiments and Results . . . . . . . . . . . . . . . . 309
13.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 313
13.3 EM vs �-Means for Malware Analysis . . . . . . . . . . . . . 314
13.3.1 Experiments and Results . . . . . . . . . . . . . . . . 314
13.3.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 317

Annotated Bibliography 319

Index 338
Preface

“Perhaps it hasn’t one,” Alice ventured to remark.


“Tut, tut, child!” said the Duchess.
“Everything’s got a moral, if only you can find it.”
— Lewis Carroll, Alice in Wonderland

For the past several years, I’ve been teaching a class on “Topics in Information
Security.” Each time I taught this course, I’d sneak in a few more machine
learning topics. For the past couple of years, the class has been turned on
its head, with machine learning being the focus, and information security
only making its appearance in the applications. Unable to find a suitable
textbook, I wrote a manuscript, which slowly evolved into this book.
In my machine learning class, we spend about two weeks on each of the
major topics in this book (HMM, PHMM, PCA, SVM, and clustering). For
each of these topics, about one week is devoted to the technical details in
Part I, and another lecture or two is spent on the corresponding applica-
tions in Part II. The material in Part I is not easy—by including relevant
applications, the material is reinforced, and the pace is more reasonable.
I also spend a week covering the data analysis topics in Chapter 8 and
several of the mini topics in Chapter 7 are covered, based on time constraints
and student interest.1
Machine learning is an ideal subject for substantive projects. In topics
classes, I always require projects, which are usually completed by pairs of stu-
dents, although individual projects are allowed. At least one week is allocated
to student presentations of their project results.
A suggested syllabus is given in Table 1. This syllabus should leave time
for tests, project presentations, and selected special topics. Note that the
applications material in Part II is intermixed with the material in Part I.
Also note that the data analysis chapter is covered early, since it’s relevant
to all of the applications in Part II.
1
Who am I kidding? Topics are selected based on my interests, not student interest.

xiii
xiv PREFACE

Table 1: Suggested syllabus

Chapter Hours Coverage


1. Introduction 1 All
2. Hidden Markov Models 3 All
9. HMM Applications 2 All
8. Data Analysis 3 All
3. Profile Hidden Markov Models 3 All
10. PHMM Applications 2 All
4. Principal Component Analysis 3 All
11. PCA Applications 2 All
5. Support Vector Machines 3 All
12. SVM Applications 3 All
6. Clustering 3 All
13. Clustering Applications 2 All
7. Mini-topics 6 LDA and selected topics
Total 36

My machine learning class is taught at the beginning graduate level. For


an undergraduate class, it might be advisable to slow the pace slightly. Re-
gardless of the level, labs would likely be helpful. However, it’s important to
treat labs as supplemental to—as opposed to a substitute for—lectures.
Learning challenging technical material requires studying it multiple times
in multiple different ways, and I’d say that the magic number is three. It’s no
accident that students who read the book, attend the lectures, and conscien-
tiously work on homework problems learn this material well. If you are trying
to learn this subject on your own, the author has posted his lecture videos
online, and these might serve as a (very poor) substitute for live lectures.2
I’m also a big believer in learning by programming—the more code that you
write, the better you will learn machine learning.

Mark Stamp
Los Gatos, California
April, 2017

2
In my experience, in-person lectures are infinitely more valuable than any recorded or
online format. Something happens in live classes that will never be fully duplicated in any
dead (or even semi-dead) format.
About the Author

My work experience includes more than seven years at the National Security
Agency (NSA), which was followed by two years at a small Silicon Valley
startup company. Since 2002, I have been a card-carrying member of the
Computer Science faculty at San Jose State University (SJSU).
My love affair with machine learning began during the early 1990s, when
I was working at the NSA. In my current job at SJSU, I’ve supervised vast
numbers of master’s student projects, most of which involve some combination
of information security and machine learning. In recent years, students have
become even more eager to work on machine learning projects, which I would
like to ascribe to the quality of the book that you have before you and my
magnetic personality, but instead, it’s almost certainly a reflection of trends
in the job market.
I do have a life outside of work.3 Recently, kayak fishing and sailing my
Hobie kayak in the Monterey Bay have occupied most of my free time. I also
ride my mountain bike through the local hills and forests whenever possible.
In case you are a masochist, a more complete autobiography can be found at

http://www.sjsu.edu/people/mark.stamp/

If you have any comments or questions about this book (or anything else)
you can contact me via email at mark.stamp@sjsu.edu. And if you happen
to be local, don’t hesitate to stop by my office to chat.

3
Of course, here I am assuming that what I do for a living could reasonably be classified
as work. My wife (among others) has been known to dispute that assumption.

xv
Acknowledgments

The first draft of this book was written while I was on sabbatical during the
spring 2014 semester. I first taught most of this material in the fall semester
of 2014, then again in fall 2015, and yet again in fall 2016. After the third
iteration, I was finally satisfied that the manuscript had the potential to be
book-worthy.
All of the students in these three classes deserve credit for helping to
improve the book to the point where it can now be displayed in public without
excessive fear of ridicule. Here, I’d like to single out the following students
for their contributions to the applications in Part II.

Topic Students
HMM Sujan Venkatachalam, Rohit Vobbilisetty
PHMM Lin Huang, Swapna Vemparala
PCA Ranjith Jidigam, Sayali Deshpande, Annapurna Annadatha
SVM Tanuvir Singh, Annapurna Annadatha
Clustering Chinmayee Annachhatre, Swathi Pai, Usha Narra

Extra special thanks go to Annapurna Annadatha and Fabio Di Troia.


In addition to her major contributions to two of the applications chapters,
Annapurna helped to improve the end-of-chapter exercises. Fabio assisted
with most of my recent students’ projects and he is a co-author on almost
all of my recent papers. I also want to thank Eric Filiol, who suggested
broadening the range of applications. This was excellent advice that greatly
improved the book.
Finally, I want to thank Randi Cohen and Veronica Rodriguez at the
Taylor & Francis Group. Without their help, encouragement, and patience,
this book would never have been published.
A textbook is like a large software project, in that it must contain bugs.
All errors in this book are solely the responsibility of your humble scribe.
Please send me any errors that you find, and I will keep an updated errata
list on the textbook website.

xvii
Chapter 1

Introduction

I took a speed reading course and read War and Peace in twenty minutes.
It involves Russia.
— Woody Allen

1.1 What Is Machine Learning?


For our purposes, we’ll view machine learning as a form of statistical discrim-
ination, where the “machine” does the heavy lifting. That is, the computer
“learns” important information, saving us humans from the hard work of
trying to extract useful information from seemingly inscrutable data.
For the applications considered in this book, we typically train a model,
then use the resulting model to score samples. If the score is sufficiently high,
we classify the sample as being of the same type as was used to train the
model. And thanks to the miracle of machine learning, we don’t have to
work too hard to perform such classification. Since the model parameters are
(more-or-less) automatically extracted from training data, machine learning
algorithms are sometimes said to be data driven.
Machine learning techniques can be successfully applied to a wide range
of important problems, including speech recognition, natural language pro-
cessing, bioinformatics, stock market analysis, information security, and the
homework problems in this book. Additional useful applications of machine
learning seem to be found on a daily basis—the set of potential applications
is virtually unlimited.
It’s possible to treat any machine learning algorithm as a black box and, in
fact, this is a major selling points of the field. Many successful machine learn-
ers simply feed data into their favorite machine learning black box, which,
surprisingly often, spits out useful results. While such an approach can work,

1
2 INTRODUCTION

the primary goal of this book is to provide the reader with a deeper un-
derstanding of what is actually happening inside those mysterious machine
learning black boxes.
Why should anyone care about the inner workings of machine learning al-
gorithms when a simple black box approach can—and often does—suffice? If
you are like your curious author, you hate black boxes, and you want to know
how and why things work as they do. But there are also practical reasons
for exploring the inner sanctum of machine learning. As with any technical
field, the cookbook approach to machine learning is inherently limited. When
applying machine learning to new and novel problems, it is often essential to
have an understanding of what is actually happening “under the covers.” In
addition to being the most interesting cases, such applications are also likely
to be the most lucrative.
By way of analogy, consider a medical doctor (MD) in comparison to a
nurse practitioner (NP).1 It is often claimed that an NP can do about 80%
to 90% of the work that an MD typically does. And the NP requires less
training, so when possible, it is cheaper to have NPs treat people. But, for
challenging or unusual or non-standard cases, the higher level of training of
an MD may be essential. So, the MD deals with the most challenging and
interesting cases, and earns significantly more for doing so. The aim of this
book is to enable the reader to earn the equivalent of an MD in machine
learning.
The bottom line is that the reader who masters the material in this book
will be well positioned to apply machine learning techniques to challenging
and cutting-edge applications. Most such applications would likely be beyond
the reach of anyone with a mere black box level of understanding.

1.2 About This Book


The focus of this book is on providing a reasonable level of detail for a reason-
ably wide variety of machine learning algorithms, while constantly reinforcing
the material with realistic applications. But, what constitutes a reasonable
level of detail? I’m glad you asked.
While the goal here is for the reader to obtain a deep understanding of
the inner workings of the algorithms, there are limits.2 This is not a math
book, so we don’t prove theorems or otherwise dwell on mathematical theory.
Although much of the underlying math is elegant and interesting, we don’t
spend any more time on the math than is absolutely necessary. And, we’ll
1
A physician assistant (PA) is another medical professional that is roughly comparable
to a nurse practitioner.
2
However, these limits are definitely not of the kind that one typically finds in a calculus
book.
1.2 ABOUT THIS BOOK 3

sometimes skip a few details, and on occasion, we might even be a little bit
sloppy with respect to mathematical niceties. The goal here is to present
topics at a fairly intuitive level, with (hopefully) just enough detail to clarify
the underlying concepts, but not so much detail as to become overwhelming
and bog down the presentation.3
In this book, the following machine learning topics are covered in chapter-
length detail.

Topic Where
Hidden Markov Models (HMM) Chapter 2
Profile Hidden Markov Models (PHMM) Chapter 3
Principal Component Analysis (PCA) Chapter 4
Support Vector Machines (SVM) Chapter 5
Clustering (�-Means and EM) Chapter 6

Several additional topics are discussed in a more abbreviated (section-length)


format. These mini-topics include the following.

Topic Where
�-Nearest Neighbors (�-NN) Section 7.2
Neural Networks Section 7.3
Boosting and AdaBoost Section 7.4
Random Forest Section 7.5
Linear Discriminant Analysis (LDA) Section 7.6
Vector Quantization (VQ) Section 7.7
Naı̈ve Bayes Section 7.8
Regression Analysis Section 7.9
Conditional Random Fields (CRF) Section 7.10

Data analysis is critically important when evaluating machine learning ap-


plications, yet this topic is often relegated to an afterthought. But that’s
not the case here, as we have an entire chapter devoted to data analysis and
related issues.
To access the textbook website, point your browser to

http://www.cs.sjsu.edu/~stamp/ML/

where you’ll find links to PowerPoint slides, lecture videos, and other relevant
material. An updated errata list is also available. And for the reader’s benefit,
all of the figures in this book are available in electronic form, and in color.
3
Admittedly, this is a delicate balance, and your unbalanced author is sure that he didn’t
always achieve an ideal compromise. But you can rest assured that it was not for lack of
trying.
4 INTRODUCTION

In addition, extensive malware and image spam datasets can be found on


the textbook website. These or similar datasets were used in many of the
applications discussed in Part II of this book.

1.3 Necessary Background


Given the title of this weighty tome, it should be no surprise that most of
the examples are drawn from the field of information security. For a solid
introduction to information security, your humble author is partial to the
book [137]. Many of the machine learning applications in this book are
specifically focused on malware. For a thorough—and thoroughly enjoyable—
introduction to malware, Aycock’s book [12] is the clear choice. However,
enough background is provided so that no outside resources should be neces-
sary to understand the applications considered here.
Many of the exercises in this book require some programming, and basic
computing concepts are assumed in a few of the application sections. But
anyone with a modest amount of programming experience should have no
trouble with this aspect of the book.
Most machine learning techniques do ultimately rest on some fancy math.
For example, hidden Markov models (HMM) build on a foundation of dis-
crete probability, principal component analysis (PCA) is based on sophisti-
cated linear algebra, Lagrange multipliers (and calculus) are used to show
how and why a support vector machine (SVM) really works, and statistical
concepts abound. We’ll review the necessary linear algebra, and generally
cover relevant math and statistics topics as needed. However, we do assume
some knowledge of differential calculus—specifically, finding the maximum
and minimum of “nice” functions.

1.4 A Few Too Many Notes


Note that the applications presented in this book are largely drawn from your
author’s industrious students’ research projects. Note also that the applica-
tions considered here were selected because they illustrate various machine
learning techniques in relatively straightforward scenarios. In particular, it is
important to note that applications were not selected because they necessarily
represent the greatest academic research in the history of academic research.
It’s a noteworthy (and unfortunate) fact of life that the primary function of
much academic research is to impress the researcher’s (few) friends with his
or her extreme cleverness, while eschewing practicality, utility, and clarity.
In contrast, the applications presented here are supposed to help demystify
machine learning techniques.
Part I

Tools of the Trade

5
Chapter 2

A Revealing Introduction to
Hidden Markov Models

The cause is hidden. The effect is visible to all.


— Ovid

2.1 Introduction and Background


Not surprisingly, a hidden Markov model (HMM) includes a Markov pro-
cess that is “hidden,” in the sense that we cannot directly observe the state
of the process. But we do have access to a series of observations that are
probabilistically related to the underlying Markov model.
While the formulation of HMMs might initially seem somewhat contrived,
there exist a virtually unlimited number of problems where the technique
can be applied. Best of all, there are efficient algorithms, making HMMs
extremely practical. Another very nice property of an HMM is that structure
within the data can often be deduced from the model itself.
In this chapter, we first consider a simple example to motivate the HMM
formulation. Then we dive into a detailed discussion of the HMM algorithms.
Realistic applications—mostly from the information security domain—can be
found in Chapter 9.
This is one of the most detailed chapters in the book. A reason for going
into so much depth is that once we have a solid understanding of this partic-
ular machine learning technique, we can then compare and contrast it to the
other techniques that we’ll consider. In addition, HMMs are relatively easy
to understand—although the notation can seem intimidating, once you have
the intuition, the process is actually fairly straightforward.1
1
To be more accurate, your dictatorial author wants to start with HMMs, and that’s all
that really matters.

7
8 HIDDEN MARKOV MODELS

The bottom line is that this chapter is the linchpin for much of the remain-
der of the book. Consequently, if you learn the material in this chapter well,
it will pay large dividends in most subsequent chapters. On the other hand,
if you fail to fully grasp the details of HMMs, then much of the remaining
material will almost certainly be more difficult than is necessary.
HMMs are based on discrete probability. In particular, we’ll need some
basic facts about conditional probability, so in the remainder of this section,
we provide a quick overview of this crucial topic.
The notation “|” denotes “given” information, so that � (� | �) is read as
“the probability of �, given �.” For any two events � and �, we have

� (� and �) = � (�) � (� | �). (2.1)

For example, suppose that we draw two cards without replacement from a
standard 52-card deck. Let � = {1st card is ace} and � = {2nd card is ace}.
Then
� (� and �) = � (�) � (� | �) = 4/52 · 3/51 = 1/221.
In this example, � (�) depends on what happens in the first event �, so we
say that � and � are dependent events. On the other hand, suppose we flip
a fair coin twice. Then the probability that the second flip comes up heads
is 1/2, regardless of the outcome of the first coin flip, so these events are
independent. For dependent events, the “given” information is relevant when
determining the sample space. Consequently, in such cases we can view the
information to the right of the “given” sign as defining the space over which
probabilities will be computed.
We can rewrite equation (2.1) as

� (� and �)
� (� | �) = .
� (�)

This expression can be viewed as the definition of conditional probability.


For an important application of conditional probability, see the discussion of
naı̈ve Bayes in Section 7.8 of Chapter 7.
We’ll often use the shorthand “�, �” for the joint probability which, in
reality is the same as “� and �.” Also, in discrete probability, “� and �” is
equivalent to the intersection of the sets � and � and sometimes we’ll want
to emphasize this set intersection. Consequently, throughout this section

� (� and �) = � (�, �) = � (� ∩ �).

Finally, matrix notation is used frequently in this chapter. A review of


matrices and basic linear algebra can be found in Section 4.2.1 of Chapter 4,
although no linear algebra is required in this chapter.
2.2 A SIMPLE EXAMPLE 9

2.2 A Simple Example


Suppose we want to determine the average annual temperature at a particular
location on earth over a series of years. To make it more interesting, suppose
the years we are focused on lie in the distant past, before thermometers were
invented. Since we can’t go back in time, we instead look for indirect evidence
of the temperature.
To simplify the problem, we only consider “hot” and “cold” for the av-
erage annual temperature. Suppose that modern evidence indicates that the
probability of a hot year followed by another hot year is 0.7 and the proba-
bility that a cold year is followed by another cold year is 0.6. We’ll assume
that these probabilities also held in the distant past. This information can
be summarized as
� �
︂ ︂ (2.2)
� 0.7 0.3
� 0.4 0.6
where � is “hot” and � is “cold.”
Next, suppose that current research indicates a correlation between the
size of tree growth rings and temperature. For simplicity, we only consider
three different tree ring sizes, small, medium, and large, denoted �, � , and �,
respectively. Furthermore, suppose that based on currently available evi-
dence, the probabilistic relationship between annual temperature and tree
ring sizes is given by
� � �
︂ ︂ (2.3)
� 0.1 0.4 0.5
.
� 0.7 0.2 0.1
For this system, we’ll say that the state is the average annual tempera-
ture, either � or �. The transition from one state to the next is a Markov
process,2 since the next state depends only on the current state and the fixed
probabilities in (2.2). However, the actual states are “hidden” since we can’t
directly observe the temperature in the past.
Although we can’t observe the state (temperature) in the past, we can
observe the size of tree rings. From (2.3), tree rings provide us with prob-
abilistic information regarding the temperature. Since the underlying states
are hidden, this type of system is known as a hidden Markov model (HMM).
Our goal is to make effective and efficient use of the observable information,
so as to gain insight into various aspects of the Markov process.
2
A Markov process where the current state only depends on the previous state is said
to be of order one. In a Markov process of order n, the current state depends on the n
consecutive preceding states. In any case, the “memory” is finite—much like your absent-
minded author’s memory, which seems to become more and more finite all the time. Let’s
see, now where was I?
10 HIDDEN MARKOV MODELS

For this HMM example, the state transition matrix is


︂ ︂
0.7 0.3
�= , (2.4)
0.4 0.6

which comes from (2.2), and the observation matrix is


︂ ︂
0.1 0.4 0.5
�= , (2.5)
0.7 0.2 0.1

which comes from (2.3). For this example, suppose that the initial state
distribution, denoted by �, is
︀ ︀
� = 0.6 0.4 , (2.6)

that is, the chance that we start in the � state is 0.6 and the chance that
we start in the � state is 0.4. The matrices �, �, and � are row stochastic,
which is just a fancy way of saying that each row satisfies the requirements
of a discrete probability distribution (i.e., each element is between 0 and 1,
and the elements of each row sum to 1).
Now, suppose that we consider a particular four-year period of interest
from the distant past. For this particular four-year period, we observe the
series of tree ring sizes �, �, �, �. Letting 0 represent �, 1 represent � , and 2
represent �, this observation sequence is denoted as
︀ ︀
� = 0, 1, 0, 2 . (2.7)

We might want to determine the most likely state sequence of the Markov
process given the observations (2.7). That is, we might want to know the most
likely average annual temperatures over this four-year period of interest. This
is not quite as clear-cut as it seems, since there are different possible inter-
pretations of “most likely.” On the one hand, we could define “most likely”
as the state sequence with the highest probability from among all possible
state sequences of length four. Dynamic programming (DP) can be used to
efficiently solve this problem. On the other hand, we might reasonably define
“most likely” as the state sequence that maximizes the expected number of
correct states. An HMM can be used to find the most likely hidden state
sequence in this latter sense.
It’s important to realize that the DP and HMM solutions to this problem
are not necessarily the same. For example, the DP solution must, by defini-
tion, include valid state transitions, while this is not the case for the HMM.
And even if all state transitions are valid, the HMM solution can still differ
from the DP solution, as we’ll illustrate in an example below.
Before going into more detail, we need to deal with the most challenging
aspect of HMMs—the notation. Once we have the notation, we’ll discuss the
Exploring the Variety of Random
Documents with Different Content
Kiot, 6,
San Marte’s view, 99-100, 107-08, 121,
and Wolfram, 261-63.

Klinschor, 253, 263.

Knight Errantry, 229.

Knighthood, prototype of in Celtic tradition, 231.

Knights of the Red Branch, 231.

Knowles’ Said and Saiyid, 196.

Koch, Kyffhäuser Sage, 197.

Köhler, 195.

Kundry in Wagner, 254-55, 263.


See Loathly Damsel.

Küpp on Pseudo-Chrestien, 8, 126,


and the branch, 193, 262.

Kynddelw, 219.

Lambar, 83-84, 86, 183.

Lame King, see Maimed King.

Lance, 109,
and Grail legend according to Birch-Hirschfeld, 111, 113, 121.

Lancelot, 83, 84, 108, 110, 112, 118, 119, 123, 172-173, 180, 240,
245.

Latin original of French romances probable, 122.

Liebrecht, 197-98.

Llyr Llediath, 219-20.

Loathly Damsel, 87,


and Rosette, 114,
in Mabinogi and Chrestien, 136,
hero’s cousin, 139-41,
double origin of in romances, 205-06,
and Wagner, 254.

Longis, 70.

Luces de Gast, 118-19.

Luces (Lucius), 91, 219.

Lufamour, 147.

Lug Lamhfhada, 184, 189, 192.

Mabinogi of Peredur (generally Mabinogi sometimes Peredur)


numbered H 3, 5, 66, 68, 69,
Villemarqué on, 97-98, 89,
Simrock on, 100, 101,
Nash, 102, 104,
Hucher, 106,
lateness of according to Birch-Hirschfeld, 114-115, 125-26,
relation to Conte du Graal, 131-37,
dwarves incident in, 134,
greater delicacy in Blanchefleur incident, 135,
blood drops incident, 137-38,
differences with Chrestien, 138-39,
machinery of Quest in, 139-42,
relation to Manessier, 142-44,
origin and development of, 143-145,
special indebtedness to Chrestien, 145, 146,
relation to Sir Perceval, 148-49,
counsels in, 150,
apparent absence of Grail from, 151,
comparison with Great Fool tale, 154-57,
with Great Fool Lay, 161-62, 164,
with Gerbert’s witch incident, 168-69, 171,
visit to Talismans Castle in, 172-73 and 176, 180, 181, 183, 184,
190, 216,
fusion of numerous Celtic tales in, 225-26,
Sex-relations in, 241, 256.

Maidens’ Castle, parallels to in Celtic tradition, 191-94.

Maimed or Lame or Sick King, 66, 83-88, 90, 91, 109,


parallel with Arthur, 122,
probable absence from Proto Mabinogi, 145,
belongs to Feud Quest, 198,
parallel to Fionn, 202, 237.

Malory, 236.

Manaal, 84.

Manannan mac Lir, 192-94, 208,


and Bran, 219.

Manessier, numbered A III, 1-2,


date etc., 4-5, 69-71, 73-74, 77, 81, 88, 92, 95, 110, 121, 138,
relation to the Mabinogi, 142-46, 168-69, 171, 175,
disregard of question, 180-82, 199, 245-46.
Manus, 189-90.

Mapes or Map, 5, 104, 105,


not author of Queste or Grand St. Graal according to Birch-
Hirschfeld, 117-19.

Martin’s views, 121-26,


Kyffhäuser hypothesis criticised, 197, 198,
Wolfram and Gerbert, 262.

Meaux, 120.

Menglad, 232.

Merlin, 92, 114, 124.

Merlin, Borron’s poem, 2, 64d, 105, 106, 112-13, 117.

Meyer, Kuno, 209, 233.

Minnedienst, 240-41.

Modred, 122.

Montsalvatch, 66.

Mordrains, 90, 109-10, 120, 173.

Morgan la Fay, 122.

Morvan lez Breiz, 148, 158, 162.

Moys or Moses, 88-90, 106, 109, 112, 116.

Mythic conceptions in the romances, 205.


Nasciens, 76, 83, 85, 120.

Nash, 102.

Nibelungenlied, 230, 234, 248.

Nicodemus, 71.

Noisi, 137, 233.

O’Daly, 159-61, 163.

Odin, 100-01.

O’Donovan, 185, 209, 213.

Oengus of the Brug, 191-92,


and swanmaid, 196.

O’Flanagan, 233.

Ogma, 188.

Oisin, 195, 200,


and Gwion, 210, 232.

O’Kearney, 201.

Orgueilleuse, Celtic character of, 124 and 232,


illustrates mediæval morality, 240-41, 263.

Osiris, 101.
Pagan essence of Grail etc. in the Christianised romances, 238.

Partinal, 81, 88, 142-43.

Parzival, 101, 252-53.


See Perceval and Wolfram.

Paulin-Paris, 5,
explanation of word Grail, 103, 111, 116-17, 119.

Pearson on the Veronica legend, 222,


and St. Brandan, 265.

Peleur, 83.

Pelleans or Pellehem, 83-86, 90.

Pelles, 83-86, 90.

Perceval, Perceval-Quest, type hero of Quest, 66-67, 72, 78,


relation to the Grail-keeper, 80-86, 88-89, 91-92,
oldest hero of Quest, 93, 94, 98, 101, 102-04,
according to Birch-Hirschfeld, 110-119, 125,
in Didot-Perceval and Conte du Graal, 127-31,
in Mabinogi and Conte du Graal, 131-45,
relation to (bespelled) cousin, 139-42,
relation of existing versions to earliest form, 146,
in the Thornton MS. romance, 147-51,
hero of Expulsion and Return Formula, 153-56,
parallel with Highland folk-tales, 157-58,
relation to Twin Brethren folk-tale and dualism in, 162-64, 169,
versions of Quest, 171-76,
visit to the Maidens’ Castle, 178-79, 180, 181,
significance of Didot-Perceval form, 182, 187,
and sword, 189,
Castle of Maidens, 191, 195, 199,
parallel with Diarmaid, 202,
possible hero of Haunted Castle form, 204-05,
relation to Fisher, 207,
his silence, 211-14, 226,
superiority to Galahad Quest, 236, 237-38, 240-41, 245, 247, 254,
256, 261-62.
See also Parzival and Peredur.

Perceval’s aunt, 79.

Perceval’s sister, 83-84, 163.

Perceval’s uncle, 78.

Perceval le Gallois, numbered G 3, authorship, 6, 65-66, 69, 104,


121, 126, 246.

Peredur (hero of Mabinogi = Perceval), Peredur-saga, 106,


mother of, 115, 132-36,
parallel to Tom of the Goat-skin, 134,
the sword test, 138,
hero of the stag hunt, 139-42, 143,
original form of saga, 144-45, 153-54, 157, 162, 163, 164, 168-69,
and Fionn, 187 and 203, 220,
fish absent from, 224,
genesis and growth of, 225-227, 228,
Blanchefleur incident in, 241.
See Perceval.

Peronnik l’idiot, 125, 158.

Perseus, 256.

Petrus, 77, 82, 88-90, 106, 109, 112,


connection with Geoffrey conversion legend, 219.
Pfaffe Amis, 265.

Pilate, 65, 70.

Potter Thompson and Arthur, 198, 262.

Potvin, 1, 2, 6,
his views, 104, 174, 177.

Prester John, 100.

Procopius, 191.

Promised or Good Knight, and Grail Keeper, 80-86,


Galahad as, 85-86,
work of, 86-91,
qualifications of, 92-93, 107, 109.

Prophecy incident in Grail romances, 156.

Pseudo-Chrestien, 8, 209.

Pseudo-Gautier, numbered aIIa, 2, 15-16, 70, 72, 74, 77, 79, 81, 95.

Pseudo-Manessier, numbered aIIIa, 2, 19, 72-73.

Queste del St. Graal, numbered D 2-3, varying redactions


distinguished typographically, 38, 65-67, 72, 75-76, 79,
three drafts of, 83-86, 90-91,
glorification of virginity in, 93, 95, 103, 107,
relation to Grand St. Graal, 108-09,
to Conte du Graal, 110-11, 112, 113,
authorship of, 117-20, 121, 126, 131, 146,
visit to Grail Castle in, 172-73, 180, 183, 186, 207, 218, 220, 222,
224, 226, 236,
ideal of, 238-40 and 243-44,
ideal criticised, 243-44,
merits of, 244-45, 246,
inferiority to Wolfram, 250, 251.

Question, Birch-Hirschfeld’s opinion, 171, 180,


belongs to Unspelling Quest, 181-82, 191, 196, 203,
Wolfram’s presentment, 249-50.

Red Knight, 147-49, 155-56, 162, 189.

Renan on Celtic poetry, 234-35.

Rhys, 198, 209, 211,


Bran legend, 219-20, 265.

Rich Fisher or King. See Fisher King.

Riseut, 141.

Robert de Borron. See Borron.

Rochat, 19,
his views, 101-02.

Roland, 229, 232.

Roménie, 118.

Rosette, 130, 141.


See Loathly Damsel.

Salmon of Wisdom, 209-10.


San Marte, views, 99-100, 101-02,
and Wolfram, 250-5.

Sarras, 72, 77, 79.

Schröder, Brandan legend, 264-65.

Seat, empty or Perillous, 81-82, 88-90.

Secret words, 73, 89, 179.

Seraphe, 108.

Sex-relations in Middle Ages, 240-42.

Siegfried, 157, 162, 203, 210, 232-33.

Simei, 90.

Simrock, views, 100-101, 103, 132, 134, 164, 251, 261-62.

Skeat, 104.

Skene, 219-20.

Sleep and the Magic Castle myth, 202-03.

Sleeping Beauty, parallel with Heinrich’s version, 203,


ethical import of, 258.

Solomon’s sword, 84.


See Sword.

Sons of Usnech, 137, 233.

Sorceresses of Gloucester, 101, 139, 156.


Spontaneity of folk tradition, 254, 257-58.

Stag Hunt in Conte du Graal and Mabinogi, 139-40,


in Didot-Perceval, 141,
parallel with Lay of Great Fool, 162.

Steinbach on Sir Perceval, 147-50.

Stephens, 219-20.

Stokes, 188, 200, 233.

Suetonius, 116.

Sword, 113, 142,


belongs more to Feud Quest, 180-82,
found also in Unspelling Quest, 183,
of Lug, 184,
in Celtic myth, 187-90, 198-99.

Taboo and Geasa, 214.

Taliesin, 97, 186,


and Oisin, 210-11.

Templars, 100.

Tennyson, 236, 244.

Tethra, 188.

Thor, Irish parallels to, 200-01.

Thornton MS. Sir Perceval (often simply Sir Perceval), numbered I 4,


66, 68-69, 101-02, 125, 126,
Steinbach’s theory of, 147-50,
criticised, 149,
absence of Grail from, 151,
connection with Great Fool tale, 154-58, 162, 164-65,
witch incident, 169, 190, 225.

Tír-na n-Og, 191, 195, 223, 248, 264.

Titurel, 66.

Titus, 107.

Trinity, symbolizing of, 88.

Tuatha de Danann, treasures of, 184-85, 189-92, 223, 230.

Two Brothers tale, 157, 162-63.

Ultonian cycle, 185.

Unspelling Quest, 181,


Celtic parallels to, 190-206, 208.

Urban (Urlain), 83, 84, 183.

Van Santen, 252.

Vanishing of Bespelled Castle, 202-03.

Veronica (Verrine), 79, 116,


Ward’s theory, 222.

Vespasian, 107, 116.


Vessel in Celtic myth, 184,
in Ultonian cycle, 185,
in Welsh myth, 186,
in Celtic folk-tales, 187.
See Grail.

Villemarqué, views 97-98, 101, 131, 148.

Virginity, 247.

Wagner, 252-54.

Ward, 220, 222.

Wartburg Krieg and Brandan legend, 264.

William of Malmesbury, 105,


Zarncke’s opinion of, 107, 115,
Ward’s opinion of, 220.

Windisch, 188, 219.

Witch who brings the dead to life, 165-69.

Wolfram von Eschenbach, numbered F 3, sources, 6, 25-26, 65-67,


69,
and Gerbert, 92, 99-102, 104, 107, 121-25, 150, 157,
brother incident in, 164, 172-73,
branch in, 193,
magician lord, 199,
account of mediæval morality, 240-41, 246,
ideal of, 248-52, 254, 255, 256,
pattern for future growth of legend, 261,
relation to Chrestien, 261-63.
Woman in Celtic tradition, 231-33.

Wülcker, Evangelium Nicodemi, 220-21.

Zarncke, views, 106-07, 115, 132, 220.

HARRISON AND SONS,


PRINTERS IN ORDINARY TO HER MAJESTY,
ST. MARTIN’S LANE, LONDON.

Footnotes:
[1] Fully described by Potvin, VI, lxix, etc.
[2] Potvin, VI, lxxv, etc.
[3] Birch-Hirschfeld: Die Sage vom Gral, 8vo., Leipzig, 1877, p. 81.
[4] Birch-Hirschfeld, p. 89.
[5] Birch-Hirschfeld, p. 110.
[6] Birch-Hirschfeld, p. 232, quoting the colophon of a Paris MS.,
after Paulin Paris, Cat. des MSS. français, vol. ii, pp. 361, etc.
[7] Birch-Hirschfeld, p. 143.
[8] This prologue is certainly not Chrestien’s work; but there is no
reason to doubt that it embodies a genuine tradition, and affords
valuable hints for a reconstruction of the original form of the story.
Cf. Otto Küpp in Zeitschrift für deutsche Philologie, vol. xvii., No. 1.
[9] Potvin’s text, from the Mons MS., is taken as basis.
[10] Several MSS. here intercalate the history of Joseph of
Arimathea: Joseph of Barimacie had the dish made; with it he
caught the blood running from the Saviour’s body as it hung on the
Cross, he afterwards begged the body of Pilate; for the devotion
showed the Grail he was denounced to the Jews, thrown into prison,
delivered thence by the Lord, exiled together with the sister of
Nicodemus, who had an image of the Lord. Joseph and his
companions came to the promised land, the White Isle, a part of
England. There they warred against them of the land. When Joseph
was short of food he prayed to the Creator to send him the Grail
wherein he had gathered the holy blood, after which to them that
sat at table the Grail brought bread and wine and meat in plenty. At
his death, Joseph begged the Grail might remain with his seed, and
thus it was that no one, of however high condition, might see it save
he was of Joseph’s blood. The Rich Fisher was of that kin, and so
was Greloguevaus, from whom came Perceval.
It is hardly necessary to point out that this must be an interpolation,
as if Gauvain had really learnt all there was to be told concerning the
Grail, there would have been no point in the reproaches addressed
him by the countryfolk. The gist of the episode is that he falls asleep
before the tale is all told.
[11] The existence of this fragment shows the necessity of collating
all the MSS. of the Conte du Graal and the impossibility of arriving at
definite conclusions respecting the growth of the work before this is
done. The writer of this version evidently knew nothing of Queste or
Grand St. Graal, whilst he had knowledge of Borron’s poem, a fact
the more remarkable since none of the other poets engaged upon
the Conte du Graal knew of Borron, so far, at least, as can be
gathered from printed sources. It is hopeless in the present state of
knowledge to do more than map out approximately the leading
sections of the work.
[12] It is by no means clear to me that Gerbert’s portion of the
Conte du Graal is an interpolation. I am rather inclined to look upon
it as an independent finish. As will be shown later on, it has several
features in common with both Mabinogi and Wolfram, features
pointing to a common prototype.
[13] In the solitary MS. which gives this version, it follows, as has
already been stated, prose versions of Robert de Borron’s undoubted
poems, “Joseph of Arimathea” and “Merlin.”
[14] Birch-Hirschfeld, in his Summary (p. 37, l. 22) or his MS.
authority, B.M., xix, E. iii., has transposed the relationships.
[15] And buried it, adds B. H. in his Summary, whether on MS.
authority or not I cannot say, but the Welsh translation has—“there
was a period of 240 years” (an obvious mistake on the part of the
translator) “after the passion of J. C. when Jos. of A. came; he who
buried J. C. and drew him down from the cross.”
[16] Thus was Evelach called as a Christian, adds B. H. Here W.
agrees with Furnivall.
[17] Here Birch-Hirschfeld’s Summary agrees with W.
[18] B. H. agrees with W.
[19] According to B. H., the recluse tells him he has fought with his
friends, whereupon, ashamed, he hurries off.
[20] B. H. here agrees with W.
[21] B. H. has five candles.
[22] B. H.: “When will the Holy Vessel come to still the pain I feel?
Never suffered man as I.”
[23] B. H. agrees with W.
[24] B. H. agrees with Furnivall.
[25] B. H., the ninth.
[26] B. H., the vision is that of a crowned old man, who with two
knights worships the cross.
[27] B. H., Nasciens.
[28] B. H. has all this passage, save that the references to the vision
at the cross-ways seem omitted.
[29] B. H., the latter.
[30] B. H., in Chaldee.
[31] B. H., Labran slays Urban.
[32] The 1488 text has Urban.
[33] B. H., Thus was the King wounded, and he was Galahad’s
grandfather.
[34] It does not appear from B. H.’s Summary whether his text
agrees with F. or W.
[35] B. H., seven knights.
[36] B. H., that was the Castle of Corbenic where the Holy Grail was
kept.
[37] B. H., the Castle of the Maimed King.
[38] B. H., ten. Obviously a mistake on the part of his text, as the
nine with the three Grail questers make up twelve, the number of
Christ’s disciples.
[39] B. H., three.
[40] B. H. agrees with F.
[41] One cannot see from B. H. whether his text agrees with F. or W.
[42] B. H. agrees with F.
[43] It will be advisable to give here the well-known passage from
the chronicle of Helinandus, which has been held by most
investigators to be of first-rate importance in determining the date of
the Grand St. Graal. The chronicle ends in the year 1204, and must
therefore have been finished in that or the following year, and as the
passage in question occurs in the earlier portion of the work it may
be dated about two years earlier (Birch-Hirschfeld, p. 33). “Hoc
tempore (717-719) in Britannia cuidam heremitae demonstrata fuit
mirabilis quaedam visio per angelum de Joseph decurione nobili, qui
corpus domini deposuit de cruce et de catino illo vel paropside, in
quo dominus caenavit cum discipulis suis, de quo ab eodem
heremita descripta est historia quae dicitur gradale. Gradalis autem
vel gradale gallice dicitur scutella lata et aliquantulum profunda, in
qua preciosae dapes divitibus solent apponi gradatim, unus
morsellus post alium in diversis ordinibus. Dicitur et vulgari nomine
greal, quia grata et acceptabilis est in ea comedenti, tum propter
continens, quia forte argentea est vel de alia preciosa materia, tum
propter contentum .i. ordinem multiplicem dapium preciosarum.
Hanc historiam latine scriptam invenire non potui sed tantum gallice
scripta habetur a quibusdem proceribus, nec facile, ut aiunt, tota
inveniri potest.”
The Grand St. Graal is the only work of the cycle now existing to
which Helinandus’ words could refer; but it is a question whether he
may not have had in view a work from which the Grand St. Graal
took over its introduction. Helinandus mentions the punning origin of
the word “greal” (infra, p. 76), which is only hinted at in the Grand
St. Graal, but fully developed elsewhere, e.g., in the Didot-Perceval
and in Borron’s poem.
Another point of great interest raised by this introduction will be
found dealt with in Appendix B.
[44] The MS. followed by Furnivall has an illustration, in which
Joseph is represented as sitting under the Cross and collecting the
blood from the sides and feet in the basin.
[45] MS. reading.
[46] I have not thought it necessary to give a summary of the prose
romance Perceval le Gallois. One will be found in Birch-Hirschfeld,
pp. 123-134. The version, though offering many interesting features,
is too late and unoriginal to be of use in the present investigation.
[47] Cf. p. 78 as to this passage.
[48] It is forty-two years, according to D. Queste (p. 119), after the
Passion that Joseph comes to Sarras.
[49] It is plain that B I is abridged in the passage dealt with, from
the following fact: Joseph (v. 2,448, etc.) praying to Christ for help,
reminds Him of His command, that when he (Joseph) wanted help
he should come “devant ce veissel precieus Où est votre sans
glorieus.” Now Christ’s words to Joseph in the prison say nothing
whatever about any such recommendation; but E, Grand St. Graal,
does contain a scene between our Lord and Joseph, in which the
latter is bidden, “Et quant tu vauras à moi parler si ouuerras l’arche
en quel lieu que tu soies” (I, 38-39) from which the conclusion may
be drawn that B I represents an abridged and garbled form of the
prototype of E.
[50] In the Mabinogi of Branwen, the daughter of Llyr, the warriors
cast into the cauldron of renovation come forth on the morrow
fighting men as good as they were before, except that they are not
able to speak (Mab., p. 381).
[51] The version summarised by Birch-Hirschfeld.
[52] Curiously enough this very text here prints Urban as the name
of the Maimed King; Urban is the antagonist of Lambar, the father of
the Maimed King in the original draft of the Queste, and his mention
in this place in the 1488 text seems due to a misprint. In the episode
there is a direct conflict of testimony between the first and second
drafts, Lambar slaving Urlain in the former, Urlain Lambar in the
latter.
[53] This account agrees with that of the second draft of the Queste,
in which Urlain slays Lambar.
[54] Only one beholder of the Quest is alluded to, although in the
Queste, from which the Grand St. Graal drew its account, three
behold the wonders of the Grail.
[55] This, of course, belongs to the second of the two accounts we
have found in the poem respecting the Promised Knight, the one
which makes him the grandson and not the son merely of Brons.
[56] The object of the Quest according to Heinrich von dem Türlin
will be found dealt with in Chapter VII.
[57] This is one of a remarkable series of points of contact between
Gerbert and Wolfram von Eschenbach.
[58] It almost looks as if the author of C were following here a
version in which the hero only has to go once to the Grail Castle;
nothing is said about Perceval’s first unsuccessful visit, and Merlin
addresses Perceval as if he were telling him for the first time about
matters concerning which he must be already fully instructed.
[59] It is remarkable, considering the scanty material at his disposal,
how accurate Schulz’ analysis is, and how correct much of his
argumentation.
[60] Wagner has admirably utilised this hint of Simrock’s in his
Parsifal, when his Kundry (the loathly damsel of Chrestien and the
Mabinogi) is Herodias. Cf. infra, Ch. X.
[61] Excepting, of course, the late fifteenth and early sixteenth
century Paris imprints, which represented as a rule, however, the
latest and most interpolated forms, and Mons. Fr. Michel’s edition of
Borron’s poem.
[62] Hucher’s argument from v. 2817 (supra p. 106) that the poem
knew of the Grand St. Graal is, however, not met.
[63] Vide p. 200, for Birch-Hirschfeld’s summary comparison of the
two works, and cf. infra p. 127.
[64] Cf. infra p. 128, for a criticism of this statement.
[65] Opera V. 410: Unde et vir ille eloquio clarus W. Mapus,
Oxoniensis archidiaconus (cujus animae propitietur Deus) solita
verborum facetia et urbanitate praecipua dicere pluris et nos in hunc
modum convenire solebat: “Multa, Magister Geralde, scripsistis et
multum adhue scribitis, et nos multa diximus. Vos scripta dedistis et
nos verba.”
[66] Printed in full, Hucher, I. 156, etc.
[67] Printed by Hucher, I. p. 35, etc.
[68] The remainder of Birch-Hirschfeld’s work is devoted to proving
that Chrestien was the only source of Wolfram von Eschenbach, the
latter’s Kiot being imagined by him to justify his departure from
Chrestien’s version; departures occasioned by his dissatisfaction with
the French poet’s treatment of the subject on its moral and spiritual
side. This element in the Grail problem will be found briefly dealt
with, Appendix A.
[69] I have not thought it necessary, or even advisable, to notice
what the “Encyclopædia Britannica” (Part XLI, pp. 34, 35) and some
other English “authorities” say about the Grail legends.
[70] They are brought together by Hucher, vol. i, p. 383, etc.
[71] In the preface to the second volume of his edition of Chrestien’s
works (Halle, 1887), W. Förster distinguishes Peredur from the Lady
of the Fountain and from Geraint, which he looks upon as simple
copies of Chrestien’s poems dealing with the same subjects. Peredur
has, he thinks, some Welsh features.
[72] It is perhaps only a coincidence that in Gautier the “pucelle de
malaire” is named Riseut la Bloie, and that Rosette la Blonde is the
name of the loathly damsel whom Perceval meets in company of the
Beau Mauvais, and whom Birch-Hirschfeld supposes to have
suggested to Chrestien his loathly damsel, the Grail messenger. But
from the three versions one gets the following:—Riseut (Gautier),
loathly damsel (Didot-Perceval), Grail messenger (Chrestien), =
Peredur’s cousin, who in the Mabinogi is the loathly Grail messenger,
and the protagonist in the stag-hunt.
[73] I have not thought it necessary to discuss seriously the
hypothesis that Chrestien may have used the Mabinogi as we now
have it. The foregoing statement of the facts is sufficient to negative
it.
[74] The Counsels. Chrestien (v. 1,725, etc.): aid dames and damsels,
for he who honoureth them not, his honour is dead; serve them
likewise; displease them not in aught; one has much from kissing a
maid if she will to lie with you, but if she forbid, leave it alone; if she
have ring, or wristband, and for love or at your prayer give it, ’tis
well you take it. Never have comradeship with one for long without
seeking his name; speak ever to worthy men and go with them; ever
pray in churches and monasteries (then follows a dissertation on
churches and places of worship generally). Mabinogi (p. 83):
wherever a church, repeat there thy Paternoster; if thou see meat
and drink, and none offer, take; if thou hear an outcry, especially of
a woman, go towards it; if thou see a jewel, take and give to
another to obtain praise thereby; pay thy court to a fair woman,
whether she will or no, thus shalt thou render thyself a better man
than before. (In the italicised passage the Mabinogi gives the direct
opposite of Chrestien, whom he has evidently misunderstood.) Sir
Perceval (p. 16): “Luke thou be of mesure Bothe in haulle and
boure, And fonde to be fre.” “There thou meteste with a knyghte, Do
thi hode off, I highte, and haylse hym in hy” (He interprets the
counsel to be of measure by only taking half the food and drink he
finds at the board of the lady of the tent. The kissing of the lady of
the tent which follows is in no way connected with his mother’s
counsel.) Wolfram: “Follow not untrodden paths; bear thyself ever
becomingly; deny no man thy greeting; accept the teaching of a
greybeard; if ring and greeting of a fair woman are to be won strive
thereafter, kiss her and embrace her dear body, for that gives luck
and courage, if so she be chaste and worthy.” Beside the mother’s
counsels Perceval is admonished by Gonemans or the personage
corresponding to him. In Chrestien (2,838, et seq.) he is to deny
mercy to no knight pleading for it; to take heed he be not over-
talkful; to aid and counsel dames and damsels and all others
needing his counsel; to go often to church; not to quote his mother’s
advice, rather to refer to him (Gonemans). In the Mabinogi he is to
leave the habits and discourse of his mother; if he see aught to
cause him wonder not to ask its meaning. In Wolfram he is not to
have his mother always on his lips; to keep a modest bearing; to
help all in need, but to give wisely, not heedlessly; and in especial
not to ask too much; to deny no man asking mercy; when he has
laid by his arms to let no traces thereof be seen, but to wash hands
and face from stain of rust, thereby shall ladies be pleased; to hold
women in love and honour; never to seek to deceive them (as he
might do many), for false love is fleeting and men and women are
one as are sun and daylight.—There seems to me an evident
progression in the ethical character of these counsels. Originally they
were doubtless purely practical and somewhat primitive of their
nature. As it is, Chrestien’s words sound very strange to modern
ears.
[75] In the notes to my two articles in the “Folk-Lore Record” will be
found a number of references establishing this fact.
[76] The hero renews his strength after his various combats by
rubbing himself with the contents of a vessel of balsam. He has
moreover to enter a house the door of which closes to of itself (like
the Grail Castle Portcullis in Wolfram), and which kills him. He is
brought to life by the friendly raven. The mysterious carlin also
appears, “there was a turn of her nails about her elbows, and a twist
of her hoary hair about her toes, and she was not joyous to look
upon.” She turns the hero’s companions into stone, and to unspell
them he must seek a bottle of living water and rub it upon them,
when they will come out alive. This is like the final incident in many
stories of the Two Brothers class. Cf. note, p. 162.
[77] O’Daly’s version consists of 158 quatrains; Campbell’s of 63.
The correspondence between them, generally very close (frequently
verbal), is shown by the following table:—

O’D., 1, 2. C., 1, 2.
— C., 3.
O’D., 3. C., 4.
O’D., 4-15. —
O’D., 16. C., 4.
O’D., 17-24. C., 5-12.
O’D., 25. —
— C., 13-15.
O’D., 26-47. C., 16-36.
O’D., 48-56. —
O’D., 57-61. C., 37-40.
O’D., 62. —
O’D., 63-65. C., 41-43.
O’D., 66. C., 45.
O’D., 67. C., 44.
O’D., 68, 69. C., 46, 47.
O’D., 70. C., 49.
O’D., 71. C., 48.
— C., 50.
O’D., 72. C., 52.
O’D., 73. —
O’D., 74. C., 53.
O’D., 75. C., 54.
O’D., 76-80. C., 55-59.
O’D., 81-134. —
O’D., 135, 136. C., 60, 61.
— C., 62.
O’D., 137. —
O’D., 138. C., 63.
O’D., 139-158. —

[78] Of this widely spread group, Grimm’s No. 60, Die zwei Brüder,
may be taken as a type. The brethren eat heart and liver of the gold
bird and thereby get infinite riches, are schemed against by a
goldsmith, who would have kept the gold bird for himself, seek their
fortunes throughout the world accompanied by helping beasts, part
at crossways, leaving a life token to tell each one how the other
fares; the one delivers a princess from a dragon, is cheated of the
fruit of the exploit by the Red Knight, whom after a year he
confounds, wins the princess, and, after a while, hunting a magic
hind, falls victim to a witch. His brother, learning his fate through the
life token, comes to the same town, is taken for the young king even
by the princess, but keeps faith to his brother by laying a bare sword
twixt them twain at night. He then delivers from the witch’s spells
his brother, who, learning the error caused by the likeness, and
thinking advantage had been taken of it, in a fit of passion slays
him, but afterwards, hearing the truth, brings him back to life again.
Grimm has pointed out in his notes the likeness between this story
and that of Siegfried (adventures with Mimir, Fafnir, Brunhilde, and
Gunnar). In India the tale figures in Somadeva’s Katha Sarit Sagara
(Brockhaus’ translation, ii., 142, et seq.). The one brother is
transformed into a demon through accidental sprinkling from a body
burning on a bier. He is in the end released from this condition by his
brother’s performing certain exploits, but there is no similarity of
detail. Other variants are Zingerle (p. 131) where the incident occurs
of the hero’s winning the king’s favour by making his bear dance
before him; this I am inclined to look upon as a weakened
recollection of the incident of a hero’s making a princess laugh,
either by playing antics himself or making an animal of his play them
(see supra, p. 134, Kennedy’s Irish Tale). Grimm also quotes Meier
29 and 58, but these are only variants of the dragon-killing incident.
In the variant of 29, given p. 306, the hero makes the king laugh,
and in both stories occurs the familiar incident of the hero coming
unknown into a tournament and overcoming all enemies, as in
Peredur (Inc. 9). Wolf., p. 369, is closer, and here the hero is
counselled by a grey mannikin whom he will unspell if he succeeds.
Stier, No. I. (not p. 67, as Grimm erroneously indicates) follows
almost precisely the same course as Grimm’s 60, save that there are
three brothers. Graal, p. 195, has the magic gold bird opening, but
none of the subsequent adventures tally. Schott, No. 11, is also cited
by Grimm, but mistakenly; it belongs to the faithful-servant group.
Very close variants come from Sweden (Cavallius-Oberleitner, Va,
Vb) and Italy (Pentamerone, I. 7 and I. 9). The Swedish tales have
the miraculous conception opening, which is a prominent feature in
tales belonging to the Expulsion and Return group (e.g., Perseus,
Cu-Chulaind, and Taliesin), but present otherwise very nearly the
same incidents as Grimm. The second of the Italian versions has the
miraculous conception opening so characteristic of this group of folk-
tales, and of the allied formula group, the attainment of riches
consequent upon eating the heart of a sea dragon, the tournament
incident (though without the disguise of the hero), the stag hunt,
wherein the stag, an inimical wizard haunting the wood, is a
cannibal and keeps the captured hero for eating. In the story of the
delivery by the second brother, the separating sword incident occurs.
The first version opens with what is apparently a distorted and
weakened form of the hero’s clearing a haunted house of its
diabolical inmates (see infra Ch. VII., Gawain) and then follows very
closely Grimm’s Two Brothers, save that the alluring witch is young
and fair, the whole tale being made to point the moral, “more luck
than wit.” Straparola, a 3, is a variant of the dragon fight incident
alone. It is impossible not to be struck by the fact that in this widely
spread group of tales are to be found some of the most
characteristic incidents of the Perceval and allied Great Fool group.
The only version, however, which brings the two groups into formal
contact is O’Daly’s form of the Great Fool.
[79] The brother feature appears likewise in Wolfram von
Eschenbach, where Parzival’s final and hardest struggle is against
the unknown brother, as the Great Fool’s is against the Gruagach.
This may be added to other indications that Wolfram did have some
other version before him besides Chrestien’s.
[80] I cannot but think that these words have connection with the
incident in the English Sir Perceval of the hero’s throwing into the
flames and thus destroying his witch enemy.
[81] I must refer to my Mabinogion Studies, I. Branwen for a
discussion of the relation of this tale with Branwen and with the
Teutonic Heldensage.
[82] Another parallel is afforded by the tale of Conall Gulban
(Campbell, III., 274). Conall, stretched wounded on the field, sees
“when night grew dark a great Turkish carlin, and she had a white
glaive of light with which she could see seven miles behind her and
seven miles before her; and she had a flask of balsam carrying it.”
The dead men are brought to life by having three drops of balsam
put into their mouths. The hero wins both flask and glaive.
[83] Cf. my Branwen for remarks on the mythological aspect of the
ballad. It should be noted that most of the ballads traditionally
current in the Highlands are of semi-literary origin, i.e., would seem
to go back to the compositions of mediæval Irish bards, who often
sprinkled over the native tradition a profusion of classical and
historical names. I do not think the foreign influence went farther
than the “names” of some personages, and such as it is is more at
work in the ballads than in the tales.
[84] This may seem to conflict with the statement made above (p.
145), that the Mabinogi probably took over the maimed uncle from
Chrestien. But there were in all probability several forms of the
story; that hinted at in Chrestien and found in Manessier had its
probable counterpart in Celtic tradition as well as that found in
Gerbert. It is hardly possible to determine what was the form found
in the proto-Mabinogi, the possibility of its having been exactly the
same as that of Gerbert is in no way affected by the fact that the
Mabinogi, as we now have it, has in this respect been influenced by
Chrestien. Meanwhile Birch-Hirschfeld’s hypothesis that Gerbert’s
section of the Conte du Graal is an interpolation between Gautier
and Manessier is laid open to grave doubt. It is far more likely that
Gerbert’s work was an independent and original attempt to provide
an ending for Chrestien’s unfinished poem, and that he had before
him a different version of the original from that used by Gautier and
Manessier.
[85] It occurs also in Peredur (Inc. 16), where the hero comes to the
Castle of the Youths, who, fighting every day against the Addanc of
the Cave, are each day slain, and each day brought to life by being
anointed in a vessel of warm water and with precious balsam.
[86] For the second time, if Gerbert’s continuation be really intended
for our present text of Gautier, and if Potvin’s summary of Gerbert is
to be relied upon; Birch-Hirschfeld seemingly differs from him here,
and makes the King at once mention the flaw.
[87] It may be worth notice that v. 35,473 is the same as Chrestien,
v. 4,533.
[88] It is evident that, although in the MS. in which this version is
found it is followed by Manessier’s section, the poem was intended
by Gerbert to end here.
[89] Told at other times, and notably by Gautier himself (Inc. 21), of
Perceval, where the feature of a dead knight lying on the altar is
added.
[90] According to the Montpellier MS., which here agrees
substantially with Potvin’s text (the Mons MS.), this is Gauvain’s
second visit to the Grail Castle. At his first visit he had been
subjected to the sword test and had slept. The mystic procession is
made up as follows:—Squire with lance; maidens with plate; two
squires with candlesticks; fair maiden weeping, in her hands a
“graal;” four squires with the bier, on which lies the knight and the
broken sword. Gauvain would fain learn about these things, but is
bidden first to make the sword whole. On his failure he is told
Vous n’avez par encore tant fet
D’armes, que vous doiez savoir, etc.,
and then goes to sleep. His awakening finds him in a marsh.
[91] It may be conjectured that the magic vessel which preserves to
this enchanted folk the semblance of life passes into the hero’s
possession when he asks about it, and that deprived of it their
existence comes to an end, as would that of the Anses without the
Apples of Iduna. I put this into a note, as I have no evidence in
support of the theory. But read in the light of this conjecture some
hitherto unnoticed legend may supply the necessary link of
testimony.
[92] Nearly all the objections to the view suggested in the text may
be put aside as due to insufficient recognition of the extent to which
the two formulas have been mingled, but there is one which seems
to me of real moment. The wasting of the land which I have looked
upon as belonging to the unspelling formula, is traced by the Queste
to the blow struck by King Lambar against King Urlain, a story
which, as we have seen, is very similar to that which forms the
groundwork of one at least of the models followed by the Conte du
Graal in its version of the feud quest. It does not seem likely that the
Queste story is a mere echo of that found in the Conte du Graal, nor
that the fusion existed so far back as in a model common to both.
But the second alternative is possible.
[93] I do not follow M. Hucher upon the (as it seems to me) very
insecure ground of Gaulish numismatic art. The object which he
finds figured in pre-Christian coins may be a cauldron—and it may
not—and even if it is a cauldron it may have no such significance as
he ascribes to it.
[94] Cf. as to Lug D’Arbois de Jubainville, Cycle Mythologique
Irlandais; Paris, 1884, p. 178. He was revered by all Celtic races, and
has left his trace in the name of several towns, chief among them
Lug-dunum = Lyons. In so far as the Celts had departmental gods,
he was the god of handicraft and trade; but cf. as to this Rhys, Hibb.
Lect., p. 427-28.
[95] Cf. D’Arbois de Jubainville, op. cit., p. 269-290. The Dagda—the
good god—seems to have been head of the Irish Olympus. A legend
anterior to the eleventh century, and belonging probably to the
oldest stratum of Celtic myth, ascribes to him power over the earth:
without his aid the sons of Miledh could get neither corn nor milk. It
is, therefore, no wonder to find him possessor of the magic
cauldron, which may be looked upon as a symbol of fertility, and, as
such, akin to similar symbols in the mythology of nearly every
people.
[96] Cf. as to the mythic character of the Tuatha de Danann,
D’Arbois de Jubainville, op. cit., and my review of his work, Folk-Lore
Journal, June, 1884.
[97] I at one time thought that the prohibition to reveal the “secret
words,” which is such an important element in Robert de Borron’s
version, might be referred to the same myth-root as the instances in
the text. There is little or no evidence to sustain such a hazardous
hypothesis. Nevertheless it is worth while drawing attention in this
place to that prohibition, for which I can offer no adequate
explanation.
[98] Powers of darkness and death. Tethra their king reigns in an
island home. It is from thence that the maiden comes to lure away
Connla of the Golden Hair, as is told in the Leabhar na-h-Uidhre,
even as the Grail messenger comes to seek Perceval—“’tis a land in
which is neither death nor old age—a plain of never ending
pleasure,” the counterpart, in fact, of that Avalon to which Arthur is
carried off across the lake by the fay maiden, that Avalon which, as
we see in Robert de Borron, was the earliest home of the Grail-host.
[99] Cf. D’Arbois de Jubainville, op. cit. p. 188.
[100] When Cuchulainn was opposing the warriors of Ireland in their
invasion of Ulster one of his feats is to make smooth chariot-poles
out of rough branches of trees by passing them through his clenched
hand, so that however bent and knotted they were they came from
his hands even, straight, and smooth. Tain bo Cualgne, quoted by
Windisch, Rev. Celt., Vol. V.
[101] This epithet recalls Lug, of whom it is the stock designation.
Now Lug was par excellence the craftsman’s god; he, too, at the
battle of Mag Tured acted as a sort of armourer-general to the
Tuatha de Danann. A dim reminiscence of this may be traced in the
words which the folk-tale applies to Ullamh l.f., “he was the one
special man for taking their arms.”
[102] Cf. my Aryan Expulsion and Return formula, pp. 8, 13, for
variants of these incidents in other stories belonging to this cycle
and in the allied folk-tales.
[103] This incident is only found in the living Fionn-sage, being
absent from all the older versions, and yet, as the comparison with
the allied Perceval sage shows, it is an original and essential feature.
How do the advocates of the theory that the Ossianic cycle is a
recent mass of legend, growing out of the lives and circumstances of
historical men, account for this development along the lines of a
formula with which, ex hypothesi, the legend has nothing to do? The
Fionn-sage, it is said, has been doctored in imitation of the
Cuchulainn-sage, but the assertion (which though boldly made has
next to no real foundation) cannot be made in the case of the Conte
du Graal. Mediæval Irish bards and unlettered Highland peasants did
not conspire together to make Fionn’s adventures agree with those
of Perceval.
[104] In the Gawain form of the feud quest found in Gautier, the
knight whose death he sets forth to avenge is slain by the cast of a
dart. Can this be brought into connection with the fact that Perceval
slays with a cast of his dart the Red Knight, who, according to the
Thornton romance, is his father’s slayer.
[105] This prose tale precedes an oral version of one of the
commonest Fenian poems, which in its present shape obviously goes
back to the days when the Irish were fighting against Norse
invaders. The poem, which still lives in Ireland as well as in the
Highlands, belongs to that later stage of development of the Fenian
cycle, in which Fionn and his men are depicted as warring against
the Norsemen. It is totally dissimilar from the prose story
summarised above, and I am inclined to look upon the prose as
belonging to a far earlier stage in the growth of the cycle, a stage in
which the heroes were purely mythical and their exploits those of
mythical heroes generally.
[106] The prohibition seems to be an echo of the widely-spread one
which forbids the visitor to the otherworld tasting the food of the
dead, which, if he break, he is forfeit to the shades. The most
famous instance of this myth is that of Persephone.
[107] Cf. Procopius quoted by Elton, Origins of English History, p. 84.
[108] Prof. Rhys, Hibbert Lectures for 1886, looks upon him as a
Celtic Zeus. He dispossessed his father of the Brug by fraud, as Zeus
dispossessed Kronos by force.
[109] D’Arbois de Jubainville, op. cit., p. 275. Rhys, op. cit., p. 149.
[110] M. Duvau, Revue Celtique, Vol. IX., No. 1, has translated the
varying versions of the story.
[111] Like many of the older Irish tales the present form is confused
and obscure, but it is easy to arrive at the original.
[112] The part in brackets is found in one version only of the story.
Of the two versions each has retained certain archaic features not to
be found in the other.
[113] Summarised by D’Arbois de Jubainville, op. cit., p. 323.
[114] D’Arbois de Jubainville, p. 326.
[115] Otto Küpp, Z.f.D. Phil. xvii, i, 68, examining Wolfram’s version
sees in the branch guarded by Gramoflanz and broken by Parzival a
trace of the original myth underlying the story. Gramoflanz is
connected with the Magic Castle (one of the inmates of which is his
sister), or with the otherworld. Küpp’s conjecture derives much force
from the importance given to the branch in the Irish tales as part of
the gear of the otherworld.
[116] This recalls the fact that Oengus of the Brug fell in love with a
swanmaid. See text and translation Revue Celtique, Vol. III., pp.
341, et. seq. The story is alluded to in the catalogue of epic tales
(dating from the tenth century) found in the Book of Leinster.
[117] In a variant from Kashmir (Knowles’ Folk-tales of Kashmir,
London, 1888, p. 75, et. seq.), Saiyid and Said, this tale is found
embedded in a twin-brethren one.
[118] Frederick (I.) Barbarossa is a mistake, as old as the
seventeenth century (cf. Koch, Sage vom Kaiser Friedrich in
Kyffhäuser, Leipzig, 1886), for Frederick II., the first German
Emperor of whom the legend was told. The mistake was caused by
the fact that Frederick took the place of a German red-bearded god,
probably Thor, hence the later identification with the red-bearded
Frederick, instead of with that great opponent of the Papacy whose
death away in Italy the German party refused for many years to
credit.
[119] Unless the passage relating to Carl the Great quoted by Grimm
(D.M., III., 286) from Mon. Germ. Hist., Vol. VIII., 215, “inde
fabulosum illud confictum de Carolo Magno, quasi de mortuis in id
ipsum resuscitato, et alio nescio quo nihilominus redivivo,” be older.
[120] Liebrecht’s edition of the Otia Imperialia, Hanover, 1856, p. 12,
and note p. 55.
[121] Martin Zur Gralsage, p. 31, arguing from the historical
connection of Frederick II. with Sicily, thinks that the localisation of
this Arthurian legend in that isle was the reason of its being
associated with the Hohenstauffen; in other words, the famous
German legend would be an indirect offshot of the Arthurian cycle. I
cannot follow Martin here. I see no reason for doubting the
genuineness of the traditions collected by Kuhn and Schwartz, or for
disbelieving that Teutons had this myth as well as Celts. It is no part
of my thesis to exalt Celtic tradition at the expense of German;
almost all the parallels I have adduced between the romances and
Celtic mythology and folk-lore could be matched from those of
Germany. But the romances are historically associated with Celtic
tradition, and the parallels found in the latter are closer and more
numerous than those which could be recovered from German
tradition. It is, therefore, the most simple course to refer the
romances to the former instead of to the latter.
[122] See Grimm, D.M., Ch. XXXII.; Fitzgerald, Rev. Celt., IV., 198;
and the references in Liebrecht, op. cit.
[123] Personally communicated by the Rev. Mr. Sorby, of Sheffield.
[124] In Chrestien the part of the Magician Lord is little insisted
upon. But in Wolfram he is a very important personage. It may here
be noted that the effects which are to follow in Chrestien the doing
away with the enchantments of this Castle, answer far more
accurately to the description given by the loathly Grail-Maiden of the
benefits which would have accrued had Perceval put the question at
the Court of the Fisher King than to anything actually described as
the effect of that question being put, either by Gautier, Manessier, or
Gerbert. This castle seems, too, to be the one in which lodge the
Knights, each having his lady love with him, which the loathly
maiden announces to be her home.
[125] Kennedy follows in the main Oss. Soc., Vol. II, pp. 118, et.
seq., an eighteenth century version translated by Mr. O’Kearney. This
particular episode is found, pp. 147, et. seq. I follow the Oss. Soc.
version in preference to Kennedy’s where they differ.
[126] The story as found in Heinrich may be compared with the folk-
tale of the Sleeping Beauty. She is a maiden sunk in a death-in-life
sleep together with all her belongings until she be awakened by the
kiss of the destined prince. May we not conjecture that in an older
form of the story than any we now possess, the court of the princess
vanished when the releasing kiss restored her to real life and left her
alone with the prince? The comparison has this further interest, that
the folk-tale is a variant of an old myth which figures prominently in
the hero-tales of the Teutonic race (Lay of Skirni, Lay of Swipday
and Menglad, Saga of Sigurd and Brunhild), and that in its most
famous form Siegfried, answering in Teutonic myth to Fionn, is its
hero. But Peredur is a Cymric Fionn, so that the parallel between the
two heroes, Celtic and Teutonic, is closer than at first appears when
Siegfried is compared only to his Gaelic counterpart.
[127] I have not examined Gawain’s visit to the Magic Castle in
detail, in the first place because it only bears indirectly upon the
Grail-Quest, and then because I hope before very long to study the
personality of Gawain in the romances, and to throw light upon it
from Celtic mythic tradition in the same way that I have tried in the
foregoing pages to do in the case of Perceval.
[128] Kennedy, Legendary Fictions, p. 154, et. seq.
[129] Grimm, Vol. III., p. 9 (note to Märchen von einem der auszog
das Fürchten zu lernen), gives a number of variants. It should be
noted that in this story there is the same mixture of incidents of the
Magic Castle and Haunted Castle forms as in the romances.
Moreover, one of the trials to which the hero’s courage is subjected
is the bringing into the room of a coffin in which lies a dead man,
just as in Gawain’s visit to the Grail Castle. Again, as Grimm notes,
but mistakenly refers to Perceval instead of to Gawain, the hero has
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like