PDF Java For Data Science 1st Edition Reese Download
PDF Java For Data Science 1st Edition Reese Download
com
https://textbookfull.com/product/java-for-data-
science-1st-edition-reese/
https://textbookfull.com/product/learning-java-functional-
programming-1st-edition-reese-richard-m/
textbookfull.com
https://textbookfull.com/product/learning-network-programming-with-
java-1st-edition-reese-richard-m/
textbookfull.com
https://textbookfull.com/product/natural-language-processing-with-
java-community-experience-distilled-1st-edition-reese-richard-m/
textbookfull.com
https://textbookfull.com/product/css-animation-master-the-art-of-
moving-objects-on-the-web-1st-edition-parmar/
textbookfull.com
Guosen Yan: A Festschrift from Theoretical Chemistry
Accounts 1st Edition Hua Guo
https://textbookfull.com/product/guosen-yan-a-festschrift-from-
theoretical-chemistry-accounts-1st-edition-hua-guo/
textbookfull.com
https://textbookfull.com/product/gpt-3-building-innovative-nlp-
products-using-large-language-models-1st-edition-kublik/
textbookfull.com
https://textbookfull.com/product/pronouns-in-literature-positions-and-
perspectives-in-language-1st-edition-alison-gibbons/
textbookfull.com
https://textbookfull.com/product/universities-and-sustainable-
communities-meeting-the-goals-of-the-agenda-2030-walter-leal-filho/
textbookfull.com
Mayhem Village Cozy Mysteries A Cozy Mystery Boxed Set 1st
Edition Kate Jach
https://textbookfull.com/product/mayhem-village-cozy-mysteries-a-cozy-
mystery-boxed-set-1st-edition-kate-jach/
textbookfull.com
Java for Data Science
Richard M. Reese
Jennifer L. Reese
BIRMINGHAM - MUMBAI
Java for Data Science
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or
transmitted in any form or by any means, without the prior written permission of the
publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the
information presented. However, the information contained in this book is sold without
warranty, either express or implied. Neither the authors, nor Packt Publishing, and its
dealers and distributors will be held liable for any damages caused or alleged to be caused
directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.
Shilpi Saxena
Richard has written several Java books and a C Pointer book. He uses a concise and easy-to-
follow approach to topics at hand. His Java books have addressed EJB 3.1, updates to Java 7
and 8, certification, jMonkeyEngine, natural language processing, functional programming,
and networks.
Richard would like to thank his wife, Karla, for her continued support, and to the staff of
Packt Publishing for their work in making this a better book.
Jennifer L. Reese studied computer science at Tarleton State University. She also earned her
M.Ed. from Tarleton in December 2016. She currently teaches computer science to high-
school students. Her research interests include the integration of computer science concepts
with other academic disciplines, increasing diversity in computer science courses, and the
application of data science to the field of education.
She previously worked as a software engineer developing software for county- and district-
level government offices throughout Texas. In her free time she enjoys reading, cooking,
and traveling—especially to any destination with a beach. She is a musician and appreciates
a variety of musical genres.
I would like to thank Dad for his inspiration and guidance, Mom for her patience and
perspective, and Jace for his support and always believing in me.
About the Reviewers
Walter Molina is a UI and UX developer from Villa Mercedes, San Luis, Argentina. His
skills include, but are not limited to, HTML5, CSS3, and JavaScript. He uses these
technologies at a Jedi/ninja level (along with a plethora of JavaScript libraries) in his daily
work as a frontend developer at Tachuso, a creative content agency. He holds a bachelor's
degree in computer science and is a member of the School of Engineering at local National
University, where he teaches programming skills to second- and third-year students. His
LinkedIn profile is https://ar.linkedin.com/in/waltermolina.
Shilpi Saxena is an IT professional and also a technology evangelist. She is an engineer who
has had exposure to various domains (IOT and cloud computing space, healthcare, telecom,
hiring, and manufacturing). She has experience in all the aspects of conception and
execution of enterprise solutions. She has been architecting, managing, and delivering
solutions in the big data space for the last 3 years; she also handles a high-performance and
geographically distributed team of elite engineers.
Shilpi has more than 14 years (3 years in the big data space) of experience in the
development and execution of various facets of enterprise solutions both in the products
and services dimensions of the software industry. An engineer by degree and profession,
she has worn various hats, such as developer, technical leader, product owner, tech
manager, and so on, and has seen all the flavors that the industry has to offer. She has
architected and worked through some of the pioneers' production implementations in big
data on Storm and Impala with autoscaling in AWS.
Shilpi has also authored Real-time Analytics with Storm and Cassandra (https://www.pack
tpub.com/big-data-and-business-intelligence/learning-real-time-analytics-sto
rm-and-cassandra) and Real time Big Data Analytics (https://www.packtpub.com/big-d
ata-and-business-intelligence/real-time-big-data-analytics) with Packt
Publishing.
www.PacktPub.com
eBooks, discount offers, and more
Did you know that Packt offers eBook versions of every book published, with PDF and
ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a
print book customer, you are entitled to a discount on the eBook copy. Get in touch with us
at customercare@packtpub.com for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a
range of free newsletters and receive exclusive discounts and offers on Packt books and
eBooks.
https://www.packtpub.com/mapt
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book
library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Customer Feedback
Thank you for purchasing this Packt book. We take our commitment to improving our
content and products to meet your needs seriously—that's why your feedback is so
valuable. Whatever your feelings about your purchase, please consider leaving a review on
this book's Amazon page. Not only will this help us, more importantly it will also help
others in the community to make an informed decision about the resources that they invest
in to learn.
You can also review for us on a regular basis by joining our reviewers' club. If you're
interested in joining, or would like to learn more about the benefits we offer, please
contact us: customerreviews@packtpub.com.
Table of Contents
Preface 1
Chapter 1: Getting Started with Data Science 6
Problems solved using data science 7
Understanding the data science problem – solving approach 8
Using Java to support data science 9
Acquiring data for an application 10
The importance and process of cleaning data 11
Visualizing data to enhance understanding 13
The use of statistical methods in data science 14
Machine learning applied to data science 16
Using neural networks in data science 18
Deep learning approaches 21
Performing text analysis 22
Visual and audio analysis 24
Improving application performance using parallel techniques 26
Assembling the pieces 28
Summary 28
Chapter 2: Data Acquisition 29
Understanding the data formats used in data science applications 30
Overview of CSV data 31
Overview of spreadsheets 31
Overview of databases 32
Overview of PDF files 34
Overview of JSON 35
Overview of XML 35
Overview of streaming data 36
Overview of audio/video/images in Java 37
Data acquisition techniques 38
Using the HttpUrlConnection class 38
Web crawlers in Java 39
Creating your own web crawler 41
Using the crawler4j web crawler 44
Web scraping in Java 47
Using API calls to access common social media sites 51
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Using OAuth to authenticate users 51
Handing Twitter 51
Handling Wikipedia 54
Handling Flickr 57
Handling YouTube 60
Searching by keyword 61
Summary 64
Chapter 3: Data Cleaning 65
Handling data formats 66
Handling CSV data 67
Handling spreadsheets 69
Handling Excel spreadsheets 70
Handling PDF files 71
Handling JSON 73
Using JSON streaming API 73
Using the JSON tree API 78
The nitty gritty of cleaning text 79
Using Java tokenizers to extract words 81
Java core tokenizers 82
Third-party tokenizers and libraries 82
Transforming data into a usable form 84
Simple text cleaning 84
Removing stop words 86
Finding words in text 88
Finding and replacing text 89
Data imputation 91
Subsetting data 94
Sorting text 95
Data validation 99
Validating data types 100
Validating dates 101
Validating e-mail addresses 103
Validating ZIP codes 105
Validating names 105
Cleaning images 106
Changing the contrast of an image 107
Smoothing an image 108
Brightening an image 110
Resizing an image 111
Converting images to different formats 112
Summary 113
Chapter 4: Data Visualization 114
[ ii ]
Understanding plots and graphs 115
Visual analysis goals 121
Creating index charts 122
Creating bar charts 125
Using country as the category 127
Using decade as the category 129
Creating stacked graphs 132
Creating pie charts 134
Creating scatter charts 137
Creating histograms 139
Creating donut charts 142
Creating bubble charts 144
Summary 147
Chapter 5: Statistical Data Analysis Techniques 148
Working with mean, mode, and median 149
Calculating the mean 149
Using simple Java techniques to find mean 149
Using Java 8 techniques to find mean 150
Using Google Guava to find mean 151
Using Apache Commons to find mean 151
Calculating the median 152
Using simple Java techniques to find median 152
Using Apache Commons to find the median 154
Calculating the mode 154
Using ArrayLists to find multiple modes 156
Using a HashMap to find multiple modes 157
Using a Apache Commons to find multiple modes 158
Standard deviation 158
Sample size determination 161
Hypothesis testing 161
Regression analysis 162
Using simple linear regression 164
Using multiple regression 167
Summary 173
Chapter 6: Machine Learning 175
Supervised learning techniques 176
Decision trees 177
Decision tree types 178
Decision tree libraries 178
Using a decision tree with a book dataset 179
Testing the book decision tree 183
[ iii ]
Support vector machines 184
Using an SVM for camping data 187
Testing individual instances 190
Bayesian networks 191
Using a Bayesian network 192
Unsupervised machine learning 195
Association rule learning 195
Using association rule learning to find buying relationships 197
Reinforcement learning 199
Summary 200
Chapter 7: Neural Networks 202
Training a neural network 204
Getting started with neural network architectures 205
Understanding static neural networks 206
A basic Java example 206
Understanding dynamic neural networks 214
Multilayer perceptron networks 215
Building the model 215
Evaluating the model 217
Predicting other values 218
Saving and retrieving the model 219
Learning vector quantization 219
Self-Organizing Maps 220
Using a SOM 220
Displaying the SOM results 221
Additional network architectures and algorithms 225
The k-Nearest Neighbors algorithm 225
Instantaneously trained networks 225
Spiking neural networks 226
Cascading neural networks 226
Holographic associative memory 226
Backpropagation and neural networks 227
Summary 227
Chapter 8: Deep Learning 228
Deeplearning4j architecture 229
Acquiring and manipulating data 230
Reading in a CSV file 230
Configuring and building a model 231
Using hyperparameters in ND4J 232
Instantiating the network model 234
Training a model 234
[ iv ]
Testing a model 235
Deep learning and regression analysis 236
Preparing the data 236
Setting up the class 237
Reading and preparing the data 237
Building the model 238
Evaluating the model 239
Restricted Boltzmann Machines 241
Reconstruction in an RBM 242
Configuring an RBM 243
Deep autoencoders 244
Building an autoencoder in DL4J 245
Configuring the network 245
Building and training the network 247
Saving and retrieving a network 247
Specialized autoencoders 247
Convolutional networks 248
Building the model 248
Evaluating the model 251
Recurrent Neural Networks 252
Summary 253
Chapter 9: Text Analysis 254
Implementing named entity recognition 255
Using OpenNLP to perform NER 256
Identifying location entities 257
Classifying text 259
Word2Vec and Doc2Vec 259
Classifying text by labels 259
Classifying text by similarity 262
Understanding tagging and POS 265
Using OpenNLP to identify POS 265
Understanding POS tags 267
Extracting relationships from sentences 268
Using OpenNLP to extract relationships 269
Sentiment analysis 271
Downloading and extracting the Word2Vec model 272
Building our model and classifying text 275
Summary 277
Chapter 10: Visual and Audio Analysis 279
[v]
Text-to-speech 280
Using FreeTTS 282
Getting information about voices 284
Gathering voice information 286
Understanding speech recognition 287
Using CMUPhinx to convert speech to text 288
Obtaining more detail about the words 289
Extracting text from an image 291
Using Tess4j to extract text 291
Identifying faces 292
Using OpenCV to detect faces 293
Classifying visual data 295
Creating a Neuroph Studio project for classifying visual images 296
Training the model 303
Summary 308
Chapter 11: Mathematical and Parallel Techniques for Data Analysis 309
Implementing basic matrix operations 310
Using GPUs with DeepLearning4j 312
Using map-reduce 314
Using Apache's Hadoop to perform map-reduce 314
Writing the map method 315
Writing the reduce method 316
Creating and executing a new Hadoop job 317
Various mathematical libraries 319
Using the jblas API 319
Using the Apache Commons math API 320
Using the ND4J API 321
Using OpenCL 323
Using Aparapi 323
Creating an Aparapi application 324
Using Aparapi for matrix multiplication 327
Using Java 8 streams 329
Understanding Java 8 lambda expressions and streams 330
Using Java 8 to perform matrix multiplication 331
Using Java 8 to perform map-reduce 332
Summary 334
Chapter 12: Bringing It All Together 336
Defining the purpose and scope of our application 337
[ vi ]
Understanding the application's architecture 337
Data acquisition using Twitter 341
Understanding the TweetHandler class 343
Extracting data for a sentiment analysis model 345
Building the sentiment model 346
Processing the JSON input 347
Cleaning data to improve our results 348
Removing stop words 349
Performing sentiment analysis 350
Analysing the results 350
Other optional enhancements 351
Summary 352
Index 353
[ vii ]
Preface
In this book, we examine Java-based approaches to the field of data science. Data science is
a broad topic and includes such subtopics as data mining, statistical analysis, audio and
video analysis, and text analysis. A number of Java APIs provide support for these topics.
The ability to apply these specific techniques allows for the creation of new, innovative
applications able to handle the vast amounts of data available for analysis.
This book takes an expansive yet cursory approach to various aspects of data science. A
brief introduction to the field is presented in the first chapter. Subsequent chapters cover
significant aspects of data science, such as data cleaning and the application of neural
networks. The last chapter combines topics discussed throughout the book to create a
comprehensive data science application.
Chapter 2, Data Acquisition, demonstrates how to acquire data from a number of sources,
including Twitter, Wikipedia, and YouTube. The first step of a data science application is to
acquire data.
Chapter 3, Data Cleaning, explains that once data has been acquired, it needs to be cleaned.
This can involve such activities as removing stop words, validating the data, and data
conversion.
Chapter 4, Data Visualization, shows that while numerical processing is a critical step in
many data science tasks, people often prefer visual depictions of the results of analysis. This
chapter demonstrates various Java approaches to this task.
Chapter 5, Statistical Data Analysis Techniques, reviews basic statistical techniques, including
regression analysis, and demonstrates how various Java APIs provide statistical support.
Statistical analysis is key to many data analysis tasks.
Chapter 7, Neural Networks, explains that neural networks can be applied to solve a variety
of data science problems. In this chapter, we explain how they work and demonstrate the
use of several different types of neural networks.
Chapter 8, Deep Learning, shows that deep learning algorithms are often described as
multilevel neural networks. Java provides significant support in this area, and we will
illustrate the use of this approach.
Chapter 9, Text Analysis, explains that significant portions of available datasets exist in
textual formats. The field of natural language processing has advanced considerably and is
frequently used in data science applications. We demonstrate various Java APIs used to
support this type of analysis.
Chapter 10, Visual and Audio Analysis, tells us that data science is not restricted to text
processing. Many social media sites use visual data extensively. This chapter illustrates the
Java supports available for this type of analysis.
Chapter 11, Mathematical and Parallel Techniques for Data Analysis, investigates the support
provided for low-level math operations and how they can be supported in a multiple
processor environment. Data analysis, at its heart, necessitates the ability to manipulate and
analyze large quantities of numeric data.
Chapter 12, Bringing It All Together, examines how the integration of the various
technologies introduced in this book can be used to create a data science application. This
chapter begins with data acquisition and incorporates many of the techniques used in
subsequent chapters to build a complete application.
[2]
Preface
Conventions
In this book, you will find a number of text styles that distinguish between different kinds
of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text are shown as follows: “The getResult method returns a
SpeechResult instance which holds the result of the processing." Database table names,
folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter
handles are shown as follows: "The KevinVoiceDirectory contains two voices: kevin
and kevin16."
New terms and important words are shown in bold. Words that you see on the screen, for
example, in menus or dialog boxes, appear in the text like this: "Select the Images category
and then filter for Labeled for reuse."
[3]
Preface
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about this
book-what you liked or disliked. Reader feedback is important for us as it helps us develop
titles that you will really get the most out of. To send us general feedback, simply e-
mail feedback@packtpub.com, and mention the book's title in the subject of your
message. If there is a topic that you have expertise in and you are interested in either
writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to help you
to get the most from your purchase.
1. Log in or register to our website using your e-mail address and password.
2. Hover the mouse pointer on the SUPPORT tab at the top.
3. Click on Code Downloads & Errata.
4. Enter the name of the book in the Search box.
5. Select the book for which you're looking to download the code files.
6. Choose from the drop-down menu where you purchased this book from.
7. Click on Code Download.
Once the file is downloaded, please make sure that you unzip or extract the folder using the
latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPubl
ishing/Java-for-Data-Science. We also have other code bundles from our rich catalog of
books and videos available at https://github.com/PacktPublishing/. Check them out!
[4]
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Preface
Errata
Although we have taken every care to ensure the accuracy of our content, mistakes do
happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-
we would be grateful if you could report this to us. By doing so, you can save other readers
from frustration and help us improve subsequent versions of this book. If you find any
errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting
your book, clicking on the Errata Submission Form link, and entering the details of your
errata. Once your errata are verified, your submission will be accepted and the errata will
be uploaded to our website or added to any list of existing errata under the Errata section of
that title.
Piracy
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At
Packt, we take the protection of our copyright and licenses very seriously. If you come
across any illegal copies of our works in any form on the Internet, please provide us with
the location address or website name immediately so that we can pursue a remedy.
We appreciate your help in protecting our authors and our ability to bring you valuable
content.
Questions
If you have a problem with any aspect of this book, you can contact us
at questions@packtpub.com, and we will do our best to address the problem.
[5]
Getting Started with Data
1
Science
Data science is not a single science as much as it is a collection of various scientific
disciplines integrated for the purpose of analyzing data. These disciplines include various
statistical and mathematical techniques, including:
Computer science
Data engineering
Visualization
Domain-specific knowledge and approaches
With the advent of cheaper storage technology, more and more data has been collected and
stored permitting previously unfeasible processing and analysis of data. With this analysis
came the need for various techniques to make sense of the data. These large sets of data,
when used to analyze data and identify trends and patterns, become known as big data.
This in turn gave rise to cloud computing and concurrent techniques such as map-reduce,
which distributed the analysis process across a large number of processors, taking
advantage of the power of parallel processing.
The process of analyzing big data is not simple and evolves to the specialization of
developers who were known as data scientists. Drawing upon a myriad of technologies
and expertise, they are able to analyze data to solve problems that previously were
either not envisioned or were too difficult to solve.
Getting Started with Data Science
Early big data applications were typified by the emergence of search engines capable of
more powerful and accurate searches than their predecessors. For example, AltaVista was
an early popular search engine that was eventually superseded by Google. While big data
applications were not limited to these search engine functionalities, these applications laid
the groundwork for future work in big data.
The term, data science, has been used since 1974 and evolved over time to include statistical
analysis of data. The concepts of data mining and data analytics have been associated with
data science. Around 2008, the term data scientist appeared and was used to describe a
person who performs data analysis. A more in-depth discussion of the history of data
science can be found at http://www.forbes.com/sites/gilpress/2013/05/28/a-very-sh
ort-history-of-data-science/#3d9ea08369fd.
This book aims to take a broad look at data science using Java and will briefly touch on
many topics. It is likely that the reader may find topics of interest and pursue these at
greater depth independently. The purpose of this book, however, is simply to introduce the
reader to the significant data science topics and to illustrate how they can be addressed
using Java.
There are many algorithms used in data science. In this book, we do not attempt to explain
how they work except at an introductory level. Rather, we are more interested in explaining
how they can be used to solve problems. Specifically, we are interested in knowing how
they can be used with Java.
Data mining is a popular application area for data science. In this activity, large quantities
of data are processed and analyzed to glean information about the dataset, to provide
meaningful insights, and to develop meaningful conclusions and predictions. It has been
used to analyze customer behavior, detecting relationships between what may appear to be
unrelated events, and to make predictions about future behavior.
[7]
Getting Started with Data Science
Machine learning is an important aspect of data science. This technique allows the
computer to solve various problems without needing to be explicitly programmed. It has
been used in self-driving cars, speech recognition, and in web searches. In data mining, the
data is extracted and processed. With machine learning, computers use the data to take
some sort of action.
Acquiring the data: Before we can process the data, it must be acquired. The data
is frequently stored in a variety of formats and will come from a wide range of
data sources.
Cleaning the data: Once the data has been acquired, it often needs to be
converted to a different format before it can be used. In addition, the data needs
to be processed, or cleaned, so as to remove errors, resolve inconsistencies, and
otherwise put it in a form ready for analysis.
Analyzing the data: This can be performed using a number of techniques
including:
Statistical analysis: This uses a multitude of statistical approaches
to provide insight into data. It includes simple techniques and
more advanced techniques such as regression analysis.
AI analysis: These can be grouped as machine learning, neural
networks, and deep learning techniques:
Machine learning approaches are characterized by
programs that can learn without being specifically
programmed to complete a specific task
Neural networks are built around models patterned
after the neural connection of the brain
Deep learning attempts to identify higher levels of
abstraction within a set of data
[8]
Getting Started with Data Science
Complementing this set of tasks is the need to develop applications that are efficient. The
introduction of machines with multiple processors and GPUs contributes significantly to
the end result.
While the exact steps used will vary by application, understanding these basic steps
provides the basis for constructing solutions to many data science problems.
There is ample support provided for the basic data science tasks. These include multiple
ways of acquiring data, libraries for cleaning data, and a wide variety of analysis
approaches for tasks such as natural language processing and statistical analysis. There are
also myriad of libraries supporting neural network types of analysis.
Java can be a very good choice for data science problems. The language provides both
object-oriented and functional support for solving problems. There is a large developer
community to draw upon and there exist multiple APIs that support data science tasks.
These are but a few reasons as to why Java should be used.
[9]
Random documents with unrelated
content Scribd suggests to you:
[215] Μύγδων. In a magic spell, Pluto, who has many analogies
with Attis, is saluted as “Huesemigadon,” perhaps “Hye, Cye,
Mygdon.” Has this Mygdon any analogy with amygdalon the
almond?
[216] Qy. Mise, the hermaphrodite Dionysos?
[217] Βουμέγας, “great ox”? All the other names which follow
are those of magicians or diviners.
[218] Two of the seven “angels of the presence.” Their
appearance in a list mainly of Greek heroes is inexplicable.
[219] τῆς ἄνω. Perhaps we should insert δυνάμεως, “the Power
on High.”
[220] See Sibyll. Orac., III. But the Sibyl says the exact
opposite. Cf. Charles, Apocrypha and Psuedepigrapha of the O.T.,
II, 377.
[221] περᾶσαι. The derivation is too much even for Theodoret,
who says that the name of the sect is taken from “Euphrates the
Peratic” (or Mede).
[222] So modern astrologers make him the “greater malefic.”
[223] A fragment from Heraclitus according to Schleiermacher.
[224] So the Pistis Sophia speaks repeatedly of “the Pleroma of
all Pleromas.”
[225] Many magical books bore the name of Moses. See
Forerunners, II, 46, and n.
[226] Is this why one Ophite sect was called the Cainites? The
hostility here shown to the God of the Jews is common to many
other sects such as that of Saturninus, of Marcion and later of
Manes. Cf. Forerunners, II, under these names.
[227] Gen. x. 9. Nimrod, who is sometimes identified with the
hero Gilgames, plays a large part in all this Eastern tradition.
[228] John iii. 13, 14.
[229] Ibid., i. 1-4.
[230] For this identification of Eve with the Mother of Life or
Great Goddess of Asia, see Forerunners, II, 300, and n.
[231] ἄκραν. Cruice and Macmahon both read ἀρχή,
“beginning,” but see ταύτην τὴν ἄκραν later.
[232] All this is, of course, quite different to the meaning
assigned to these stars by the unnamed heretics of Book IV.
[233] If we could be sure that Hippolytus was here
summarizing fairly Ophite doctrines, it would appear that the
Ophites rejected the Platonic theory that matter was essentially
evil. What is here said presents a curious likeness to Stoic
doctrines of the universe, as of man’s being. Hippolytus, however,
never quotes a Stoic author and seems throughout to ignore
Stoicism save in Book I.
[234] πρόσωπον. The word used to denote the “character” or
part or a person on the stage.
[235] ἰδέαι. So throughout this passage.
[236] Gen. xxx. 37 ff.
[237] χαρακτῆρες. See n. on p. 143 supra.
[238] Not “ring-straked” like Jacob’s sheep.
[239] ὁμοούσιος.
[240] Matt. vii. 11. Note the change of “Your” for “Our.”
[241] John viii. 44.
[242] Here again he dwells upon the supposed evil nature of
the Demiurge.
[243] Or as Macmahon translates, “the substantial from the
Unsubstantial one.”
[244] A lacuna in the text is thus filled by Cruice.
[245] Again this simile is not necessarily by the Peratic author,
but seems to be introduced by Hippolytus. For the supposed
conduct of naphtha in the presence of fire, see Plutarch, vit Alex.
[246] ἐξεικονισμένον. A different metaphor from the “type.” We
shall meet with this one frequently in the work attributed to
Simon Magus.
[247] The text has ἐκ καμαρίου. Here Schneidewin agrees that
the proper reading is μακαρίου, there being no reason why any
“life-giving substance” should exist in the brain-pan. He thus
confirms the reading in n. on p. 152 supra.
[248] This chapter on the Peratæ is evidently drawn from more
sources than one. The author’s first statement of their doctrines,
which occupies pp. 146-149 supra, represents probably his first
impression of them and contains at least one glaring
contradiction, duly noted in its place. Then comes a long extract
from Sextus Empiricus which is to all appearance a repetition of
the earliest part of Book IV, only pardonable if it be allowed that
the present Book was delivered in lecture form. There follows a
quotation longer and more sustained than any other in the whole
work from a Peratic book which he says was called Proastii, with a
bombastic prelude much resembling the language of Simon
Magus’ Great Announcement in Book VI, followed by a catalogue
of starry “influences” which reads much as if it were taken from
some astrological manual. There follows in its turn a dissertation
on the Ophite Serpent showing how this object of their adoration,
identified with the Brazen Serpent of Exodus, was made to
prefigure or typify in the most incongruous manner many
personages in the Old and New Testaments, including Christ
Himself. After this he announces an “epitome” of the Peratic
doctrine which turns out to be perfectly different from anything
before said, divides the universe, which he has previously said the
Peratics divided into unbegotten, self-begotten and begotten, into
a new triad of Father, Son (i. e. Serpent), and Matter, and gives a
fairly consistent statement of the Peratic scheme of salvation
based on this hypothesis. One can only suppose here that this last
is an afterthought added when revising the book and inspired by
some fresh evidence of Peratic beliefs probably coloured by Stoic
or Marcionite doctrine. In those parts of the chapter which appear
to have been taken from genuinely Peratic sources, the reference
to some Western Asiatic tradition concerning cosmogony and the
protoplasts and differing considerably from the narrative of
Genesis, is plainly apparent.
[249] This chapter is the most difficult of the whole book to
account for, with the doubtful exception of the much later one on
the Docetæ. A sect of Sethians is mentioned by Irenæus, who
does not attempt to separate their doctrines from those of the
Ophites. Pseudo-Tertullian in his tractate Against All Heresies also
connects with the Ophites a sect called Sethites or Sethoites, the
main dogma he attributes to them being an attempt to identify
Christ with the Seth of Genesis. Epiphanius follows this last author
in this identification and calls them Sethians, but does not
expressly connect them with the Ophites, makes them an
Egyptian sect, and does not attribute to them serpent-worship.
The sectaries of this chapter are called in the rubric Sithiani,
altered to Sēthiani in the Summary of Book X, and the name is
not necessarily connected with that of the Patriarch. In the Bruce
Papyrus, a Power, good but subordinate to the Supreme God, is
mentioned, called “the Sitheus,” which may possibly, by analogy
with the late-Egyptian Si-Osiris and Si-Ammon, be construed “Son
of God.” Of their doctrines little can be made from Hippolytus’
brief but confused description. Their division of the cosmos into
three parts does not seem to differ much from that of the Peratæ,
although they make a sharper distinction than this last between
the world of light and that of darkness, which has led Salmon
(D.C.B. s.v., Ophites) to conjecture for them a Zoroastrian origin.
This is unlikely, and more attention is due to Hippolytus’ own
statement that they derived their doctrines from Musæus, Linus,
and Orpheus. In Forerunners it is sought to show that the Orphic
teaching was one of the foundations on which the fabric of
Gnosticism was reared, and the image of the earth as a matrix
was certainly familiar to the Greeks, who made Delphi its ὀμφαλός
or navel. Hence the imagery of the text, offensive as it is to our
ideas, would not have been so to them, and Epiphanius (Hær.,
XXXVIII, p. 510, Oehl.) knew of several writings, κατὰ τῆς
Ὑστέρας, or the Womb, which he says the sister sect of Cainites
called the maker of heaven and earth. In this case, we need not
take the story in the text about the generation by the bad or good
serpent as necessarily referring to the Incarnation. One of the
scenes in the Mysteries of Attis-Sabazius, and perhaps of those of
Eleusis also, seems to have shown the seduction by Zeus in
serpent-form of his virgin daughter Persephone and the birth
therefrom of the Saviour Dionysos who was but his father re-
born. This story of the fecundation of the earth-goddess by a
higher power in serpent shape seems to have been present in all
the religions of Western Asia, and was therefore extremely likely
to be caught hold of by an early form of Gnosticism. In no other
respect does this so-called “Sethian” heresy seem to have
anything in common with Christianity, and it may therefore
represent a pre-Christian form of Ophitism. The serpent in it is,
perhaps, neither bad nor good.
[250] τούτοις δοκεῖ, “it seems to them.”
[251] Cruice and Macmahon both translate this “into the same
nature with the spirit.”
[252] This anxiety of the higher powers to redeem from matter
darkness or chaos, the scintilla of their own being which has
slipped into it, is the theme of all Gnosticism from the Ophites to
the Pistis Sophia and the Manichæan writings. See Forerunners,
II, passim.
[253] Or “the substances brought up to the sealer.”
[254] ἰδέαι. And so throughout.
[255] Schneidewin, Cruice, and Macmahon would here and
elsewhere read ὁ φαλλὸς. But see the next sentence about
pregnancy.
[256] ἐξετύπωσεν, “struck off.”
[257] πρωτόγονος. The others were “unbegotten” like the
highest world of the Peratæ and Naassenes.
[258] εἴδεσιν.
[259] Is this Ps. xxix. 3, 10 already quoted by the Naassene
author? Cf. p. 133 supra.
[260] This idea of a divine son superior to his father is common
to the whole Orphic cosmogony and leads to the dethroning of
Uranus by Kronos, Kronos by Zeus and finally of Zeus by
Dionysos. It is met with again in Basilides (see Book VII infra).
[261] A lacuna here which Cruice thus fills.
[262] This has not been previously described. Is the narrative
of the Fall alluded to?
[263] Cruice and Macmahon would translate “any other than
man’s.”
[264] Phil. ii. 7. The only quotation from the N.T. other than
that from Matt. used by the Sethians, if it be not, as I believe it is,
the interpolation of Hippolytus.
[265] ἀπελούσατο. Yet it may refer to baptism which preceded
initiation in nearly all the secret rites of the Pagan gods. Cf.
Forerunners, 1, c. 2.
[266] The whole of this paragraph reads like an interpolation,
or rather as something which had got out of its place. The
statement about the physicists is directly at variance with the
opening of the next which attributes the Sethian teaching to the
Orphics. The triads he quotes are all of three “good” powers and
therefore would belong much more appropriately to the system of
the Peratæ. The quotation from Deut. iv. 11, he attributes to
several other heresiarchs.
[267] The codex has ὀμφαλός for ὁ φαλλὸς which is
Schneidewin’s emendation. No book attributed to Orpheus called
“Bacchica” has come down to us, but the Rape of Persephone was
a favourite theme with Orphic poets. Cf. Abel’s Orphica, pp. 209-
219.
[268] This is not improbable; but Hippolytus gives us no
evidence that this is the case, as Plutarch, from whom he quotes,
certainly did not connect the frescoes of Phlium in the
Peloponnesus (not Attica as he says) with the Sethians, nor does
the light in their story desire the water.
[269] This too is a stock quotation which has already done duty
for the Naassene author. Cf. p. 131 supra.
[270] So has this with the “Peratic.” Cf. p. 154 supra.
[271] κράσις ... μίξις.
[272] καταμεμῖχθαι λεπτῶς.
[273] τέχνη.
[274] Matt. x. 34.
[275] This again seems to be Hippolytus’ own repetition of a
simile which he met with in the Naassene author and which so
pleased him that he made use of it in his account of the Peratic
heresy as well as here. Cf. pp. 144 and 159 supra.
[276] ἅλας πηγνύμενον.
[277] Herodotus VI, 20, mentions the City of Ampe, but says
nothing there about the well which is described in c. 119 as at
Ardericca in Cissia.
[278] The title of the book is given in the text as Παράφρασις
Σήθ, which is a well-nigh impossible phrase.
[279] On the whole it may be said that this is the most suspect
of all the chapters in the Philosophumena, and that, if ever
Hippolytus was deceived into purchasing forged documents
according to Salmon and Stähelin’s theory, one of them appears
here. Much of it is mere verbiage as when, after having identified
Mind or Nous with the fragrance of the spirit, he again explains
that it is a ray of light sent from the perfect light, or when he
explains the difference between the three different kinds of law.
The quotations too are seldom new, nearly all of them appearing
in other chapters and are, if it were possible, more than usually
inapposite, while almost the only new one is inaccurate. The
sentence about the Paraphrase (of) Seth, if that is the actual title
of the book, does not suggest that Hippolytus is quoting from that
work, nor does the phrase, “he says,” occur with anything like the
frequency of its use in e. g., the Naassene chapter. On the whole,
then, it seems probable that in this Hippolytus was not copying or
extracting from any written document, but was writing down, to
the best of his recollection the statements of some convert who
professed to be able to reveal its teaching. It is significant in this
respect that when the summary in Book X had to be made, the
summarizer makes no attempt to abbreviate the statement of the
supposed tenets of the Sethians, but merely copies out the part of
the chapter in which they are described, entirely omitting the
stories of the frescoed porch at Phlium and the oil-well at Ampa.
[280] Nothing is known of this Justinus, whose name is not
mentioned by any other patristic writer, and there is no sure
means of fixing his date. Macmahon, relying apparently on the
last sentence of the chapter, would make him a predecessor of
Simon Magus, and therefore contemporary with the Apostles’ first
preaching. This is extremely unlikely, and Salmon on the other
hand (D.C.B., s.v., “Justinus the Gnostic”) considers his heresy
should be referred to “the latest stage of Gnosticism” which, if
taken literally, would make it long posterior to Hippolytus. The
source of his doctrine is equally obscure; for although Hippolytus
classes him with the Ophites, the serpent in his system is certainly
not good and plays as hostile a part towards man as the serpent
of Genesis, while his supreme Triad of the Good Being, an
intermediate power ignorant of the existence of his superior, and
the Earth, differs in all essential respects from the Ophite Trinity
of the First and Second Man and First Woman. Yet the names of
the world-creating angels and devils here given, bear a singular
likeness to those which Theodore bar Khôni in his Book of Scholia
attributes to the Ophites and also to those mentioned by Origen
as appearing on the Ophite Diagram. On the other hand, there
are many likenesses not only of ideas but of language between
the system of Justinus and that of Marcion, who also taught the
existence of a Supreme and Benevolent God and of a lower one,
harsh, but just, who was the unwitting author of the evil which is
in the world. This, indeed, leaves out of the account the third or
female power; but an Armenian account of Marcion’s doctrines
attributes to him belief in a female power also, called Hyle or
Matter and the spouse of the Just God of the Law, with whom her
relations are pretty much as described in the text. Justinus,
however, was not like Marcion a believing Christian; for he makes
his Saviour the son of Joseph and Mary and the mere mouthpiece
of the subaltern angel Baruch, while his account of the Crucifixion
differs materially from that of Marcion. The obscene stories he
tells about the protoplasts also appear in much later Manichæan
documents and seem to be drawn from the Babylonian tradition
of which the loves of the angels in the Book of Enoch are
probably also a survival. It is therefore not improbable that
Justinus, the Book of Enoch, the Ophites, and perhaps Marcion,
alike derived their tenets on these points from heathen myths of
the marriage of Heaven and Earth, which may possibly be traced
back to early Babylonian theories of cosmogony. Cf. Forerunners,
II, cc. 8 and 11, passim.
[281] Hippolytus, like the Gnostic writers, seems to know of an
oral as well as a written tradition from the Evangelists.
[282] Matt. x. 5. In the A.V. as here, τὰ ἔθνη, “the nations.”
[283] πρότερον διδάξας or “at first teaches.”
[284] ψυχαγωγίας χάριν. The reader must again be reminded
that while the ψυχή of the Greeks was what we should call
“mind,” the πνεῦμα is spirit, answering more to our word “soul.”
[285] παραμύθιον, a play upon μύθος.
[286] 1 Cor. ii. 9.
[287] Lit., “guarded the secrets of silence.”
[288] Ps. cx. 4.
[289] “The Blessed.”
[290] παραπλάσει, “given it another form.” As a fact, Justinus’
quotation from Herodotus is singularly accurate, save as
afterwards noted.
[291] Herodotus, IV, 8-10.
[292] An island near Cadiz. The codex has Ἐρυθρᾶς, “the Red
Sea.”
[293] In Herodotus it is mares and a chariot.
[294] μιξοπάρθενος. A neologism.
[295] In Herodotus the prophecy is given by the girl.
[296] To explain the origin of the Scythian nation.
[297] Or perhaps, as above, “the things of the universe.”
[298] Supplied from the summary in Book X. So the Pistis
Sophia has a Power never otherwise described but not benevolent
who is called “the great unseen Forefather,” and seems to rule
over material things.
[299] There is nothing to show that Hippolytus or Justinus
knew this to be a plural.
[300] Seven names are missing from the text. Of the five given,
Michael, Amen and Gabriel are given in the chapter on the
Ophites in Theodore bar Khôni’s Book of Scholia as the first
angels created by God, the name of Baruch being replaced by
that of “the great Yah.” “Esaddæus” is probably El Shaddai, who is
said in the same book to be the angel sent to give the Law to the
Jews and to have treacherously persuaded them to worship
himself.
[301] Of these twelve names, Babel is written in bar Khôni as
Babylon and said to be masculo-feminine, Achamoth is the
Hebrew חכמת, Chochmah, Sophia, or Wisdom whom most
Gnostics called the Mother of Life, Naas is the Serpent as is
explained in the chapter on the Naassenes, Bel, Baal or the
Chaldæan Bel, for Belias we should probably read Beliar, the devil
of works like the Ascensio Isaiae, Kavithan should probably be
Leviathan, Adonaios is the Hebrew Adonai, or the Lord, while
Sael, Karkamenos and Lathen cannot be identified. Pharaoh and
“Samiel,” a homonym of Satan, appear in bar Khôni’s list of angels
who rule one or other of the ten heavens, and Adonaios and
Leviathan in the Ophite Diagram described by Celsus. Cf.
Forerunners, II, pp. 70 ff.
[302] Gen. ii. 8.
[303] So a Chinese Manichæan treatise lately discovered (see
Forerunners, II, p. 352) speaks of demons inhabiting the soul as
“trees.”
[304] ξύλον τοῦ εἰδέναι γνῶσιν κ.τ.λ., “the Tree of seeing
Knowledge,” etc.
[305] The context shows that it is the unity, etc., of Elohim and
Edem that is referred to.
[306] Cf. n. on p. 177 supra.
[307] Gen. i. 28.
[308] Macmahon, “viceregal”; but the “satrap” shows from
which country the story comes.
[309] Thus the Armenian version of Marcion’s theology (for
which see Forerunners, II, p. 217, n. 2) makes the “God of the
Law’s” withdrawal from Hyle or Matter, and his retirement to a
higher heaven, the cause of all man’s woes.
[310] Cf. Ps. cxvii. 19, 20; but the likeness is not exact.
[311] Ps. cx. 1.
[312] Lit., “until she wishes it not.”
[313] “Serpent.” See n. on p. 173 supra.
[314] Gen. ii. 16, 17.
[315] That these stories about the protoplasts endured into
Manichæan times, see M. Cumont’s La Cosmogonie Manichéenne,
Appendix I.
[316] Here again a power is referred to by its number instead
of its name, as with the Naassene author.
[317] Gal. v. 17.
[318] τὴν πλάσιν τὴν πονηράν, malam fictionem, Cr. Yet we
have been told nothing of any deceit by Edem towards her
partner.
[319] The Ophite Diagram, and bar Khôni’s authority both
figure the powers hostile to man as taking the shapes of these
animals.
[320] So one of the latest documents of the Pistis Sophia calls
the planet Aphrodite by a place-name, which in that case is
Bubastis.
[321] προφητεία.
[322] If these words are to be taken literally, Justinus was the
only heretic of early date who denied His divinity, and this would
distinguish him finally from Marcion. But the words are not
inconsistent with the Adoptionist view.
[323] These words are Miller’s suggestion.
[324] John xix. 26.
[325] παραθέμενος. So Luke xxiii. 46.
[326] ἐπριοποίησε. The derivation is absurd and the word if it
had any meaning would be something like “made like a saw.”
προποιέω would make the pun at which he seems to have been
striving.
[327] This was not the case, the statues of Priapus being
placed in gardens. The whole passage seems to have been
interpolated by some one ignorant of Greek and of Greek customs
or mythology.
[328] Isa. i. 2.
[329] τελεῖσθας or “initiated.” In any case a mystical word.
[330] Lit., “washed”; but the context shows that it is baptism
which is in question. It played an important part not only in all
these heretical sects but in heathen “mysteries” like those of Isis
and Mithras.
[331] Hosea i. 2. The A.V. has “departing from the Lord.” Here
we have Edem clearly identified with the Earth goddess which is
the key to the whole of Justinus’ story.
[332] ταῖς ἑξῆς ... τὰς τῶν ἀκολούθων αἱρέσεων. Macmahon,
following Cruice, translates as above. It may well be, however,
that the “heresies which follow” only mean which follow in the
book.
[333] There is no reason to doubt Hippolytus’ assertion that
this chapter is compiled from a book called Baruch in which
Justinus set forth his own doctrines. The narrative therein is,
unlike that of the earlier chapters, perfectly coherent and plain,
and the author’s use of the historical present gives it a dramatic
form which is lacking from the oratio obliqua formerly employed.
Solecisms like the omission of the article are also rare, and the
very long sentences in which Hippolytus seems to have delighted
do not appear except in those passages where he is speaking in
his own person. Whether from this or from some other cause,
moreover, the transcription of it seems to have given less difficulty
to the scribe Michael than some of the other chapters, and there
is therefore far less need to constantly restore the text as in the
case of the quotations from Sextus Empiricus. On the whole,
therefore, we may assume that, as we have it, it is a genuine
summary of Justinus’ doctrines taken from a work by his own
hand.
PUBLICATIONS
OF THE
S. P. C. K.
BOOKS FOR
STUDENTS
AND
OTHERS
Pioneers of Progress
MEN OF SCIENCE: Edited by S. Chapman, M.A., D.Sc. Each with a
Portrait. Paper cover, 1s. 3d.; cloth, 2s. net.
Galileo. By W. W. Bryant, F.R.A.S.
Michael Faraday. By J. A. Crowther, D.Sc.
Alfred Russel Wallace. The Story of a Great Discoverer. By
Lancelot T. Hogben, B.A., B.Sc.
Joseph Priestley. By D. H. Peacock, B.A., M.Sc., F.I.C.
Joseph Dalton Hooker, O.M., G.C.S.I., C.B., F.R.S., M.D., etc.
By Professor F. O. Bower, Sc.D., F.R.S.
Herschel. By the Rev. Hector Macpherson, M.A., F.R.A.S., F.R.S.E.
Archimedes. By Sir Thomas L. Heath, K.C.B., F.R.S.
The Copernicus of Antiquity (Aristarchus of Samos). By Sir
Thomas L. Heath, K.C.B., F.R.S.
John Dalton. By L. J. Neville-Polley, B.Sc.
Kepler. By W. W. Bryant.
EMPIRE BUILDERS:
Edited by A. P. Newton, M.A., D.Litt., B.Sc., and W. Basil Worsfold,
M.A.
With Portrait. 7¼ × 5. Paper cover, 1s. 3d.; cloth, 2s. net.
Sir Francis Drake. By Walter J. Harte, M.A.
Sir Robert Sandeman. By A. L. P. Tucker.
Transcriber's Notes
Obvious typographical errors and variable spelling were corrected. The following
corrections have been made to the text:
Page Original New
5 leben Leben
12 recemmet récemment
25 δοκείν δοκεῖν
33 ἅ ἃ
45 αὐτῆ αὐτῇ
45 έξατμισθέντα ἐξατμισθέντα
45 πυκνωθὲντα πυκνωθέντα
45 κοὶλῳ κοίλῳ
57 σολλογιστικώτερον συλλογιστικώτερον
62 δασσαντο δάσσαντο
63 Λἰθήρ Αἰθήρ
63 καἰ καὶ
66 δἰ δι’
68 Mathescos Matheseos
69 δορυφορεἶσθαι δορυφορεῖσθαι
69 σομπάσχει συμπάσχει
71 sabacta subacta
72 ν ἐν
73 μερἰζεσθαί μερίζεσθαι
75 οί οἱ
80 Ideés Idées
80 σομφωνίᾳ συμφωνίᾳ
82 guess-work guesswork
83 Scientarum Scientiarum
85 ἀπαρτίσῄ ἀπαρτίσῃ
87 ἀγωνίξωνται ἀγωνίζωνται
92 Kapital Capitel
98 σκολόπενδριον σκολόπενδρον
98 ἀμορρύτων αὐτορρύτων
99 after-thought afterthought
103 windpipe wind-pipe
106 ἀπερίξυγον ἀπερίζυγον
109 ’εν ἐν
110 Manichéisine Manichéisme
111 positon position
113 Ιασίδαο Ἰασίδαο
113 ’ιδέας ἰδέας