100% found this document useful (14 votes)

928 views

Deep Learning Applications, Volume 2 M. Arif Wani all chapter instant download

Volume

Uploaded by

suyutsababa

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (14 votes)

928 views

Deep Learning Applications, Volume 2 M. Arif Wani all chapter instant download

Volume

Uploaded by

suyutsababa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Experience Seamless Full Ebook Downloads for Every Genre at textbookfull.

com

Deep Learning Applications, Volume 2 M. Arif Wani

https://textbookfull.com/product/deep-learning-applications-
volume-2-m-arif-wani/

OR CLICK BUTTON

DOWNLOAD NOW

Explore and download more ebook at https://textbookfull.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Biota Grow 2C gather 2C cook Loucas

https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-loucas/

textboxfull.com

Programming PyTorch for Deep Learning Creating and

Deploying Deep Learning Applications 1st Edition Ian
Pointer
https://textbookfull.com/product/programming-pytorch-for-deep-
learning-creating-and-deploying-deep-learning-applications-1st-
edition-ian-pointer/
textboxfull.com

Deep Learning Vol 2 From Basics to Practice Andrew

Glassner

https://textbookfull.com/product/deep-learning-vol-2-from-basics-to-
practice-andrew-glassner/

textboxfull.com

Computational Methods for Deep Learning: Theoretic,

Practice and Applications Wei Qi Yan

https://textbookfull.com/product/computational-methods-for-deep-
learning-theoretic-practice-and-applications-wei-qi-yan/

textboxfull.com
Deep Learning in Python An Object Oriented Programming 1st
Edition Hong M. Lei [Lei

https://textbookfull.com/product/deep-learning-in-python-an-object-
oriented-programming-1st-edition-hong-m-lei-lei/

textboxfull.com

Machine Learning for Economics and Finance in TensorFlow

2: Deep Learning Models for Research and Industry Isaiah
Hull
https://textbookfull.com/product/machine-learning-for-economics-and-
finance-in-tensorflow-2-deep-learning-models-for-research-and-
industry-isaiah-hull/
textboxfull.com

Deep learning in computer vision: principles and

applications First Edition. Edition Mahmoud Hassaballah

https://textbookfull.com/product/deep-learning-in-computer-vision-
principles-and-applications-first-edition-edition-mahmoud-hassaballah/

textboxfull.com

Deep Learning Book Ian Goodfellow

https://textbookfull.com/product/deep-learning-book-ian-goodfellow/

textboxfull.com

Deep Learning on Windows: Building Deep Learning Computer

Vision Systems on Microsoft Windows Thimira Amaratunga

https://textbookfull.com/product/deep-learning-on-windows-building-
deep-learning-computer-vision-systems-on-microsoft-windows-thimira-
amaratunga/
textboxfull.com
Advances in Intelligent Systems and Computing 1232

M. Arif Wani
Taghi M. Khoshgoftaar
Vasile Palade Editors

Deep Learning
Applications,
Volume 2
Advances in Intelligent Systems and Computing

Volume 1232

Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland

Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro,
Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
** Indexing: The books of this series are submitted to ISI Proceedings,
EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/11156

M. Arif Wani Taghi M. Khoshgoftaar
• •

Vasile Palade
Editors

Deep Learning Applications,

Volume 2

123
Editors
M. Arif Wani Taghi M. Khoshgoftaar
Department of Computer Science Computer and Electrical Engineering
University of Kashmir Florida Atlantic University
Srinagar, India Boca Raton, FL, USA

Vasile Palade
Faculty of Engineering and Computing
Coventry University
Coventry, UK

ISSN 2194-5357 ISSN 2194-5365 (electronic)

Advances in Intelligent Systems and Computing
ISBN 978-981-15-6758-2 ISBN 978-981-15-6759-9 (eBook)
https://doi.org/10.1007/978-981-15-6759-9

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

Machine learning algorithms have influenced many aspects of our day-to-day living
and transformed major industries around the world. Fueled by an exponential
growth of data, improvements in computer hardware, scalable cloud resources, and
accessible open-source frameworks, machine learning technology is being used by
companies in big and small alike for innumerable applications. At home, machine
learning models are suggesting TV shows, movies, and music for entertainment,
providing personalized ecommerce suggestions, shaping our digital social net-
works, and improving the efficiency of our appliances. At work, these data-driven
methods are filtering our emails, forecasting trends in productivity and sales, tar-
geting customers with advertisements, improving the quality of video conferences,
and guiding critical decisions. At the frontier of machine learning innovation are
deep learning systems, a class of multi-layered networks is capable of automatically
learning meaningful hierarchical representations from a variety of structured and
unstructured data. Breakthroughs in deep learning allow us to generate new rep-
resentations, extract knowledge, and draw inferences from raw images, video
streams, text and speech, time series, and other complex data types. These powerful
deep learning methods are being applied to new and exciting real-world problems in
medical diagnostics, factory automation, public safety, environmental sciences,
autonomous transportation, military applications, and much more.
The family of deep learning architectures continues to grow as new methods and
techniques are developed to address a wide variety of problems. A deep learning
network is composed of multiple layers that form universal approximators capable
of learning any function. For example, the convolutional layers in Convolutional
Neural Networks use shared weights and spatial invariance to efficiently learn
hierarchical representations from images, natural language, and temporal data.
Recurrent Neural Networks use backpropagation through time to learn from vari-
able length sequential data. Long Short-Term Memory networks are a type of
recurrent network capable of learning order dependence in sequence prediction
problems. Deep Belief Networks, Autoencoders, and other unsupervised models
generate meaningful latent features for downstream tasks and model the underlying
concepts of distributions by reconstructing their inputs. Generative Adversarial

v
vi Preface

Networks simultaneously learn generative models capable of producing new data

from distribution and discriminative models that can distinguish between real and
artificial images. Transformer Networks combine encoders and decoders with
attention layers for improved sequence-to-sequence learning. Network architecture
search automates the designs of these deep models by optimizing performance over
the hyperparameter space. As a result of these advances, and many others, deep
learning is revolutionizing complex problem domains with state-of-the-art results
and, in some cases, is a way superior to the human performances.
This book explores some of the latest applications in deep learning and includes
a variety of architectures and novel deep learning techniques. Deep models are
trained to recommend products, diagnose medical conditions or faults in industrial
machines, detect when a human falls, and recognize solar panels in aerial images.
Sequence models are used to capture driving behaviors and identify radio trans-
mitters from temporal data. Residual networks are used to detect human targets in
indoor environments, algorithm incorporating thresholding strategy is used to
identify fraud within highly imbalanced data, and hybrid methods are used to locate
vehicles during satellite outages. Multi-adversarial variational autoencoder network
is used for image synthesis and classification and finally parameter continuation
method is used for non-convex optimization of deep neural networks. We believe
that these recent deep learning methods and applications illustrated in this book
capture some of the most exciting advances in deep learning.

Srinagar, India M. Arif Wani

Boca Raton, USA Taghi M. Khoshgoftaar
Coventry, UK Vasile Palade
Contents

Deep Learning-Based Recommender Systems . . . . . . . . . . . . . . . . . . . . 1

Meshal Alfarhood and Jianlin Cheng
A Comprehensive Set of Novel Residual Blocks for Deep Learning
Architectures for Diagnosis of Retinal Diseases from Optical
Coherence Tomography Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Sharif Amit Kamran, Sourajit Saha, Ali Shihab Sabbir,
and Alireza Tavakkoli
Three-Stream Convolutional Neural Network for Human
Fall Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Guilherme Vieira Leite, Gabriel Pellegrino da Silva, and Helio Pedrini
Diagnosis of Bearing Faults in Electrical Machines Using Long
Short-Term Memory (LSTM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Russell Sabir, Daniele Rosato, Sven Hartmann, and Clemens Gühmann
Automatic Solar Panel Detection from High-Resolution Orthoimagery
Using Deep Learning Segmentation Networks . . . . . . . . . . . . . . . . . . . . 101
Tahir Mujtaba and M. Arif Wani
Training Deep Learning Sequence Models to Understand
Driver Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Shokoufeh Monjezi Kouchak and Ashraf Gaffar
Exploiting Spatio-Temporal Correlation in RF Data
Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Debashri Roy, Tathagata Mukherjee, and Eduardo Pasiliao
Human Target Detection and Localization with Radars
Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Michael Stephan, Avik Santra, and Georg Fischer

vii
viii Contents

Thresholding Strategies for Deep Learning with Highly Imbalanced

Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Justin M. Johnson and Taghi M. Khoshgoftaar
Vehicular Localisation at High and Low Estimation Rates During
GNSS Outages: A Deep Learning Approach . . . . . . . . . . . . . . . . . . . . . 229
Uche Onyekpe, Stratis Kanarachos, Vasile Palade,
and Stavros-Richard G. Christopoulos
Multi-Adversarial Variational Autoencoder Nets for Simultaneous
Image Generation and Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Abdullah-Al-Zubaer Imran and Demetri Terzopoulos
Non-convex Optimization Using Parameter Continuation Methods
for Deep Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Harsh Nilesh Pathak and Randy Clinton Paffenroth

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

Editors and Contributors

About the Editors

Dr. M. Arif Wani is a Professor at the University of Kashmir, having previously

served as a Professor at California State University, Bakersﬁeld. He completed his
M.Tech. in Computer Technology at the Indian Institute of Technology, Delhi, and
his Ph.D. in Computer Vision at Cardiff University, UK. His research interests are
in the area of machine learning, with a focus on neural networks, deep learning,
inductive learning, and support vector machines, and with application to areas that
include computer vision, pattern recognition, classiﬁcation, prediction, and analysis
of gene expression datasets. He has published many papers in reputed journals and
conferences in these areas. Dr. Wani has co-authored the book ‘Advances in Deep
Learning,’ co-edited the book ‘Deep Learning Applications,’ and co-edited 17
conference proceeding books in machine learning and applications area. He is a
member of many academic and professional bodies, e.g., the Indian Society for
Technical Education, Computer Society of India, and IEEE USA.

Dr. Taghi M. Khoshgoftaar is the Motorola Endowed Chair professor of the

Department of computer and electrical engineering and Computer Science, Florida
Atlantic University, and the Director of NSF Big Data Training and Research
Laboratory. His research interests are in big data analytics, data mining and
machine learning, health informatics and bioinformatics, social network mining,
and software engineering. He has published more than 750 refereed journal and
conference papers in these areas. He was the Conference Chair of the IEEE
International Conference on Machine Learning and Applications (ICMLA 2019).
He is the Co-Editor-in-Chief of the Journal of Big Data. He has served on orga-
nizing and technical program committees of various international conferences,
symposia, and workshops. He has been a Keynote Speaker at multiple international

ix
x Editors and Contributors

conferences and has given many invited talks at various venues. Also, he has served
as North American Editor of the Software Quality Journal, was on the editorial
boards of the journals Multimedia Tools and Applications, Knowledge and
Information Systems, and Empirical Software Engineering, and is on the editorial
boards of the journals Software Quality, Software Engineering and Knowledge
Engineering, and Social Network Analysis and Mining.

Dr. Vasile Palade is currently a Professor of Artiﬁcial Intelligence and Data

Science at Coventry University, UK. He previously held several academic and
research positions at the University of Oxford—UK, University of Hull—UK, and
the University of Galati—Romania. His research interests are in the area of machine
learning, with a focus on neural networks and deep learning, and with main
application to image processing, social network data analysis and web mining,
smart cities, health, among others. Dr. Palade is author and co-author of more than
170 papers in journals and conference proceedings as well as several books on
machine learning and applications. He is an Associate Editor for several reputed
journals, such as Knowledge and Information Systems and Neurocomputing. He
has delivered keynote talks to international conferences on machine learning and
applications. Dr. Vasile Palade is an IEEE Senior Member.

Contributors

Meshal Alfarhood Department of Electrical Engineering and Computer Science,

University of Missouri-Columbia, Columbia, USA
M. Arif Wani Department of Computer Science, University of Kashmir, Srinagar,
India
Jianlin Cheng Department of Electrical Engineering and Computer Science,
University of Missouri-Columbia, Columbia, USA
Stavros-Richard G. Christopoulos Institute for Future Transport and Cities,
Coventry University, Coventry, UK;
Faculty of Engineering, Coventry University, Coventry, UK
Randy Clinton Paffenroth Worcester Polytechnic Institute, Mathematical
Sciences Computer Science & Data Science, Worcester, MA, USA
Gabriel Pellegrino da Silva Institute of Computing, University of Campinas,
Campinas, SP, Brazil
Georg Fischer Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen,
Germany
Ashraf Gaffar Arizona State University, Tempe, USA
Editors and Contributors xi

Clemens Gühmann Chair of Electronic Measurement and Diagnostic Technology

& Technische Universität Berlin, Berlin, Germany
Sven Hartmann SEG Automotive Germany GmbH, Stuttgart, Germany
Abdullah-Al-Zubaer Imran University of California, Los Angeles, CA, USA
Justin M. Johnson Florida Atlantic University, Boca Raton, FL, USA
Sharif Amit Kamran University of Nevada, Reno, NV, USA
Stratis Kanarachos Faculty of Engineering, Coventry University, Coventry, UK
Taghi M. Khoshgoftaar Florida Atlantic University, Boca Raton, FL, USA
Shokoufeh Monjezi Kouchak Arizona State University, Tempe, USA
Guilherme Vieira Leite Institute of Computing, University of Campinas,
Campinas, SP, Brazil
Tahir Mujtaba Department of Computer Science, University of Kashmir,
Srinagar, India
Tathagata Mukherjee Computer Science, University of Alabama, Huntsville,
AL, USA
Harsh Nilesh Pathak Expedia Group, Seattle, WA, USA
Uche Onyekpe Institute for Future Transport and Cities, Coventry University,
Coventry, UK;
Research Center for Data Science, Coventry University, Coventry, UK
Vasile Palade Research Center for Data Science, Coventry University, Coventry,
UK
Eduardo Pasiliao Munitions Directorate, Air Force Research Laboratory, Eglin
AFB, Valparaiso, FL, USA
Helio Pedrini Institute of Computing, University of Campinas, Campinas, SP,
Brazil
Daniele Rosato SEG Automotive Germany GmbH, Stuttgart, Germany
Debashri Roy Computer Science, University of Central Florida, Orlando, FL,
USA
Russell Sabir SEG Automotive Germany GmbH, Stuttgart, Germany;
Chair of Electronic Measurement and Diagnostic Technology & Technische
Universität Berlin, Berlin, Germany
Ali Shihab Sabbir Center for Cognitive Skill Enhancement, Independent
University Bangladesh, Dhaka, Bangladesh
xii Editors and Contributors

Sourajit Saha Center for Cognitive Skill Enhancement, Independent University

Bangladesh, Dhaka, Bangladesh
Avik Santra Inﬁneon Technologies AG, Neubiberg, Germany
Michael Stephan Inﬁneon Technologies AG, Neubiberg, Germany;
Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany
Alireza Tavakkoli University of Nevada, Reno, NV, USA
Demetri Terzopoulos University of California, Los Angeles, CA, USA
Deep Learning-Based Recommender
Systems

Meshal Alfarhood and Jianlin Cheng

Abstract The term “information overload” has gained popularity over the last few
years. It defines the difficulties people face in finding what they want from a huge
volume of available information. Recommender systems have been recognized to be
an effective solution to such issues, such that suggestions are made based on users’
preferences. This chapter introduces an application of deep learning techniques in
the domain of recommender systems. Generally, collaborative filtering approaches,
and Matrix Factorization (MF) techniques in particular, are widely known for their
convincing performance in recommender systems. We introduce a Collaborative
Attentive Autoencoder (CATA) that improves the matrix factorization performance
by leveraging an item’s contextual data. Specifically, CATA learns the proper features
from scientific articles through the attention mechanism that can capture the most
pertinent parts of information in order to make better recommendations. The learned
features are then incorporated into the learning process of MF. Comprehensive exper-
iments on three real-world datasets have shown our method performs better than other
state-of-the-art methods according to various evaluation metrics. The source code of
our model is available at: https://github.com/jianlin-cheng/CATA.

This chapter is an extended version of our published paper at the IEEE ICMLA conference 2019
[1]. This chapter incorporates new experimental contributions compared to the original confere-
nce paper.

M. Alfarhood (B) · J. Cheng

Department of Electrical Engineering and Computer Science,
University of Missouri-Columbia, Columbia, USA
e-mail: may82@missouri.edu
J. Cheng
e-mail: chengji@missouri.edu

© The Editor(s) (if applicable) and The Author(s), under exclusive license 1
to Springer Nature Singapore Pte Ltd. 2021
M. A. Wani et al. (eds.), Deep Learning Applications, Volume 2,
Advances in Intelligent Systems and Computing 1232,
https://doi.org/10.1007/978-981-15-6759-9_1
2 M. Alfarhood and J. Cheng

1 Introduction

The era of e-commerce has vastly changed people’s lifestyles during the first part
of the twenty-first century. People today tend to do many of their daily routines
online, such as shopping, reading the news, and watching movies. Nevertheless,
consumers often face difficulties while exploring related items such as new fashion
trends because they are not aware of their existence due to the overwhelming amount
of information available online. This phenomenon is widely known as “information
overload”. Therefore, Recommender Systems (RSs) are a critical solution for helping
users make decisions when there are lots of choices. RSs have been integrated into
and have become an essential part of every website due to their impact on increasing
customer interactions, attracting new customers, and growing businesses’ revenue.
Scientific article recommendation is a very common application for RSs. It keeps
researchers updated on recent related work in their field. One traditional way to
find relevant articles is to go through the references section in other articles. Yet,
this approach is biased toward heavily cited articles, such that new relevant articles
with higher impact have less chance to be found. Another method is to search for
articles using keywords. Although this technique is popular among researchers, they
must filter out a tremendous number of articles from the search results to retrieve
the most suitable articles. Moreover, all users get the same search results with the
same keywords, and these results are not personalized based on the users’ personal
interests. Thus, recommendation systems can address this issue and help scientists
and researchers find valuable articles while being aware of recent related work.
Over the last few decades, a lot of effort has been made by both academia and
industry on proposing new ideas and solutions for RSs, which ultimately help ser-
vice providers in adopting such models in their system architecture. The research in
RSs has evolved remarkably following the Netflix prize competition1 in 2006, where
the company offered one million dollars for any team that could improve their rec-
ommendation accuracy by 10%. Since that time, collaborative filtering models and
matrix factorization techniques in particular have become the most common models
due to their effective performance. Generally, recommendation models are classified
into three categories: Collaborative Filtering Models (CF), Content-Based Filter-
ing models (CBF), and hybrid models. CF models [2–4] focus on users’ histories,
such that users with similar past behaviors tend to have similar future tastes. On the
other hand, CBF models work by learning the item’s features from its informational
description, such that two items are possibly similar to each other if they share more
characteristics. For example, two songs are similar to each other if they both share
the same artist, genre, tempo, energy, etc. However, similarities between items in CF
models are different such that two items are likely similar to each other once they are
rated by multiple users in the same manner, even though those items have different
characteristics.

1 www.netflixprize.com.
Deep Learning-Based Recommender Systems 3

Generally, CF models function better than CBF models. However, CF performance

drops substantially when users or items have an insufficient amount of feedback
data. This problem is defined as the data sparsity problem. To tackle data sparseness,
hybrid models have been widely proposed in recent works [5–8], in which content
information, used in CBF models, is incorporated with CF models to improve the
system performance. Hybrid models are divided into two sub-categories according
to how models are trained: loosely coupled models and tightly coupled models [7].
Loosely coupled models train CF and CBF models separately, like ensembles, and
then determine the final score based on the scores of the two separated models. On
the other hand, the tightly coupled models train both CF and CBF models jointly.
In joint training, both models cooperate with one another to calculate the final score
under the same loss function.
Even though traditional recommendation approaches have achieved great success
over the last years, they still have shortcomings in accurately modeling complex
(e.g., non-linear) relationship between users and items. Alternatively, deep neural
networks are universal function approximators that are capable of modeling any con-
tinuous function. Recently, Deep Learning (DL) has become an effective approach
for most data mining problems. DL meets recommendation systems in the last few
years. One of the first works that applied DL concept for CF recommendations was
Restricted Boltzmann Machines (RBM) [4]. However, this approach was not deep
enough (two layers only) to learn users’ tastes from their histories, and it also did not
take contextual information into consideration. Recently, Collaborative Deep Learn-
ing (CDL) [7] has become a very popular deep learning technique in RSs due to its
promising performance. CDL can be viewed as an updated version of Collaborative
Topic Regression (CTR) [5] by substituting the Latent Dirichlet Allocation (LDA)
topic modeling with a Stacked Denoising Autoencoder (SDAE) to learn from item
contents, and then integrating the learned latent features into a Probabilistic Matrix
Factorization (PMF). Lately, Collaborative Variational Autoencoder (CVAE) [8] has
been proposed to learn deep item latent features via a variational autoencoder. The
authors show that their model learns better item features than CDL because their
model infers the latent variable distribution in latent space instead of observation
space. However, both CDL and CVAE models assume that all parts of their model’s
contribution are the same for their final predictions.
Hence, in this work, we propose a deep learning-based model named Collaborative
Attentive Autoencoder (CATA) for recommending scientific articles. In particular,
we integrate the attention mechanism into our unsupervised deep learning process
to identify an item’s features. We learn the item’s features from the article’s textual
information (e.g., the article’s title and abstract) to enhance the recommendation
quality. The compressed low-dimensional representation learned by the unsupervised
model is incorporated then into the matrix factorization approach for our ultimate
recommendation. To demonstrate the capability of our proposed model to generate
more relevant recommendations, we conduct inclusive experiments on three real-
world datasets, which are taken from the CiteULike2 website, to evaluate CATA

2 www.citeulike.org.
4 M. Alfarhood and J. Cheng

against multiple recent works. The experimental results prove that our model can
extract more constructive information from an article’s contextual data than other
models. More importantly, CATA performs very well where the data sparsity is
extremely high.
The remainder of this chapter is organized in the following manner. First, we
demonstrate the matrix factorization method in Sect. 2. We introduce our model,
CATA, in Sect. 3. The experimental results of our model against the state-of-the-art
models are discussed thoroughly in Sect. 4. We then conclude our work in Sect. 5.

2 Background

Our work is designed and evaluated on recommendations with implicit feedback.

Thus, in this section, we describe the well-known collaborative filtering approach,
Matrix Factorization, for implicit feedback problems.

2.1 Matrix Factorization

Matrix Factorization (MF) [2] is the most popular CF method, mainly due to its
simplicity and efficiency. The idea behind MF is to decompose the user-item matrix,
R ∈ Rn×m , into two lower dimensional matrices, U ∈ Rn×d and V ∈ Rm×d , such
that the inner product of U and V will approximate the original matrix R, where d
is the dimension of the latent factors, such that d min(n, m). n and m correspond
to the number of users and items in the system. Figure 1 illustrates the MF process.

R ≈ U · VT (1)

MF optimizes the values of U and V by minimizing the sum of the squared

difference between the actual values and the model predictions with adding two
regularization terms, as shown here:
Ii j λu λv
L= (ri j − u i v Tj )2 + u i 2 + v j 2 (2)
i, j∈R
2 2 2

where Ii j is an indicator function that equals 1 if useri has rated item j , and 0 if
otherwise. Also, ||U || and ||V || are the Euclidean norms, and λu , λv are two regu-
larization terms preventing the values of U and V from being too large. This avoid
model overfitting.
Explicit data, such as ratings (ri j ) are not regularly available. Therefore, Weighted
Regularized Matrix Factorization (WRMF) [9] introduces two modifications to the
previous objective function to make it work for implicit feedback. The optimization
Deep Learning-Based Recommender Systems 5

Fig. 1 Matrix factorization illustration

process in this case runs through all user-item pairs with different confidence levels
assigned to each pair, as in the following:
ci j λu λv
L= ( pi j − u i v Tj )2 + u i 2 + v j 2 (3)
i, j∈R
2 2 2

where pi j is the user preference score with a value of 1 when useri and item j have
an interaction, and 0 otherwise. ci j is a confidence variable where its value shows
how confident the user like the item. In general, ci j = a when pi j = 1, and ci j = b
when pi j = 0, such that a > b > 0.
Stochastic Gradient Decent (SGD) [10] and Alternating Least Squares (ALS) [11]
are two optimization methods that can be used to minimize the objective function
of MF in Eq. 2. The first method, SGD, loops over each single training sample and
then computes the prediction error as ei j = ri j − u i v Tj . The gradient of the objective
function with respect to u i and v j can be computed as follows:

∂L
=− Ii j (ri j − u i v Tj )v j + λu u i
∂u i j
(4)
∂L
=− Ii j (ri j − u i v Tj )u i + λv v j
∂v j i

After calculating the gradient, SGD updates the user and item latent factors in the
opposite direction of the gradient using the following equations:
6 M. Alfarhood and J. Cheng
⎛ ⎞

ui ← ui + α ⎝ Ii j ei j v j − λu u i ⎠
j
(5)

vj ← vj + α Ii j ei j u i − λ j v j
i

where α is the learning rate.

Even though SGD is easy to implement and generally faster than ALS in some
cases, it is not suitable to use with implicit feedback, since looping over each single
training sample is not practical. ALS works better in this case. ALS iteratively opti-
mizes U while V is fixed, and vice versa. This optimization process is repeated until
the model converges.
To determine what user and item vector values minimize the objective function
for implicit data (Eq. 3), we first take the derivative of L with respect to u i .

∂L
=− ci j ( pi j − u i v Tj )v j + λu u i
∂u i j

0 = −Ci (Pi − u i V T )V + λu u i
0 = −Ci V Pi + Ci V u i V T + λu u i (6)
V Ci Pi = u i V Ci V + λu u i
T

V Ci Pi = u i (V Ci V T + λu I )
ui = V Ci Pi (V Ci V T + λu I )−1
ui = (V Ci V T + λu I )−1 V Ci Pi

where I is the identity matrix.

Similarly, taking the derivative of L with respect to v j leads to

v j = (U C j U T + λv I )−1 U C j P j (7)

3 Proposed Model

In this section, we illustrate our proposed model in depth. The intuition behind our
model is to learn the latent factors of items in PMF with the use of available side
textual contents. We use an attentive unsupervised model to catch more plentiful
information from the available data. The architecture of our model is displayed in
Fig. 2. We first define the problem with implicit feedback before we go through the
details of our model.
Deep Learning-Based Recommender Systems 7

λu λv

X̂ j

Decoder

Attention

Ui Vj Zj X Softmax

Rij Encoder

Xj
i = 1:n j = 1:m

Fig. 2 Collaborative attentive autoencoder architecture

3.1 Problem Definition

User-item interaction data is the primary source for training recommendation

engines. This data can be either collected in an explicit or implicit manner. In explicit
data, users directly express their opinion about an item using the rating system to
show how much they like that item. The user’s ratings usually vary from one-star to
five-stars with five being very interested and one being not interested. This type of
data is very useful and reliable due to the fact that it represents the actual feeling of
users about items. However, users’ ratings occasionally are not available due to the
difficulty of obtaining users’ explicit opinions. In this case, implicit feedback can be
obtained indirectly from the user’s behavior such as user clicks, bookmarks, or the
time spent viewing an item. For instance, if a user listens to a song 10 times in the
last two days, he or she most likely likes this song. Thus, implicit data is more preva-
lent and easier to collect, but it is generally less reliable than explicit data. Also, all
the observed interactions in implicit data constitute positive feedback, but negative
feedback is missing. This problem is also defined as the one-class problem.
There are multiple previous works aiming to deal with the one-class problem. A
simple solution is to treat all missing data as negative feedback. However, this is
not true because the missing (unobserved) interaction could be positive if the user is
aware of the item existing. Therefore, using this strategy to build a model might result
in a misleading model due to faulty assumptions at the outset. On the contrary, if
8 M. Alfarhood and J. Cheng

we treat all missing data as unobserved data without considering including negative
feedback in the model training, the corresponding trained model is probably useless
since it is only trained on positive data. As a result, sampling negative feedback
from positive feedback is one practical solution for this problem, which has been
proposed by [12]. In addition, Weighted Regularized Matrix Factorization (WRMF)
[9] is another proposed solution that introduces a confidence variable that works as
a weight to measure how likely a user is to like an item.
In general, the recommendation problem with implicit data is usually formulated
as follows:
1, if there is user-item interaction
Rnm = (8)
0, otherwise

where the ones in implicit feedback represent all the positive feedback. However,
it is important to note that a value of 0 does not imply always negative feedback.
It may be that users are not aware of the existence of those items. In addition,
the user-item interaction matrix (R) is usually highly imbalanced, such that the
number of the observed interactions is much less than the number of the unobserved
interactions. In other words, matrix R is very sparse, meaning that users only interact
explicitly or implicitly with a very small number of items compared to the total
number of items in this matrix. Sparsity is one frequent problem in RSs, which brings
a real challenge for any proposed model to have the capability to provide effective
personalized recommendations under this situation. The following sections explain
our methodology, where we aim to eliminate the influence of the aforementioned
problems.

3.2 The Attentive Autoencoder

Autoencoder [13] is an unsupervised learning neural network that is useful for com-
pressing high-dimensional input data into a lower dimensional representation while
preserving the abstract nature of the data. The autoencoder network is generally
composed of two main components, i.e., the encoder and the decoder. The encoder
takes the input and encodes it through multiple hidden layers and then generates a
compressed representative vector, Z j . The encoding function can be formulated as
Z j = f (X j ). Subsequently, the decoder can be used then to reconstruct and estimate
the original input, Xˆ j , using the representative vector, Z j . The decoder function can
be formulated as Xˆ j = f (Z j ). Each the encoder and the decoder usually consist of
the same number of hidden layers and neurons. The output of each hidden layer is
computed as follows:

h () = σ (h (−1) W () + b() ) (9)

Deep Learning-Based Recommender Systems 9

where () is the layer number, W is the weights matrix, b is the bias vector, and σ
is a non-linear activation function. We use the Rectified Linear Unit (ReLU) as the
activation function.
Our model takes input from the article’s textual data, X j = {x 1 , x 2 , . . . , x s },
where x i is a value between [0, 1] and s represents the vocabulary size of the arti-
cles’ titles and abstracts. In other words, the input of our autoencoder network is
a normalized bag-of-words histograms of filtered vocabularies of the articles’ titles
and abstracts.
Batch Normalization (BN) [14] has been proven to be a proper solution for the
internal covariant shift problem, where the layer’s input distribution in deep neural
networks changes across the time of training, and causes difficulty in training the
model. In addition, BN can work as a regularization procedure like Dropout [15]
in deep neural networks. Accordingly, we apply a batch normalization layer after
each hidden layer in our autoencoder to obtain a stable distribution from each layer’s
output.
Furthermore, we use the idea of the attention mechanism to work between the
encoder and the decoder, such that only the relevant parts of the encoder output are
selected for the input reconstruction. Attention in deep learning can be described
simply as a vector of weights to show the importance of the input elements. Thus,
the intuition behind attention is that not all parts of the input are equally significant,
i.e., only few parts are significant for the model. We first calculate the scores as the
probability distribution of the encoder’s output using the so f tmax(.) function.

ezc
f (z c ) = zd
(10)
de

The probability distribution and the encoder output are then multiplied using
element-wise multiplication function to get Z j .
We use the attentive autoencoder to pretrain the items’ contextual information
and then integrate the compressed representation, Z j , in computing the items’ latent
factors, V j , from the matrix factorization method. The dimension space of Z j and V j
are set to be equal to each other. Finally, we adopt the binary cross-entropy (Eq. 11)
as the loss function we want to minimize in our attentive autoencoder model.

L=− yk log( pk ) − (1 − yk ) log(1 − pk ) (11)
k

where yk corresponds to the correct labels and pk corresponds to the predicted

values.
10 M. Alfarhood and J. Cheng

3.3 Probabilistic Matrix Factorization

Probabilistic Matrix Factorization (PMF) [3] is a probabilistic linear model where

the prior distributions of the latent factors and users’ preferences are drawn from
Gaussian normal distribution.

u i ∼ N (0, λ−1
u I)
v j ∼ N (0, λ−1
v I) (12)
pi j ∼ N (u i v Tj , σ 2 )

We integrate the items’ contents, trained through the attentive autoencoder, into
PMF. Therefore, the objective function in Eq. 3 has been changed slightly to become
ci j λu λv
L= ( pi j − u i v Tj )2 + u i 2 + v j − θ (X j )2 (13)
i, j∈R
2 2 2

where θ (X j ) = Encoder (X j ) = Z j .
Thus, taking the partial derivative of our previous objective function with respect
to both u i and v j results in the following equations that minimize our objective
function the most

u i = (V Ci V T + λu I )−1 V Ci Pi
(14)
v j = (U C j U T + λv I )−1 U C j P j + λv θ (X j )

We optimize the values of u i and v j using the Alternating Least Squares (ALS)
optimization method.

3.4 Prediction

After our model has been trained and the latent factors of users and articles, U and
V , are identified, we calculate our model’s prediction scores of useri and each article
as the dot product of vector u i with all vectors in V as scor esi = u i V T . Then, we
sort all articles based on our model predication scores in descending order, and then
recommend the top-K articles for that useri . We go through all users in U in our
evaluation and report the average performance among all users. The overall process
of our approach is illustrated in Algorithm 1.
Deep Learning-Based Recommender Systems 11

Algorithm 1: CATA algorithm

1 pretrain autoencoder with input X ;
2 Z ← θ(X );
3 U, V ← Initialize with random values;
4 while <NOT converge> do
5 for <each user i > do
6 u i ← update using Equation 14;
7 end for
8 for <each article j > do
9 vi ← update using Equation 14;
10 end for
11 end while
12 for <each user i > do
13 scor esi ← u i V T ;
14 sort(scor esi ) in descending order;
15 end for
16 Evaluate the top-K recommendations;

4 Experiments

In this section, we conduct extensive experiments aiming to answer the following

research questions:
• RQ1: How does our proposed model, CATA, perform against state-of-the-art meth-
ods?
• RQ2: Does adding the attention mechanism actually improve our model perfor-
mance?
• RQ3: How could different values of the regularization parameters (λu and λv )
affect CATA performance?
• RQ4: What is the impact of different dimension values of users and items’ latent
space on CATA performance?
• RQ5: How many training epochs are sufficient for pretraining our autoencoder?
Before answering these research questions, we first describe the datasets used in our
evaluations, the evaluation metrics, and the baseline approaches we use to evaluate
our model against.

4.1 Datasets

Three scientific article datasets are used to evaluate our model against the state-of-
the-art methods. All datasets are collected from CiteULike website. The first dataset
is called Citeulike-a, which is collected by [5]. It has 5,551 users, 16,980 articles, and
204,986 user-article pairs. The sparseness of this dataset is extremely high, where
only around 0.22% of the user-article matrix has interactions. Each user has at least
12 M. Alfarhood and J. Cheng

ten articles in his or her library. On average, each user has 37 articles in his or her
library and each article has been added to 12 users’ libraries. The second dataset is
called Citeulike-t, which is collected by [6]. It has 7,947 users, 25,975 articles, and
134,860 user-article pairs. This dataset is actually sparser than the first one with only
0.07% available user-article interactions. Each user has at least three articles in his
or her library. On average, each user has 17 articles in his or her library and each
article has been added to five users’ libraries. Lastly, Citeulike-2004–2007 is the third
dataset, and it is collected by [16]. It is three times bigger than the previous ones with
regard to the user-article matrix. It has 3,039 users, 210,137 articles, and 284,960
user-article pairs. This dataset is the sparsest in this experiment, with a sparsity equal
to 99.95%. Each user has at least ten articles in his or her library. On average, each
user has 94 articles in his or her library and each article has been added only to one
user library. Brief statistics of the datasets are shown in Table 1.
Title and abstract of each article are given in each dataset. The average number
of words per article in both title and abstract after our text preprocessing is 67 words
in Citeulike-a, 19 words in Citeulike-t, and 55 words in Citeulike-2004–2007. We
follow the same preprocessing techniques as the state-of-the-art models in [5, 7,
8]. A five-stage procedure to preprocess the textual content is displayed in Fig. 3.
Each article title and abstract are combined together and then are preprocessed such
that stop words are removed. After that, top-N distinct words based on the TF-IDF
measurement are picked out. 8,000 distinct words are selected for the Citeulike-a
dataset, 20,000 distinct words are selected for the Citeulike-t dataset, and 19,871
distinct words are selected for the Citeulike-2004–2007 dataset to form the bag-of-
words histogram, which are then normalized into values between 0 and 1 based on
the vocabularies’ occurrences.

Table 1 Descriptions of citeulike datasets

Dataset #Users #Articles #Pairs Sparsity (%)
Citeulike-a 5,551 16,980 204,986 99.78
Citeulike-t 7,947 25,975 134,860 99.93
Citeulike-2004–2007 3,039 210,137 284,960 99.95

Fig. 3 A five-stage procedure for preprocessing articles’ titles and abstracts

Deep Learning-Based Recommender Systems 13

Fig. 4 Ratio of articles that have been added to ≤N users’ libraries

Figure 4 shows the ratio of articles that have been added to five or fewer users’
libraries. For example, 15, 77, and 99% of the articles in Citeulike-a, Citeulike-t, and
Citeulike-2004–2007, respectively, are added to five or fewer users’ libraries. Also,
only 1% of the articles in Citeulike-a have been added only to one user library, while
the rest of the articles have been added to more than this number. On the contrary,
13, and 77% of the articles in Citeulike-t and Citeulike-2004–2007 have been added
only to one user library. This proves the sparseness of the data with regard to articles
as we go from one dataset to another.

4.2 Evaluation Methodology

We follow the state-of-the-art techniques [6–8] to generate our training and testing
sets. For each dataset, we create two versions of the dataset for sparse and dense
settings. In total, six dataset cases are used in our evaluation. To form the sparse
(P = 1) and the dense (P = 10) datasets, P items are randomly selected from each
user library to generate the training set while the remaining items from each user
library are used to generate the testing set. As a result, when P = 1, only 2.7, 5.9,
and 1.1% of the data entries are used to generate the training set in Citeulike-a,
Citeulike-t, and Citeulike-2004–2007, respectively. Similarly, 27.1, 39.6, and 10.7%
of the data entries are used to generate the training set when P = 10 as Fig. 5 shows.
14 M. Alfarhood and J. Cheng

(a) Citeulike-a (b) Citeulike-t (c) Citeulike-2004-2007

Fig. 5 The percentage of the data entries that forms the training and testing sets in all citeulike
datasets

We use recall and Discounted Cumulative Gain (DCG) as our evaluation metrics
to test how our model performs. Recall is usually used to evaluate recommender
systems with implicit feedback. However, precision is not favorable to use with
implicit feedback because the zero value in the user-article interaction matrix has
two meanings: either the user is not interested in the article, or the user is not aware
of the existence of this article. Therefore, using the precision metric only assumes
that for each zero value the user is not interested in the article, which is not the case.
Recall per user can be measured using the following formula:

Relevant Articles ∩ K Recommended Articles

recall@K = (15)
Relevant Articles
where K is set manually in the experiment and represents the top K articles of each
user. We set K = 10, 50, 100, 150, 200, 250, and 300 in our evaluations. The overall
recall can be calculated as the average recall among all users. If K equals the number
of articles in the dataset, recall will have a value of 1.
Recall, however, does not take into account the ranking of articles within the
top-K recommendations, as long as they are in the top-K list. However, DCG does.
DCG shows the capability of the recommendation engine to recommend articles at
the top of the ranking list. Articles in higher ranked K positions have more value than
others. The DCG among all users can be measured using the following equation:
|U |
1 rel(i)
K
DCG@K = (16)
|U | u=1 i=1 log2 (i + 1)

where |U | is the total number of users, i is the rank of the top-K articles recommended
by the model, and rel(i) is an indicator function that outputs 1 if the article at rank i
is a relevant article, and 0 otherwise.
Deep Learning-Based Recommender Systems 15

4.3 Baselines

We evaluate our approach against the following baselines described below:

• POP: Popular predictor is a non-personalized recommender system. It recom-
mends the most popular articles in the training set, such that all users get identical
recommendations. It is widely used as the baseline for personalized recommen-
dation models.
• CDL: Collaborative Deep Learning (CDL) [7] is a deep Bayesian model that
jointly models both user-item interaction data and items’ content via a Stacked
Denoising Autoencoder (SDAE) with a Probabilistic Matrix Factorization (PMF).
• CML+F: Collaborative Metric Learning (CML) [17] is a metric learning model
that pulls items liked by a user closer to that user. Recommendations are then
generated based on the K-Nearest Neighbor of each user. CML+F additionally
uses a neural network with two fully connected layers to train items’ features
(articles’ tags in this chapter) to update items’ location. CML+F has been shown
to have a better performance than CML.
• CVAE: Collaborative Variational Autoencoder (CVAE) [8] is a probabilistic
model that jointly models both user-item interaction data and items’ content
through a Variational Autoencoder (VAE) with a Probabilistic Matrix Factoriza-
tion (PMF). It can be considered as the baseline of our proposed approach since
CVAE and CATA share the same strategy.
For hyper-parameter settings, we set the confidence variables (i.e., a and b) to
a = 1, and b = 0.01. These are the same values used in CDL and CVAE as well.
Also, a four-layer network is used to construct our attentive autoencoder. The four-
layer network has the following shape “#Vocabularies-400-200-100-50-100-200-
400-#Vocabularies”.

4.4 Experimental Results

For each dataset, we repeat the data splitting four times with different random splits
of training and testing set, which has been previously described in the evaluation
methodology section. We use one split as a validation experiment to find the optimal
parameters of λu and λv for our model and the state-of-the-art models as well. We
search a grid of the following values {0.01, 0.1, 1, 10, 100} and the best values on
the validation experiment have been reported in Table 2. The other three splits are
used to report the average performance of our model against the baselines. In this
section, we address the research questions that have been previously defined in the
beginning of this section.
16 M. Alfarhood and J. Cheng

Table 2 Parameter settings for λu and λv based on the validation experiment

Approach Citeulike-a Citeulike-t Citeulike-2004–2007
Sparse Dense Sparse Dense Sparse Dense
λu λv λu λv λu λv λu λv λu λv λu λv
CDL 0.01 10 0.01 10 0.01 10 0.01 10 0.01 10 0.01 10
CVAE 0.1 10 1 10 0.1 10 0.1 10 0.1 10 0.1 10
CATA 10 0.1 10 0.1 10 0.1 10 0.1 10 0.1 10 0.1

4.4.1 RQ1

To evaluate how our model performs, we conduct quantitative and qualitative com-
parisons to answer this question. Figures 6, 7, 8, and 9 show the performance of the
top-K recommendations under the sparse and dense settings in terms of recall and
DCG. First, the sparse cases are very challenging for any proposed model since there
is less data for training. In the sparse setting where there is only one article in each
user’s library in the training set, our model, CATA, outperforms the baselines in all
datasets in terms of recall and DCG, as Figs. 6 and 7 show. More importantly, CATA
outperforms the baselines by a wide margin in the Citeulike-2004–2007 dataset,
where it is actually sparser and contains a huge number of articles. This validates the
robustness of our model against data sparsity.

(a) Citeulike-a (b) Citeulike-t (c) Citeulike-2004-2007

Fig. 6 Recall performance under the sparse setting, P = 1

(a) Citeulike-a (b) Citeulike-t (c) Citeulike-2004-2007

Fig. 7 DCG performance under the sparse setting, P = 1

Another Random Scribd Document
with Unrelated Content
substance on gaming and women. Their study was to appear splendid in
dress, sage and astute in speech; and he who was quickest with biting
phrase was wisest and most esteemed (VII. xxviii. 155).
Force and necessity, not written promises and obligations, make princes
keep faith (VIII. xxii. 198).

But most of his exposition is not added; it is welded. The narrative

itself is made expository by a constant chain of cause and effect. It
is clear both in its events and in their significance for policy. We
learn at every turn not only what Florence did, but why; and we
forecast the result. Stefano Porcari, lamenting the decay of the
Church (VI. xxix. 101), is inspired by Petrarch’s “Spirto gentil.” The
account of the conspiracy nipped by the Pope is rather a story plot
than a story. Macchiavelli is content to suggest that it was operatic.
He is not concerned to work out its story values; he is bent on its
historical significance. The spectacles at the wedding of Lorenzo to
Clarice (VII. xxi. 148) are not elaborated descriptively; they are
summed up as indicative of the habit of the time. So is handled (VII.
xxxiii. 162) Professor Cola Montano’s doctrinaire enthusiasm for
republics and scorn of tyrants. His pupils find the issue in
assassination. The splendid audience of the Pope (VIII. xxxvi. 218)
to the ambassadors of Florence for reconciliation is at once
description and argument. Thus the progress of the Istorie fiorentine
is simultaneously of facts and of ideas. It is analyzed narrative.
Fused also is the style. Heightened for the orations (II. xxxiv; III.
v, xi, xiii, xxiii; IV. xxi; V. viii, xi, xxi, xxiv; VI. xx; VII. xxiii; VIII. x), it
is never decorated, never diffused, so ascetically conformed to its
message as never to obtrude. This is not negatively the art that
knows how to conceal itself, but positively the art that is devoted
singly. True in the choice of words, it is expert in the telling emphasis
of sentences. Its reasoned balances suffice without the empty
iteration of English euphuism. They are played never for display,
always for point. The Latin period, welcome to the habit of
Macchiavelli’s mind, is rarely pushed to a conformity that would in
the vernacular have seemed artificial. Macchiavelli’s sentences are in
logic fifty years ahead of the French and the English; but they do not
force his own vernacular.
Chapter IX
ESSAYS

1. DISCUSSIONS ON POLITICS AND SOCIETY

Two Italian books of the early sixteenth century became so
famous as to be almost proverbial. Written about the same time,
Macchiavelli’s Principe (1513) and Castiglione’s Cortegiano (1514)
are complementary. Macchiavelli expounds princely policy in war and
in the truces between wars; Castiglione leads princely leisure into
culture. The policy and the culture are parts of the same Italian
world; but the two books are in sharpest contrast. Macchiavelli’s
facts are strictly analyzed; Castiglione’s are habitually idealized.
Macchiavelli’s style is stripped and so fused with the message as to
be inseparable; Castiglione’s is ample, manipulating the decorative
diffuseness of its time and its setting to elegance. Macchiavelli’s
economy is insistent, urgent; Castiglione’s is gracious, deliberate,
suggestive, rising to oratory. Both men used their thorough control
of Latin to shape their writing of Italian prose; but Macchiavelli was
applying rather such compression as that of Tacitus, Castiglione the
composition of Cicero.
It is Macchiavelli’s triumph that consideration of his doctrine has
quite submerged his style.

I have not adorned nor distended this book with ample cadences, nor
with precious or magnificent words or any other extrinsic charm or
ornament, such as many are wont to use for descriptive decoration; for I
have wished that nothing might win it praise, in other words that it should
be acceptable only for the truth of its matter and the gravity of its subject
(Dedication to Lorenzo).
Since my object is to write something useful to him who understands it,
I have thought it more fitting to follow rather the effectual truth of the
thing itself than its concept [immaginazione] (Opening of xv).

His name soon became a byword; for Englishmen and Frenchmen

found it easier to denounce Italian statecraft than to explain wherein
their own was different. Formulated for Italian despots, his doctrine
that the safety and independence of the state are paramount over
any consideration of justice or mercy became more and more
sinister in terms of the rising new national monarchs beyond his ken.
In the composition of the whole Macchiavelli was still young. He had
not yet achieved the sure control felt in his Istorie fiorentine.
Masterly already in expository analysis, eloquent in its close, the
Principe has not a compelling logical sequence.
In sequence and in detail the Cortegiano is more mature than
Macchiavelli’s Principe. Castiglione kept it by him ten years. The final
revision (Codex Laurentianus, Rome, 1524) was published at
Florence in 1528. All this care left the diction unpretentious.
Scholarly without pedantry, Castiglione even forestalls the Tuscans
by openly proclaiming his right to Lombard words. “I have written in
my own tongue, and as I speak, and to those who speak as I do.”
Thinking often of rhetoric, feeling the Latin period and attentive to
clausula, he applies his lore to Italian sentences without stiffness or
formality, happily reconciling gravity with ease. Encomium, inevitable
in his subject and his time, is oftener implied than dilated. The plan
of the dialogue is taken from Cicero’s De oratore. Reminiscence in
detail is negligible. Castiglione’s imitation is not the common
Renaissance borrowing of passages; it is the adaptation of Cicero’s
plan for presenting the typical Roman statesman to survey of the
typical Italian. Thus the dialogue is Ciceronian in proceeding logically
from point to point. Within the frame of Cicero the conduct of the
book expands the dialogue toward conversation. This is not dramatic
dialogue; nor is it imitation of the Platonic quest. Rather Castiglione’s
intention was to realize the human scene, to flavor the point with
the speaker; and his achievement in suggesting the gracious
interchange of the court of Urbino has been found quite as
significant as the conclusions of his debates.
For the Cortegiano is one of the few Renaissance books that have
endured the test of time. Details of place and time have been made
to carry so much larger human suggestion that it has been reprinted
again and again; it has been widely translated; it has today an
audience not only of special students, but of the many more who
love literature. Though the very term “courtier” is obsolete, though
the particular social function soon faded, the book endures. It is not
only the best of Renaissance dialogues; it is a classic.
The Utopia (1516) of Sir Thomas More, beginning as a dialogue on
certain social evils in England, passes to descriptive exposition of a
state organized and operated solely for the common weal. Though
the name Utopia means “nowhere,” this polity is described as the
actual experience of a returned traveler. The literary form is thus
reminiscent of Lucian, whom More ten years before had translated
with Erasmus. It is reminiscent also of Plato, of the travelers’ tales
popular in that age of discovery and explanation, and more faintly of
those distant or fortunate isles (îles lointaines) which had often been
posed as abodes of idealized communities. But though these hints
were doubtless intended, they are incidental. They fade as we read
on.
Unfortunately for More’s literary reputation, most of us read his
best-known book only in a pedestrian translation (Ralph Robinson,
1551; second edition, 1556). Keeping much of the vivacity of the
diction, this is quite unequal to More’s flexible Latin rhythms.[84] For
More, as for Poliziano and Leonardo Aretino, Erasmus and Buchanan,
Latin was a primary language. But whereas Erasmus had, so to
speak, no effective vernacular, More’s literary achievement in English
is both distinguished in itself and ahead of his time. In spite of some
uncertain ascriptions, we may be fairly sure that the English version
of his Richard III,[85] as well as the Latin, is his own.
Continued discussion of the prince and the state moved Sir
Thomas Elyot (1490?-1546) to make an English compilation for the
widening circle of readers, The Governour (1531, ed. H. S. Croft,
London, 1883, 2 vols.). “I have nowe enterprised,” he says in a
proem to Henry VIII, “to describe in our vulgare tunge the fourme of
a juste publike weale, whiche mater I have gathered as well of the
sayenges of moste noble autours (grekes and latynes) as by myne
owne experience.” But the “governour” and the “juste publike weale”
receive no consistent discussion.

The opening chapters, postulating order, proceed thence to honour

(i.e., rank), and so to one sovereign. Their review of history is very slight;
and from Chapter iv Book I is occupied rather with the education of a
gentleman. Book II is composed mainly of exempla to illustrate the virtues
appropriate to high position; and Book III adds little more than further
classified aggregation.

With no further design, without even a distinct idea, The

Governour has of course no logical progress. Lawyer and something
of a diplomat, Elyot was not a thinker. Reading widely without
discrimination, and sometimes apparently at second hand, he
compiled under headings. His later Bankette of Sapience (second
edition? 1542) is a collection of sententiae arranged alphabetically
under abstinence, adversity, affection, ambition, authoritie, amitie,
apparaile, almsdeede, accusation, arrogance, etc. His Governour,
though its headings have more logic, is hardly consecutive. In
sources as in topics the book is a miscellany.

I. vii, viii, for instance, on a gentlemanly, not a professional knowledge

of music, painting, and sculpture, suggest the Cortegiano; xii inquires
“why gentilmen in this present time be not equal in doctryne to the
auncient noblemen”; xiv proposes exempla for law students. After finding
England deficient in the fine arts (140), he returns to law students with a
recommendation of rhetoric, and thereupon itemizes it (149) under status,
inventio, etc. By the end of the book he has passed from prudence to
chess, archery, tennis, and bowls.

Elyot’s diction, though he wishes to “augment our Englysshe

tongue,” is Latinized sparingly. Copie in the sense of the Latin copia,
was fairly common in his time. He adds, e.g., allecte and allectyve,
coarted, fatigate, fucate, illecebrous, infuded, propise, and provecte.
His generally unpretentious habit is sometimes concretely racy.
Jean Bodin’s treatise on historical method (Methodus ad facilem
historiarum cognitionem, 1566),[86] giving high praise to
Guicciardini, differs from him in conception. For Bodin, history is less
a progress in time than a thesaurus of exempla.

Dividing it into human, natural, and divine, he would have us begin with
a chronological reference table (ii), proceed to a more detailed survey,
such as Funck’s or Melanchthon’s, advance to the histories of particular
nations, Jews, Greeks, Romans, and then to such smaller communities as
Rhodes, Venice, and Sicily, with constant attention to geography.
In iii, De locis historiarum recte instituendis, the topics are first the
commonplaces of encomium: birth, endowments, achievements, morals,
culture. From the family, which for Bodin is the starting point of history,
we are to proceed to the organization of the state and the developments
of the arts.
De historicorum delectu (iv) has many specific and acute estimates of
both ancients and moderns. “Somehow those who are active in wars and
affairs (44) shy at writing; and those who have given themselves
somewhat more to literature are so possessed with its charms and
sweetness as hardly to think in other terms.” Bodin himself is broad
enough to praise both Plutarch and Tacitus.
De recto historiarum iudicio (v), beginning with geography, proceeds to
regional traits. The approach is suggestive; but the development is little
more than aggregation under those dubious headings Northern and
Southern, Eastern and Western.
At this point (vi) Bodin begins the analysis of the state: the elemental
family, the citizen, the magistrate, the king. “Macchiavelli, indeed, the first
after some twelve hundred years since the barbarians to write on the
state, has won general currency; but there is no doubt that he would have
written several things more truly and better if he had added legal tradition
(usus) to his knowledge of ancient philosophers and historians” (140).
Monarchy is found to be the ideal form of government. The golden age of
primitive peace and happiness is proved to be a senile fancy (vii). Let us
rather, relying on the science of numbers, De temporis universi ratione
(viii), compute the recurrence of historical “cycles.” Strange conclusion to
so much hard reasoning!

Systematically analytical, the book is easier to consult than to

follow; but its Latin style is of that sincere, capable, unpretentious
sort which had been established for history by the Italians. The
political ideas of the Methodus are carried out by the same
systematic analysis in Bodin’s second book, Les Six Livres de la
république, 1576.[87] Greek and Latin political usage is made by a
long wall of citations to support, with other proofs from history, the
theory of absolute monarchy.
Such support of the new monarchies by a reasoned theory based
on ancient history did not pass unchallenged. George Buchanan,
with more literary competence in Latin, though with less knowledge
of politics, offered for his little Scotland a theory of monarchy
answerable to the people (De jure regni apud Scotos dialogus,
1579).[88] The preface, addressed to James VI, keeps a tutorial
tone, as of one still laying down the law. The occasion put forth for
the Ciceronian dialogue is French reprobation of Scotch politics. How
shall this be met? The method is evident from the first three points.

To distinguish a king from a tyrant, we must remember that society is

founded not only on utility, but on natural law implanted by God. A king is
typically shepherd, leader, governor, physician, created not for his own
ends, but for the welfare of his people (1-6).
Kingship, being an ars based on prudentia, needs guidance by laws (8).
Objection: who would be king on these terms? Answer: ancient history
and doctrine show motives higher than lust for power and wealth (9).
These two points being iterated in summary for transition, the third is
the need not only of laws, but of a council (11-14).

The many exempla from ancient and modern history confirming or

challenging the a priori progress of the dialogue do not touch the
recent events that raised the question. Scotch history is used even
less specifically than ancient to confirm the theory of limited
monarchy. But though Buchanan does not prove that recent politics
were an application of his theory, he makes the theory itself
interesting and sometimes persuasive.
The Latin style has more liveliness, expertness, and range than
Bodin’s. But the argument, though urgent as well as scholastically
ingenious, remains unconvincing. After debating general
considerations inconclusively, it falls back at last on the particular
customs and needs of Scotland. These are not applied specifically
enough to be determining. The expertness of the dialogue is rather
literary than argumentative.
Brought down to the market place by printing, controversy by the
end of the century was learning the ways of journalism in
pamphlets. Meantime printing had opened such compilation as
Elyot’s, samples of learning for those eager readers who had not
gone to school with the Latin manuals of Erasmus.
The best of these sixteenth-century discussions, the piercing
urgency of Macchiavelli, the charming exposition of Castiglione, the
philosophical survey of More, the systematic analysis of Bodin, the
hot attack of Buchanan, are all essays in that modern sense of the
word which applies it to consecutive exposition involving argument.
They show essay-writing of this kind—which was to move more
surely in the seventeenth century—already on a firm footing. They
recognize the Italian tradition of history in abjuring the decorative
dilation which was habitual in other fields. They show Latin and
vernacular side by side, and vernacular prose gaining point and
finish from the Latin commanded by all their writers. They are a
solid literary achievement of the Renaissance.

2. MONTAIGNE
The other kind of essay, the literary form that has kept the original
meaning of attempt, sketch, experiment, had its pace set late in the
sixteenth century by Montaigne. Nothing could be farther removed
than his habit from tidy system or consecutive argument. Devoted to
the reading of history, and eager to share its profits, he had no mind
to follow the Italian tradition of writing history. Essai in his practice is
not the settling of a subject, but the trying. He makes one approach,
then another, suggesting relations that he does not carry out. With
many exempla he invites us to accumulate philosophy of living. If we
do not coöperate, if we do not think them over, his essays remain
collections of items in memorable phrase, without compelling
sequence of ideas. For Montaigne is not the kind of philosopher who
integrates a system; he is a sage. He has the sage’s oral habit. No
writing conveys more the impression of thinking aloud. Again and
again he writes as if making up his mind, not before utterance, but
by the very process of utterance. Macchiavelli, or Bodin, having
made up his mind fully and finally, tries to convince us; Montaigne,
as if making up his in our company, throws out suggestions.
True, some few of his essays are more consecutive developments
of what he has concluded. His early and widely quoted Education of
Children (II. xxvi) has even some logical progress.
But logical sequence is not Montaigne’s habit. His many
revisions[89] show him leaning more and more on the aggregation of
separate suggestions. He changes words, he adds instances, but he
does not seek a stricter order.

But I am going off a little to the left of my theme.... I, who take more
pains with the weight and usefulness of my discourses than with their
order and sequence, need not fear to lodge here, a little off the track, a
fine story (II. xxvii).
This bundling of so many various pieces is made on condition that I put
hand to it only when urged by too lax a leisure, and only when I am at
home (II. xxxvii, opening).

His usual lack of sequence, then, is not careless. The careless

fumbling that comes from muddled thinking he ridicules.

They themselves do not yet know what they mean, and you see them
stammer in bringing it forth, and judge that their labor is not in childbirth,
but in conception, and that they are only licking what is not yet formed (I.
xxvi).

As to sequence he even catechizes himself.

Is it not making bricks without straw, or very like, to build books
without science and without art? The fantasies of music are conducted by
art, mine by chance.

And his answer is very earnest.

At least I have this from my course of study (discipline), that never a

man treated a subject that he understood and knew better than I do the
one that I have undertaken, and that in this subject I am the most
learned man alive; secondly, that no one ever penetrated farther into its
material, nor peeled more sedulously its parts and their consequences,
nor reached more precisely and fully the end that he had proposed for his
job. To accomplish this, I need bring no more than fidelity. That I have,
the most sincere and pure that is to be found (III. ii).

Montaigne’s method, then, is deliberate.[90] If he passes, as in

Des coches (III. vi), from examples of lavish display to the cruelty of
Spanish conquest in Mexico and frankly begins his last paragraph
with retumbons à nos coches, that is because he usually prefers to
take us on a journey around his idea. Hundreds of readers have
found the talk of such a guide on the way more winsome than the
conclusions of others after they have come home.
The art of growing an idea by successive additions sets the pace
also for his sentences. Knowing Latin, he tells us, as a native
language, and better than French, he puts aside Cicero for Seneca.
This is more than the rejection of Ciceronianism, more than
preference for Seneca’s philosophy; it is in detail the same
aggregative method that he uses for the composition of a whole
essay. That vernacular sentences were commonly more aggregative
than those of Augustan Latin may have been a reason for his
choosing the vernacular. At any rate, he keeps the two languages
quite apart. Instead of applying his Latin to the pointing of his
French sentences, he prefers to let them accumulate as in talk.

(1) They do still worse who keep the revelation of some intention of
hatred toward their neighbor for their last will,
(2) having hid it during their lives,
(3) and show that they care little for their own honor,
(4) irritating the offense by bringing it to mind,
(5) instead of bringing it to conscience,
(6) not knowing how, even in view of death, to let their grudge die,
(7) and extending its life beyond their own. (I. vii.)

The sentence might easily have been recast in a Latin period;

Montaigne prefers to let it reach its climax by accumulation.

(1) Nature has furnished us, as with feet for walking, so with foresight
to guide our lives,
(2) foresight not so ingenious, robust, and pretentious as the sort that
explores (invention),
(3) but as things come, easy, quiet, and healthful,
(4) and doing very well what other people say,
(5) in those who have the knack of using it simply and regularly,
(6) that is to say, naturally. (III. xiii.)

So his epigrams are comparatively few and simple. His many

memorable sayings are not paraded as sententiae.

It is not a soul, not a body, that we are educating; it is a man (I. xxvi).
Unable to regulate events, I regulate myself, and adjust myself to them
if they do not adjust themselves to me (II. xvii).
The teaching that could not reach their souls has stayed on their lips
(III. iii).
Between ourselves, two things have always seemed to me in singular
accord, supercelestial opinions and subterranean morals (III. xiii).

For Montaigne’s shrewd summaries prevail less often by balanced

sentences than by concrete diction.

I am seldom seized by these violent passions. My sensibility is naturally

dense; and I encrust and thicken it daily by discourse (I. ii).
Anybody’s job is worth sounding; a cowherd’s, a mason’s, a passer-by’s,
all should be turned to use, and each lend its wares; for everything comes
handy in the kitchen (I. xxvi).

Such sentences, such diction, are not only his practice; they are
part of his literary theory.

The speech that I like is simple and direct, the same on paper as on the
lips, speech succulent and prompt (nerveux), curt and compact, not so
much delicate and smoothed as vehement and brusque—Haec demum
sapiet dictio quae feriet—rather tough than tiresome, shunning
affectation, irregular, loose, and bold, each bit for itself, not pedantic, not
scholastic, not legal, but rather soldierly (I. xxvi).
The urgent metrical sentence of poetry seems to me to soar far more
suddenly and strike with a sharper shock [The figure is of a falcon] (I.
xxvi).
These good people (Vergil and Lucretius) had no need of keen and
subtle antitheses. Their diction is all full, and big with a natural and
constant force. They are all epigram, not only the tail, but the head, the
stomach, and the feet.... It is an eloquence not merely soft and faultless;
it is prompt and firm, not so much pleasing as filling and quickening the
strongest minds. When I see those brave forms of expression, so vivid, so
deep, I do not call it good speaking; I call it good thinking (III. v).[91]

So he cannot stomach that Renaissance imitation which ran to

borrowing, nor that display of Latin style for itself which published
even private letters.

Those indiscreet writers of our century who go sowing in their worthless

works whole passages from the ancients to honor themselves (I. xxvi).
But it surpasses all baseness of heart in persons of their rank that they
have sought to derive a principal part of their fame from chatter and
gossip, even to using the private letters written to their friends (I. xl).

So he is impatient with the unreality of romance.

Going to war only after having announced it, and often after having
assigned the hour and place of battle (I. v).
Those Lancelots, Amadis, Huons, and such clutter of books to amuse
children (I. xxvi).

Reviewing contemporary criticism of poetry, he says: “We have

more poets than judges and interpreters of poetry; it is easier to
make it than to know it” (I. xxxvii). “You may make a fool of yourself
anywhere else,” he warns, “but not in poetry” (II. xvii). So there is
no room for mediocre poetry.

Popular, purely natural poetry has simplicities and graces comparable

with the eminent beauty of poetry artistically perfect, as is evident in the
Gascon villanelles and in songs brought to us from illiterate peoples.
Mediocre poetry, which is neither the one nor the other, is disdained,
without honor or even esteem (I. liv).

Dismissing in a scornful phrase “the Spanish and Petrarchist fanciful

elevations” (II. x), he exactly estimates the Latin poets of his time as
“good artisans in that craft” (II. xvii). Perhaps a certain significance,
therefore, attaches to his repeating the current complacency with
regard to French poetry.

I think it has been raised to the highest degree it will ever attain; and in
those directions in which Ronsard and Du Bellay excel I find them hardly
below the ancient perfection (II. xvii).

Elsewhere, and habitually, Montaigne’s attitude toward the classics

was quite different from the habit of the Renaissance. He sought not
so much the Augustans as Seneca and the Plutarch of Amyot.

Je n’ay dressé commerce avec aucun livre solide sinon Plutarque et

Seneque, où je puyse comme les Danaides, remplissant et versant sans
cesse (I. xxvi).

These, and even Cicero and Vergil, he sought not for style, but for
philosophy and morals. That sounder classicism of composition
which, through the Italian tradition of history, had animated
Renaissance essayists of the stricter sort he put aside. He was not
interested in the ancient rhetoric of composition, nor, to judge from
his slight attention to it, in that field of ancient poetic. He quotes
both Dante and Tasso, but not in that aspect. He is not interested in
the growing appreciation of Aristotle’s Poetic. In this disregard of
composition, indeed, he was of the Renaissance; but he rejected and
even repudiated Renaissance pursuit of classicism in style. There he
adopted the sound doctrine of Quintilian and scornfully, to use his
own word, abjured borrowed plumes and decorative dilation. If we
use the word classical in its typical Renaissance connotation, we
must call Montaigne, as well as Rabelais, anti-classical. Unlike as
they are otherwise, they agree in satirizing Renaissance classicism.
The positive aspect of this rejection is Montaigne’s homely
concreteness. Trying to teach his readers, not to dazzle them, he is
very carefully specific. To leave no doubt of his meaning, he will
have it not merely accepted, but felt. Therefore he is more than
specific; he is concrete. Imagery for him is not mythology; it is of
native vintage.
“In this last scene between death and us there is no more
pretending. We have to speak French; we have to show how much
that is good and clean is left at the bottom of the pot” (I. xix). Such
expression strikes us not as wit, not as an aristocrat’s catering to the
new public, but as the sincere use of sensory terms to animate
ideas. If it reminds us sometimes of popular preaching, that is
because Montaigne was a sage.
FOOTNOTES
[1] In H. Chamard, Les Origines de la poésie française de la
Renaissance (Paris, 1920), p. 256.
[2] Bembo, Prose, II. xxi (Venice, 1525).
[3] Allen, Age of Erasmus, p. 121.
[4] É. Egger, L’Hellénisme en France (Paris, 1869), pp. 358-359;
see Monnier, II, 134 for modern estimate of Renaissance Greek
texts.
[5] Prose, I, vi (1525).
[6] Egger, p. 398.
[7] Ibid., p. 205.
[8] Edition of Osgood, pp. 119, 193.
[9] Probably the source of Rabelais’s Abbey of Thelème. He had
read the book.
[10] Page references to 1596 edition.
[11] Edited by Louis Humbert, Paris, 1914.
[12] Sir John Cheke, however, spoke as a scholar when he
wrote to Hoby: “I am of opinion that our own tung shold be
written cleane and pure, vnmixt and vnmangeled with borrowing
of other tunges.” Quoted in Arber’s Introduction to Ascham’s
Scholemaster, p. 5.
[13] Parodied by Orationes obscurorum virorum (before 1515),
which was part of the Reuchlin-Pfefferkorn controversy.
[14] This is the exercise called by the ancients declamatio. See
ARP (Ancient Rhetoric and Poetic) and a letter of Erasmus, May 1,
1506.
[15] Bartholomaei Riccii De imitatione libri tres (Venice, 1545),
folio 38 verso. See below, Chapter III, Sect. 3.
[16] MRP (Medieval Rhetoric and Poetic) I and II.
[17] Ep. 221 in Migne’s Patrologia latina (Vol. 199, p. 247),
which dates it 1167; Ep. 223, p. 389, in the collection of the
letters of Gerbert, John of Salisbury, and Stephen of Tournay
printed by Ruette (Paris, 1611). The letter is translated MRP 209.
[18] Apologia dei dialoghi, opening; p. 516 of the Venice, 1596,
edition.
[19] For De oratore, see ARP.
[20] Minturno, Arte poetica, is mere catechism. Perionius hardly
achieves dialogue at all; his interlocutors merely interrupt.
[21] Analecta hymnica.
[22] For the pattern of the classical rhetoric, see ARP.
[23] MRP.
[24] Paul Spaak, Jean Lemaire (Paris, 1926).
[25] Pierre Villey, Les Grands Écrivains du xviᵉ siècle, I, 83-97,
110-148.
[26] Evvres de Louize Labé, Lionnoize, revues et corrigées par
la dite dame, à Lion, par Jean de Tournes, MDLVI (dedicatory
epistle dated 1555).
[27] Each stanza of the Epithalamion ends with a longer line (6
beats), which is the common refrain. The other lines have
generally five beats, but the sixth and eleventh have only three;
and this variation is occasionally extended. Generally there is a
rhyme-shift after the eleventh line, but not a break (11 lines on 5
rhymes [or 4] plus 7 lines on 3 rhymes [or 4]). A few stanzas are
lengthened to nineteen lines (11 plus 8). Thus the typical
variations in this triumph of metrical interweaving are as follows,
the underlined letters indicating the lines of three beats:

Stanza I ababccbcbdd effeegg

/
II ababccdcdff/ g h h g g h
h
IV ababccdcdee fgghhii
/
III. & a b a b c c d c d e e f g g f h h i i (19
VIII / lines
)

[28] Œuvres complètes de P. de Ronsard, ed. par Paul

Laumonier (Paris, 1914-1919), I, 316.
[29] London, Wynkyn de Worde, 1515.
[30] For Petrarch and Boccaccio, see Carrara, La poesia
pastorale, pp. 88-111.
[31] Edited by M. Scherillo (Torino, 1888).
[32] Written 1573; published 1580; edited by Angelo Solerti
(Torino, 1901).
[33] Le Premier Livre d’Amadis de Gaule, publié sur l’édition
originale par Hugues Vaganay (Paris, 1918), 2 vols.
[34] For Alamanni, see Henri Hauvette, Un Exilé florentin ...
Luigi Alamanni (Paris, 1903).
[35] Edited by G. B. Weston (Bari, 2 vols.).
[36] So I. iii. 31, 51; v. 13, 56; vi. 54; ix. 36; xi. 46; and
throughout the poem.
[37] I. xxii is fabliau; and so, in various degrees, the stories
inserted at I. vi. 22, xiii. 29, xxix. 3; II. i. 22, xiii. 9, xxvi. 22; III.
ii. 47.
[38] E. Donadone, Torquato Tasso (Venice, 1928).
[39] The stanzas are adapted by Spenser, FQ, Book II. xii. 74-
75.
[40] Diocletian-giants-Brutus-Hogh-Gormet-Hercules, II. x. 7;
Tristan-nymphs-Latona’s son, VI. ii. 25.
[41] Chapter VII.
[42] For Seneca, see ARP.
[43] Opera, II, 2.
[44] “Acta fuit Burdegalae Anno MDXLIII” in the colophon can
hardly mean merely that the play was finished in that year.
[45] On tragicomedy, see H. C. Lancaster, The French
Tragicomedy, Its Origins and Development from 1552 to 1628
(Baltimore, 1907).
[46] For Garnier in England, see A. M. Witherspoon, The
Influence of Robert Garnier on Elizabethan Drama (New Haven,
1924).
[47] For Plautus and Terence, see ARP.
[48] “Politian was in 1471, at the request of Cardinal Francesco
Gonzaga, despatched to Mantua by Lorenzo de’ Medici to prepare
an entertainment for the reception of Duke Galeazzo Maria
Sforza. The Orfeo, a lyric pastoral in dramatic form, prophetic of
so much that was later to come, was the contribution of the
brilliant humanist and poet to the Duke’s entertainment. It stands
close to the fountainhead of European secular drama.” H. M.
Ayres, preface to his translation of the Orfeo in Romanic Review,
XX (January, 1929), 1.
[49] See Chapter IV.
[50] Alfred Mortier, Un Dramaturge populaire ... Ruzzante.
Œuvres compl. traduites pour la première fois (Paris, 1926).
[51] ARP and MRP.
[52] For Aristotle’s Poetic, see ARP.
[53] For discussion of the romances, see Chapter V. For Giraldi’s
novelle, see Chapter VIII, 1, c.
[54] ARP.
[55] For Hermogenes, see MRP, pp. 23 ff.
[56] References are to the second edition of 1581. See also F.
M. Padelford, Select Translations from Scaliger’s Poetics (New
York, 1905).
[57] See above, Du Bellay, Chapter II, pp. 3, 6.
[58] See H. B. Charlton, Castelvetro’s Theory of Poetry
(Manchester, 1913).
[59] Lodge’s feebler Defence of Poetry (1579) has little other
interest than the historical, i.e., as a reply to Gosson’s attack on
the stage.
[60] In Smith’s reprint shortened by summary.
[61] Gregory Smith, II, 327-355.
[62] Gregory Smith, II, 356-384; Arthur Colby Sprague, Samuel
Daniel, Poems and a Defence of Ryme (Cambridge, Harvard
University Press, 1930).
[63] Patrizzi’s refutation of Tasso, 68, 116, 144/5, 173, 175.
[64] Nevertheless two of his references (V. 116; VI. 125)
suggest, perhaps without his intention, a relation between Plato’s
Symposium and Aristotle’s idea of creative imitation.
[65] Pellissier’s long introduction and valuable notes, though
they need a few corrections by later studies, remain one of the
most important surveys of the French development of poetic in
the sixteenth century.
[66] But Vauquelin with Tasso bids poets leave pagan myth for
Christian themes, though perhaps he refers only to subject; and
he recognizes the place of Montemayor’s Diana among pastorals.
[67] For Aristotle’s imitation, see ARP, pages 139 ff.
[68] D. L. Clark, Rhetoric and Poetry in the Renaissance (New
York, 1922).
[69] Cf. in Chapter VII Giraldi’s theory of the romance.
[70] This is inferred from a commendatory letter of Bartolomeo
Cavalcanti prefixed to this fourth (1580) edition.
[71] For editions and translations, see Louis Berthé de
Besaucèle, J.-B. Giraldi (thesis at the University of Aix-en-
Provence, Paris, 1920), pp. 109, 255, 258; for the French
translator, Gabriel Chappuys, see p. 261.
[72] See above, p. 198.
[73] For the Gorgian figures, see MRP and Croll’s introduction
to his edition of Euphues.
[74] Samuel Lee Wolff, The Greek Romances in Elizabethan
Prose Fiction (New York, Columbia University Press, 1912).
[75] Op. cit., pp. 173 seq. The quotation is at p. 177.
[76] H. Brown, Rabelais in English Literature (Harvard Press,
1933), p. 19.
[77] Cf. Budé, Chapter I, for Renaissance complacency.
[78] Above, Chapter II.
[79] J. Plattard, François Rabelais (Paris, 1932), p. 194.
[80] Ibid., p. 140.
[81] Ibid., pp. 115 seq.
[82] Plattard, p. 117.
[83] For narratio, see ARP.
[84] In the prefatory epistle to Petrus Aegidius about two-thirds
of the first hundred clauses conform to the cursus of the curial
dictamen (MRP). These clauses compose about twenty sentences
ending: planus, 6 (30%); tardus, 2 (10%); velox, 7 (35%);
unconformed, 5 (25%). Inconclusive, this may be worth further
study.
[85] See above, Chapter VIII.
[86] Citations are from Jacobus Stoer’s edition of 1595.
[87] The fourth edition, cited here, by Gabriel Cartier, 1599.
Meantime Bodin had published in 1586 a revised edition in Latin,
De re publica libri vi.
[88] Edition cited Edinburgh (Freebairn), 1715, Opera omnia,
ed. Thomas Ruddiman, Vol. I.
[89] See F. Strowski, Montaigne (Paris, 1931).
[90] “Qu’il n’est rien si contraire à mon style qu’une narration
estendue (i.e., narratio, sustained exposition); je me recouppe si
souvent à fault d’haleine; je n’ay ni composition ny explication qui
vaille” (I. xxi).
[91] This is the doctrine of Quintilian, whom he quotes. ARP.
INDEX

Abraham and Isaac, 136

Accademia della Crusca, 30
Achilleis (Statius), 185
Achilles Tatius, 79, 87, 188
Acciajuoli, Donato, 214
Actores octo, see Auctores octo
Adrian, Cardinal Corneto, 24
Aeneid, 102, 104, 114, 121, 126, 127, 165, 185
Aeschines, 26
Alain de Lille, 9, 10
Alamanni, Luigi, 13, 67, 96-98
Alberti, Leone Battista, 27
Aldus Manutius, 9, 20
Alexandrian literature, 7, 78, 90
Alexandrianism, 123, 188
Allegory, 123, 130, 133
Alliteration, 87, 129, 199
Alunno, Francesco, 31, 37
Amadis of Gaul, 95
Amants fortunés, Les, 194
Ameto (Boccaccio), 82, 84
Aminta (Tasso), 87, 147, 153
Amyot, Jacques, 23, 90, 238
Anacreon, 21, 79
Annales d’Aquitaine (Bouchet), 134
Anthology, 7, 21, 79, 104
Antigone (Garnier), 142
Antigone (Sophocles), 142
Aphthonius, 21
Apollonius Rhodius, 21, 79, 188
Apuleius, 93, 179, 188
Aquila, Serafino d’, 69
Aquinas, 8, 9
Arcadia (Sannazaro), 83-87
Arcadia (Sidney), 90, 202
Aretino, Leonardo, see Bruni, Leonardo
Ariosto, 9, 11, 24, 30, 67, 91, 96, 101, 104, 111-23, 131, 142,
143, 147, 159, 160, 164, 168, 177, 179, 184, 205
Aristotle, 15, 20, 25, 26, 53, 55, 61, 62, 63, 133, 145, 158, 159,
164, 165, 166, 167, 168, 169, 172, 175, 176, 177, 178,
179, 184, 185, 186, 187, 189, 238
“Ars poetica” (Horace), 10, 15, 133, 155, 158, 161, 163, 164,
165, 171, 186, 188
Arte of English Poesie (Campion), 183
Arte of English Poesie (Puttenham), 182
Arte poetica (Minturno), 44n, 168
Arthurian cycle, 96, 98
Art of rhetorique, The (Wilson), 62
Art poétique, L’ (Peletier), 163-64
Art poétique, L’ (Ronsard), 175
Art poétique, L’ (Vauquelin de la Fresnaye), 186
Arts poétiques, 15, 127
Ascham, Roger, 37, 38n, 183
Astrophel and Stella (Sidney), 77
As You Like It (Shakspere), 146
Athenaeus, 21
Auctores octo, 10, 81
Augustan Latin, 18, 44, 58, 65, 234
Aulus Gellius, 10, 188
Aurispa, 20
Ausonius, 10, 79, 188
Avarchide (Alamanni), 96
Ayres, H. M., quoted, 147n

Baif, 34
Balade, 10
Balsamo-Crivelli, Gustavo, 190
Bandello, Matteo, 190-94, 197, 198-99
Bankette of Sapience (Elyot), 227
Baptistes (Buchanan), 139
Basia (Secundus), 66
Bede, 9
Bellay, Cardinal du, 211
Bellay, Joachim du, 10, 32, 34, 69, 163, 238
Belleau, Remi, 21
Belleforest, 198
Bembo, 9, 15, 22, 26, 28, 30, 31, 36, 87, 170, 179, 188
Beolco, Angelo, see Ruzzante
Berni, 111
Besaucèle, 195n
Bessarion, Johannes, 20, 39
Beza, 179
Bibbiena, Cardinal, 179
Blason, 118
Boccaccio, 5, 6, 13, 14, 15, 20, 23, 27, 28, 29, 31, 32, 60, 67,
82, 84, 87, 104, 121, 153, 170, 177, 179, 185, 186, 194-96
Bodin, 228, 229, 230, 230n, 231, 232
Boece, Hector, 216
Boethius, 9, 25, 37, 85
Boiardo, Matteo Maria, 11, 91, 92, 93, 94, 101-11, 120, 124,
125, 127, 159
Bouchet, Jean, 134
Boulanger, André, 163
Bradamante (Garnier), 142
Brocardo, Jacopo, 63
Brome Abraham and Isaac, 136
Brown, H., 202n
Browne, Sir Thomas, 52
Bruni, Leonardo, 39, 214-17, 226
Buchanan, George, 17, 137-39, 179, 216, 226, 230, 231
Bucolics (Vergil), 80, 165
Bucolicum carmen (Petrarch), 82
Budé, Guillaume, 4, 22, 34, 207n
Bundy, M. W., 162

Caesar, 18, 216

Caesarius, Joannes, 55
Callimachus, 72, 79
Calpurnius, 87
Camillo, Giulio, 50, 55, 56, 170, 171, 188
Campion, Thomas, 183
Capranica, Cardinal, 39
Carde of Fancie (Greene), 201
Carolingian cycle, 100, 101, 124
Cartier, Gabriel, 230n
Castelvetro, 176, 185, 189
Castiglione, Baldassare, 12, 30, 43, 53, 67, 114, 177, 223-25,
231
Cathonet, 10, 81
Cato, 58, 81
Catullus, 57, 58, 65, 66, 72, 86, 87, 170
Cavalcanti, Bartolomeo, 63, 195n
Caviceo, Jacopo, 39, 40
Caxton, William, 37, 96, 98, 130
Certaine notes of instruction concerning ... verse or rime ...
(Gascoigne), 180
Cervantes, 132, 213
Chamard, H., 8n
Champier, Symphorien, 93
Chansons de geste, 124
Chappuys, Gabriel, 195n
Charlton, H. B., 176n
Châteillon, Sébastien, 52
Châtelaine de Vergi, 194
Chaucer, 5, 13, 14, 15, 24, 37, 67, 70, 89, 96, 108, 121, 122,
153, 180, 181, 183, 198
Cheke, Sir John, 38n
Chemnicensis, 181
Chivalry, 11, 92, 124, 130, 131
Chronicles, 214
Chrysolaras, Manuel, 20
Cicero, 10, 15, 18, 26, 28, 30, 40, 41, 43-45, 53, 62, 63, 163,
164, 169, 175, 178, 188, 202, 214, 216, 223, 225, 230,
234, 238
Ciceronianism, 44, 45, 48, 49, 51, 53, 57
Cicero thesaurus, 46, 49
Cinthio, Giraldi, 29, 158-61, 188
Clark, D. L., 189n
Classicism, 3, 7, 9, 39, 45, 79, 187, 239
Claudian, 10, 72, 79, 87, 160, 174, 188
Colonna, Francesco, 25, 202, 209
Comedy, 133, 134, 146-54
Comedy of Errors (Shakspere), 147
Comes, Natalis, see Conti, Natale
Compleint d’amour, 118
Concorde des deux langages (Lemaire), 31, 69
Conflictus, 43
Conti, Natale, 23
Cook, Albert S., 178
Copia (Erasmus), 54
Corneille, 53, 145, 154
Cornélie (Garnier), 141
Corpus Christi cycles, 135
Cortegiano (Castiglione), 12, 43, 53, 223, 224
Cortesi, Paolo, 41, 48
Court shows, medieval, 133, 147
Croft, H. S., 227
Cursus, 42
Cuspidius, 211
Cyclical romances, 98, 100, 101, 119
Cyropaedia (Xenophon), 93

Daniel, Samuel, 183

Dante, 5, 8, 13, 14, 27, 28, 29, 30, 31, 32, 60, 66, 82, 87, 110,
114, 119, 132, 158, 162, 168, 169, 177, 179, 238
Daphnis and Chloe (Longus), 23, 79, 90
Dati, Agostino, 39
De arte poetica (Vida), 155-58
Débat, 42
Débat de Folie et d’Amour (Labé), 71
De bello italico adversos gothos gesto historia (Bruni), 215
Decameron (Boccaccio), 14, 60, 194, 195
De causis corruptarum artium (Vives), 54
De contemptu mundi, 10
De elegantiae linguae latinae, 8, 19
Defence of Ryme (Daniel), 183
Defense of Poesy (Sidney), 89, 178-80
Defense of Poetry (Lodge), 179n
Deffense et illustration de la langue française (Du Bellay), 32
De imitatione (Ricci), 50, 57
De inventione (Cicero), 10, 45
De jure regni apud Scotos dialogus (Buchanan), 230
De laudibus D. Eusebii (Dati), 39
De linguae gallicae origine (Périon), 33
Delivery, 54
Deloney, Thomas, 202
Demosthenes, 21, 26, 28, 63
Denores, Jason, 185
De oratore (Cicero), 15, 43, 44, 45, 53, 225
De poeta (Minturno), 13, 15, 164-67;
excerpt, 4
De ratione dicendi (Vives), 54
De re publica (Bodin), 230n
De senectute (Cicero), 208
De studio literarum (Budé), 4
De Tournes, 13
De tradendis disciplinis (Vives), 54
Dialogo delle lingue (Speroni), 26
Dialogues, 6, 39, 42, 225
Dialogus Ciceronianus (Erasmus), 49
Diana (Montemayor), 186n
Dictamen, 42
Diomedes, 10
Dionysius of Halicarnassus, 158
Discorsi dell’ arte poetica ... (Tasso), 176-78
Discorsi ... intorno al comporre de i romanzi, delle commedie, e
delle tragedie (Cinthio), 158-61;
excerpt, 29
Discourse of English Poetrie (Webbe), 180
Discussions of Tuscan (Tomitano), 59
Discussions on politics and society, 223-32
Dispositio, 54, 60, 64, 176
Divina Commedia (Dante), 5, 132, 158, 168
Dolce, Lodovico, 30
Donadone, E., 125
Donati, Edouardo, 180
Donatus, 10
Don Quixote (Cervantes), 132, 213
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.