100% found this document useful (2 votes)
16 views

An Introduction to Parallel Programming 2. Edition Pachecodownload

The document promotes the second edition of 'An Introduction to Parallel Programming' by Peter S. Pacheco, which covers key concepts and APIs for parallel programming, including MPI, Pthreads, OpenMP, and CUDA. It highlights the importance of parallel programming in modern computing and aims to provide accessible resources for students and professionals. The book is structured to allow readers to learn independently about different APIs while ensuring foundational knowledge in parallel systems.

Uploaded by

nisortaradi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
16 views

An Introduction to Parallel Programming 2. Edition Pachecodownload

The document promotes the second edition of 'An Introduction to Parallel Programming' by Peter S. Pacheco, which covers key concepts and APIs for parallel programming, including MPI, Pthreads, OpenMP, and CUDA. It highlights the importance of parallel programming in modern computing and aims to provide accessible resources for students and professionals. The book is structured to allow readers to learn independently about different APIs while ensuring foundational knowledge in parallel systems.

Uploaded by

nisortaradi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Download the full version and explore a variety of ebooks

or textbooks at https://ebookmass.com

An Introduction to Parallel Programming 2. Edition


Pacheco

_____ Tap the link below to start your download _____

https://ebookmass.com/product/an-introduction-to-parallel-
programming-2-edition-pacheco/

Find ebooks or textbooks at ebookmass.com today!


Here are some recommended products for you. Click the link to
download, or explore more at ebookmass.com

An Introduction to Parallel Programming. Second Edition


Peter S. Pacheco

https://ebookmass.com/product/an-introduction-to-parallel-programming-
second-edition-peter-s-pacheco/

Parallel programming: concepts and practice González-


Domínguez

https://ebookmass.com/product/parallel-programming-concepts-and-
practice-gonzalez-dominguez/

An Introduction to Programming through C++ Abhiram G.


Ranade

https://ebookmass.com/product/an-introduction-to-programming-through-
c-abhiram-g-ranade/

Data Parallel C++ : Programming Accelerated Systems Using


C++ and SYCL James Reinders

https://ebookmass.com/product/data-parallel-c-programming-accelerated-
systems-using-c-and-sycl-james-reinders/
Data Parallel C++: Programming Accelerated Systems Using
C++ and SYCL 2nd Edition James Reinders

https://ebookmass.com/product/data-parallel-c-programming-accelerated-
systems-using-c-and-sycl-2nd-edition-james-reinders/

Introduction to Java Programming, Comprehensive Version Y.


Daniel Liang

https://ebookmass.com/product/introduction-to-java-programming-
comprehensive-version-y-daniel-liang/

Introduction to Computing and Programming in Python,


Global Edition Mark J. Guzdial

https://ebookmass.com/product/introduction-to-computing-and-
programming-in-python-global-edition-mark-j-guzdial/

Introduction to Computation and Programming Using Python,


Third Edition John V. Guttag

https://ebookmass.com/product/introduction-to-computation-and-
programming-using-python-third-edition-john-v-guttag/

Java Programming: A Comprehensive Introduction, First


edition

https://ebookmass.com/product/java-programming-a-comprehensive-
introduction-first-edition/
An Introduction to Parallel
Programming

SECOND EDITION

Peter S. Pacheco
University of San Francisco

Matthew Malensek
University of San Francisco
Table of Contents

Cover image

Title page

Copyright

Dedication

Preface

Chapter 1: Why parallel computing

1.1. Why we need ever-increasing performance

1.2. Why we're building parallel systems

1.3. Why we need to write parallel programs

1.4. How do we write parallel programs?

1.5. What we'll be doing

1.6. Concurrent, parallel, distributed

1.7. The rest of the book


1.8. A word of warning

1.9. Typographical conventions

1.10. Summary

1.11. Exercises

Bibliography

Chapter 2: Parallel hardware and parallel software

2.1. Some background

2.2. Modifications to the von Neumann model

2.3. Parallel hardware

2.4. Parallel software

2.5. Input and output

2.6. Performance

2.7. Parallel program design

2.8. Writing and running parallel programs

2.9. Assumptions

2.10. Summary

2.11. Exercises

Bibliography

Chapter 3: Distributed memory programming with MPI


3.1. Getting started

3.2. The trapezoidal rule in MPI

3.3. Dealing with I/O

3.4. Collective communication

3.5. MPI-derived datatypes

3.6. Performance evaluation of MPI programs

3.7. A parallel sorting algorithm

3.8. Summary

3.9. Exercises

3.10. Programming assignments

Bibliography

Chapter 4: Shared-memory programming with Pthreads

4.1. Processes, threads, and Pthreads

4.2. Hello, world

4.3. Matrix-vector multiplication

4.4. Critical sections

4.5. Busy-waiting

4.6. Mutexes

4.7. Producer–consumer synchronization and semaphores

4.8. Barriers and condition variables


4.9. Read-write locks

4.10. Caches, cache-coherence, and false sharing

4.11. Thread-safety

4.12. Summary

4.13. Exercises

4.14. Programming assignments

Bibliography

Chapter 5: Shared-memory programming with OpenMP

5.1. Getting started

5.2. The trapezoidal rule

5.3. Scope of variables

5.4. The reduction clause

5.5. The parallel for directive

5.6. More about loops in OpenMP: sorting

5.7. Scheduling loops

5.8. Producers and consumers

5.9. Caches, cache coherence, and false sharing

5.10. Tasking

5.11. Thread-safety

5.12. Summary
5.13. Exercises

5.14. Programming assignments

Bibliography

Chapter 6: GPU programming with CUDA

6.1. GPUs and GPGPU

6.2. GPU architectures

6.3. Heterogeneous computing

6.4. CUDA hello

6.5. A closer look

6.6. Threads, blocks, and grids

6.7. Nvidia compute capabilities and device architectures

6.8. Vector addition

6.9. Returning results from CUDA kernels

6.10. CUDA trapezoidal rule I

6.11. CUDA trapezoidal rule II: improving performance

6.12. Implementation of trapezoidal rule with warpSize thread


blocks

6.13. CUDA trapezoidal rule III: blocks with more than one warp

6.14. Bitonic sort

6.15. Summary
6.16. Exercises

6.17. Programming assignments

Bibliography

Chapter 7: Parallel program development

7.1. Two n-body solvers

7.2. Sample sort

7.3. A word of caution

7.4. Which API?

7.5. Summary

7.6. Exercises

7.7. Programming assignments

Bibliography

Chapter 8: Where to go from here

Bibliography

Bibliography

Bibliography

Index
Copyright
Morgan Kaufmann is an imprint of Elsevier
50 Hampshire Street, 5th Floor, Cambridge, MA 02139,
United States

Copyright © 2022 Elsevier Inc. All rights reserved.

No part of this publication may be reproduced or


transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or any
information storage and retrieval system, without
permission in writing from the publisher. Details on how to
seek permission, further information about the Publisher's
permissions policies and our arrangements with
organizations such as the Copyright Clearance Center and
the Copyright Licensing Agency, can be found at our
website: www.elsevier.com/permissions.

This book and the individual contributions contained in it


are protected under copyright by the Publisher (other than
as may be noted herein).
Cover art: “seven notations,” nickel/silver etched plates,
acrylic on wood structure, copyright © Holly Cohn

Notices
Knowledge and best practice in this field are constantly
changing. As new research and experience broaden our
understanding, changes in research methods,
professional practices, or medical treatment may become
necessary.
Practitioners and researchers must always rely on their
own experience and knowledge in evaluating and using
any information, methods, compounds, or experiments
described herein. In using such information or methods
they should be mindful of their own safety and the safety
of others, including parties for whom they have a
professional responsibility.

To the fullest extent of the law, neither the Publisher nor


the authors, contributors, or editors, assume any liability
for any injury and/or damage to persons or property as a
matter of products liability, negligence or otherwise, or
from any use or operation of any methods, products,
instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data


A catalog record for this book is available from the Library
of Congress

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the
British Library

ISBN: 978-0-12-804605-0

For information on all Morgan Kaufmann publications


visit our website at https://www.elsevier.com/books-and-
journals

Publisher: Katey Birtcher


Acquisitions Editor: Stephen Merken
Content Development Manager: Meghan Andress
Publishing Services Manager: Shereen Jameel
Production Project Manager: Rukmani Krishnan
Designer: Victoria Pearson

Typeset by VTeX
Printed in United States of America

Last digit is the print number: 9 8 7 6 5 4 3 2 1


Dedication

To the memory of Robert S. Miller


Preface
Parallel hardware has been ubiquitous for some time
now: it's difficult to find a laptop, desktop, or server that
doesn't use a multicore processor. Cluster computing is
nearly as common today as high-powered workstations
were in the 1990s, and cloud computing is making
distributed-memory systems as accessible as desktops. In
spite of this, most computer science majors graduate with
little or no experience in parallel programming. Many
colleges and universities offer upper-division elective
courses in parallel computing, but since most computer
science majors have to take a large number of required
courses, many graduate without ever writing a
multithreaded or multiprocess program.
It seems clear that this state of affairs needs to change.
Whereas many programs can obtain satisfactory
performance on a single core, computer scientists should
be made aware of the potentially vast performance
improvements that can be obtained with parallelism, and
they should be able to exploit this potential when the need
arises.
Introduction to Parallel Programming was written to
partially address this problem. It provides an introduction
to writing parallel programs using MPI, Pthreads, OpenMP,
and CUDA, four of the most widely used APIs for parallel
programming. The intended audience is students and
professionals who need to write parallel programs. The
prerequisites are minimal: a college-level course in
mathematics and the ability to write serial programs in C.
The prerequisites are minimal, because we believe that
students should be able to start programming parallel
systems as early as possible. At the University of San
Francisco, computer science students can fulfill a
requirement for the major by taking a course on which this
text is based immediately after taking the “Introduction to
Computer Science I” course that most majors take in the
first semester of their freshman year. It has been our
experience that there really is no reason for students to
defer writing parallel programs until their junior or senior
year. To the contrary, the course is popular, and students
have found that using concurrency in other courses is much
easier after having taken this course.
If second-semester freshmen can learn to write parallel
programs by taking a class, then motivated computing
professionals should be able to learn to write parallel
programs through self-study. We hope this book will prove
to be a useful resource for them.
The Second Edition
It has been nearly ten years since the first edition of
Introduction to Parallel Programming was published.
During that time much has changed in the world of parallel
programming, but, perhaps surprisingly, much also remains
the same. Our intent in writing this second edition has been
to preserve the material from the first edition that
continues to be generally useful, but also to add new
material where we felt it was needed.
The most obvious addition is the inclusion of a new
chapter on CUDA programming. When the first edition was
published, CUDA was still very new. It was already clear
that the use of GPUs in high-performance computing would
become very widespread, but at that time we felt that
GPGPU wasn't readily accessible to programmers with
relatively little experience. In the last ten years, that has
clearly changed. Of course, CUDA is not a standard, and
features are added, modified, and deleted with great
rapidity. As a consequence, authors who use CUDA must
present a subject that changes much faster than a
standard, such as MPI, Pthreads, or OpenMP. In spite of
this, we hope that our presentation of CUDA will continue
to be useful for some time.
Another big change is that Matthew Malensek has come
onboard as a coauthor. Matthew is a relatively new
colleague at the University of San Francisco, but he has
extensive experience with both the teaching and
application of parallel computing. His contributions have
greatly improved the second edition.
About This Book
As we noted earlier, the main purpose of the book is to
teach parallel programming in MPI, Pthreads, OpenMP, and
CUDA to an audience with a limited background in
computer science and no previous experience with
parallelism. We also wanted to make the book as flexible as
possible so that readers who have no interest in learning
one or two of the APIs can still read the remaining material
with little effort. Thus the chapters on the four APIs are
largely independent of each other: they can be read in any
order, and one or two of these chapters can be omitted.
This independence has some cost: it was necessary to
repeat some of the material in these chapters. Of course,
repeated material can be simply scanned or skipped.
On the other hand, readers with no prior experience with
parallel computing should read Chapter 1 first. This
chapter attempts to provide a relatively nontechnical
explanation of why parallel systems have come to dominate
the computer landscape. It also provides a short
introduction to parallel systems and parallel programming.
Chapter 2 provides technical background on computer
hardware and software. Chapters 3 to 6 provide
independent introductions to MPI, Pthreads, OpenMP, and
CUDA, respectively. Chapter 7 illustrates the development
of two different parallel programs using each of the four
APIs. Finally, Chapter 8 provides a few pointers to
additional information on parallel computing.
We use the C programming language for developing our
programs, because all four API's have C-language
interfaces, and, since C is such a small language, it is a
relatively easy language to learn—especially for C++ and
Java programmers, since they will already be familiar with
C's control structures.
Classroom Use
This text grew out of a lower-division undergraduate
course at the University of San Francisco. The course
fulfills a requirement for the computer science major, and it
also fulfills a prerequisite for the undergraduate operating
systems, architecture, and networking courses. The course
begins with a four-week introduction to C programming.
Since most of the students have already written Java
programs, the bulk of this introduction is devoted to the
use pointers in C.1 The remainder of the course provides
introductions first to programming in MPI, then Pthreads
and/or OpenMP, and it finishes with material covering
CUDA.
We cover most of the material in Chapters 1, 3, 4, 5, and
6, and parts of the material in Chapters 2 and 7. The
background in Chapter 2 is introduced as the need arises.
For example, before discussing cache coherence issues in
OpenMP (Chapter 5), we cover the material on caches in
Chapter 2.
The coursework consists of weekly homework
assignments, five programming assignments, a couple of
midterms and a final exam. The homework assignments
usually involve writing a very short program or making a
small modification to an existing program. Their purpose is
to insure that the students stay current with the
coursework, and to give the students hands-on experience
with ideas introduced in class. It seems likely that their
existence has been one of the principle reasons for the
course's success. Most of the exercises in the text are
suitable for these brief assignments.
The programming assignments are larger than the
programs written for homework, but we typically give the
students a good deal of guidance: we'll frequently include
pseudocode in the assignment and discuss some of the
more difficult aspects in class. This extra guidance is often
crucial: it's easy to give programming assignments that will
take far too long for the students to complete.
The results of the midterms and finals and the
enthusiastic reports of the professor who teaches operating
systems suggest that the course is actually very successful
in teaching students how to write parallel programs.
For more advanced courses in parallel computing, the
text and its online supporting materials can serve as a
supplement so that much of the material on the syntax and
semantics of the four APIs can be assigned as outside
reading.
The text can also be used as a supplement for project-
based courses and courses outside of computer science
that make use of parallel computation.
Support Materials
An online companion site for the book is located at
www.elsevier.com/books-and-journals/book-
companion/9780128046050.. This site will include errata
and complete source for the longer programs we discuss in
the text. Additional material for instructors, including
downloadable figures and solutions to the exercises in the
book, can be downloaded from
https://educate.elsevier.com/9780128046050.
We would greatly appreciate readers' letting us know of
any errors they find. Please send email to
mmalensek@usfca.edu if you do find a mistake.
Acknowledgments
In the course of working on this book we've received
considerable help from many individuals. Among them we'd
like to thank the reviewers of the second edition, Steven
Frankel (Technion) and Il-Hyung Cho (Saginaw Valley State
University), who read and commented on draft versions of
the new CUDA chapter. We'd also like to thank the
reviewers who read and commented on the initial proposal
for the book: Fikret Ercal (Missouri University of Science
and Technology), Dan Harvey (Southern Oregon
University), Joel Hollingsworth (Elon University), Jens
Mache (Lewis and Clark College), Don McLaughlin (West
Virginia University), Manish Parashar (Rutgers University),
Charlie Peck (Earlham College), Stephen C. Renk (North
Central College), Rolfe Josef Sassenfeld (The University of
Texas at El Paso), Joseph Sloan (Wofford College), Michela
Taufer (University of Delaware), Pearl Wang (George Mason
University), Bob Weems (University of Texas at Arlington),
and Cheng-Zhong Xu (Wayne State University). We are also
deeply grateful to the following individuals for their
reviews of various chapters of the book: Duncan Buell
(University of South Carolina), Matthias Gobbert
(University of Maryland, Baltimore County), Krishna Kavi
(University of North Texas), Hong Lin (University of
Houston–Downtown), Kathy Liszka (University of Akron),
Leigh Little (The State University of New York), Xinlian Liu
(Hood College), Henry Tufo (University of Colorado at
Boulder), Andrew Sloss (Consultant Engineer, ARM), and
Gengbin Zheng (University of Illinois). Their comments and
suggestions have made the book immeasurably better. Of
course, we are solely responsible for remaining errors and
omissions.
Slides and the solutions manual for the first edition were
prepared by Kathy Liszka and Jinyoung Choi, respectively.
Thanks to both of them.
The staff at Elsevier has been very helpful throughout
this project. Nate McFadden helped with the development
of the text. Todd Green and Steve Merken were the
acquisitions editors. Meghan Andress was the content
development manager. Rukmani Krishnan was the
production editor. Victoria Pearson was the designer. They
did a great job, and we are very grateful to all of them.
Our colleagues in the computer science and mathematics
departments at USF have been extremely helpful during
our work on the book. Peter would like to single out Prof.
Gregory Benson for particular thanks: his understanding of
parallel computing—especially Pthreads and semaphores—
has been an invaluable resource. We're both very grateful
to our system administrators, Alexey Fedosov and Elias
Husary. They've patiently and efficiently dealt with all of
the “emergencies” that cropped up while we were working
on programs for the book. They've also done an amazing
job of providing us with the hardware we used to do all
program development and testing.
Peter would never have been able to finish the book
without the encouragement and moral support of his
friends Holly Cohn, John Dean, and Maria Grant. He will
always be very grateful for their help and their friendship.
He is especially grateful to Holly for allowing us to use her
work, seven notations, for the cover.
Matthew would like to thank his colleagues in the USF
Department of Computer Science, as well as Maya
Malensek and Doyel Sadhu, for their love and support.
Most of all, he would like to thank Peter Pacheco for being
a mentor and infallible source of advice and wisdom during
the formative years of his career in academia.
Our biggest debt is to our students. As always, they
showed us what was too easy and what was far too difficult.
They taught us how to teach parallel computing. Our
deepest thanks to all of them.
1 “Interestingly, a number of students have said that they
found the use of C pointers more difficult than MPI
programming.”
Chapter 1: Why parallel
computing
From 1986 to 2003, the performance of microprocessors
increased, on average, more than 50% per year [28]. This
unprecedented increase meant that users and software
developers could often simply wait for the next generation
of microprocessors to obtain increased performance from
their applications. Since 2003, however, single-processor
performance improvement has slowed to the point that in
the period from 2015 to 2017, it increased at less than 4%
per year [28]. This difference is dramatic: at 50% per year,
performance will increase by almost a factor of 60 in 10
years, while at 4%, it will increase by about a factor of 1.5.
Furthermore, this difference in performance increase has
been associated with a dramatic change in processor
design. By 2005, most of the major manufacturers of
microprocessors had decided that the road to rapidly
increasing performance lay in the direction of parallelism.
Rather than trying to continue to develop ever-faster
monolithic processors, manufacturers started putting
multiple complete processors on a single integrated circuit.
This change has a very important consequence for
software developers: simply adding more processors will
not magically improve the performance of the vast majority
of serial programs, that is, programs that were written to
run on a single processor. Such programs are unaware of
the existence of multiple processors, and the performance
of such a program on a system with multiple processors
will be effectively the same as its performance on a single
processor of the multiprocessor system.
All of this raises a number of questions:

• Why do we care? Aren't single-processor systems


fast enough?
• Why can't microprocessor manufacturers continue
to develop much faster single-processor systems?
Why build parallel systems? Why build systems
with multiple processors?
• Why can't we write programs that will automatically
convert serial programs into parallel programs,
that is, programs that take advantage of the
presence of multiple processors?

Let's take a brief look at each of these questions. Keep in


mind, though, that some of the answers aren't carved in
stone. For example, the performance of many applications
may already be more than adequate.

1.1 Why we need ever-increasing performance


The vast increases in computational power that we've been
enjoying for decades now have been at the heart of many of
the most dramatic advances in fields as diverse as science,
the Internet, and entertainment. For example, decoding the
human genome, ever more accurate medical imaging,
astonishingly fast and accurate Web searches, and ever
more realistic and responsive computer games would all
have been impossible without these increases. Indeed,
more recent increases in computational power would have
been difficult, if not impossible, without earlier increases.
But we can never rest on our laurels. As our computational
power increases, the number of problems that we can
seriously consider solving also increases. Here are a few
examples:

• Climate modeling. To better understand climate


change, we need far more accurate computer
models, models that include interactions between
the atmosphere, the oceans, solid land, and the ice
caps at the poles. We also need to be able to make
detailed studies of how various interventions might
affect the global climate.
• Protein folding. It's believed that misfolded proteins
may be involved in diseases such as Huntington's,
Parkinson's, and Alzheimer's, but our ability to study
configurations of complex molecules such as
proteins is severely limited by our current
computational power.
• Drug discovery. There are many ways in which
increased computational power can be used in
research into new medical treatments. For example,
there are many drugs that are effective in treating a
relatively small fraction of those suffering from some
disease. It's possible that we can devise alternative
treatments by careful analysis of the genomes of the
individuals for whom the known treatment is
ineffective. This, however, will involve extensive
computational analysis of genomes.
• Energy research. Increased computational power
will make it possible to program much more detailed
models of technologies, such as wind turbines, solar
cells, and batteries. These programs may provide
the information needed to construct far more
efficient clean energy sources.
• Data analysis. We generate tremendous amounts of
data. By some estimates, the quantity of data stored
worldwide doubles every two years [31], but the vast
majority of it is largely useless unless it's analyzed.
As an example, knowing the sequence of nucleotides
in human DNA is, by itself, of little use.
Understanding how this sequence affects
development and how it can cause disease requires
extensive analysis. In addition to genomics, huge
quantities of data are generated by particle
colliders, such as the Large Hadron Collider at
CERN, medical imaging, astronomical research, and
Web search engines—to name a few.

These and a host of other problems won't be solved without


tremendous increases in computational power.

1.2 Why we're building parallel systems


Much of the tremendous increase in single-processor
performance was driven by the ever-increasing density of
transistors—the electronic switches—on integrated circuits.
As the size of transistors decreases, their speed can be
increased, and the overall speed of the integrated circuit
can be increased. However, as the speed of transistors
increases, their power consumption also increases. Most of
this power is dissipated as heat, and when an integrated
circuit gets too hot, it becomes unreliable. In the first
decade of the twenty-first century, air-cooled integrated
circuits reached the limits of their ability to dissipate heat
[28].
Therefore it is becoming impossible to continue to
increase the speed of integrated circuits. Indeed, in the last
few years, the increase in transistor density has slowed
dramatically [36].
But given the potential of computing to improve our
existence, there is a moral imperative to continue to
increase computational power.
How then, can we continue to build ever more powerful
computers? The answer is parallelism. Rather than building
ever-faster, more complex, monolithic processors, the
industry has decided to put multiple, relatively simple,
complete processors on a single chip. Such integrated
circuits are called multicore processors, and core has
become synonymous with central processing unit, or CPU.
In this setting a conventional processor with one CPU is
often called a single-core system.
1.3 Why we need to write parallel programs
Most programs that have been written for conventional,
single-core systems cannot exploit the presence of multiple
cores. We can run multiple instances of a program on a
multicore system, but this is often of little help. For
example, being able to run multiple instances of our
favorite game isn't really what we want—we want the
program to run faster with more realistic graphics. To do
this, we need to either rewrite our serial programs so that
they're parallel, so that they can make use of multiple
cores, or write translation programs, that is, programs that
will automatically convert serial programs into parallel
programs. The bad news is that researchers have had very
limited success writing programs that convert serial
programs in languages such as C, C++, and Java into
parallel programs.
This isn't terribly surprising. While we can write
programs that recognize common constructs in serial
programs, and automatically translate these constructs into
efficient parallel constructs, the sequence of parallel
constructs may be terribly inefficient. For example, we can
view the multiplication of two matrices as a sequence
of dot products, but parallelizing a matrix multiplication as
a sequence of parallel dot products is likely to be fairly slow
on many systems.
An efficient parallel implementation of a serial program
may not be obtained by finding efficient parallelizations of
each of its steps. Rather, the best parallelization may be
obtained by devising an entirely new algorithm.
As an example, suppose that we need to compute n
values and add them together. We know that this can be
done with the following serial code:
Now suppose we also have p cores and . Then each
core can form a partial sum of approximately values:

Here the prefix indicates that each core is using its own,
private variables, and each core can execute this block of
code independently of the other cores.
After each core completes execution of this code, its
variable will store the sum of the values computed by
its calls to . For example, if there are eight
cores, , and the 24 calls to return the
values

1, 4, 3, 9, 2, 8, 5, 1, 1, 6, 2, 7, 2, 5, 0, 4, 1, 8, 6, 5,
1, 2, 3, 9,
then the values stored in might be

Here we're assuming the cores are identified by


nonnegative integers in the range , where p is the
number of cores.
When the cores are done computing their values of ,
they can form a global sum by sending their results to a
designated “master” core, which can add their results:

In our example, if the master core is core 0, it would add


the values .
But you can probably see a better way to do this—
especially if the number of cores is large. Instead of making
the master core do all the work of computing the final sum,
we can pair the cores so that while core 0 adds in the result
of core 1, core 2 can add in the result of core 3, core 4 can
add in the result of core 5, and so on. Then we can repeat
the process with only the even-ranked cores: 0 adds in the
result of 2, 4 adds in the result of 6, and so on. Now cores
divisible by 4 repeat the process, and so on. See Fig. 1.1.
The circles contain the current value of each core's sum,
and the lines with arrows indicate that one core is sending
its sum to another core. The plus signs indicate that a core
is receiving a sum from another core and adding the
received sum into its own sum.
FIGURE 1.1 Multiple cores forming a global sum.

For both “global” sums, the master core (core 0) does


more work than any other core, and the length of time it
takes the program to complete the final sum should be the
length of time it takes for the master to complete. However,
with eight cores, the master will carry out seven receives
and adds using the first method, while with the second
method, it will only carry out three. So the second method
results in an improvement of more than a factor of two. The
difference becomes much more dramatic with large
numbers of cores. With 1000 cores, the first method will
require 999 receives and adds, while the second will only
require 10—an improvement of almost a factor of 100!
The first global sum is a fairly obvious generalization of
the serial global sum: divide the work of adding among the
cores, and after each core has computed its part of the
sum, the master core simply repeats the basic serial
addition—if there are p cores, then it needs to add p values.
The second global sum, on the other hand, bears little
relation to the original serial addition.
The point here is that it's unlikely that a translation
program would “discover” the second global sum. Rather,
there would more likely be a predefined efficient global
sum that the translation program would have access to. It
could “recognize” the original serial loop and replace it
with a precoded, efficient, parallel global sum.
We might expect that software could be written so that a
large number of common serial constructs could be
recognized and efficiently parallelized, that is, modified so
that they can use multiple cores. However, as we apply this
principle to ever more complex serial programs, it becomes
more and more difficult to recognize the construct, and it
becomes less and less likely that we'll have a precoded,
efficient parallelization.
Thus we cannot simply continue to write serial programs;
we must write parallel programs, programs that exploit the
power of multiple processors.

1.4 How do we write parallel programs?


There are a number of possible answers to this question,
but most of them depend on the basic idea of partitioning
the work to be done among the cores. There are two widely
used approaches: task-parallelism and data-parallelism.
In task-parallelism, we partition the various tasks carried
out in solving the problem among the cores. In data-
parallelism, we partition the data used in solving the
problem among the cores, and each core carries out more
or less similar operations on its part of the data.
As an example, suppose that Prof P has to teach a section
of “Survey of English Literature.” Also suppose that Prof P
has one hundred students in her section, so she's been
assigned four teaching assistants (TAs): Mr. A, Ms. B, Mr. C,
and Ms. D. At last the semester is over, and Prof P makes
up a final exam that consists of five questions. To grade the
exam, she and her TAs might consider the following two
options: each of them can grade all one hundred responses
to one of the questions; say, P grades question 1, A grades
question 2, and so on. Alternatively, they can divide the one
hundred exams into five piles of twenty exams each, and
each of them can grade all the papers in one of the piles; P
grades the papers in the first pile, A grades the papers in
the second pile, and so on.
In both approaches the “cores” are the professor and her
TAs. The first approach might be considered an example of
task-parallelism. There are five tasks to be carried out:
grading the first question, grading the second question,
and so on. Presumably, the graders will be looking for
different information in question 1, which is about
Shakespeare, from the information in question 2, which is
about Milton, and so on. So the professor and her TAs will
be “executing different instructions.”
On the other hand, the second approach might be
considered an example of data-parallelism. The “data” are
the students' papers, which are divided among the cores,
and each core applies more or less the same grading
instructions to each paper.
The first part of the global sum example in Section 1.3
would probably be considered an example of data-
parallelism. The data are the values computed by
, and each core carries out roughly the same
operations on its assigned elements: it computes the
required values by calling and adds them
together. The second part of the first global sum example
might be considered an example of task-parallelism. There
are two tasks: receiving and adding the cores' partial sums,
which is carried out by the master core; and giving the
partial sum to the master core, which is carried out by the
other cores.
When the cores can work independently, writing a
parallel program is much the same as writing a serial
program. Things get a great deal more complex when the
cores need to coordinate their work. In the second global
sum example, although the tree structure in the diagram is
very easy to understand, writing the actual code is
relatively complex. See Exercises 1.3 and 1.4.
Unfortunately, it's much more common for the cores to
need coordination.
In both global sum examples, the coordination involves
communication: one or more cores send their current
partial sums to another core. The global sum examples
should also involve coordination through load balancing.
In the first part of the global sum, it's clear that we want
the amount of time taken by each core to be roughly the
same as the time taken by the other cores. If the cores are
identical, and each call to requires the same
amount of work, then we want each core to be assigned
roughly the same number of values as the other cores. If,
for example, one core has to compute most of the values,
then the other cores will finish much sooner than the
heavily loaded core, and their computational power will be
wasted.
A third type of coordination is synchronization. As an
example, suppose that instead of computing the values to
be added, the values are read from . Say, is an array
that is read in by the master core:

In most systems the cores are not automatically


synchronized. Rather, each core works at its own pace. In
this case, the problem is that we don't want the other cores
to race ahead and start computing their partial sums before
the master is done initializing and making it available to
the other cores. That is, the cores need to wait before
starting execution of the code:
We need to add in a point of synchronization between the
initialization of and the computation of the partial sums:

The idea here is that each core will wait in the function
until all the cores have entered the function—in
particular, until the master core has entered this function.
Currently, the most powerful parallel programs are
written using explicit parallel constructs, that is, they are
written using extensions to languages such as C, C++, and
Java. These programs include explicit instructions for
parallelism: core 0 executes task 0, core 1 executes task 1,
…, all cores synchronize, …, and so on, so such programs
are often extremely complex. Furthermore, the complexity
of modern cores often makes it necessary to use
considerable care in writing the code that will be executed
by a single core.
There are other options for writing parallel programs—
for example, higher level languages—but they tend to
sacrifice performance to make program development
somewhat easier.

1.5 What we'll be doing


We'll be focusing on learning to write programs that are
explicitly parallel. Our purpose is to learn the basics of
programming parallel computers using the C language and
four different APIs or application program interfaces:
the Message-Passing Interface or MPI, POSIX threads
or Pthreads, OpenMP, and CUDA. MPI and Pthreads are
libraries of type definitions, functions, and macros that can
be used in C programs. OpenMP consists of a library and
some modifications to the C compiler. CUDA consists of a
library and modifications to the C++ compiler.
You may well wonder why we're learning about four
different APIs instead of just one. The answer has to do
with both the extensions and parallel systems. Currently,
there are two main ways of classifying parallel systems: one
is to consider the memory that the different cores have
access to, and the other is to consider whether the cores
can operate independently of each other.
In the memory classification, we'll be focusing on
shared-memory systems and distributed-memory
systems. In a shared-memory system, the cores can share
access to the computer's memory; in principle, each core
can read and write each memory location. In a shared-
memory system, we can coordinate the cores by having
them examine and update shared-memory locations. In a
distributed-memory system, on the other hand, each core
has its own, private memory, and the cores can
communicate explicitly by doing something like sending
messages across a network. Fig. 1.2 shows schematics of
the two types of systems.

FIGURE 1.2 (a) A shared memory system and (b) a


distributed memory system.
The second classification divides parallel systems
according to the number of independent instruction
streams and the number of independent data streams. In
one type of system, the cores can be thought of as
conventional processors, so they have their own control
units, and they are capable of operating independently of
each other. Each core can manage its own instruction
stream and its own data stream, so this type of system is
called a Multiple-Instruction Multiple-Data or MIMD
system.
An alternative is to have a parallel system with cores that
are not capable of managing their own instruction streams:
they can be thought of as cores with no control unit.
Rather, the cores share a single control unit. However, each
core can access either its own private memory or memory
that's shared among the cores. In this type of system, all
the cores carry out the same instruction on their own data,
so this type of system is called a Single-Instruction
Multiple-Data or SIMD system.
In a MIMD system, it's perfectly feasible for one core to
execute an addition while another core executes a multiply.
In a SIMD system, two cores either execute the same
instruction (on their own data) or, if they need to execute
different instructions, one executes its instruction while the
other is idle, and then the second executes its instruction
while the first is idle. In a SIMD system, we couldn't have
one core executing an addition while another core executes
a multiplication. The system would have to do something
like this: =5.7cm

Time First core Second core


1 Addition Idle
2 Idle Multiply
Since you're used to programming a processor with its
own control unit, MIMD systems may seem more natural to
you. However, as we'll see, there are many problems that
are very easy to solve using a SIMD system. As a very
simple example, suppose we have three arrays, each with n
elements, and we want to add corresponding entries of the
first two arrays to get the values in the third array. The
serial pseudocode might look like this:

Now suppose we have n SIMD cores, and each core is


assigned one element from each of the three arrays: core i
is assigned elements , and . Then our program can
simply tell each core to add its x- and y-values to get the z
value:

This type of system is fundamental to modern Graphics


Processing Units or GPUs, and since GPUs are extremely
powerful parallel processors, it's important that we learn
how to program them.
Our different APIs are used for programming different
types of systems:

• MPI is an API for programming distributed memory


MIMD systems.
• Pthreads is an API for programming shared memory
MIMD systems.
• OpenMP is an API for programming both shared
memory MIMD and shared memory SIMD systems,
although we'll be focusing on programming MIMD
systems.
• CUDA is an API for programming Nvidia GPUs,
which have aspects of all four of our classifications:
shared memory and distributed memory, SIMD, and
MIMD. We will, however, be focusing on the shared
memory SIMD and MIMD aspects of the API.

1.6 Concurrent, parallel, distributed


If you look at some other books on parallel computing or
you search the Web for information on parallel computing,
you're likely to also run across the terms concurrent
computing and distributed computing. Although there
isn't complete agreement on the distinction between the
terms parallel, distributed, and concurrent, many authors
make the following distinctions:

• In concurrent computing, a program is one in which


multiple tasks can be in progress at any instant [5].
• In parallel computing, a program is one in which
multiple tasks cooperate closely to solve a problem.
• In distributed computing, a program may need to
cooperate with other programs to solve a problem.

So parallel and distributed programs are concurrent, but


a program such as a multitasking operating system is also
concurrent, even when it is run on a machine with only one
core, since multiple tasks can be in progress at any instant.
There isn't a clear-cut distinction between parallel and
distributed programs, but a parallel program usually runs
multiple tasks simultaneously on cores that are physically
close to each other and that either share the same memory
or are connected by a very high-speed network. On the
other hand, distributed programs tend to be more “loosely
coupled.” The tasks may be executed by multiple
computers that are separated by relatively large distances,
and the tasks themselves are often executed by programs
that were created independently. As examples, our two
concurrent addition programs would be considered parallel
by most authors, while a Web search program would be
considered distributed.
But beware, there isn't general agreement on these
terms. For example, many authors consider shared-memory
programs to be “parallel” and distributed-memory
programs to be “distributed.” As our title suggests, we'll be
interested in parallel programs—programs in which closely
coupled tasks cooperate to solve a problem.

1.7 The rest of the book


How can we use this book to help us write parallel
programs?
First, when you're interested in high performance,
whether you're writing serial or parallel programs, you
need to know a little bit about the systems you're working
with—both hardware and software. So in Chapter 2, we'll
give an overview of parallel hardware and software. In
order to understand this discussion, it will be necessary to
review some information on serial hardware and software.
Much of the material in Chapter 2 won't be needed when
we're getting started, so you might want to skim some of
this material and refer back to it occasionally when you're
reading later chapters.
The heart of the book is contained in Chapters 3–7.
Chapters 3, 4, 5, and 6 provide a very elementary
introduction to programming parallel systems using C and
MPI, Pthreads, OpenMP, and CUDA, respectively. The only
prerequisite for reading these chapters is a knowledge of C
programming. We've tried to make these chapters
independent of each other, and you should be able to read
them in any order. However, to make them independent, we
did find it necessary to repeat some material. So if you've
read one of the three chapters, and you go on to read
another, be prepared to skim over some of the material in
the new chapter.
Chapter 7 puts together all we've learned in the
preceding chapters. It develops two fairly large programs
using each of the four APIs. However, it should be possible
to read much of this even if you've only read one of
Chapters 3, 4, 5, or 6. The last chapter, Chapter 8, provides
a few suggestions for further study on parallel
programming.

1.8 A word of warning


Before proceeding, a word of warning. It may be tempting
to write parallel programs “by the seat of your pants,”
without taking the trouble to carefully design and
incrementally develop your program. This will almost
certainly be a mistake. Every parallel program contains at
least one serial program. Since we almost always need to
coordinate the actions of multiple cores, writing parallel
programs is almost always more complex than writing a
serial program that solves the same problem. In fact, it is
often far more complex. All the rules about careful design
and development are usually far more important for the
writing of parallel programs than they are for serial
programs.

1.9 Typographical conventions


We'll make use of the following typefaces in the text:

• Program text, displayed or within running text, will


use the following typefaces:
• Definitions are given in the body of the text, and the
term being defined is printed in boldface type: A
parallel program can make use of multiple cores.
• When we need to refer to the environment in which
a program is being developed, we'll assume that
we're using a UNIX shell, such as , and we'll use a
to indicate the shell prompt:

• We'll specify the syntax of function calls with fixed


argument lists by including a sample argument list.
For example, the integer absolute value function, ,
in , might have its syntax specified with

For more complicated syntax, we'll enclose required


content in angle brackets and optional content in
square brackets . For example, the C statement
might have its syntax specified as follows:
This says that the statement must include an
expression enclosed in parentheses, and the right
parenthesis must be followed by a statement. This
statement can be followed by an optional clause.
If the clause is present, it must include a second
statement.

1.10 Summary
For many years we've reaped the benefits of having ever-
faster processors. However, because of physical limitations,
the rate of performance improvement in conventional
processors has decreased dramatically. To increase the
power of processors, chipmakers have turned to multicore
integrated circuits, that is, integrated circuits with multiple
conventional processors on a single chip.
Ordinary serial programs, which are programs written
for a conventional single-core processor, usually cannot
exploit the presence of multiple cores, and it's unlikely that
translation programs will be able to shoulder all the work
of converting serial programs into parallel programs—
programs that can make use of multiple cores. As software
developers, we need to learn to write parallel programs.
When we write parallel programs, we usually need to
coordinate the work of the cores. This can involve
communication among the cores, load balancing, and
synchronization of the cores.
In this book we'll be learning to program parallel
systems, so that we can maximize their performance. We'll
be using the C language with four different application
program interfaces or APIs: MPI, Pthreads, OpenMP, and
CUDA. These APIs are used to program parallel systems
that are classified according to how the cores access
memory and whether the individual cores can operate
independently of each other.
In the first classification, we distinguish between shared-
memory and distributed-memory systems. In a shared-
memory system, the cores share access to one large pool of
memory, and they can coordinate their actions by accessing
shared memory locations. In a distributed-memory system,
each core has its own private memory, and the cores can
coordinate their actions by sending messages across a
network.
In the second classification, we distinguish between
systems with cores that can operate independently of each
other and systems in which the cores all execute the same
instruction. In both types of system, the cores can operate
on their own data stream. So the first type of system is
called a multiple-instruction multiple-data or MIMD
system, and the second type of system is called a single-
instruction multiple-data or SIMD system.
MPI is used for programming distributed-memory MIMD
systems. Pthreads is used for programming shared-memory
MIMD systems. OpenMP can be used to program both
shared-memory MIMD and shared-memory SIMD systems,
although we'll be looking at using it to program MIMD
systems. CUDA is used for programming Nvidia graphics
processing units or GPUs. GPUs have aspects of all four
types of system, but we'll be mainly interested in the
shared-memory SIMD and shared-memory MIMD aspects.
Concurrent programs can have multiple tasks in
progress at any instant. Parallel and distributed
programs usually have tasks that execute simultaneously.
There isn't a hard and fast distinction between parallel and
distributed, although in parallel programs, the tasks are
usually more tightly coupled.
Parallel programs are usually very complex. So it's even
more important to use good program development
techniques with parallel programs.
1.11 Exercises
1.1 Devise formulas for the functions that calculate
and in the global sum example.
Remember that each core should be assigned
roughly the same number of elements of
computations in the loop. : First consider the
case when n is evenly divisible by p.
1.2 We've implicitly assumed that each call to
requires roughly the same amount of
work as the other calls. How would you change your
answer to the preceding question if call requires
times as much work as the call with ? How
would you change your answer if the first call ( )
requires 2 milliseconds, the second call ( )
requires 4, the third ( ) requires 6, and so on?
1.3 Try to write pseudocode for the tree-structured
global sum illustrated in Fig. 1.1. Assume the
number of cores is a power of two (1, 2, 4, 8, …).
: Use a variable to determine whether a
core should send its sum or receive and add. The
should start with the value 2 and be doubled
after each iteration. Also use a variable to
determine which core should be partnered with the
current core. It should start with the value 1 and
also be doubled after each iteration. For example, in
the first iteration and , so 0
receives and adds, while 1 sends. Also in the first
iteration and , so 0 and
1 are paired in the first iteration.
1.4 As an alternative to the approach outlined in the
preceding problem, we can use C's bitwise operators
to implement the tree-structured global sum. To see
how this works, it helps to write down the binary
(base 2) representation of each of the core ranks and
note the pairings during each stage: =8.5cm
From the table, we see that during the first stage each
core is paired with the core whose rank differs in the
rightmost or first bit. During the second stage, cores
that continue are paired with the core whose rank
differs in the second bit; and during the third stage,
cores are paired with the core whose rank differs in
the third bit. Thus if we have a binary value that
is 0012 for the first stage, 0102 for the second, and
1002 for the third, we can get the rank of the core
we're paired with by “inverting” the bit in our rank
that is nonzero in . This can be done using the
bitwise exclusive or ∧ operator.
Implement this algorithm in pseudocode using the
bitwise exclusive or and the left-shift operator.
1.5 What happens if your pseudocode in Exercise 1.3 or
Exercise 1.4 is run when the number of cores is not a
power of two (e.g., 3, 5, 6, 7)? Can you modify the
pseudocode so that it will work correctly regardless
of the number of cores?
1.6 Derive formulas for the number of receives and
additions that core 0 carries out using
a. the original pseudocode for a global sum, and
b. the tree-structured global sum.
Make a table showing the numbers of receives and
additions carried out by core 0 when the two sums
are used with cores.
1.7 The first part of the global sum example—when
each core adds its assigned computed values—is
usually considered to be an example of data-
parallelism, while the second part of the first global
sum—when the cores send their partial sums to the
master core, which adds them—could be considered
to be an example of task-parallelism. What about the
second part of the second global sum—when the
cores use a tree structure to add their partial sums?
Is this an example of data- or task-parallelism? Why?
1.8 Suppose the faculty members are throwing a party
for the students in the department.
a. Identify tasks that can be assigned to the
faculty members that will allow them to use task-
parallelism when they prepare for the party.
Work out a schedule that shows when the various
tasks can be performed.
b. We might hope that one of the tasks in the
preceding part is cleaning the house where the
party will be held. How can we use data-
parallelism to partition the work of cleaning the
house among the faculty?
c. Use a combination of task- and data-parallelism
to prepare for the party. (If there's too much
work for the faculty, you can use TAs to pick up
the slack.)
1.9 Write an essay describing a research problem in
your major that would benefit from the use of
parallel computing. Provide a rough outline of how
parallelism would be used. Would you use task- or
data-parallelism?

Bibliography
[5] Clay Breshears, The Art of Concurrency: A Thread
Monkey's Guide to Writing Parallel Applications.
Sebastopol, CA: O'Reilly; 2009.
[28] John Hennessy, David Patterson, Computer
Architecture: A Quantitative Approach. 6th ed.
Burlington, MA: Morgan Kaufmann; 2019.
[31] IBM, IBM InfoSphere Streams v1.2.0 supports
highly complex heterogeneous data analysis, IBM
United States Software Announcement 210-037,
Feb. 23, 2010
http://www.ibm.com/common/ssi/rep_ca/7/897/ENUS
210-037/ENUS210-037.PDF.
[36] John Loeffler, No more transistors: the end of
Moore's Law, Interesting Engineering, Nov 29, 2018.
See https://interestingengineering.com/no-more-
transistors-the-end-of-moores-law.
Chapter 2: Parallel
hardware and parallel
software
It's perfectly feasible for specialists in disciplines other
than computer science and computer engineering to write
parallel programs. However, to write efficient parallel
programs, we often need some knowledge of the underlying
hardware and system software. It's also very useful to have
some knowledge of different types of parallel software, so
in this chapter we'll take a brief look at a few topics in
hardware and software. We'll also take a brief look at
evaluating program performance and a method for
developing parallel programs. We'll close with a discussion
of what kind of environment we might expect to be working
in, and a few rules and assumptions we'll make in the rest
of the book.
This is a long, broad chapter, so it may be a good idea to
skim through some of the sections on a first reading so that
you have a good idea of what's in the chapter. Then, when a
concept or term in a later chapter isn't quite clear, it may
be helpful to refer back to this chapter. In particular, you
may want to skim over most of the material in
“Modifications to the von Neumann Model,” except “The
Basics of Caching.” Also, in the “Parallel Hardware”
section, you can safely skim the material on
“Interconnection Networks.” You can also skim the material
on “SIMD Systems” unless you're planning to read the
chapter on CUDA programming.
Other documents randomly have
different content
enlivened by imagery peculiarly vivid and rich; the seventh and
eighth lines especially, contain a picture of a great beauty:—

"Lo in the orient when the gracious light


Lifts up his burning head, each under eye
Doth homage to his new-appearing sight,
Serving with looks his sacred majesty;
And having climb'd the steep-up heavenly hill,
Resembling strong youth in his middle age,
Yet mortal looks adore his beauty still,
Attending on his golden pilgrimage;
But when from high-most pitch, with weary car,
Like feeble age, he reeleth from the day,
The eyes, 'fore duteous, now converted are
From his low tract, and look another way:
So thou," &c.

Son. 7.
The inevitable effects of time over every object in physical nature,
reminding the poet of the disastrous changes incident to human life,
he exclaims in a style highly figurative and picturesque:—

"When I do count the clock that tells the time,


And see the brave day sunk in hideous night;
When I behold the violet past prime,
And sable curls, all silver'd o'er with white;
When lofty trees I see barren of leaves,
Which erst from heat did canopy the herd,
And summer's green all girded up in sheaves,
Borne on the bier with white and bristly beard;
Then of thy beauty do I question make."
Son. 12.
A still more lovely sketch, illustrative of the uneasiness which he
felt in consequence of absence from his friend, is given us in the
following passage, of which the third and fourth lines are pre-
eminent for the poetry of their diction:—

"From you have I been absent in the Spring,


When proud-pied April, dress'd in all his trim,
Hath put a spirit of youth in every thing;
That heavy Saturn laugh'd and leap'd with him.
Yet nor the lays of birds, nor the sweet smell
Of different flowers in odour and in hue,
Could make me any summer's story tell,
Or from their proud lap pluck them where they grew."

Son. 98.
To the melody, perspicuity, and spirit of the versification of the
next specimen, and to the exquisite turn upon the words, too much
praise cannot be given. It is one amongst the numerous evidences
of Lord Southampton being the subject of the great bulk of our
author's sonnets; for he assures us, that he not only esteemed his
lays, but gave argument and skill to his pen:—

"Where art thou, Muse, that thou forget'st so long


To speak of that which gives thee all thy might?
Spend'st thou thy fury on some worthless song,
Dark'ning thy power, to lend base subjects light?
Return, forgetful Muse, and straight redeem
In gentle numbers time so idly spent;
Sing to the ear that doth thy lays esteem,
And gives thy pen both skill and argument."
Son. 100.
From the expressions "old rhyme," and "antique pen," in the
extract which we are about to quote, it is highly probable that our
bard alluded to Chaucer, certainly before his own appearance the
greatest poet that England had produced. The chivalric picture in the
first quatrain, is peculiarly interesting, and the cadence of the metre
is harmony itself:—

"When, in the chronicle of wasted time,


I see descriptions of the fairest wights,
And beauty making beautiful old rhime,
In praise of ladies dead, and lovely knights;
Then, in the blazon of sweet beauty's best,
Of hand, of foot, of lip, of eye, of brow,
I see their antique pen would have express'd
Even such a beauty as you master now."

Son. 106.
It is a striking proof of the poetical inferiority of the few sonnets
which Shakspeare has addressed to his mistress, that we find it
difficult to select more than one passage from them which does
honour to his memory. Of this, however, it will be allowed, that the
comparison is happy, the rhythm pleasing, and the expression clear:

"And truly not the morning sun of heaven


Better becomes the grey cheeks of the east,
Nor that full star that ushers in the even,
Doth half that glory to the sober west,
As those two mourning eyes become thy face."
Son. 132.
In order, however, to judge satisfactorily of the merit of these
poems, it will, no doubt, be deemed necessary by the reader, that a
few entire sonnets be presented to his notice; for, though the
passages just quoted, as well as numerous others which might be
given, have a decided claim upon our approbation, yet, the sonnet
being a very brief composition, it will, of course, be required, that all
its parts be perfect, and of equal value. That this is not always the
case with these productions of our author, will be inferred from the
short extracts which we have selected; but that it is so in very many
instances may truly be affirmed, and will, indeed, be proved by the
subsequent specimens.
So far from affectation and pedantry being the general
characteristic of these pieces, impartial criticism must declare, that
more frequent examples of simple, clear, and nervous diction are to
be culled from them, than can be found among the sonnets of any
of his contemporaries. The following, indeed, is given, not as a
solitary proof, but as the exemplar of a numerous class of
Shakspearean sonnets; and with the remark, that neither in this
instance, nor in many others, is there, either in versification,
language, or thought, the smallest deviation into the regions of
affectation or conceit:—

"No longer mourn for me when I am dead,


Than you shall hear the surly sullen bell
Give warning to the world that I am fled
From this vile world, with vilest worms to dwell:
Nay, if you read this line, remember not
The hand that writ it; for I love you so,
That I in your sweet thoughts would be forgot,
If thinking on me then should make you woe.
O if, I say, you look upon this verse,
When I perhaps compounded am with clay,
Do not so much as my poor name rehearse;
But let your love even with my life decay:
Lest the wise world should look into your moan,
And mock you with me after I am gone."
Son. 71.
Simplicity of style, and tenderness of sentiment, form the sole
features of this sonnet; but in the next, with an equal chastity of
diction, are combined more energy and dignity, together with the
infusion of some noble and appropriate imagery. It must also be
added, that the flow and structure of the verse are singularly
pleasing:—

"Let me not to the marriage of true minds


Admit impediments. Love is not love
Which alters when it alteration finds,
Or bends with the remover to remove:
O no! it is an ever-fixed mark,
That looks on tempests, and is never shaken;
It is the star to every wandering bark,
Whose worth's unknown, although his height be taken.
Love's not Time's fool, though rosy lips and cheeks
Within his bending sickle's compass come;
Love alters not with his brief hours and weeks,
But bears it out even to the edge of doom.
If this be error, and upon me prov'd,
I never writ, nor no man ever lov'd."
Son. 116.
Of a lighter though more glowing cast of poetry, both in
expression and imagination, but with a slight blemish, arising from
the pharmaceutical allusion in the last line, is the sonnet which we
are about to quote. A trifling inaccuracy with respect to the colour of
the cynorhodon, or canker-rose, afforded Mr. Steevens a pretext for
the splenetic interrogation which has been recorded by us with due
censure. It is somewhat strange that the beauties of the poem could
not disarm the prejudices of the critic:

"O how much more doth beauty beauteous seem,


By that sweet ornament which truth doth give!
The rose looks fair, but fairer we it deem
For that sweet odour which doth in it live.
The canker-blooms have full as deep a dye,
As the perfumed tincture of the roses,
Hang on such thorns, and play as wantonly
When summer's breath their masked buds discloses:
But, for their virtue only is their show,
They live unwoo'd, and unrespected fade;
Die to themselves. Sweet roses do not so;
Of their sweet deaths are sweetest odours made:
And so of you, beauteous and lovely youth,
When that shall fade, my verse distills your truth."

Son. 54.
In spirit, however, in elegance, in the skill and texture of its
modulation, and beyond all, in the dignified and highly poetical close
of the third quatrain, no one of our author's sonnets excels the
twenty-ninth. The ascent of the lark was a favourite subject of
contemplation with the poet:—

"When in disgrace with fortune and men's eyes,


I all alone beweep my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon myself, and curse my fate.
Wishing me like to one more rich in hope,
Featur'd like him, like him with friends possess'd,
Desiring this man's art, and that man's scope,
With what I most enjoy contented least;
Yet in these thoughts myself almost despising,
Haply I think on thee,—and then my state,
Like to the lark at break of day arising
From sullen earth, sings hymns at heaven's gate;
For thy sweet love remember'd, such wealth brings,
That then I scorn to change my state with kings."

It is, time, however, to terminate these transcriptions, which have


been already sufficiently numerous to enable the reader to form an
estimate of the poet's merit in the difficult task of sonnet-writing.
That many more might be brought forward, of equal value with
those which we have selected, will be allowed perhaps when we
state, that in the specimens of Mr. Ellis, the Petrarca of Mr.
Henderson, and the Laura of Mr. Lofft, eleven have been chosen, of
which, we find upon reference, only one among the four just now
adduced.
The last production in the minor poems of Shakspeare, is A
Lover's Complaint, in which a forlorn damsel, seduced and deserted,
relates the history of her sorrows to
"A reverend man that graz'd his cattle nigh."

It is written in stanzas of seven lines; the first and third, and the
second, fourth, and fifth, rhiming to each other, while the sixth and
seventh form a couplet; an arrangement exactly similar to the stanza
of the Rape of Lucrece. Like many of our author's smaller pieces, it
is too full of imagery and allusion, but has several passages of great
beauty and force. In the description which this forsaken fair one
gives of the person and qualities of her lover, the following lines will
be acknowledged to possess considerable excellence:—

"His browny locks did hang in crooked curls,


And every light occasion of the wind
Upon his lips their silken parcels hurls.—

His qualities were beauteous as his form,


For maiden-tongu'd he was, and therefore free;
Yet, if men mov'd him, was he such a storm
As oft 'twixt May and April is to see,
When winds breathe sweet, unruly though they be.—

His real habitude gave life and grace


To appertainings and to ornament."

These, and every other portion of the poem, however, are


eclipsed by a subsequent part of the same picture, in which, as Mr.
Steevens well remarks, the poet "has accidentally delineated his own
character as a dramatist."[83:A] So applicable, indeed, did the
passage appear to us, as a forcible though rapid sketch of the more
prominent features of the author's own genius, and of his universal
influence over the human mind, that we have selected it as a motto
for the second volume of this work:—
—— "On the tip of his subduing tongue
All kind of arguments and question deep,
All replication prompt, and reason strong,
For his advantage still did wake and sleep:
To make the weeper laugh, the laugher weep,
He had the dialect and different skill,
Catching all passions in his craft of will;

That he did in the general bosom reign


Of young, of old; and sexes both enchanted."

The address which the injured mistress puts into the mouth of her
seducer, when "he 'gan besiege her," opens in a strain of such
beautiful simplicity, that we cannot avoid an expression of regret,
that the defective taste of the age prevented its continuance and
completion in a similar style of tenderness and ease:—

————————————— "Gentle maid,


Have of my suffering youth some feeling pity,
And be not of my holy vows afraid."

After relating, rather too circumstantially, the arts and hypocrisy


which had been exercised for her ruin, she bursts into the following
exclamation:—

"O father, what a hell of mischief lies


In the small orb of one particular tear!"

Various lines, and brief extracts, of no common merit, might be


detached from the Lover's Complaint; but enough has now been said
on the Miscellaneous Poetry of Shakspeare, to prove that it
possesses a value far beyond what has been attributed to it in
modern times. The depreciation, indeed, to which it has been lately
subjected, a fate so directly opposed to that which accompanied its
first reception in the world, must be ascribed, in a great measure, to
the unaccountable prejudices of Mr. Steevens, who, in an
Advertisement prefixed to the edition of our author's Dramas, in
1793, has made the following curious declaration:—
"We have not reprinted the Sonnets, &c. of Shakspeare, because
the strongest act of parliament that could be framed would fail to
compel readers into their service; notwithstanding these
miscellaneous poems have derived every possible advantage from
the literature and judgment of their only intelligent editor, Mr.
Malone, whose implements of criticism, like the ivory rake and
golden spade in Prudentius, are on this occasion disgraced by the
objects of their culture—had Shakspeare produced no other works
than these, his name would have reached us with as little celebrity
as time has conferred on that of Thomas Watson, an older and much
more elegant sonnetteer."[85:A]
That Watson was a much more elegant sonnetteer than
Shakspeare, is an assertion which wants no other mean for its
complete refutation, than a reference to the works of the elder bard.
At the period when Mr. Steevens advanced this verdict, such a
reference was not within the power of one in a thousand of his
readers, but all may now be referred to a very satisfactory article in
the British Bibliographer, where Sir Egerton Brydges has transcribed
seventeen of Watson's sonnets, and declares it to be his conviction,
that they "want the moral cast" of Shakspeare's sonnets; "his
unsophisticated materials; his pure and natural train of thought."
[85:B]It may be added, that a more extended comparison would
render the inferiority of Watson still further apparent, and that the
Bard of Avon would figure from the juxta-position like "Hyperion to a
satyr."
When Mr. Steevens compliments his brother-commentator at the
expense of the poet; when he tells us, that his implements of
criticism are on this occasion disgraced by the objects of their
culture, who can avoid feeling a mingled emotion of wonder and
disgust? who can, in short, forbear a smile of derision and contempt
at the folly of such a declaration?
And lastly, when he assures us, that the strongest act of
parliament that could be framed would fail to compel readers into
the service of our author's Miscellaneous Poetry, and when, at the
same time, we recollect, what gives us pleasure to acknowledge, the
wit, the ingenuity, and research of this able editor on almost every
other occasion, it will not, we trust, be deemed a work of
supererogation, that we have attempted to unfold, at length, the
beauties of these calumniated poems, and to refute the sweeping
censure which they have so unworthily incurred; nor will the
summary inference with which we shall conclude this chapter, be
viewed, we hope, as either incorrect, or unauthorised by the
previous disquisition, when we state it to consist of the following
terms; namely, that the Poems of Shakspeare, although they are
chargeable with the faults peculiar to the age in which they sprung,
yet exhibit so much originality, invention, and fidelity to nature, such
a rich store of moral and philosophic thought, and often, such a
purity, simplicity, and grace of style, as not only deservedly placed
them high in the favour of his contemporaries, but will permanently
secure to them no inconsiderable share of the admiration and the
gratitude of posterity.[86:A]
FOOTNOTES:
[2:A] Sydney Papers, vol. ii. p. 132.

[2:B] Venus and Adonis was entered on the Stationers' Books,


by Richard Field, April 18, 1593, six days before its author
completed the twenty-ninth year of his age.

[3:A] "There is one instance," says Rowe, who first mentioned


the anecdote, "so singular in the magnificence of this patron of
Shakspeare's, that if I had not been assured that the story was
handed down by Sir William Davenant, who was probably very
well acquainted with his affairs, I should not have ventured to
have inserted; that my Lord Southampton at one time gave him a
thousand pounds, to enable him to go through with a purchase
which he heard he had a mind to. A bounty very great, and very
rare at any time."—Reed's Shakspeare, vol. i. p. 67.

[5:A] Sydney Papers, vol. i. p. 348.

[5:B] "There were present, at this Council, the Earl of


Southampton, with whom, in former times, he (Essex) had been
at some emulations, and differences, at Court: But, after,
Southampton, having married his Kinswoman, plunged himself
wholly into his fortune," &c. Declaration of the Treason of the Earl
of Essex, sign. D. quoted by Mr. Chalmers, Supplement. Apology,
p. 110.

[5:C] Rowland Whyte informs us, that "Lord Southampton


fought with one of the king's great men of war, and sunk her."
Sydney Papers, vol. ii. p. 72; but Sir William Monson calls this
man of war "a frigate of the Spanish fleet."

[5:D] Account of the Wars with Spain, p. 38.


[6:A] Sydney Papers, vol. ii. p. 83.

[7:A] Sydney Papers, vol. ii. p. 87.

[7:B] Ibid., p. 81.

[7:C] Ibid., p. 88.

[7:D] Ibid., p. 90.

[7:E] In a letter, dated November 2nd, 1598, Rowland Whyte


says, that Lord Southampton is about to return to England.
Sydney Papers, vol. ii. p. 104.

[8:A] Imperfect Hints towards a New Edition of Shakspeare,


4to. Part II., Advertisement, p. xxi.

[8:B] Birch's Memoirs, vol. ii. p. 422.

[8:C] Kennet's History of England, vol. ii. p. 614.

[9:A] Vide Harrington's Nugæ Antiquæ, vol. ii. p. 33.

[11:A] Bacon's Works, Mallet's edit. vol. iv. p. 412.

[11:B] Vide Queen Elizabeth's Progresses, by Nichols, vol. ii. p.


1.

[11:C] Chalmers's Supplemental Apology, p. 311, 312.

[12:A] Wilson tells us, that "the Earl of Southampton, covered


long with the Ashes of great Essex his Ruins, was sent for from
the Tower, and the King lookt upon him with a smiling
countenance, though displeasing happily to the new Baron
Essingdon, Sir Robert Cecil, yet it was much more to the Lords
Cobham and Grey, and Sir Walter Rawleigh."—History of Great
Britain, folio, 1653, p. 4.

[12:B] Lodge's Illustrations of British History, vol. iii. p. 270.

[13:A] Winwood's Memorials, vol. iii. p. 54.


[13:B] Lodge's Illustrations, vol. iii. p. 331.

[13:C] Winwood's Memorials, vol. iii. p. 154.

[15:A] "This Spring," relates Wilson, "gave birth to four brave


Regiments of foot (a new apparition in the English horizon) fifteen
hundred in a regiment, which were raised, and transported into
Holland, under four gallant Collonells; the Earl of Oxford, the Earl
of Southampton, the Earl of Essex, and the Lord Willoughby, since
Earl of Lindsey."—History of Great Britain, p. 280.

[16:A] History of Great Britain, p. 284.

[16:B] Cabala, p. 299.

[17:A] When Richard Brathwaite dedicated his "Survey of


History, or a Nursery for Gentry," to Lord Southampton, he terms
him "Learning's select Favourite." Vide Restituta, vol. iii. p. 340.—
Nash, dedicating his "Life of Jacke Wilton," 1594, to the same
nobleman, calls him "a dere lover and cherisher, as well of the
Lovers of Poets, as of Poets themselves;" and he emphatically
adds,—"Incomprehensible is the height of your spirit, both in
heroical resolution and matters of conceit. Unrepriveably perished
that booke whatsoever to wast paper, which on the diamond
rocke of your judgement disasterly chanceth to be shipwrackt."
Jarvis Markham also addresses our English Mecænas in a similar
style, commencing a Sonnet prefixed to his "Most honorable
Tragedie of Richard Grenvile, Knt." in the following manner:—

"Thou glorious Laurell of the Muses' hill;


Whose eyes doth crowne the most victorious pen:
Bright Lampe of Vertue, in whose sacred skill
Lives all the blisse of eares-inchaunting men:"
and closes it with declaring, that if His Lordship would vouchsafe
to approve his Muse, immortality would be the result:—

"So shall my tragick layes be blest by thee,


And from thy lips suck their eternitie."

Restituta, vol. iii. pp. 410, 414.

[19:A] Beaumont's Poems. Chalmers's English Poets, vol. vi. p.


42.

[19:B] Several other tributes to the memory and virtues of


Southampton are on record. Daniel has one, commemorating his
fortitude, when under sentence of death, and the Rev. William
Jones published, in 1625, a Sermon on his decease, preached
before the Countess; to which he added, "The Teares of the Isle
of Wight, shed on the tombe of their most noble, valorous, and
loving Captaine and Governour, the right Honourable Henrie, Earle
of Southampton," containing an Elegy on the father and son
written by himself; "an Episode upon the death" of Lord
Southampton, by Fra. Beale Esqr.; fifteen short pieces of poetry,
called "certain touches upon the life and death of the Right
Honourable Henrie, Earle of Southampton," by W. Pettie, and
another poem on the same subject by Ar. Price.

[19:C] Imperfect Hints towards a New Edition of Shakspeare,


Part II. p. 6. 4to. 1788.

[20:A] A similar impression seems to have arisen in the mind of


the ingenious author of the "Imperfect Hints," who, after selecting
the parting scene between Bassanio and Anthonio in the
Merchant of Venice, as the subject of a picture, remarks, that
"this noble spirit of friendship might have been realized, when my
lord Southampton (the dear and generous friend of Shakspeare)
embarked for the seige of Rees in the Dutchy of Cleve."—
Imperfect Hints, Part I. p. 35.

[20:B] See Part II. chap. ii.

[20:C] "Mr. Malone," relates Mr. Beloe, "had long been in


search of this edition, and when he was about to give up all hope
of possessing it, he obtained a copy from a provincial catalogue.
But he still did not procure it till after a long and tedious
negotiation, and a most enormous price."—Anecdotes of
Literature, vol. i. p. 363.

[27:A] These, and the following extracts, are taken from Mr.
Malone's edition of the Poems of Shakspeare.

[28:A] Malone's Supplement to Shakspeare, 1780, vol. i. p.


463.

[28:B] "Epigrammes in the oldest Cut and newest Fashion. A


twice seven Houres (in so many Weekes) Studie. No longer (like
the Fashion) not unlike to continue. The first seven, John Weever.

Sit voluisse sit valuisse.

At London: printed by V. S. for Thomas Bushell, and are to be


sold at his shop, at the great North doore of Paules. 1599.
12mo."—Vide Beloe's Anecdotes, vol. vi. p. 156.

[28:C] Beloe's Anecdotes, vol. vi. p. 159.

[29:A] Reed's Shakspeare, vol. xviii. p. 2. note by Steevens.

[29:B] Censura Literaria, vol. ix. p. 45, 46.

[29:C] Reed's Shakspeare, vol. ii. p. 197.

[30:A] Ancient British Drama, vol. i. p. 49. col. 2.

[30:B] Malone's Supplement, vol. i. p. 463.


[31:A] Censura Literaria, vol. vi. p. 276. A second edition of
this satire was published separately, in 4to. 1625.

[31:B] Reed's Shakspeare, vol. ii. p. 197, 198.—Many passages,


I believe, might be added to those given in the text, which point
out the great popularity of our author's earliest effort in poetry.
Thus, in the Merrie Conceited Jests of George Peele, an author
who died in or before 1598, the Tapster of an Inn in Pye-corner is
represented as "much given to poetry: for he had ingrossed the
Knight of the Sunne, Venus and Adonis, and other pamphlets."—
Reprint, p. 28.

Again in the Dumb Knight, an Historical Comedy, by Lewis


Machin, printed in 1608, one of the characters, after quoting
several lines from Venus and Adonis, concludes by saying,—

"Go thy way, thou best book in the world.

"Veloups. I pray you, sir, what book do you read?

"President. A book that never an orator's clerk in this kingdom


but is beholden unto; it is called, Maid's Philosophy, or Venus and
Adonis."

Ancient British Drama, vol. ii. p. 146.

[32:A] It is the more probable that the entry of 1594 indicates


a separate edition, as an entry of the impression of 1596 appears
in the Stationers' Register, by W. Leake, dated June 23. 1596.—
Vide Reed's Shakspeare, vol. ii. p. 121.

[32:B] Beloe's Anecdotes, vol. i. p. 363. This copy is in the


possession of Mr. Chalmers.

[33:A] Malone's Supplement, vol. i. p. 469. note.


[34:A] Warton's History of English Poetry, vol. iii. p. 415, 416.
—"It is remarkable," says the historian, in a note on this passage,
"that the sign of Berthelette, the king's printer in Fleet-street, who
flourished about 1540, was the Lucretia, or as he writes it,
Lucretia Romana."

[34:B] The last line of this extract is taken from the 12mo. edit.
of 1616.

[38:A] Supplement, vol. i. p. 537. note.

[38:B] Perhaps the opening stanza of the following scarce


poem, entitled "Epicedium. A funerall Song, upon the vertuous life
and godly death of the right worshipfull the Lady Helen Branch;

Virtus sola manet, cætera cuncta ruunt.

London, printed by Thomas Creed, 1594;" may allude to our


author's Rape of Lucrece:—

"You that to shew your wits, have taken toyle


In regist'ring the deeds of noble men;
And sought for matter in a forraine soyle,
As worthie subjects of your silver pen,
Whom you have rais'd from darke oblivion's den.
You that have writ of chaste Lucretia,
Whose death was witnesse of her spotlesse life:
Or pen'd the praise of sad Cornelia,
Whose blamelesse name hath made her fame so rife,
As noble Pompey's most renoumed wife:
Hither unto your home direct your eies,
Whereas, unthought on, much more matter lies."

Vide Brydges's Restituta, vol. iii. p. 297-299.


[39:A] Malone's Supplement, vol. i. p. 575.

[39:B] "Polimanteia, or The meanes lawfull and unlawfull, to


judge of the fall of a Common-wealth, against the frivolous and
foolish conjectures of this age. Whereunto is added, A letter from
England to her three daughters, Cambridge, Oxford, Innes of
Court, and to all the rest of her inhabitants, &c. &c. Printed by
John Legate, Printer to the Universitie of Cambridge, 1595."

"This work," remarks Mr. Haslewood, "is divided into three


parts; the first, Polimanteia, is on the subtleties and unlawfulness
of Divination, the second, an address from England to her three
Daughters; and the third, England to her Inhabitants, concluding
with the speeches of Religion and Loyalty to her children. Some
researches have been made by a friend to ascertain the author's
name, but without success. He was evidently a man of learning,
and well acquainted with the works of contemporary writers, both
foreign and domestic. The second part of his work is too
interesting, from the names enumerated in the margin, not to be
given entire. The mention of Shakspeare is two years earlier than
Meres's Palladis Tamia, a circumstance that has escaped the
research of all the Commentators; although a copy of the
Polimanteia was possessed by Dr. Farmer, and the work is
repeatedly mentioned by Oldys, in his manuscript notes on
Langbaine."—British Bibliographer, vol. i. p. 274.

[40:A] British Bibliographer, No. XIV. p. 247.

[40:B] Ibid. No. V. p. 533.

[41:A] Malone's Supplement, vol. i. p. 575.

[41:B] Supplement, vol. i. p. 471.—An edition of the Rape of


Lucrece, with a supplement by John Quarles, was published about
1676; for at the end of a copy of Burton's Anatomy of Melancholy,
in my possession, printed in 1676, and the eighth edition, is a
catalogue of books sold by Peter Parker, the proprietor of the
above impression, among which occurs the following article:—

"The Rape of Lucrece committed by Tarquin the sixth, and


remarkable judgements that befell him for it, by that
incomparable Master of our English Poetry William Shakespeare
Gentleman. Whereunto is annexed the Banishment of Tarquin or
the reward of Lust, by John Quarles, 8vo."

It is remarkable, that, at the commencement of the eighteenth


century, our author's Venus and Adonis, and The Rape of Lucrece,
were re-published as State Poems, though it would puzzle the
most acute critic to discover, in either of them, the smallest
allusion to the politics of their age. The work in which they are
thus enrolled, and which betrays also the most complete
ignorance of the era of their production, is entitled "State Poems.—
Poems on affairs of State from 1620 to 1707." London, 1703-7.
8vo. 4 vols.

[42:A] Reed's Shakspeare, vol. vii. p. 105. Act iv. sc. 3.—We
have found reason, as will be seen hereafter, to ascribe this play
to the year 1591.

[42:B] Malone's Supplement, vol. i. pp. 710. 715.

[43:A] "I know not," says this gentleman, "when the second
edition was printed."—Reed's Shakspeare, 1803, vol. ii. p. 153.

[46:A] Vol. xxvi. p. 120, 121.

[46:B] Ibid. vol. xxvi. p. 523.

[47:A] Monthly Magazine, vol. xxvi. p. 312.

[48:A] Monthly Magazine, vol. xxvi. p. 121.


[48:B] Of the ill-requited Capel, whose text of Shakspeare,
notwithstanding all which has been achieved since his decease, is,
perhaps, one of the purest extant, we shall probably have
occasion to speak hereafter. Of the talents of his nephew, and of
the glowing attachment which he bears to Shakspeare, and of the
taste and judgment which he has shown in appreciating his
writings and character, we possess an interesting memorial in the
Introduction to his late publication, entitled "Aphorisms from
Shakspeare."

[49:A] Malone's Supplement, vol. i. p. 714.

[50:A] Printed at the end of his "Lady Pecunia, 4to. London,


1605." This very sonnet, however, has been attributed to
Barnefield himself, and is, in all probability, another evidence of
the incorrectness or the fraud of Jaggard.

[50:B] "Shakspeare's Sonnets, never before imprinted, quarto,


1609, G. Eld, for T. T."

[52:A] Malone's Supplement, vol. i. p. 640.

[57:A] Chalmers's Supplemental Apology, pp. 40-43.

[57:B] Sonnet 126. It should be observed, however, that


Sonnet 145, though in alternate verse, and terminated by a
couplet, is in the octo-syllabic measure.

[59:A] Preface to his revised and corrected edition of


Shakspeare's Works, p. 7.

[60:A] See his "Queen's Arcadia."

[60:B] Malone's Supplement, vol. i. p. 596.

[61:A] Malone's Supplement, vol. i. p. 579.

[62:A] Reed's Shakspeare, vol. vii. p. 331, and vol. xii. p. 219.
[63:A] Malone's Supplement, vol. i. p. 698.

[67:A] If we consult the context of this sonnet, and recollect


that Shakspeare addresses in his own person, it will be sufficiently
evident that my lovers here can only mean my friends.

[73:A] That this series of sonnets, as well as the preceding,


should be considered by Mr. Chalmers as addressed to Queen
Elizabeth, is, indeed, of all conjectures, the most extraordinary!

[74:A] Malone's Supplement, vol. i. p. 682.

[74:B] Ibid. p. 684.

[74:C] Ibid.

[75:A] Malone's Supplement, vol. i. p. 684.

[75:B] Ibid. p. 685.

[75:C] Ibid.

[83:A] Malone's Supplement, vol. i. p. 748. note.

[85:A] Reed's Shakspeare, vol. i. p. 30.

[85:B] British Bibliographer, No. XII. p. 16.

[86:A] That Shakspeare himself entertained a confident hope


of the immortality of his minor poems, the following, out of many
instances, will sufficiently prove:—

"So long as men can breathe, or eyes can see,


So long lives this, and this gives life to thee."

Son. 18.

"Yet, do thy worst, old Time: despite thy wrong,


My love shall in my verse ever live young."
Son. 19.

"Not marble, nor the gilded monuments


Of princes, shall out-live this powerful rhime."

Son. 54.

"Time doth transfix the flourish set on youth,


And delves the parallels in beauty's brow;
Feeds on the rarities of nature's truth,
And nothing stands but for his scythe to mow:
And yet, to times in hope, my verse shall stand,
Praising thy worth, despite his cruel hand."

Son. 60.

——— "Confounding age ———


——— shall never cut from memory
My sweet love's beauty, though my lover's life.
His beauty shall in these black lines be seen,
And they shall live, and he in them still green."

Son. 63.

"When all the breathers of this world are dead;


You still shall live (such virtue hath my pen),
Where breath most breathes,—even in the mouths
of men."

Son. 81.
CHAPTER VI.

ON THE DRESS, AND MODES OF LIVING, THE MANNERS,


AND CUSTOMS, OF THE INHABITANTS OF THE
METROPOLIS, DURING THE AGE OF SHAKSPEARE.
Before we enter on the dramatic career of Shakspeare, a subject
which we wish to preserve unbroken, and free from irrelative matter,
it will be necessary, in order to prosecute our view of the costume of
the Times, to give a picture in this place of the prevalent habits of
the metropolis, which, with the sketch already drawn of those
peculiar to the country, will form a corresponding, and, we trust, an
adequate whole.
In no period of our annals, perhaps, has DRESS formed a more
curious subject of enquiry, than during the reigns of Elizabeth and
James the First. The Queen, who possessed an almost unbounded
share of vanity and coquetry, set an example of profusion which was
followed through every rank of society, and furnished by its
universality, an inexhaustible theme for the puritanic satirists of the
age.
Of the mutability and eccentricity of the dresses both of men and
women, during this period, Harrison has provided us with a singular
and interesting account, and which, as constituting a very
appropriate preface to more minute particulars, we shall here
transcribe.
"Such is our mutabilitie, that to daie there is none to the Spanish
guise, to morrow the French toies are most fine and delectable, yer
long no such apparell as that which is after the high Alman fashion,
by and by the Turkish maner is generallie best liked of, otherwise the
Morisco gowns, the Barbarian sleeves, the mandilion worne to Collie
westen ward, and the short French breeches make such a comelie
vesture, that except it were a dog in a doublet, you shall not sée
anie so disguised, as are my countrie men of England. And as these
fashions are diverse, so likewise it is a world to see the costlinesse
and the curiositie: the excesse and the vanitie: the pompe and the
braverie: the change and the varietie: and finallie the ficklenesse
and the follie that is in all degrees: insomuch that nothing is more
constant in England than inconstancie of attire. Oh how much cost is
bestowed now adaies upon our bodies and how little upon our
soules! how many sutes of apparell hath the one and how little
furniture hath the other? how long time is asked in decking up of the
first, and how little space left wherin to feed the later? how curious,
how nice also are a number of men and women, and how hardlie
can the tailer please them in making it fit for their bodies? how
manie times must it be sent backe againe to him that made it? what
chafing, what fretting, what reprochfull language doth the poore
workman beare awaie? and manie times when he dooth nothing to it
at all, yet when it is brought home againe it is verie fit and
handsome; then must we put it on, then must the long seames of
our hose be set by a plumb-line, then we puffe, then we blow, and
finallie sweat till we drop, that our clothes may stand upon us. I will
saie nothing of our heads, which sometimes are polled, sometimes
curled, or suffered to grow at length like woman's lockes, manie
times cut off above or under the ears round as by a woodden dish.
Neither will I meddle with our varietie of beards, of which some are
shaven from the chin like those of Turks, not a few cut short like to
the beard of marques Otto, some made round like a rubbing brush
other with a pique devant (O fine fashion) or now and then suffered
to grow long, the barbers being growen to be so cunning in this
behalfe as the tailers. And therefore if a man have a leane and
streight face, a marquesse Ottons cut will make it broad and large; if
it be platter like, a long slender beard will make it seeme the
narrower; if he be wesell becked, then much heare left on the
cheekes will make the owner looke big like a bowdled hen, and so
grim as a goose, if Cornelius of Chalmeresford saie true: manie old
men doo weare no beards at all. Some lustie courtiers also and
gentlemen of courage, doo weare either rings of gold, stones, or
pearle in their eares, whereby they imagine the workmanship of God
not to be a little amended. But herein they rather disgrace than
adorne their persons, as by their nicenesse in apparell, for which I
saie most nations doo not unjustlie deride us, as also for that we
doo séeme to imitate all nations round about us, wherein we be like
to the Polypus or Chameleon; and thereunto bestow most cost upon
our arses, and much more than upon all the rest of our bodies, as
women doo likewise upon their heads and shoulders. In women also
it is most to be lamented that they doo now farre exceed the
lightnesse of our men (who neverthelesse are transformed from the
cap even to the verie shoo) and such staring attire as in time past
was supposed meet for none but light housewives onelie, is now
become an habit for chast and sober matrones. What should I saie
of their doublets with pendant cod peeses on the brest full of jags
and cuts, and sleeves of sundrie colours? their galligascons to beare
out their bums and make their attire to sit plum round (as they
terme it) about them? their fardingals, and diverslie coloured nether
stocks of silke, ierdseie, and such like, whereby their bodies are
rather deformed than commended? I have met with some of these
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookmasss.com

You might also like