100% found this document useful (2 votes)

34 views

An Introduction to Parallel Programming. Second Edition Peter S. Pachecopdf download

The document provides information about the book 'An Introduction to Parallel Programming, Second Edition' by Peter S. Pacheco, which aims to teach parallel programming using various APIs like MPI, Pthreads, OpenMP, and CUDA. It highlights the importance of parallel programming in modern computing and outlines the structure of the book, including its chapters and intended audience. Additionally, it mentions the prerequisites for readers and the educational context in which the book can be used.

Uploaded by

ravjotonill

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

34 views

An Introduction to Parallel Programming. Second Edition Peter S. Pachecopdf download

Uploaded by

ravjotonill

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Download the full version and explore a variety of ebooks

or textbooks at https://ebookmass.com

An Introduction to Parallel Programming. Second

Edition Peter S. Pacheco

_____ Follow the link below to get your download now _____

https://ebookmass.com/product/an-introduction-to-parallel-
programming-second-edition-peter-s-pacheco/

Access ebookmass.com now to download high-quality

ebooks or textbooks
Here are some recommended products for you. Click the link to
download, or explore more at ebookmass.com

An Introduction to Parallel Programming 2. Edition Pacheco

https://ebookmass.com/product/an-introduction-to-parallel-
programming-2-edition-pacheco/

Fishes: An Introduction to Ichthyology Peter B. Moyle

https://ebookmass.com/product/fishes-an-introduction-to-ichthyology-
peter-b-moyle/

Parallel programming: concepts and practice González-

Domínguez

https://ebookmass.com/product/parallel-programming-concepts-and-
practice-gonzalez-dominguez/

An Introduction to Programming through C++ Abhiram G.

Ranade

https://ebookmass.com/product/an-introduction-to-programming-through-
c-abhiram-g-ranade/
An Introduction to Policing 9th Edition John S. Dempsey

https://ebookmass.com/product/an-introduction-to-policing-9th-edition-
john-s-dempsey/

The Tangled Bank: An Introduction to Evolution Second

Edition – Ebook PDF Version

https://ebookmass.com/product/the-tangled-bank-an-introduction-to-
evolution-second-edition-ebook-pdf-version/

The Stacked Deck: An Introduction to Social Inequality

Second Edition Jennifer Ball

https://ebookmass.com/product/the-stacked-deck-an-introduction-to-
social-inequality-second-edition-jennifer-ball/

Introduction to global studies Second Edition John

Mccormick

https://ebookmass.com/product/introduction-to-global-studies-second-
edition-john-mccormick/

An Introduction to Redox Polymers for Energy-Storage

Applications Ulrich S. Schubert

https://ebookmass.com/product/an-introduction-to-redox-polymers-for-
energy-storage-applications-ulrich-s-schubert/
An Introduction to Parallel
Programming

SECOND EDITION

Peter S. Pacheco
University of San Francisco

Matthew Malensek
University of San Francisco
Table of Contents

Cover image

Title page

Dedication

Preface

Chapter 1: Why parallel computing

1.1. Why we need ever-increasing performance

1.2. Why we're building parallel systems

1.3. Why we need to write parallel programs

1.4. How do we write parallel programs?

1.5. What we'll be doing

1.6. Concurrent, parallel, distributed

1.7. The rest of the book

1.8. A word of warning

1.9. Typographical conventions

1.10. Summary

1.11. Exercises

Bibliography

Chapter 2: Parallel hardware and parallel software

2.1. Some background

2.2. Modifications to the von Neumann model

2.3. Parallel hardware

2.4. Parallel software

2.5. Input and output

2.6. Performance

2.7. Parallel program design

2.8. Writing and running parallel programs

2.9. Assumptions

2.10. Summary

2.11. Exercises

Bibliography

Chapter 3: Distributed memory programming with MPI

3.1. Getting started

3.2. The trapezoidal rule in MPI

3.3. Dealing with I/O

3.4. Collective communication

3.5. MPI-derived datatypes

3.6. Performance evaluation of MPI programs

3.7. A parallel sorting algorithm

3.8. Summary

3.9. Exercises

3.10. Programming assignments

Bibliography

Chapter 4: Shared-memory programming with Pthreads

4.1. Processes, threads, and Pthreads

4.2. Hello, world

4.3. Matrix-vector multiplication

4.4. Critical sections

4.5. Busy-waiting

4.6. Mutexes

4.7. Producer–consumer synchronization and semaphores

4.8. Barriers and condition variables

4.9. Read-write locks

4.10. Caches, cache-coherence, and false sharing

4.11. Thread-safety

4.12. Summary

4.13. Exercises

4.14. Programming assignments

Bibliography

Chapter 5: Shared-memory programming with OpenMP

5.1. Getting started

5.2. The trapezoidal rule

5.3. Scope of variables

5.4. The reduction clause

5.5. The parallel for directive

5.6. More about loops in OpenMP: sorting

5.7. Scheduling loops

5.8. Producers and consumers

5.9. Caches, cache coherence, and false sharing

5.10. Tasking
5.11. Thread-safety

5.12. Summary

5.13. Exercises

5.14. Programming assignments

Bibliography

Chapter 6: GPU programming with CUDA

6.1. GPUs and GPGPU

6.2. GPU architectures

6.3. Heterogeneous computing

6.4. CUDA hello

6.5. A closer look

6.6. Threads, blocks, and grids

6.7. Nvidia compute capabilities and device architectures

6.8. Vector addition

6.9. Returning results from CUDA kernels

6.10. CUDA trapezoidal rule I

6.11. CUDA trapezoidal rule II: improving performance

6.12. Implementation of trapezoidal rule with warpSize thread

blocks

6.13. CUDA trapezoidal rule III: blocks with more than one warp
6.14. Bitonic sort

6.15. Summary

6.16. Exercises

6.17. Programming assignments

Bibliography

Chapter 7: Parallel program development

7.1. Two n-body solvers

7.2. Sample sort

7.3. A word of caution

7.4. Which API?

7.5. Summary

7.6. Exercises

7.7. Programming assignments

Bibliography

Chapter 8: Where to go from here

Bibliography

Bibliography
Index
Copyright
Morgan Kaufmann is an imprint of Elsevier
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

Copyright © 2022 Elsevier Inc. All rights reserved.

No part of this publication may be reproduced or transmitted in any

form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval
system, without permission in writing from the publisher. Details on
how to seek permission, further information about the Publisher's
permissions policies and our arrangements with organizations such
as the Copyright Clearance Center and the Copyright Licensing
Agency, can be found at our website:
www.elsevier.com/permissions.

This book and the individual contributions contained in it are

protected under copyright by the Publisher (other than as may be
noted herein).
Cover art: “seven notations,” nickel/silver etched plates, acrylic on
wood structure, copyright © Holly Cohn

Notices
Knowledge and best practice in this field are constantly changing.
As new research and experience broaden our understanding,
changes in research methods, professional practices, or medical
treatment may become necessary.
Practitioners and researchers must always rely on their own
experience and knowledge in evaluating and using any
information, methods, compounds, or experiments described
herein. In using such information or methods they should be
mindful of their own safety and the safety of others, including
parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the
authors, contributors, or editors, assume any liability for any injury
and/or damage to persons or property as a matter of products
liability, negligence or otherwise, or from any use or operation of
any methods, products, instructions, or ideas contained in the
material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of
Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-804605-0

For information on all Morgan Kaufmann publications visit our

website at https://www.elsevier.com/books-and-journals

Publisher: Katey Birtcher

Acquisitions Editor: Stephen Merken
Content Development Manager: Meghan Andress
Publishing Services Manager: Shereen Jameel
Production Project Manager: Rukmani Krishnan
Designer: Victoria Pearson

Typeset by VTeX
Printed in United States of America

Last digit is the print number: 9 8 7 6 5 4 3 2 1

Dedication

To the memory of Robert S. Miller

Preface
Parallel hardware has been ubiquitous for some time now: it's
difficult to find a laptop, desktop, or server that doesn't use a
multicore processor. Cluster computing is nearly as common today
as high-powered workstations were in the 1990s, and cloud
computing is making distributed-memory systems as accessible as
desktops. In spite of this, most computer science majors graduate
with little or no experience in parallel programming. Many colleges
and universities offer upper-division elective courses in parallel
computing, but since most computer science majors have to take a
large number of required courses, many graduate without ever
writing a multithreaded or multiprocess program.
It seems clear that this state of affairs needs to change. Whereas
many programs can obtain satisfactory performance on a single core,
computer scientists should be made aware of the potentially vast
performance improvements that can be obtained with parallelism,
and they should be able to exploit this potential when the need
arises.
Introduction to Parallel Programming was written to partially
address this problem. It provides an introduction to writing parallel
programs using MPI, Pthreads, OpenMP, and CUDA, four of the
most widely used APIs for parallel programming. The intended
audience is students and professionals who need to write parallel
programs. The prerequisites are minimal: a college-level course in
mathematics and the ability to write serial programs in C.
The prerequisites are minimal, because we believe that students
should be able to start programming parallel systems as early as
possible. At the University of San Francisco, computer science
students can fulfill a requirement for the major by taking a course on
which this text is based immediately after taking the “Introduction
to Computer Science I” course that most majors take in the first
semester of their freshman year. It has been our experience that there
really is no reason for students to defer writing parallel programs
until their junior or senior year. To the contrary, the course is
popular, and students have found that using concurrency in other
courses is much easier after having taken this course.
If second-semester freshmen can learn to write parallel programs
by taking a class, then motivated computing professionals should be
able to learn to write parallel programs through self-study. We hope
this book will prove to be a useful resource for them.
The Second Edition
It has been nearly ten years since the first edition of Introduction to
Parallel Programming was published. During that time much has
changed in the world of parallel programming, but, perhaps
surprisingly, much also remains the same. Our intent in writing this
second edition has been to preserve the material from the first
edition that continues to be generally useful, but also to add new
material where we felt it was needed.
The most obvious addition is the inclusion of a new chapter on
CUDA programming. When the first edition was published, CUDA
was still very new. It was already clear that the use of GPUs in high-
performance computing would become very widespread, but at that
time we felt that GPGPU wasn't readily accessible to programmers
with relatively little experience. In the last ten years, that has clearly
changed. Of course, CUDA is not a standard, and features are
added, modified, and deleted with great rapidity. As a consequence,
authors who use CUDA must present a subject that changes much
faster than a standard, such as MPI, Pthreads, or OpenMP. In spite of
this, we hope that our presentation of CUDA will continue to be
useful for some time.
Another big change is that Matthew Malensek has come onboard
as a coauthor. Matthew is a relatively new colleague at the
University of San Francisco, but he has extensive experience with
both the teaching and application of parallel computing. His
contributions have greatly improved the second edition.
g y p
About This Book
As we noted earlier, the main purpose of the book is to teach
parallel programming in MPI, Pthreads, OpenMP, and CUDA to an
audience with a limited background in computer science and no
previous experience with parallelism. We also wanted to make the
book as flexible as possible so that readers who have no interest in
learning one or two of the APIs can still read the remaining material
with little effort. Thus the chapters on the four APIs are largely
independent of each other: they can be read in any order, and one or
two of these chapters can be omitted. This independence has some
cost: it was necessary to repeat some of the material in these
chapters. Of course, repeated material can be simply scanned or
skipped.
On the other hand, readers with no prior experience with parallel
computing should read Chapter 1 first. This chapter attempts to
provide a relatively nontechnical explanation of why parallel
systems have come to dominate the computer landscape. It also
provides a short introduction to parallel systems and parallel
programming.
Chapter 2 provides technical background on computer hardware
and software. Chapters 3 to 6 provide independent introductions to
MPI, Pthreads, OpenMP, and CUDA, respectively. Chapter 7
illustrates the development of two different parallel programs using
each of the four APIs. Finally, Chapter 8 provides a few pointers to
additional information on parallel computing.
We use the C programming language for developing our
programs, because all four API's have C-language interfaces, and,
since C is such a small language, it is a relatively easy language to
learn—especially for C++ and Java programmers, since they will
already be familiar with C's control structures.
Classroom Use
This text grew out of a lower-division undergraduate course at the
University of San Francisco. The course fulfills a requirement for the
computer science major, and it also fulfills a prerequisite for the
undergraduate operating systems, architecture, and networking
courses. The course begins with a four-week introduction to C
g
programming. Since most of the students have already written Java
programs, the bulk of this introduction is devoted to the use pointers
in C.1 The remainder of the course provides introductions first to
programming in MPI, then Pthreads and/or OpenMP, and it finishes
with material covering CUDA.
We cover most of the material in Chapters 1, 3, 4, 5, and 6, and
parts of the material in Chapters 2 and 7. The background in Chapter
2 is introduced as the need arises. For example, before discussing
cache coherence issues in OpenMP (Chapter 5), we cover the
material on caches in Chapter 2.
The coursework consists of weekly homework assignments, five
programming assignments, a couple of midterms and a final exam.
The homework assignments usually involve writing a very short
program or making a small modification to an existing program.
Their purpose is to insure that the students stay current with the
coursework, and to give the students hands-on experience with
ideas introduced in class. It seems likely that their existence has been
one of the principle reasons for the course's success. Most of the
exercises in the text are suitable for these brief assignments.
The programming assignments are larger than the programs
written for homework, but we typically give the students a good
deal of guidance: we'll frequently include pseudocode in the
assignment and discuss some of the more difficult aspects in class.
This extra guidance is often crucial: it's easy to give programming
assignments that will take far too long for the students to complete.
The results of the midterms and finals and the enthusiastic reports
of the professor who teaches operating systems suggest that the
course is actually very successful in teaching students how to write
parallel programs.
For more advanced courses in parallel computing, the text and its
online supporting materials can serve as a supplement so that much
of the material on the syntax and semantics of the four APIs can be
assigned as outside reading.
The text can also be used as a supplement for project-based
courses and courses outside of computer science that make use of
parallel computation.
Support Materials
An online companion site for the book is located at
www.elsevier.com/books-and-journals/book-
companion/9780128046050.. This site will include errata and
complete source for the longer programs we discuss in the text.
Additional material for instructors, including downloadable figures
and solutions to the exercises in the book, can be downloaded from
https://educate.elsevier.com/9780128046050.
We would greatly appreciate readers' letting us know of any errors
they find. Please send email to mmalensek@usfca.edu if you do find
a mistake.
Acknowledgments
In the course of working on this book we've received considerable
help from many individuals. Among them we'd like to thank the
reviewers of the second edition, Steven Frankel (Technion) and Il-
Hyung Cho (Saginaw Valley State University), who read and
commented on draft versions of the new CUDA chapter. We'd also
like to thank the reviewers who read and commented on the initial
proposal for the book: Fikret Ercal (Missouri University of Science
and Technology), Dan Harvey (Southern Oregon University), Joel
Hollingsworth (Elon University), Jens Mache (Lewis and Clark
College), Don McLaughlin (West Virginia University), Manish
Parashar (Rutgers University), Charlie Peck (Earlham College),
Stephen C. Renk (North Central College), Rolfe Josef Sassenfeld (The
University of Texas at El Paso), Joseph Sloan (Wofford College),
Michela Taufer (University of Delaware), Pearl Wang (George Mason
University), Bob Weems (University of Texas at Arlington), and
Cheng-Zhong Xu (Wayne State University). We are also deeply
grateful to the following individuals for their reviews of various
chapters of the book: Duncan Buell (University of South Carolina),
Matthias Gobbert (University of Maryland, Baltimore County),
Krishna Kavi (University of North Texas), Hong Lin (University of
Houston–Downtown), Kathy Liszka (University of Akron), Leigh
Little (The State University of New York), Xinlian Liu (Hood
College), Henry Tufo (University of Colorado at Boulder), Andrew
g y y
Sloss (Consultant Engineer, ARM), and Gengbin Zheng (University
of Illinois). Their comments and suggestions have made the book
immeasurably better. Of course, we are solely responsible for
remaining errors and omissions.
Slides and the solutions manual for the first edition were prepared
by Kathy Liszka and Jinyoung Choi, respectively. Thanks to both of
them.
The staff at Elsevier has been very helpful throughout this project.
Nate McFadden helped with the development of the text. Todd
Green and Steve Merken were the acquisitions editors. Meghan
Andress was the content development manager. Rukmani Krishnan
was the production editor. Victoria Pearson was the designer. They
did a great job, and we are very grateful to all of them.
Our colleagues in the computer science and mathematics
departments at USF have been extremely helpful during our work
on the book. Peter would like to single out Prof. Gregory Benson for
particular thanks: his understanding of parallel computing—
especially Pthreads and semaphores—has been an invaluable
resource. We're both very grateful to our system administrators,
Alexey Fedosov and Elias Husary. They've patiently and efficiently
dealt with all of the “emergencies” that cropped up while we were
working on programs for the book. They've also done an amazing
job of providing us with the hardware we used to do all program
development and testing.
Peter would never have been able to finish the book without the
encouragement and moral support of his friends Holly Cohn, John
Dean, and Maria Grant. He will always be very grateful for their
help and their friendship. He is especially grateful to Holly for
allowing us to use her work, seven notations, for the cover.
Matthew would like to thank his colleagues in the USF
Department of Computer Science, as well as Maya Malensek and
Doyel Sadhu, for their love and support. Most of all, he would like to
thank Peter Pacheco for being a mentor and infallible source of
advice and wisdom during the formative years of his career in
academia.
Our biggest debt is to our students. As always, they showed us
what was too easy and what was far too difficult. They taught us
how to teach parallel computing. Our deepest thanks to all of them.
1
“Interestingly, a number of students have said that they found the
use of C pointers more difficult than MPI programming.”
Chapter 1: Why parallel
computing
From 1986 to 2003, the performance of microprocessors increased, on
average, more than 50% per year [28]. This unprecedented increase
meant that users and software developers could often simply wait
for the next generation of microprocessors to obtain increased
performance from their applications. Since 2003, however, single-
processor performance improvement has slowed to the point that in
the period from 2015 to 2017, it increased at less than 4% per year
[28]. This difference is dramatic: at 50% per year, performance will
increase by almost a factor of 60 in 10 years, while at 4%, it will
increase by about a factor of 1.5.
Furthermore, this difference in performance increase has been
associated with a dramatic change in processor design. By 2005,
most of the major manufacturers of microprocessors had decided
that the road to rapidly increasing performance lay in the direction
of parallelism. Rather than trying to continue to develop ever-faster
monolithic processors, manufacturers started putting multiple
complete processors on a single integrated circuit.
This change has a very important consequence for software
developers: simply adding more processors will not magically
improve the performance of the vast majority of serial programs,
that is, programs that were written to run on a single processor. Such
programs are unaware of the existence of multiple processors, and
the performance of such a program on a system with multiple
processors will be effectively the same as its performance on a single
processor of the multiprocessor system.
All of this raises a number of questions:
• Why do we care? Aren't single-processor systems fast
enough?
• Why can't microprocessor manufacturers continue to
develop much faster single-processor systems? Why build
parallel systems? Why build systems with multiple
processors?
• Why can't we write programs that will automatically convert
serial programs into parallel programs, that is, programs
that take advantage of the presence of multiple processors?

Let's take a brief look at each of these questions. Keep in mind,

though, that some of the answers aren't carved in stone. For
example, the performance of many applications may already be
more than adequate.

1.1 Why we need ever-increasing performance

The vast increases in computational power that we've been enjoying
for decades now have been at the heart of many of the most dramatic
advances in fields as diverse as science, the Internet, and
entertainment. For example, decoding the human genome, ever
more accurate medical imaging, astonishingly fast and accurate Web
searches, and ever more realistic and responsive computer games
would all have been impossible without these increases. Indeed,
more recent increases in computational power would have been
difficult, if not impossible, without earlier increases. But we can
never rest on our laurels. As our computational power increases, the
number of problems that we can seriously consider solving also
increases. Here are a few examples:

• Climate modeling. To better understand climate change, we

need far more accurate computer models, models that
include interactions between the atmosphere, the oceans,
solid land, and the ice caps at the poles. We also need to be
able to make detailed studies of how various interventions
might affect the global climate.
• Protein folding. It's believed that misfolded proteins may be
involved in diseases such as Huntington's, Parkinson's, and
Alzheimer's, but our ability to study configurations of
complex molecules such as proteins is severely limited by
our current computational power.
• Drug discovery. There are many ways in which increased
computational power can be used in research into new
medical treatments. For example, there are many drugs that
are effective in treating a relatively small fraction of those
suffering from some disease. It's possible that we can devise
alternative treatments by careful analysis of the genomes of
the individuals for whom the known treatment is ineffective.
This, however, will involve extensive computational analysis
of genomes.
• Energy research. Increased computational power will make it
possible to program much more detailed models of
technologies, such as wind turbines, solar cells, and batteries.
These programs may provide the information needed to
construct far more efficient clean energy sources.
• Data analysis. We generate tremendous amounts of data. By
some estimates, the quantity of data stored worldwide
doubles every two years [31], but the vast majority of it is
largely useless unless it's analyzed. As an example, knowing
the sequence of nucleotides in human DNA is, by itself, of
little use. Understanding how this sequence affects
development and how it can cause disease requires extensive
analysis. In addition to genomics, huge quantities of data are
generated by particle colliders, such as the Large Hadron
Collider at CERN, medical imaging, astronomical research,
and Web search engines—to name a few.
These and a host of other problems won't be solved without
tremendous increases in computational power.

1.2 Why we're building parallel systems

Much of the tremendous increase in single-processor performance
was driven by the ever-increasing density of transistors—the
electronic switches—on integrated circuits. As the size of transistors
decreases, their speed can be increased, and the overall speed of the
integrated circuit can be increased. However, as the speed of
transistors increases, their power consumption also increases. Most
of this power is dissipated as heat, and when an integrated circuit
gets too hot, it becomes unreliable. In the first decade of the twenty-
first century, air-cooled integrated circuits reached the limits of their
ability to dissipate heat [28].
Therefore it is becoming impossible to continue to increase the
speed of integrated circuits. Indeed, in the last few years, the
increase in transistor density has slowed dramatically [36].
But given the potential of computing to improve our existence,
there is a moral imperative to continue to increase computational
power.
How then, can we continue to build ever more powerful
computers? The answer is parallelism. Rather than building ever-
faster, more complex, monolithic processors, the industry has
decided to put multiple, relatively simple, complete processors on a
single chip. Such integrated circuits are called multicore processors,
and core has become synonymous with central processing unit, or
CPU. In this setting a conventional processor with one CPU is often
called a single-core system.

1.3 Why we need to write parallel programs

Most programs that have been written for conventional, single-core
systems cannot exploit the presence of multiple cores. We can run
multiple instances of a program on a multicore system, but this is
often of little help. For example, being able to run multiple instances
of our favorite game isn't really what we want—we want the
program to run faster with more realistic graphics. To do this, we
need to either rewrite our serial programs so that they're parallel, so
that they can make use of multiple cores, or write translation
programs, that is, programs that will automatically convert serial
programs into parallel programs. The bad news is that researchers
have had very limited success writing programs that convert serial
programs in languages such as C, C++, and Java into parallel
programs.
This isn't terribly surprising. While we can write programs that
recognize common constructs in serial programs, and automatically
translate these constructs into efficient parallel constructs, the
sequence of parallel constructs may be terribly inefficient. For
example, we can view the multiplication of two matrices as a
sequence of dot products, but parallelizing a matrix multiplication as
a sequence of parallel dot products is likely to be fairly slow on
many systems.
An efficient parallel implementation of a serial program may not
be obtained by finding efficient parallelizations of each of its steps.
Rather, the best parallelization may be obtained by devising an
entirely new algorithm.
As an example, suppose that we need to compute n values and
add them together. We know that this can be done with the
following serial code:

Now suppose we also have p cores and . Then each core can
form a partial sum of approximately values:
Here the prefix indicates that each core is using its own, private
variables, and each core can execute this block of code
independently of the other cores.
After each core completes execution of this code, its variable
will store the sum of the values computed by its calls to
. For example, if there are eight cores, , and the 24 calls to
return the values

1, 4, 3, 9, 2, 8, 5, 1, 1, 6, 2, 7, 2, 5, 0, 4, 1, 8, 6, 5, 1, 2, 3, 9,

then the values stored in might be

Here we're assuming the cores are identified by nonnegative

integers in the range , where p is the number of cores.
When the cores are done computing their values of , they can
form a global sum by sending their results to a designated “master”
core, which can add their results:

In our example, if the master core is core 0, it would add the values
.
But you can probably see a better way to do this—especially if the
number of cores is large. Instead of making the master core do all the
work of computing the final sum, we can pair the cores so that while
core 0 adds in the result of core 1, core 2 can add in the result of core
3, core 4 can add in the result of core 5, and so on. Then we can
repeat the process with only the even-ranked cores: 0 adds in the
result of 2, 4 adds in the result of 6, and so on. Now cores divisible
by 4 repeat the process, and so on. See Fig. 1.1. The circles contain
the current value of each core's sum, and the lines with arrows
indicate that one core is sending its sum to another core. The plus
signs indicate that a core is receiving a sum from another core and
adding the received sum into its own sum.

FIGURE 1.1 Multiple cores forming a global sum.

For both “global” sums, the master core (core 0) does more work
than any other core, and the length of time it takes the program to
complete the final sum should be the length of time it takes for the
master to complete. However, with eight cores, the master will carry
out seven receives and adds using the first method, while with the
second method, it will only carry out three. So the second method
results in an improvement of more than a factor of two. The
difference becomes much more dramatic with large numbers of
cores. With 1000 cores, the first method will require 999 receives and
adds, while the second will only require 10—an improvement of
almost a factor of 100!
The first global sum is a fairly obvious generalization of the serial
global sum: divide the work of adding among the cores, and after
each core has computed its part of the sum, the master core simply
repeats the basic serial addition—if there are p cores, then it needs to
add p values. The second global sum, on the other hand, bears little
relation to the original serial addition.
The point here is that it's unlikely that a translation program
would “discover” the second global sum. Rather, there would more
likely be a predefined efficient global sum that the translation
program would have access to. It could “recognize” the original
serial loop and replace it with a precoded, efficient, parallel global
sum.
We might expect that software could be written so that a large
number of common serial constructs could be recognized and
efficiently parallelized, that is, modified so that they can use
multiple cores. However, as we apply this principle to ever more
complex serial programs, it becomes more and more difficult to
recognize the construct, and it becomes less and less likely that we'll
have a precoded, efficient parallelization.
Thus we cannot simply continue to write serial programs; we
must write parallel programs, programs that exploit the power of
multiple processors.

1.4 How do we write parallel programs?

There are a number of possible answers to this question, but most of
them depend on the basic idea of partitioning the work to be done
among the cores. There are two widely used approaches: task-
parallelism and data-parallelism. In task-parallelism, we partition
the various tasks carried out in solving the problem among the cores.
In data-parallelism, we partition the data used in solving the
problem among the cores, and each core carries out more or less
similar operations on its part of the data.
As an example, suppose that Prof P has to teach a section of
“Survey of English Literature.” Also suppose that Prof P has one
hundred students in her section, so she's been assigned four teaching
assistants (TAs): Mr. A, Ms. B, Mr. C, and Ms. D. At last the semester
is over, and Prof P makes up a final exam that consists of five
questions. To grade the exam, she and her TAs might consider the
following two options: each of them can grade all one hundred
responses to one of the questions; say, P grades question 1, A grades
question 2, and so on. Alternatively, they can divide the one
hundred exams into five piles of twenty exams each, and each of
them can grade all the papers in one of the piles; P grades the papers
in the first pile, A grades the papers in the second pile, and so on.
In both approaches the “cores” are the professor and her TAs. The
first approach might be considered an example of task-parallelism.
There are five tasks to be carried out: grading the first question,
grading the second question, and so on. Presumably, the graders
will be looking for different information in question 1, which is
about Shakespeare, from the information in question 2, which is
about Milton, and so on. So the professor and her TAs will be
“executing different instructions.”
On the other hand, the second approach might be considered an
example of data-parallelism. The “data” are the students' papers,
which are divided among the cores, and each core applies more or
less the same grading instructions to each paper.
The first part of the global sum example in Section 1.3 would
probably be considered an example of data-parallelism. The data are
the values computed by , and each core carries out
roughly the same operations on its assigned elements: it computes
the required values by calling and adds them together.
The second part of the first global sum example might be considered
an example of task-parallelism. There are two tasks: receiving and
adding the cores' partial sums, which is carried out by the master
core; and giving the partial sum to the master core, which is carried
out by the other cores.
When the cores can work independently, writing a parallel
program is much the same as writing a serial program. Things get a
great deal more complex when the cores need to coordinate their
work. In the second global sum example, although the tree structure
in the diagram is very easy to understand, writing the actual code is
g y y g
relatively complex. See Exercises 1.3 and 1.4. Unfortunately, it's
much more common for the cores to need coordination.
In both global sum examples, the coordination involves
communication: one or more cores send their current partial sums to
another core. The global sum examples should also involve
coordination through load balancing. In the first part of the global
sum, it's clear that we want the amount of time taken by each core to
be roughly the same as the time taken by the other cores. If the cores
are identical, and each call to requires the same amount
of work, then we want each core to be assigned roughly the same
number of values as the other cores. If, for example, one core has to
compute most of the values, then the other cores will finish much
sooner than the heavily loaded core, and their computational power
will be wasted.
A third type of coordination is synchronization. As an example,
suppose that instead of computing the values to be added, the values
are read from . Say, is an array that is read in by the master core:

In most systems the cores are not automatically synchronized.

Rather, each core works at its own pace. In this case, the problem is
that we don't want the other cores to race ahead and start computing
their partial sums before the master is done initializing and making
it available to the other cores. That is, the cores need to wait before
starting execution of the code:

We need to add in a point of synchronization between the

initialization of and the computation of the partial sums:
The idea here is that each core will wait in the function
until all the cores have entered the function—in particular, until the
master core has entered this function.
Currently, the most powerful parallel programs are written using
explicit parallel constructs, that is, they are written using extensions
to languages such as C, C++, and Java. These programs include
explicit instructions for parallelism: core 0 executes task 0, core 1
executes task 1, …, all cores synchronize, …, and so on, so such
programs are often extremely complex. Furthermore, the complexity
of modern cores often makes it necessary to use considerable care in
writing the code that will be executed by a single core.
There are other options for writing parallel programs—for
example, higher level languages—but they tend to sacrifice
performance to make program development somewhat easier.

1.5 What we'll be doing

We'll be focusing on learning to write programs that are explicitly
parallel. Our purpose is to learn the basics of programming parallel
computers using the C language and four different APIs or
application program interfaces: the Message-Passing Interface or
MPI, POSIX threads or Pthreads, OpenMP, and CUDA. MPI and
Pthreads are libraries of type definitions, functions, and macros that
can be used in C programs. OpenMP consists of a library and some
modifications to the C compiler. CUDA consists of a library and
modifications to the C++ compiler.
You may well wonder why we're learning about four different
APIs instead of just one. The answer has to do with both the
extensions and parallel systems. Currently, there are two main ways
of classifying parallel systems: one is to consider the memory that
the different cores have access to, and the other is to consider
whether the cores can operate independently of each other.
In the memory classification, we'll be focusing on shared-memory
systems and distributed-memory systems. In a shared-memory
system, the cores can share access to the computer's memory; in
principle, each core can read and write each memory location. In a
shared-memory system, we can coordinate the cores by having them
examine and update shared-memory locations. In a distributed-
memory system, on the other hand, each core has its own, private
memory, and the cores can communicate explicitly by doing
something like sending messages across a network. Fig. 1.2 shows
schematics of the two types of systems.

FIGURE 1.2 (a) A shared memory system and (b) a

distributed memory system.

The second classification divides parallel systems according to the

number of independent instruction streams and the number of
independent data streams. In one type of system, the cores can be
thought of as conventional processors, so they have their own
control units, and they are capable of operating independently of
each other. Each core can manage its own instruction stream and its
own data stream, so this type of system is called a Multiple-
Instruction Multiple-Data or MIMD system.
An alternative is to have a parallel system with cores that are not
capable of managing their own instruction streams: they can be
thought of as cores with no control unit. Rather, the cores share a
single control unit. However, each core can access either its own
private memory or memory that's shared among the cores. In this
type of system, all the cores carry out the same instruction on their
own data, so this type of system is called a Single-Instruction
Multiple-Data or SIMD system.
In a MIMD system, it's perfectly feasible for one core to execute an
addition while another core executes a multiply. In a SIMD system,
two cores either execute the same instruction (on their own data) or,
if they need to execute different instructions, one executes its
instruction while the other is idle, and then the second executes its
instruction while the first is idle. In a SIMD system, we couldn't have
one core executing an addition while another core executes a
multiplication. The system would have to do something like this:
=5.7cm

Time First core Second core

1 Addition Idle
2 Idle Multiply

Since you're used to programming a processor with its own

control unit, MIMD systems may seem more natural to you.
However, as we'll see, there are many problems that are very easy to
solve using a SIMD system. As a very simple example, suppose we
have three arrays, each with n elements, and we want to add
corresponding entries of the first two arrays to get the values in the
third array. The serial pseudocode might look like this:

Now suppose we have n SIMD cores, and each core is assigned one
element from each of the three arrays: core i is assigned elements
, and . Then our program can simply tell each core to add its
x- and y-values to get the z value:

This type of system is fundamental to modern Graphics Processing

Units or GPUs, and since GPUs are extremely powerful parallel
processors, it's important that we learn how to program them.
Our different APIs are used for programming different types of
systems:

• MPI is an API for programming distributed memory MIMD

systems.
• Pthreads is an API for programming shared memory MIMD
systems.
• OpenMP is an API for programming both shared memory
MIMD and shared memory SIMD systems, although we'll be
focusing on programming MIMD systems.
• CUDA is an API for programming Nvidia GPUs, which
have aspects of all four of our classifications: shared memory
and distributed memory, SIMD, and MIMD. We will,
however, be focusing on the shared memory SIMD and
MIMD aspects of the API.

1.6 Concurrent, parallel, distributed

If you look at some other books on parallel computing or you search
the Web for information on parallel computing, you're likely to also
run across the terms concurrent computing and distributed
computing. Although there isn't complete agreement on the
distinction between the terms parallel, distributed, and concurrent,
many authors make the following distinctions:
• In concurrent computing, a program is one in which
multiple tasks can be in progress at any instant [5].
• In parallel computing, a program is one in which multiple
tasks cooperate closely to solve a problem.
• In distributed computing, a program may need to cooperate
with other programs to solve a problem.

So parallel and distributed programs are concurrent, but a

program such as a multitasking operating system is also concurrent,
even when it is run on a machine with only one core, since multiple
tasks can be in progress at any instant. There isn't a clear-cut
distinction between parallel and distributed programs, but a parallel
program usually runs multiple tasks simultaneously on cores that
are physically close to each other and that either share the same
memory or are connected by a very high-speed network. On the
other hand, distributed programs tend to be more “loosely coupled.”
The tasks may be executed by multiple computers that are separated
by relatively large distances, and the tasks themselves are often
executed by programs that were created independently. As
examples, our two concurrent addition programs would be
considered parallel by most authors, while a Web search program
would be considered distributed.
But beware, there isn't general agreement on these terms. For
example, many authors consider shared-memory programs to be
“parallel” and distributed-memory programs to be “distributed.” As
our title suggests, we'll be interested in parallel programs—
programs in which closely coupled tasks cooperate to solve a
problem.

1.7 The rest of the book

How can we use this book to help us write parallel programs?
First, when you're interested in high performance, whether you're
writing serial or parallel programs, you need to know a little bit
about the systems you're working with—both hardware and
software. So in Chapter 2, we'll give an overview of parallel
hardware and software. In order to understand this discussion, it
will be necessary to review some information on serial hardware and
software. Much of the material in Chapter 2 won't be needed when
we're getting started, so you might want to skim some of this
material and refer back to it occasionally when you're reading later
chapters.
The heart of the book is contained in Chapters 3–7. Chapters 3, 4,
5, and 6 provide a very elementary introduction to programming
parallel systems using C and MPI, Pthreads, OpenMP, and CUDA,
respectively. The only prerequisite for reading these chapters is a
knowledge of C programming. We've tried to make these chapters
independent of each other, and you should be able to read them in
any order. However, to make them independent, we did find it
necessary to repeat some material. So if you've read one of the three
chapters, and you go on to read another, be prepared to skim over
some of the material in the new chapter.
Chapter 7 puts together all we've learned in the preceding
chapters. It develops two fairly large programs using each of the
four APIs. However, it should be possible to read much of this even
if you've only read one of Chapters 3, 4, 5, or 6. The last chapter,
Chapter 8, provides a few suggestions for further study on parallel
programming.

1.8 A word of warning

Before proceeding, a word of warning. It may be tempting to write
parallel programs “by the seat of your pants,” without taking the
trouble to carefully design and incrementally develop your program.
This will almost certainly be a mistake. Every parallel program
contains at least one serial program. Since we almost always need to
coordinate the actions of multiple cores, writing parallel programs is
almost always more complex than writing a serial program that
solves the same problem. In fact, it is often far more complex. All the
rules about careful design and development are usually far more
important for the writing of parallel programs than they are for
serial programs.
1.9 Typographical conventions
We'll make use of the following typefaces in the text:

• Program text, displayed or within running text, will use the

following typefaces:

• Definitions are given in the body of the text, and the term
being defined is printed in boldface type: A parallel
program can make use of multiple cores.
• When we need to refer to the environment in which a
program is being developed, we'll assume that we're using a
UNIX shell, such as , and we'll use a to indicate the shell
prompt:

• We'll specify the syntax of function calls with fixed

argument lists by including a sample argument list. For
example, the integer absolute value function, , in ,
might have its syntax specified with
For more complicated syntax, we'll enclose required content in
angle brackets and optional content in square brackets .
For example, the C statement might have its syntax
specified as follows:

This says that the statement must include an expression

enclosed in parentheses, and the right parenthesis must be
followed by a statement. This statement can be followed by
an optional clause. If the clause is present, it must
include a second statement.

1.10 Summary
For many years we've reaped the benefits of having ever-faster
processors. However, because of physical limitations, the rate of
performance improvement in conventional processors has decreased
dramatically. To increase the power of processors, chipmakers have
turned to multicore integrated circuits, that is, integrated circuits
with multiple conventional processors on a single chip.
Ordinary serial programs, which are programs written for a
conventional single-core processor, usually cannot exploit the
presence of multiple cores, and it's unlikely that translation
programs will be able to shoulder all the work of converting serial
programs into parallel programs—programs that can make use of
multiple cores. As software developers, we need to learn to write
parallel programs.
When we write parallel programs, we usually need to coordinate
the work of the cores. This can involve communication among the
cores, load balancing, and synchronization of the cores.
In this book we'll be learning to program parallel systems, so that
we can maximize their performance. We'll be using the C language
with four different application program interfaces or APIs: MPI,
Pthreads, OpenMP, and CUDA. These APIs are used to program
parallel systems that are classified according to how the cores access
memory and whether the individual cores can operate
independently of each other.
In the first classification, we distinguish between shared-memory
and distributed-memory systems. In a shared-memory system, the
cores share access to one large pool of memory, and they can
coordinate their actions by accessing shared memory locations. In a
distributed-memory system, each core has its own private memory,
and the cores can coordinate their actions by sending messages
across a network.
In the second classification, we distinguish between systems with
cores that can operate independently of each other and systems in
which the cores all execute the same instruction. In both types of
system, the cores can operate on their own data stream. So the first
type of system is called a multiple-instruction multiple-data or
MIMD system, and the second type of system is called a single-
instruction multiple-data or SIMD system.
MPI is used for programming distributed-memory MIMD
systems. Pthreads is used for programming shared-memory MIMD
systems. OpenMP can be used to program both shared-memory
MIMD and shared-memory SIMD systems, although we'll be
looking at using it to program MIMD systems. CUDA is used for
programming Nvidia graphics processing units or GPUs. GPUs
have aspects of all four types of system, but we'll be mainly
interested in the shared-memory SIMD and shared-memory MIMD
aspects.
Concurrent programs can have multiple tasks in progress at any
instant. Parallel and distributed programs usually have tasks that
execute simultaneously. There isn't a hard and fast distinction
y
between parallel and distributed, although in parallel programs, the
tasks are usually more tightly coupled.
Parallel programs are usually very complex. So it's even more
important to use good program development techniques with
parallel programs.

1.11 Exercises

1.1 Devise formulas for the functions that calculate and

in the global sum example. Remember that each core
should be assigned roughly the same number of elements of
computations in the loop. : First consider the case when n
is evenly divisible by p.
1.2 We've implicitly assumed that each call to
requires roughly the same amount of work as the other calls.
How would you change your answer to the preceding
question if call requires times as much work as the
call with ? How would you change your answer if the
first call ( ) requires 2 milliseconds, the second call ( )
requires 4, the third ( ) requires 6, and so on?
1.3 Try to write pseudocode for the tree-structured global sum
illustrated in Fig. 1.1. Assume the number of cores is a power
of two (1, 2, 4, 8, …). : Use a variable to determine
whether a core should send its sum or receive and add. The
should start with the value 2 and be doubled after each
iteration. Also use a variable to determine which
core should be partnered with the current core. It should
start with the value 1 and also be doubled after each
iteration. For example, in the first iteration and
, so 0 receives and adds, while 1 sends. Also in the
first iteration and , so 0 and
1 are paired in the first iteration.
1.4 As an alternative to the approach outlined in the preceding
problem, we can use C's bitwise operators to implement the
tree-structured global sum. To see how this works, it helps to
write down the binary (base 2) representation of each of the
core ranks and note the pairings during each stage: =8.5cm

From the table, we see that during the first stage each core is
paired with the core whose rank differs in the rightmost or
first bit. During the second stage, cores that continue are
paired with the core whose rank differs in the second bit; and
during the third stage, cores are paired with the core whose
rank differs in the third bit. Thus if we have a binary value
that is 0012 for the first stage, 0102 for the second, and
1002 for the third, we can get the rank of the core we're
paired with by “inverting” the bit in our rank that is nonzero
in . This can be done using the bitwise exclusive or ∧
operator.
Implement this algorithm in pseudocode using the bitwise
exclusive or and the left-shift operator.
1.5 What happens if your pseudocode in Exercise 1.3 or Exercise
1.4 is run when the number of cores is not a power of two
(e.g., 3, 5, 6, 7)? Can you modify the pseudocode so that it
will work correctly regardless of the number of cores?
1.6 Derive formulas for the number of receives and additions
that core 0 carries out using
a. the original pseudocode for a global sum, and
b. the tree-structured global sum.
Make a table showing the numbers of receives and additions
carried out by core 0 when the two sums are used with
cores.
1.7 The first part of the global sum example—when each core
adds its assigned computed values—is usually considered to
be an example of data-parallelism, while the second part of
the first global sum—when the cores send their partial sums
to the master core, which adds them—could be considered to
be an example of task-parallelism. What about the second
part of the second global sum—when the cores use a tree
structure to add their partial sums? Is this an example of
data- or task-parallelism? Why?
1.8 Suppose the faculty members are throwing a party for the
students in the department.
a. Identify tasks that can be assigned to the faculty
members that will allow them to use task-parallelism
when they prepare for the party. Work out a schedule
that shows when the various tasks can be performed.
b. We might hope that one of the tasks in the preceding
part is cleaning the house where the party will be held.
How can we use data-parallelism to partition the work
of cleaning the house among the faculty?
c. Use a combination of task- and data-parallelism to
prepare for the party. (If there's too much work for the
faculty, you can use TAs to pick up the slack.)
1.9 Write an essay describing a research problem in your major
that would benefit from the use of parallel computing.
Provide a rough outline of how parallelism would be used.
Would you use task- or data-parallelism?

Bibliography
[5] Clay Breshears, The Art of Concurrency: A Thread Monkey's
Guide to Writing Parallel Applications. Sebastopol, CA:
O'Reilly; 2009.
[28] John Hennessy, David Patterson, Computer Architecture: A
Quantitative Approach. 6th ed. Burlington, MA: Morgan
Kaufmann; 2019.
[31] IBM, IBM InfoSphere Streams v1.2.0 supports highly
complex heterogeneous data analysis, IBM United States
Software Announcement 210-037, Feb. 23, 2010
http://www.ibm.com/common/ssi/rep_ca/7/897/ENUS210-
037/ENUS210-037.PDF.
[36] John Loeffler, No more transistors: the end of Moore's Law,
Interesting Engineering, Nov 29, 2018. See
https://interestingengineering.com/no-more-transistors-the-
end-of-moores-law.
Chapter 2: Parallel hardware and
parallel software
It's perfectly feasible for specialists in disciplines other than
computer science and computer engineering to write parallel
programs. However, to write efficient parallel programs, we often
need some knowledge of the underlying hardware and system
software. It's also very useful to have some knowledge of different
types of parallel software, so in this chapter we'll take a brief look at
a few topics in hardware and software. We'll also take a brief look at
evaluating program performance and a method for developing
parallel programs. We'll close with a discussion of what kind of
environment we might expect to be working in, and a few rules and
assumptions we'll make in the rest of the book.
This is a long, broad chapter, so it may be a good idea to skim
through some of the sections on a first reading so that you have a
good idea of what's in the chapter. Then, when a concept or term in a
later chapter isn't quite clear, it may be helpful to refer back to this
chapter. In particular, you may want to skim over most of the
material in “Modifications to the von Neumann Model,” except “The
Basics of Caching.” Also, in the “Parallel Hardware” section, you can
safely skim the material on “Interconnection Networks.” You can
also skim the material on “SIMD Systems” unless you're planning to
read the chapter on CUDA programming.

2.1 Some background

Parallel hardware and software have grown out of conventional
serial hardware and software: hardware and software that runs
(more or less) a single job at a time. So to better understand the
current state of parallel systems, let's begin with a brief look at a few
aspects of serial systems.

2.1.1 The von Neumann architecture

The “classical” von Neumann architecture consists of main
memory, a central-processing unit (CPU) or processor or core, and
an interconnection between the memory and the CPU. Main
memory consists of a collection of locations, each of which is capable
of storing both instructions and data. Every location has an address
and the location's contents. The address is used to access the location,
and the contents of the location is the instruction or data stored in
the location.
The central processing unit is logically divided into a control unit
and a datapath. The control unit is responsible for deciding which
instructions in a program should be executed, and the datapath is
responsible for executing the actual instructions. Data in the CPU
and information about the state of an executing program are stored
in special, very fast storage, called registers. The control unit has a
special register called the program counter. It stores the address of
the next instruction to be executed.
Instructions and data are transferred between the CPU and
memory via the interconnect. This has traditionally been a bus,
which consists of a collection of parallel wires and some hardware
controlling access to the wires. More recent systems use more
complex interconnects. (See Section 2.3.4.) A von Neumann machine
executes a single instruction at a time, and each instruction operates
on only a few pieces of data. See Fig. 2.1.
Other documents randomly have
different content
care of the calories and the protein will take care of itself.’”—N Y
Times
“The American reader will perhaps turn with especial interest to
the study of the work of the Commission for relief in Belgium as an
example of good food ministration and control.”
+ N Y Times 22:395 O 14 ‘17 150w

Pratt p19 O ‘17

St Louis 15:327 S ‘17 10w

“In a hundred pages he presents in clear, concise and fascinating
language the fundamental principles of nutrition. Bayliss, though
noted for his work on the secretory glands and not recognized as
an expert on nutrition, has nevertheless written with the
appreciative touch characteristic of the master mind.” Graham Lusk
+ Science n s 46:18 Jl 6 ‘17 50w

+ Spec 118:520 My 5 ‘17 180w

Bayonet training manual used by the British forces. (Van Nostrand’s

military manuals) il *30c Van Nostrand 355
This pocket manual is a reprint of material which appeared in the
Infantry Journal for May, 1917. The copyright is held by the United
States Infantry association. The preface states that the instructions
are from the latest British training manual (1916), and that they
are based on experience in accordance with which the forces are
now being trained.

BEACH, HARLAN PAGE. Renaissant Latin America. il $1 (2c)

Missionary educ. movement 266 16-22287
“An outline and interpretation of the Congress on Christian work in
Latin America, held at Panama, February 10-19, 1916.” The author
has prepared a condensed account of the congress, quoting as
largely as was consistent with his purpose from speeches and
reports. Contents: The story of the Congress; Re-discovering Latin
America; Interpretation, message, method; Latin Americans and
education; Leaves for the healing of nations; The upbuilding of
womanhood; The Latin evangelical churches; The home fulcrum;
Unity’s fraternal program; Congressional addresses; Aftermath and
estimates.
“The volume is interesting from beginning to end and for the busy
reader meets an urgent need.” J. W. M.
+ Am J Theol 21:480 Jl ‘17 90w
“Much suggestive and stirring material is contained in this
condensed review of Christian work.”
+ Boston Transcript p6 Mr 3 ‘17 300w
“The exchange of ideas was noteworthy as delegates were present
from nearly all over the world, and from these workers Dr Beach
has collected a most interesting fund of facts.”
+ Ind 90:257 My 5 ‘17 60w
“While the enthusiasm of the author for the South Americans
carries him perhaps a little too far, yet the book is well worth
reading.”
+ Springf’d Republican p17 Ap 8 ‘17 140w

BEACH, REX ELLINGWOOD. Laughing Bill Hyde, and other

stories. il *$1.35 (1c) Harper 17-30123
The title story is a tale of Alaska, so is the one following, “The
north wind’s malice.” Among the others, several are stories of
business, one is a newspaper story. Some of the titles are: His
stock in trade; With bridges burned; With interest to date; The cub
reporter; Out of the night; The real and the make-believe; Running
Elk; The moon, the maid, and the winged shoes; Flesh. The book
is printed without table of contents.
“He excels in one kind of fiction which is purely American: the
business story.”
+ Boston Transcript p7 D 22 ‘17 300w
“There is nothing particularly original or striking in any of these
tales, but many of them will no doubt furnish amusement for an
idle hour. They are written in Mr Beach’s well-known and rather
agreeable style.”
+ N Y Times 22:516 D 2 ‘17 800w
—

BEALS, MRS KATHARINE (MCMILLAN). Flower lore and legend.

*$1.25 Holt 716.2 17-23777
The author has brought together a store of miscellaneous
information—myth, legend, and fancy, with quotations from poetry,
—connected with thirty-five of our common flowers. Chapters are
given to the snowdrop, arbutus, crocus, narcissus, dandelion,
violet, pansy, mignonette, buttercup, etc.
A L A Bkl 14:75 D ‘17

BEAN, C. E. W. Letters from France. il *5s Cassell & co., London

940.91
“Mr Bean, war correspondent for the Commonwealth of Australia,
has not attempted to narrate the full story of the Australian
imperial force, but gives graphic accounts of the first impressions
of some of the Australians in France, of their life in the trenches
and in billets, of the share of the Australians in the Somme
advance and in the fighting at Pozières, and of their bravery at
Mouquet Farm.”—Ath
Ath p420 Ag ‘17 110w
“The simple, easy style of these letters shows us clearly what the
Australians have done in France.”
+ Sat R 123:552 Je 16 ‘17 1050w
“It is a wonderful story, and it is told with great spirit. Mr Bean
warns his readers that the Australian troops hate to be called
‘Anzacs,’ just as they hate being called ‘Colonials.’”
+ Spec 118:675 Je 16 ‘17 120w

BEARD, FREDERICA, comp. Prayers for use in home, school and

Sunday school. *60c Doran 248 17-24844
The author has assembled a number of prayers for children and
young people. In those for little people she appeals to the child’s
natural love of rhythm and repetition. Those for older boys and
girls are drawn from many sources and are characterized by a
spirit of reverence. They are arranged in four groups: Prayers for
little children; Prayers for boys and girls; Prayers for young people;
For use on special occasions.
“A beautiful collection.”
+ A L A Bkl 14:73 D ‘17
“Tho not many are adapted to use in public schools, in private
schools, in the home and Sunday school, they would provide
splendid suggestive training.”
+ Ind 91:354 S 1 ‘17 60w

Ind 92:449 D 1 ‘17 30w

BEAUFORT, J. M. DE. Behind the German veil; a record of a
journalistic war pilgrimage. il *$2 (2c) Dodd 940.91 17-14977
Before going to Germany in 1914 as the representative of a
London newspaper, the author had spent three years in journalistic
work in New York, and he acknowledges a debt of gratitude to his
American training. He is a Hollander by birth and parentage and as
a boy was sent to school in Germany. His sympathies, even before
starting on his mission to Germany, were strongly pro-Ally. He
says, “I started on my mission and entered Germany with as far as
possible an open mind. I could not honestly say at that time that I
hated the Germans; I merely had no use for them.” All his
experiences within the German empire intensified his feeling. The
book consists of four parts: General impressions; My trip to the
eastern front and visit to Hindenburg; An incognito visit to the fleet
and Germany’s naval harbours; Interviews.
“He relates his experiences and impressions in journalistic and
entertaining fashion.”
+ A L A Bkl 14:17 O ‘17
“The style in which the book is written is not attractive, but the
matter is undeniably of interest.”
+ Ath p204 Ap ‘17 170w
—
“The material is interesting but the writer dilates rather too freely
on his own shrewdness and ‘nerve.’”
+ Cleveland p101 S ‘17 60w
—
“If there is anything ‘Behind the German veil’ which is particularly
worth disclosing, it has not been revealed by J. M. de Beaufort.”
— Nation 106:70 Ja 17 ‘18 160w
“Offers some of the most interesting firsthand accounts that have
come out of Germany. ... Mr de Beaufort writes vivaciously,
although somewhat garrulously, and his book is full of interesting
matter of much importance for Americans if they would
understand the German spirit. He was in Europe as the
correspondent of the London Daily Telegraph.”
+ N Y Times 22:215 Je 3 ‘17 600w

Pittsburgh 22:680 O ‘17 20w

+ R of Rs 56:107 Jl ‘17 90w

BEAVERBROOK, WILLIAM MAXWELL AITKEN, 1st baron.

Canada in Flanders. maps *1s 3d Hodder & Stoughton, London
940.91
The second volume of the official story of the Canadian
expeditionary force covers the period between September, 1915,
and July, 1916. For an account of the first volume consult the
Digest annual, 1916, under Aitken, Sir William Maxwell—the name
of Lord Beaverbrook before he was raised to the peerage.
“The descriptions of the dash and vigour of the Canadian troops
are graphic and inspiring.”
+ Ath p258 My ‘17 70w

St Louis 15:417 D ‘17 20w

“Lord Beaverbrook’s second volume concerning the Canadians,
which is written by him as the Canadian ‘Eyewitness,’ contains a
most readable and workmanlike account of the long and bitter
struggles first at St Eloi and then at Hooge, in the Ypres Salient,
which ended a fortnight before the battle of the Somme began.”
+ Spec 118:675 Je 16 ‘17 120w
“It is difficult to conceive of anything more likely to stimulate zeal
and efficiency than volumes of this kind. The general public cannot
master an official dispatch, so long after the event, without
considerable explanatory notes and plans. The whole scheme of
the volumes at present issued is to present a coherent account of
an action as a whole, and at the same time to signalize individual
acts of gallantry.”
+ The Times [London] Lit Sup p110 Mr 8 ‘17
600w

BECHHOFER, C. E., ed. Russian anthology in English. *$1.50

Dutton 891.7 A17-1637
“Translated extracts in verse and prose from twenty-five authors
(of whom only one, Volynsky, is new to English readers), with
some ballads and folk songs.”—The Times [London] Lit Sup
A L A Bkl 14:119 Ja ‘18
“This collection of extracts from Russian verse, drama, and prose
is too fragmentary to be satisfying. In some of the examples, such
as the excerpt from ‘The idiot’ by Dostoevsky, the absence of
context makes for obscurity and a sense of incompleteness. Other
examples are enjoyable, such as Gogol’s idyllic ‘Old-world gentle-
folk,’ ‘The death of Ivan’ by Alexis Tolstoy, Pushkin’s poem ‘The
three sisters,’ Leo Tolstoy’s thoughtful criticism of Maupassant, and
the slyly humorous sketch by Chekhov, ‘A work of art.’ Many
prominent modern Russian authors are represented, though we
miss the names of Gorky, Grigorovitch, Artsibashev, and Sologub.”
+ Ath p360 Jl ‘17 100w
—

+ Boston Transcript p7 N 3 ‘17 230w

—

The Times [London] Lit Sup p214 My 5 ‘17

20w

BECKLEY, ZOË, and GOLLOMB, JOSEPH, comps. Songs for

courage. *$1 Barse & Hopkins 821.08 17-15993
Courage is one of “the subjects made prominent by the war” to
which librarians are officially advised to give special attention in
book selection. In this collection of over 100 titles we find the old
favorites, such as Henley’s “Invictus,” Sill’s “Opportunity,” Matthew
Arnold’s “Self-dependence,” together with the work of more recent
writers.
Cleveland p121 N ‘17 50w
“Many old favorites are here. ... There are also many unworthy
verses. The inferior verse far outranks the worthy. And it is
surprising to note how many of the poems of revolutionary
courage are missing.” Clement Wood
+ N Y Call p14 Je 24 ‘17 120w
—

BEECROFT, WILLEY INGRAHAM, comp. Who’s who among the

wild flowers and ferns. new and combined ed il *$1.50 Moffat
582 A17-406
“The outstanding feature of the work and the one which
commends it to the ordinary student, is that a person need not be
a botanist to use Mr Beecroft’s guide.” (Springf’d Republican) “The
flowers are classified by colors, as in most volumes of the kind,
and under, the name of each flower ample description is detailed
for identification. There are blank pages for notes.” (Boston
Transcript)
A L A Bkl 13:361 My ‘17

+ Boston Transcript p13 Ap 7 ‘17 150w

“The inclusiveness of ‘Who’s who among the wild flowers and
ferns’ will rightly make it a popular guide.”
+ Ind 91:109 Jl 21 ‘17 40w
“While scientific and accurate, it is entirely untechnical.”
+ Springf’d Republican p17 Ag 29 ‘17 130w

BEER, GEORGE LOUIS. English-speaking peoples; their future

relations and joint international obligations. 2d ed *$1.50 (2c)
Macmillan 327.73 17-17291
Mr Beer was formerly lecturer in European history at Columbia
university, and is the author of “The old colonial system, 1660-
1754,” etc. He recalls in his preface Admiral Mahan’s essay of 1894
entitled “Possibilities of an Anglo-American re-union,” and goes on
to say: “What in 1894 was unripe and academic, has today
become urgent and practical.” A series of notes is appended which
furnish a running bibliography to easily accessible and non-
technical literature. Some of the material in the book appeared
originally in the Political Quarterly, New Republic, and elsewhere.
+ Am Econ R 7:840 D ‘17 60w
+ A L A Bkl 14:112 Ja ‘18

Ath p463 S ‘17 60w

“Valuable as the author’s opinions are, it is no discourtesy to him
to say that the facts, figures, and references appended to the book
in some forty pages of ‘Notes’ are in some respects even more
valuable; for facts on these contentious subjects are often ignored
and sometimes very difficult to get at, and Mr Beer has a genius
for relevant documentation.”
+ Ath p505 O ‘17 1600w

Cleveland p138 D ‘17 60w

“Mr Beer’s argument is logical and forceful. He has scrupulous
regard for the facts of history and economics; his views are the
outcome of a lifetime of study of British imperial and colonial
affairs and of international politics. Many, perhaps most, of his
readers will shrink from his conclusions. But no one will be justified
in withholding from this book the tribute of candid and thoughtful
consideration.” F: A. Ogg
+ Dial 63:520 N 22 ‘17 1100w

Ind 92:60 O 6 ‘17 70w

Reviewed by Sinclair Kennedy
J Pol Econ 26:101 Ja ‘18 470w
“The valuable references and notes are sure to be of immediate
help to every thoughtful reader interested in this absorbing and
timely question.”
+ Lit D 55:45 O 13 ‘17 300w
“The volume is easily one of the most weighty pieces of writing
about the war that has yet appeared in this country, and should be
widely read.”
+ Nation 105:322 S 20 ‘17 560w

N Y Br Lib News 4:133 S ‘17 30w

“A factor of the first importance in the molding of public opinion in
this country. ... In three remarkably thoughtful concluding chapters
Mr Beer discusses the predominant factors in the unity of English-
speaking peoples, the economic possibilities in co-operation, and
the community of Anglo-American policy toward China and Latin
America. The chapter on the growing economic interdependence
of the world is, in particular, closely reasoned.”
+ N Y Times 22:356 S 23 ‘17 1200w
“Without necessarily giving full credence to ideas that are indeed
but tentatively advanced, one may affirm that ‘The English-
speaking peoples’ is a statesmanlike book. In its grasp of the ends
to be wished for, in its perception of present realities, and in the
caution of its conclusions, Mr Beer’s book differs essentially and
completely both from those forecasts of the future which are more
or less frankly utopian and from the desperately opportunistic
proposals which the present world-crisis has called forth from
certain would-be practical idealists. Although his style is of the
plainest (in both senses of the word), the author possesses an
unusual power of extracting fundamental truths from a great mass
of conflicting facts. ... The book will prove valuable for its broad
and illuminating criticisms of such general ideas as that of
nationality, and of such programmes or proposals as pan-
Americanism and the League to enforce peace.”
+ No Am 206:478 S ‘17 950w
+

Outlook 117:64 S 12 ‘17 150w

“He states his arguments cogently, but without heat, and fortifies
every position he takes up with a full reference to facts and
authorities. We regret only that in the effort to be at once
condensed and accurate he has allowed his style to become, at
times, so abstruse and technical as to prevent his volume from
appealing to the widest possible public.”
+ Spec 119:sup472 N 3 ‘17 800w
—
“We are bound to demur to his too facile assumption of the
abandonment of free trade by Great Britain.” R: Roberts
Survey 38:549 S 22 ‘17 650w

The Times [London] Lit Sup p395 Ag 16 ‘17

50w
“It is one of the best, most original, and judicious attempts to
construct out of the political anarchy of these times new
organizations. ... Mr Beer modestly describes his book as a livre de
circonstance dealing with an unpredictable future. It is in reality a
valuable addition to political science. ... This book, with its earnest
appeal for support to a permanent, loosely knit association
between Great Britain and the United States, is to be welcomed by
every one who has at heart the ideals which these two countries
represent.”
+ The Times [London] Lit Sup p422 S 6 ‘17
+ 620w
“The book is the work of a scholar, and it is, as scholars say,
thoroughly documented. But it is not primarily addressed to
scholars, and it is not a dry-as-dust performance. It is addressed
to thinking people who are ready to consider seriously and with
care the duty of the nation in this great crisis, and it abounds with
fresh suggestions and arguments which are bound to excite
interest and open new channels of thought.” G. B. Adams
+ Yale R n s 7:416 Ja ‘18 1200w

BEERS, HENRY AUGUSTIN. Two twilights. *$1 Badger, R: G. 811

17-25112
“This volume includes selections from two early books of verse,
long out of print; a few pieces from ‘The Ways of Yale’; and a
handful of poems contributed of late years to the magazines and
not heretofore collected.” (Preface) The author has been professor
of English literature in Yale university since 1880.

BEITH, JOHN HAY (IAN HAY, pseud.). All in it; “K (1)” carries
on. *$1.50 (2½c) Houghton 940.91 17-29361
This is the continuation of “The first hundred thousand,” promised
us by Captain Beith. “‘The first hundred thousand’ closed with the
battle of Loos. The present narrative follows certain friends of ours
from the scene of that costly but valuable experience, through a
winter campaign in the neighbourhood of Ypres and Ploegsteert, to
profitable participation in the battle of the Somme.” (Author’s note)
Captain (now major) Wagstaffe and Private (now corporal)
Mucklewame reappear in this volume.
“Told with the same humorous turns and descriptions that made
the first book so readable.”
+ A L A Bkl 14:88 D ‘17
“Bit by bit Major Beith pieces together the tale of the fighter in the
present war. He does not minimize its horrors, but he does not
over-emphasize them. Through his entire story runs an
undercurrent of optimism.” E. F. E.
+ Boston Transcript p8 N 7 ‘17 1500w

Cleveland p130 D ‘17 60w

Ind 93:128 Ja 19 ‘18 50w
“Ian Hay’s own narrative is full of the brightest humor, not
untouched with an equally bright cynicism. ... And yet it would be
a grave mistake to assume that because he writes brightly, and
often humorously, Major Beith’s is a ‘light’ book. It is far from
that. ... In ‘All in it’ the heroism is present always. The terrible
things are not glossed over.”
+ N Y Times 22:462 N 11 ‘17 750w

+ Outlook 117:520 N 28 ‘17 100w

+ Springf’d Republican p11 Ja 27 ‘18 580w

BEITH, JOHN HAY (IAN HAY, pseud.). Getting together. *50c

(6½c) Doubleday; Houghton 940.91 17-6208
In this little book, Captain Beith, who has been lecturing in the
United States, attempts to bring Briton and American to an
understanding of one another. He answers some of the questions
that have been put to him: How about that blockade? What are
you opening our mails for? Would you welcome American
intervention? etc.
“Appeared in the Outlook, F 7 ‘17.”
+ A L A Bkl 13:345 My ‘17

+ Cath World 105:843 S ‘17 180w

+ Cleveland p102 S ‘17 50w

“A sincere and fine-spirited effort to explain misunderstandings
between the citizens of Britain and the United States.”
+ N Y Times 22:45 F 11 ‘17 800w
Reviewed by Joseph Mosher
Pub W 91:593 F 17 ‘17 350w
“His brief account of the voluntary help rendered by America to
the Allies before she came into the war will surprise many
people. ... His manly and sensible little book should do good.”
+ Spec 118:678 Je 16 ‘17 140w
“Ian Hay’s little essay in Anglo-American propagandism will not
increase his literary reputation. ... There is no need of a
presentation of the case of the Allies to intelligent Americans, and
this book is not so conceived as to win over old-fashioned Yankees
who entertain animosity toward Great Britain. The softness of the
language defeats its own purpose.”
– Springf’d Republican p15 Mr 4 ‘17 400w
+

+ The Times [London] Lit Sup p203 Ap 26

‘17 220w

BEITH, JOHN HAY (IAN HAY, pseud.). Oppressed English. *50c

(6½c) Doubleday 941.5 17-18156
The author of “The first hundred thousand” and “Getting together,”
a Scotsman, has some witty and practical things to say on the
world attitude toward the “English” as distinct from the “British”
people. He writes: “In the war of to-day, for instance, whenever
anything particularly unpleasant or unpopular has to be done—
such as holding up neutral mails, or establishing a blacklist of
neutral firms trading with the enemy—upon whom does the odium
fall? Upon ‘England’; never upon France, and only occasionally
upon Great Britain. ... On the other hand, ... a victory gained by
English boys from Devon or Yorkshire appears as a British victory,
pure and simple.” The fourth and fifth chapters make clear some of
the answers to: “Why can’t you people settle the Irish question?”—
the claims of the Nationalists, the Unionists, and of the Sinn Fein
being put side by side for study by outsiders.
“Good-natured, humorous, but very lucid explanation of the Irish
question.”
+ A L A Bkl 14:18 O ‘17

+ Cleveland p131 D ‘17 70w

“As is apt to be the case with a book of this kind, Mr Hay’s desire
to make his humorous periods leads him sometimes to sacrifice
the exact truth. He exaggerates the idiosyncrasies of the
Englishman to make his satire carry over. Once you have forgiven
that, however, you find the little book pleasant reading.”
+ Dial 63:461 N 8 ‘17 190w
—
“The Irish rebellion was not made in Germany. It was made in
England, and not a little part of it was made by just such
dunderheads as Captain Beith, with their inaccurate talk of
beneficences that were never really conferred and freedom that
never existed.” F. H.
— New Repub 13:188 D 15 ‘17 1400w
“As a specimen of dry Scotch humor carrying with it a large
volume of matter for serious consideration, Mr Hay’s little book is
unrivalled in its way, though it is, perhaps, not exactly the ‘sense
of humour’ that is likely to appeal to ardent Irish patriots. ... The
book contains much matter of considerable interest to Americans,
for the author has much more than an ordinary grasp of the
psychology of the peoples he deals with in this little volume.” J. W.
+ N Y Call p14 Jl 15 ‘17 650w

Pittsburgh 22:674 O ‘17 70w

Pratt p46 O ‘17 20w

St Louis 15:379 O ‘17 10w

“An amusing comment on British characteristics.”
+ Springf’d Republican p10 O 4 ‘17 300w

BEITH, JOHN HAY (IAN HAY, pseud.). “Pip”; a romance of

youth. *$1.50 (2c) Houghton 17-9709
A happy story of irresponsible youth. Half the book is taken up
with the schoolday adventures of the young hero. Pip is a valiant
cricketer and when he leaves school he becomes something of a
nation-wide figure. The death of his father sends him into the
world to earn his living. He does so for a time as a chauffeur.
There is a girl in the story, of course. Pip met her first as a friend
of his sister’s, when she was sixteen. She is older and so is he
when the book closes, ending with a golf match that decides an
important matter for Pip.
+ A L A Bkl 13:354 My ‘17
“Up to the outbreak of the war Ian Hay was known in this country
as the author of six books, all of them fiction. ... Prior to these,
however, he had written another book. Its title is ‘Pip.’ ... Its
understanding of childhood, youth and early manhood is keen, its
ability to make the most of the zest of delicate comedy is
complete.” E. F. E.
+ Boston Transcript p6 Mr 17 ‘17 1500w

+ Lit D 54:1089 Ap 14 ‘17 200w

“Captain Beith writes with genial humor, and his account of the
making of Pip into a man, and a man who is a thorough
Englishman, is likely to bring many a smile to the face of his
reader. Having been, in the days before the war, a schoolmaster
himself he knows much about the life of British schools and the
character of the men who conduct them.”
+ N Y Times 22:93 Mr 18 ‘17 500w

+ Outlook 115:622 Ap 4 ‘17 60w

+ Springf’d Republican p17 Ap 22 ‘17 250w

“As a school story it is inferior to ‘David Blaize,’ and the detailed
descriptions of cricket contests are beyond the American reader,
but it is nevertheless a story of decided interest.”
+ Wis Lib Bul 13:126 Ap ‘17 50w

BELL, ARCHIE. Trip to Lotus land. il *$2.50 (4c) Lane 915.2 17-
30747
The author outlines a six-weeks’ itinerary for the tourist to Japan,
and states that his purpose is to convey to the reader something
of the joys that such a tour holds for a traveler. He says that the
book is not a guide book. “Mr Terry’s ‘Japanese empire’ and the
excellent publications of the Imperial Japanese government
railways” supply that need, and his pleasant narrative account of
his own travels will serve to supplement them. Yokohama,
Kamakura, Miyanoshita, Tokyo, Nagoya, Kyoto, Kobe, Nagasaki and
Nikko are among the points visited. There are over fifty
illustrations.
Reviewed by A. M. Chase
+ Bookm 46:335 N ‘17 40w
“Both instructive and entertaining.”
+ Lit D 56:40 Ja 12 ‘18 170w
+ N Y Times 22:579 D 30 ‘17 100w
“[Fulfills its purpose] admirably both in text and illustrations.”
+ Outlook 117:615 D 12 ‘17 60w

+ R of Rs 57:219 F ‘18 50w

BELL, FREDERICK MCKELVEY. First Canadians in France; the

chronicle of a military hospital in the war zone. il *$1.35 (2½c)
Doran 940.91 17-28775
Colonel Bell, attached, as medical director, to the first contingent
of Canadian troops overseas, was detailed to found a Canadian
hospital near Boulogne. He chronicles the progress of that
undertaking among the heterogeneous lot of men whom “the
hammer of time,” with many a nasty knock, welded together. The
quality that made Colonel Bell the one force that held the boys
together is responsible for the grip the book gets on the reader. It
is a simple recital of every day routine, without central theme or
plot, told in a realistic, colloquial, normal, human fashion with an
eye keen to every humorous incident that livened camp monotony.
“The writer confesses to a flavor of romancing in his story, but the
reader will not feel like criticising this or seeking too closely the
line between fact and imagination.”
+ Boston Transcript p9 N 10 ‘17 400w
“Clever characterization, and many amusing anecdotes.”
+ Cleveland p130 D ‘17 30w

+ Ind 93:128 Ja 19 ‘18 170w

“Certainly, this excellent book should be read. It is so human.”
+ N Y Times 22:448 N 4 ‘17 700w

Outlook 117:387 N 7 ‘17 30w

BELL, JOHN JOY. Kiddies. *$1.50 (2½c) Stokes

A collection of seventeen stories about children by this well known
Scottish humorist, author of “Wee Macgreegor.” That young hero
appears in several of the stories. Among the titles are: Habakkuk;
Little boy; Some advantages of being an aunt; The good fairy; Mr
Logie’s heart; An early engagement; Silk stocking and suedes; The
ugly uncle.
“The stories are canny and full of dry humor and quaint pathos.”
+ Ind 92:604 D 29 ‘17 200w

N Y Br Lib News 4:63 Ap ‘17 50w

“The humorous tales are, generally speaking, the best, the serious
and pathetic ones being somewhat conventional and
oversentimental.”
+ NY Times 22:500 N 25 ‘17 240w
—

+ Outlook 117:475 N 21 ‘17 20w

BELL, JOHN JOY. Till the clock stops. *$1.35 (2c) Duffield 17-5450
The clock, with its diamond-studded pendulum, stood in a
secluded house in Scotland. It was guaranteed to go for a year
and a day after the pendulum was set in motion—that being done
on the death of its owner Christopher Craig. It was in some way to
watch over the green box full of diamonds and the other fortune
reserved for Christopher’s nephew, Alan Craig, supposedly lost in
the Arctic. Its enemy was Bullard, London member of a South
African mining syndicate, who knew of the existence of the
diamonds and its guardians were a dense green liquid with which
the case was partly filled, placed over the ominous word
“Dangerous,” Caw, the faithful servant of the dead man, and
Marjorie Handyside, the daughter of a doctor and neighbor. How
these and others played their respective parts, and the surprise in
store for all when the clock stopped make a thrilling tale. The
writer is the author of “Wee MacGreegor.”
A L A Bkl 13:401 Je ‘17
“The story is well planned, and full of excitement and suspense up
to the last chapter.”
+ Ath p101 F ‘17 30w

+ N Y Times 22:110 Mr 25 ‘17 250w

Outlook 115:668 Ap 11 ‘17 20w

“A melodrama full of alarms and surprises.”
Spec 118:241 F 24 ‘17 7w

+ Springf’d Republican p17 Ap 8 ‘17 210w

“Mr J. J. Bell may have had the cinematograph in mind in writing
‘Till the clock stops.’ Hidden diamonds form the mainspring of the
story, and propel it forward mechanically through its allotted span;
and one can imagine the pistol shots, explosions, and so forth
which arise out of the search for them being reduced to a series of
highly effective pictures.”
The Times [London] Lit Sup p44 Ja 25 ‘17
200w
BELL, JOHN KEBLE (KEBLE HOWARD, pseud.). Gay life. *$1.30
(2c) Lane 17-6536
A happy and wholesome story of theatrical life. The author has
written it to counteract some of the sensational ideas that prevail
concerning the stage. Jilly Nipchin is an attractive and impudent
little Cockney who determines to put her twin gifts, mimicry and
an agility in turning handsprings, to use on the stage. Her family is
in need, and Jilly chooses this way of helping them. The story
follows her progress with a traveling company in the provinces, in
the music halls, in a repertory company, and finally takes her to
America. The hero, Ed Chauncey, the world’s greatest equilibrist, is
as worthy in his way as is Jilly.
+ Ath p414 Ag ‘17 90w
“The wholesome story shows a thorough knowledge of the
external life of the stage, but not very deep understanding of
universal human nature. The author is a theatrical manager and
producer, and the editor of the Sketch, a semi-theatrical
publication.”
Cleveland p63 My ‘17 70w
“The thorough knowledge of the stage and of all things stagey
which the author obviously possesses apparently does not include
the capacity for understanding the forces that underlie the
struggles and the successes of its workers. ‘The gay life’ is
superficial, occasionally clever, and of fleeting value.”
— Dial 62:247 Mr 22 ‘17 110w

+ Ind 90:84 Ap 7 ‘17 100w

“The novel is clever, amusing and graphic in its account of stage
life, though developed in a somewhat jerky manner. The theme
recalls certain of Leonard Merrick’s delightful tales, and of course
this story suffers from the comparison, but it is an entertaining
piece of work, with an attractive, very human heroine and several
interesting and well-drawn characters.”
+ N Y Times 22:69 F 25 ‘17 350w
“The book on the whole is pleasant reading.”
+ Spec 119:169 Ag 18 ‘17 30w
“Mr Howard weaves a colorless romance into the narrative, but
Jilly’s adventures and high spirits hold the attention without
outside aid.”
+ Springf’d Republican p17 D 9 ‘17 220w
—
“It is a jolly tale, an amusing tale, a good-natured tale. There is
general truth in portions of his book, which the tale as a whole
lacks.”
+ The Times [London] Lit Sup p334 Jl 12 ‘17
— 500w

BELL, JOHN KEBLE (KEBLE HOWARD, pseud.). Smiths in war

time. *$1.40 (2½c) Lane 17-30282
This story, by the author of “The Smiths of Surbiton,” “The Smiths
of Valley View,” etc. is written in a lighter vein than most of the
novels dealing with England in war-time. It tells us how Mr Smith,
aged seventy-three, and his devoted wife, tried to help their
country; how they rented their pleasant villa at Surbiton and
attempted to live in a cottage; how they decided to dismiss Edith,
one of the three maids who kept them so comfortable; how Mr
Smith tried to observe a meatless day and fell into temptation;
how he tried to drill for home service; and how “young George,”
the Smith’s idolized grandson, was “reported missing” but returned
in safety by aeroplane to his anxious relatives.
“The book is written with a thoroughly delightful mixture of humor
and pathos; if we laugh at Mr Smith, it is very tenderly, and we are
all the fonder of him for his whimsies and absurdities, just as his
wise, sweet wife was. They are people we are glad to know, quiet,
simple, human, ‘ordinary,’ and very lovable people, with something
big and fine in them underneath it all.”
+ N Y Times 22:475 N 18 ‘17 550w
“A charming story; an epitome of the spirit that is making the
sacrifices and upholding the nation’s determination that the
sacrifices shall not be in vain.”
+ Springf’d Republican p15 D 23 ‘17 300w

BELL, RALPH W.[2] Canada in war-paint. il *$1.25 Dutton 940.91

17-13337
“‘Canada in war paint’ is a series of sketches, mostly of the
humorous type, of the Canadian forces across the water. Its
author, Capt. Ralph W. Bell, dedicates its pages to the ‘officers, N.
C. O.’s and men of the 1st Canadian infantry battalion, Ontario
regiment,’ of which he is a member. He has striven to portray types
rather than individuals, or as he himself puts it in the preface to
give ‘vignettes of things as they struck me at the time, and
later.’”—Springf’d Republican
“Among the brightest and most cheerful of the war stories from
the men at the front is this crisp and relishing offering. Only a
small portion is devoted to the rough and cruel side.”
+ Boston Transcript p8 N 10 ‘17 150w
“Captain Bell writes light-heartedly, and makes the best of the
everyday events of life in the war zone, in the somewhat
fragmentary jottings which he calls ‘Canada in war paint,’ but there
is pathos, too, intermingled with the humor of his book.”
+ Sat R 123:556 Je 16 ‘17 310w
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge

connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.