100% found this document useful (2 votes)
48 views

Pro TBB: C++ Parallel Programming With Threading Building Blocks 1st Edition Michael Voss Download PDF

Threading

Uploaded by

taboutarizul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
48 views

Pro TBB: C++ Parallel Programming With Threading Building Blocks 1st Edition Michael Voss Download PDF

Threading

Uploaded by

taboutarizul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Download the full version of the textbook now at textbookfull.

com

Pro Tbb: C++ Parallel Programming with


Threading Building Blocks 1st Edition Michael
Voss

https://textbookfull.com/product/pro-tbb-c-
parallel-programming-with-threading-building-
blocks-1st-edition-michael-voss/

Explore and download more textbook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Primary Mathematics Textbook 2B Jennifer Hoerst

https://textbookfull.com/product/primary-mathematics-
textbook-2b-jennifer-hoerst/

textbookfull.com

Handbook of Macroeconomics, Volume 2A-2B SET 1st Edition


John B. Taylor

https://textbookfull.com/product/handbook-of-macroeconomics-
volume-2a-2b-set-1st-edition-john-b-taylor/

textbookfull.com

Fortran 2018 with Parallel Programming 1st Edition Subrata


Ray (Author)

https://textbookfull.com/product/fortran-2018-with-parallel-
programming-1st-edition-subrata-ray-author/

textbookfull.com

Primitive colors. A case study in neo-pragmatist


metaphysics and philosophy of perception 1st Edition
Joshua Gert
https://textbookfull.com/product/primitive-colors-a-case-study-in-neo-
pragmatist-metaphysics-and-philosophy-of-perception-1st-edition-
joshua-gert/
textbookfull.com
The Biological Resources of Model Organisms 1st Edition
Robert L. Jarret

https://textbookfull.com/product/the-biological-resources-of-model-
organisms-1st-edition-robert-l-jarret/

textbookfull.com

Reverse Thyroid Disease Naturally Alternative Treatments


for Hyperthyroidism Hypothyroidism Hashimoto s Disease
Graves Disease Thyroid Cancer Goiters More Hatherleigh
Natural Health Guides Michelle Honda
https://textbookfull.com/product/reverse-thyroid-disease-naturally-
alternative-treatments-for-hyperthyroidism-hypothyroidism-hashimoto-s-
disease-graves-disease-thyroid-cancer-goiters-more-hatherleigh-
natural-health-guides-michelle-h/
textbookfull.com

Grammar for Teachers A Guide to American English for


Native and Non Native Speakers 2nd Edition Andrea Decapua

https://textbookfull.com/product/grammar-for-teachers-a-guide-to-
american-english-for-native-and-non-native-speakers-2nd-edition-
andrea-decapua-2/
textbookfull.com

Understanding the Business of Media Entertainment


(American Film Market Presents) 2nd Edition Bernstein

https://textbookfull.com/product/understanding-the-business-of-media-
entertainment-american-film-market-presents-2nd-edition-bernstein/

textbookfull.com

Connecting Adult Learning and Knowledge Management


Strategies for Learning and Change in Higher Education and
Organizations Monica Fedeli
https://textbookfull.com/product/connecting-adult-learning-and-
knowledge-management-strategies-for-learning-and-change-in-higher-
education-and-organizations-monica-fedeli/
textbookfull.com
Myths of Branding: A Brand is Just a Logo, and Other
Popular Misconceptions (Business Myths) 1st Edition Simon
Bailey
https://textbookfull.com/product/myths-of-branding-a-brand-is-just-a-
logo-and-other-popular-misconceptions-business-myths-1st-edition-
simon-bailey/
textbookfull.com
Pro TBB
C++ Parallel Programming with
Threading Building Blocks

Michael Voss
Rafael Asenjo
James Reinders
Pro TBB
C++ Parallel Programming with
Threading Building Blocks

Michael Voss
Rafael Asenjo
James Reinders
Pro TBB: C++ Parallel Programming with Threading Building Blocks
Michael Voss Rafael Asenjo
Austin, Texas, USA Málaga, Spain
James Reinders
Portland, Oregon, USA

ISBN-13 (pbk): 978-1-4842-4397-8 ISBN-13 (electronic): 978-1-4842-4398-5


https://doi.org/10.1007/978-1-4842-4398-5

Copyright © 2019 by Intel Corporation


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on
microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed.
Open Access This book is licensed under the terms of the Creative Commons Attribution-
NonCommercial-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/
by-nc-nd/4.0/), which permits any noncommercial use, sharing, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if you modified the licensed material. You do not have permission under this
license to share adapted material derived from this book or parts of it.
The images or other third party material in this book are included in the book’s Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the book’s Creative Commons license and
your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every
occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to
the benefit of the trademark owner, with no intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as
such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the
authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made.
The publisher makes no warranty, express or implied, with respect to the material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Natalie Pao
Development Editor: James Markham
Coordinating Editor: Jessica Vakili
Cover designed by eStudioCalamar
Cover image designed by Freepik (www.freepik.com)
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor,
New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit
www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science +
Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
For information on translations, please e-mail rights@apress.com, or visit http://www.apress.com/rights-permissions.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also
available for most titles. For more information, reference our Print and eBook Bulk Sales web page at http://www.apress.
com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub
via the book’s product page, located at www.apress.com/978-1-4842-4397-8. For more detailed information, please visit
http://www.apress.com/source-code.
Printed on acid-free paper.
Table of Contents
About the Authors����������������������������������������������������������������������������������������������������xv

Acknowledgments�������������������������������������������������������������������������������������������������xvii

Preface�������������������������������������������������������������������������������������������������������������������xix

Part 1�������������������������������������������������������������������������������������������������������������� 1
Chapter 1: Jumping Right In: “Hello, TBB!”�������������������������������������������������������������� 3
Why Threading Building Blocks?��������������������������������������������������������������������������������������������������� 3
Performance: Small Overhead, Big Benefits for C++�������������������������������������������������������������� 4
Evolving Support for Parallelism in TBB and C++������������������������������������������������������������������� 5
Recent C++ Additions for Parallelism������������������������������������������������������������������������������������� 6
The Threading Building Blocks (TBB) Library�������������������������������������������������������������������������������� 7
Parallel Execution Interfaces��������������������������������������������������������������������������������������������������� 8
Interfaces That Are Independent of the Execution Model������������������������������������������������������ 10
Using the Building Blocks in TBB������������������������������������������������������������������������������������������� 10
Let’s Get Started Already!����������������������������������������������������������������������������������������������������������� 11
Getting the Threading Building Blocks (TBB) Library������������������������������������������������������������� 11
Getting a Copy of the Examples��������������������������������������������������������������������������������������������� 12
Writing a First “Hello, TBB!” Example������������������������������������������������������������������������������������ 12
Building the Simple Examples����������������������������������������������������������������������������������������������� 15
Building on Windows Using Microsoft Visual Studio������������������������������������������������������������� 16
Building on a Linux Platform from a Terminal����������������������������������������������������������������������� 17
A More Complete Example���������������������������������������������������������������������������������������������������������� 21
Starting with a Serial Implementation����������������������������������������������������������������������������������� 21
Adding a Message-Driven Layer Using a Flow Graph������������������������������������������������������������ 25
Adding a Fork-Join Layer Using a parallel_for���������������������������������������������������������������������� 27
Adding a SIMD Layer Using a Parallel STL Transform����������������������������������������������������������� 29

iii
Table of Contents

Chapter 2: Generic Parallel Algorithms������������������������������������������������������������������ 33


Functional / Task Parallelism������������������������������������������������������������������������������������������������������ 37
A Slightly More Complicated Example: A Parallel Implementation of Quicksort�������������������� 40
Loops: parallel_for, parallel_reduce, and parallel_scan�������������������������������������������������������������� 42
parallel_for: Applying a Body to Each Element in a Range���������������������������������������������������� 42
parallel_reduce: Calculating a Single Result Across a Range����������������������������������������������� 46
parallel_scan: A Reduction with Intermediate Values����������������������������������������������������������� 52
How Does This Work?������������������������������������������������������������������������������������������������������������ 54
A Slightly More Complicated Example: Line of Sight������������������������������������������������������������� 56
Cook Until Done: parallel_do and parallel_pipeline�������������������������������������������������������������������� 57
parallel_do: Apply a Body Until There Are No More Items Left���������������������������������������������� 58
parallel_pipeline: Streaming Items Through a Series of Filters��������������������������������������������� 67

Chapter 3: Flow Graphs������������������������������������������������������������������������������������������ 79


Why Use Graphs to Express Parallelism?������������������������������������������������������������������������������������ 80
The Basics of the TBB Flow Graph Interface������������������������������������������������������������������������������� 82
Step 1: Create the Graph Object�������������������������������������������������������������������������������������������� 84
Step 2: Make the Nodes�������������������������������������������������������������������������������������������������������� 84
Step 3: Add Edges������������������������������������������������������������������������������������������������������������������ 87
Step 4: Start the Graph���������������������������������������������������������������������������������������������������������� 89
Step 5: Wait for the Graph to Complete Executing����������������������������������������������������������������� 91
A More Complicated Example of a Data Flow Graph������������������������������������������������������������������� 91
Implementing the Example as a TBB Flow Graph������������������������������������������������������������������ 93
Understanding the Performance of a Data Flow Graph��������������������������������������������������������� 96
The Special Case of Dependency Graphs������������������������������������������������������������������������������������ 97
Implementing a Dependency Graph�������������������������������������������������������������������������������������� 99
Estimating the Scalability of a Dependency Graph�������������������������������������������������������������� 105
Advanced Topics in TBB Flow Graphs��������������������������������������������������������������������������������������� 106

Chapter 4: TBB and the Parallel Algorithms of the C++ Standard


Template Library��������������������������������������������������������������������������������������������������� 109
Does the C++ STL Library Belong in This Book?���������������������������������������������������������������������� 110
A Parallel STL Execution Policy Analogy����������������������������������������������������������������������������������� 112
iv
Table of Contents

A Simple Example Using std::for_each������������������������������������������������������������������������������������� 113


What Algorithms Are Provided in a Parallel STL Implementation?�������������������������������������������� 117
How to Get and Use a Copy of Parallel STL That Uses TBB������������������������������������������������� 117
Algorithms in Intel’s Parallel STL����������������������������������������������������������������������������������������� 118
Capturing More Use Cases with Custom Iterators�������������������������������������������������������������������� 120
Highlighting Some of the Most Useful Algorithms�������������������������������������������������������������������� 124
std::for_each, std::for_each_n�������������������������������������������������������������������������������������������� 124
std::transform���������������������������������������������������������������������������������������������������������������������� 126
std::reduce�������������������������������������������������������������������������������������������������������������������������� 127
std::transform_reduce��������������������������������������������������������������������������������������������������������� 128
A Deeper Dive into the Execution Policies�������������������������������������������������������������������������������� 130
The sequenced_policy��������������������������������������������������������������������������������������������������������� 131
The parallel_policy�������������������������������������������������������������������������������������������������������������� 131
The unsequenced_policy����������������������������������������������������������������������������������������������������� 132
The parallel_unsequenced_policy��������������������������������������������������������������������������������������� 132
Which Execution Policy Should We Use?���������������������������������������������������������������������������������� 132
Other Ways to Introduce SIMD Parallelism�������������������������������������������������������������������������������� 134

Chapter 5: Synchronization: Why and How to Avoid It����������������������������������������� 137


A Running Example: Histogram of an Image����������������������������������������������������������������������������� 138
An Unsafe Parallel Implementation������������������������������������������������������������������������������������������� 141
A First Safe Parallel Implementation: Coarse-­Grained Locking������������������������������������������������ 145
Mutex Flavors���������������������������������������������������������������������������������������������������������������������� 151
A Second Safe Parallel Implementation: Fine-­Grained Locking������������������������������������������������ 153
A Third Safe Parallel Implementation: Atomics������������������������������������������������������������������������� 158
A Better Parallel Implementation: Privatization and Reduction������������������������������������������������� 163
Thread Local Storage, TLS��������������������������������������������������������������������������������������������������� 164
enumerable_thread_specific, ETS��������������������������������������������������������������������������������������� 165
combinable�������������������������������������������������������������������������������������������������������������������������� 168
The Easiest Parallel Implementation: Reduction Template������������������������������������������������������� 170
Recap of Our Options���������������������������������������������������������������������������������������������������������������� 172

v
Table of Contents

Chapter 6: Data Structures for Concurrency�������������������������������������������������������� 179


Key Data Structures Basics������������������������������������������������������������������������������������������������������� 180
Unordered Associative Containers��������������������������������������������������������������������������������������� 180
Map vs. Set�������������������������������������������������������������������������������������������������������������������������� 181
Multiple Values�������������������������������������������������������������������������������������������������������������������� 181
Hashing������������������������������������������������������������������������������������������������������������������������������� 181
Unordered���������������������������������������������������������������������������������������������������������������������������� 182
Concurrent Containers�������������������������������������������������������������������������������������������������������������� 182
Concurrent Unordered Associative Containers�������������������������������������������������������������������� 185
Concurrent Queues: Regular, Bounded, and Priority������������������������������������������������������������ 193
Concurrent Vector���������������������������������������������������������������������������������������������������������������� 202

Chapter 7: Scalable Memory Allocation��������������������������������������������������������������� 207


Modern C++ Memory Allocation����������������������������������������������������������������������������������������������� 208
Scalable Memory Allocation: What�������������������������������������������������������������������������������������������� 209
Scalable Memory Allocation: Why��������������������������������������������������������������������������������������������� 209
Avoiding False Sharing with Padding���������������������������������������������������������������������������������� 210
Scalable Memory Allocation Alternatives: Which���������������������������������������������������������������������� 212
Compilation Considerations������������������������������������������������������������������������������������������������������ 214
Most Popular Usage (C/C++ Proxy Library): How��������������������������������������������������������������������� 214
Linux: malloc/new Proxy Library Usage������������������������������������������������������������������������������ 216
macOS: malloc/new Proxy Library Usage���������������������������������������������������������������������������� 216
Windows: malloc/new Proxy Library Usage������������������������������������������������������������������������ 217
Testing our Proxy Library Usage������������������������������������������������������������������������������������������ 218
C Functions: Scalable Memory Allocators for C������������������������������������������������������������������������ 220
C++ Classes: Scalable Memory Allocators for C++������������������������������������������������������������������ 221
Allocators with std::allocator<T> Signature����������������������������������������������������������������������� 222
scalable_allocator��������������������������������������������������������������������������������������������������������������������� 222
tbb_allocator����������������������������������������������������������������������������������������������������������������������������� 222
zero_allocator��������������������������������������������������������������������������������������������������������������������������� 223

vi
Table of Contents

cached_aligned_allocator��������������������������������������������������������������������������������������������������������� 223
Memory Pool Support: memory_pool_allocator������������������������������������������������������������������ 223
Array Allocation Support: aligned_space����������������������������������������������������������������������������� 224
Replacing new and delete Selectively�������������������������������������������������������������������������������������� 224
Performance Tuning: Some Control Knobs�������������������������������������������������������������������������������� 228
What Are Huge Pages?�������������������������������������������������������������������������������������������������������� 228
TBB Support for Huge Pages����������������������������������������������������������������������������������������������� 228
scalable_allocation_mode(int mode, intptr_t value)����������������������������������������������������������� 229
TBBMALLOC_USE_HUGE_PAGES����������������������������������������������������������������������������������������� 229
TBBMALLOC_SET_SOFT_HEAP_LIMIT��������������������������������������������������������������������������������� 230
int scalable_allocation_command(int cmd, void *param)��������������������������������������������������� 230
TBBMALLOC_CLEAN_ALL_BUFFERS����������������������������������������������������������������������������������� 230
TBBMALLOC_CLEAN_THREAD_BUFFERS���������������������������������������������������������������������������� 230

Chapter 8: Mapping Parallel Patterns to TBB������������������������������������������������������� 233


Parallel Patterns vs. Parallel Algorithms����������������������������������������������������������������������������������� 233
Patterns Categorize Algorithms, Designs, etc.�������������������������������������������������������������������������� 235
Patterns That Work�������������������������������������������������������������������������������������������������������������������� 236
Data Parallelism Wins��������������������������������������������������������������������������������������������������������������� 237
Nesting Pattern������������������������������������������������������������������������������������������������������������������������� 238
Map Pattern������������������������������������������������������������������������������������������������������������������������������ 239
Workpile Pattern����������������������������������������������������������������������������������������������������������������������� 240
Reduction Patterns (Reduce and Scan)������������������������������������������������������������������������������������ 241
Fork-Join Pattern���������������������������������������������������������������������������������������������������������������������� 243
Divide-and-Conquer Pattern����������������������������������������������������������������������������������������������������� 244
Branch-and-Bound Pattern������������������������������������������������������������������������������������������������������� 244
Pipeline Pattern������������������������������������������������������������������������������������������������������������������������� 246
Event-Based Coordination Pattern (Reactive Streams)������������������������������������������������������������� 247

vii
Table of Contents

Part 2�������������������������������������������������������������������������������������������������������������������� 249

Chapter 9: The Pillars of Composability��������������������������������������������������������������� 251


What Is Composability?������������������������������������������������������������������������������������������������������������� 253
Nested Composition������������������������������������������������������������������������������������������������������������ 254
Concurrent Composition������������������������������������������������������������������������������������������������������ 256
Serial Composition�������������������������������������������������������������������������������������������������������������� 258
The Features That Make TBB a Composable Library���������������������������������������������������������������� 259
The TBB Thread Pool (the Market) and Task Arenas������������������������������������������������������������ 260
The TBB Task Dispatcher: Work Stealing and More������������������������������������������������������������� 263
Putting It All Together���������������������������������������������������������������������������������������������������������������� 270
Looking Forward����������������������������������������������������������������������������������������������������������������������� 274
Controlling the Number of Threads�������������������������������������������������������������������������������������� 274
Work Isolation���������������������������������������������������������������������������������������������������������������������� 274
Task-to-Thread and Thread-to-Core Affinity������������������������������������������������������������������������ 275
Task Priorities���������������������������������������������������������������������������������������������������������������������� 275

Chapter 10: Using Tasks to Create Your Own Algorithms������������������������������������� 277


A Running Example: The Sequence������������������������������������������������������������������������������������������� 278
The High-Level Approach: parallel_invoke�������������������������������������������������������������������������������� 280
The Highest Among the Lower: task_group������������������������������������������������������������������������������ 282
The Low-Level Task Interface: Part One – Task Blocking���������������������������������������������������������� 284
The Low-Level Task Interface: Part Two – Task Continuation��������������������������������������������������� 290
Bypassing the Scheduler����������������������������������������������������������������������������������������������������� 297
The Low-Level Task Interface: Part Three – Task Recycling����������������������������������������������������� 297
Task Interface Checklist������������������������������������������������������������������������������������������������������������ 300
One More Thing: FIFO (aka Fire-and-Forget) Tasks������������������������������������������������������������������� 301
Putting These Low-Level Features to Work������������������������������������������������������������������������������� 302

Chapter 11: Controlling the Number of Threads Used for Execution�������������������� 313
A Brief Recap of the TBB Scheduler Architecture��������������������������������������������������������������������� 314
Interfaces for Controlling the Number of Threads��������������������������������������������������������������������� 315

viii
Table of Contents

Controlling Thread Count with task_scheduler_init������������������������������������������������������������ 315


Controlling Thread Count with task_arena�������������������������������������������������������������������������� 316
Controlling Thread Count with global_control��������������������������������������������������������������������� 318
Summary of Concepts and Classes������������������������������������������������������������������������������������� 318
The Best Approaches for Setting the Number of Threads��������������������������������������������������������� 320
Using a Single task_scheduler_init Object for a Simple Application����������������������������������� 320
Using More Than One task_scheduler_init Object in a Simple Application������������������������� 323
Using Multiple Arenas with Different Numbers of Slots to Influence Where
TBB Places Its Worker Threads�������������������������������������������������������������������������������������������� 325
Using global_control to Control How Many Threads Are Available to Fill Arena Slots��������� 329
Using global_control to Temporarily Restrict the Number of Available Threads������������������ 330
When NOT to Control the Number of Threads��������������������������������������������������������������������������� 332
Figuring Out What’s Gone Wrong���������������������������������������������������������������������������������������������� 334

Chapter 12: Using Work Isolation for Correctness and Performance������������������� 337
Work Isolation for Correctness�������������������������������������������������������������������������������������������������� 338
Creating an Isolated Region with this_task_arena::isolate������������������������������������������������� 343
Using Task Arenas for Isolation: A Double-Edged Sword���������������������������������������������������������� 349
Don’t Be Tempted to Use task_arenas to Create Work Isolation for Correctness���������������� 353

Chapter 13: Creating Thread-to-Core and Task-to-Thread Affinity����������������������� 357


Creating Thread-to-Core Affinity����������������������������������������������������������������������������������������������� 358
Creating Task-to-Thread Affinity����������������������������������������������������������������������������������������������� 362
When and How Should We Use the TBB Affinity Features?������������������������������������������������������� 370

Chapter 14: Using Task Priorities������������������������������������������������������������������������� 373


Support for Non-Preemptive Priorities in the TBB Task Class��������������������������������������������������� 374
Setting Static and Dynamic Priorities��������������������������������������������������������������������������������������� 376
Two Small Examples����������������������������������������������������������������������������������������������������������������� 377
Implementing Priorities Without Using TBB Task Support��������������������������������������������������������� 382

Chapter 15: Cancellation and Exception Handling������������������������������������������������ 387


How to Cancel Collective Work������������������������������������������������������������������������������������������������� 388
Advanced Task Cancellation������������������������������������������������������������������������������������������������������ 390

ix
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Table of Contents

Explicit Assignment of TGC�������������������������������������������������������������������������������������������������� 392


Default Assignment of TGC�������������������������������������������������������������������������������������������������� 395
Exception Handling in TBB�������������������������������������������������������������������������������������������������������� 399
Tailoring Our Own TBB Exceptions�������������������������������������������������������������������������������������������� 402
Putting All Together: Composability, Cancellation, and Exception Handling������������������������������ 405

Chapter 16: Tuning TBB Algorithms: Granularity, Locality, Parallelism,


and Determinism�������������������������������������������������������������������������������������������������� 411
Task Granularity: How Big Is Big Enough?�������������������������������������������������������������������������������� 412
Choosing Ranges and Partitioners for Loops���������������������������������������������������������������������������� 413
An Overview of Partitioners������������������������������������������������������������������������������������������������� 415
Choosing a Grainsize (or Not) to Manage Task Granularity�������������������������������������������������� 417
Ranges, Partitioners, and Data Cache Performance������������������������������������������������������������ 420
Using a static_partitioner���������������������������������������������������������������������������������������������������� 428
Restricting the Scheduler for Determinism������������������������������������������������������������������������� 431
Tuning TBB Pipelines: Number of Filters, Modes, and Tokens��������������������������������������������������� 433
Understanding a Balanced Pipeline������������������������������������������������������������������������������������� 434
Understanding an Imbalanced Pipeline������������������������������������������������������������������������������� 436
Pipelines and Data Locality and Thread Affinity������������������������������������������������������������������ 438
Deep in the Weeds�������������������������������������������������������������������������������������������������������������������� 439
Making Your Own Range Type��������������������������������������������������������������������������������������������� 439
The Pipeline Class and Thread-Bound Filters���������������������������������������������������������������������� 442

Chapter 17: Flow Graphs: Beyond the Basics������������������������������������������������������� 451


Optimizing for Granularity, Locality, and Parallelism����������������������������������������������������������������� 452
Node Granularity: How Big Is Big Enough?�������������������������������������������������������������������������� 452
Memory Usage and Data Locality���������������������������������������������������������������������������������������� 462
Task Arenas and Flow Graph����������������������������������������������������������������������������������������������� 477
Key FG Advice: Dos and Don’ts������������������������������������������������������������������������������������������������� 480
Do: Use Nested Parallelism������������������������������������������������������������������������������������������������� 480
Don’t: Use Multifunction Nodes in Place of Nested Parallelism������������������������������������������ 481
Do: Use join_node, sequencer_node, or multifunction_node to Reestablish Order
in a Flow Graph When Needed�������������������������������������������������������������������������������������������� 481

x
Table of Contents

Do: Use the Isolate Function for Nested Parallelism������������������������������������������������������������ 485


Do: Use Cancellation and Exception Handling in Flow Graphs�������������������������������������������� 488
Do: Set a Priority for a Graph Using task_group_context���������������������������������������������������� 492
Don’t: Make an Edge Between Nodes in Different Graphs�������������������������������������������������� 492
Do: Use try_put to Communicate Across Graphs����������������������������������������������������������������� 495
Do: Use composite_node to Encapsulate Groups of Nodes������������������������������������������������� 497
Introducing Intel Advisor: Flow Graph Analyzer������������������������������������������������������������������������� 501
The FGA Design Workflow��������������������������������������������������������������������������������������������������� 502
The FGA Analysis Workflow������������������������������������������������������������������������������������������������� 505
Diagnosing Performance Issues with FGA��������������������������������������������������������������������������� 507

Chapter 18: Beef Up Flow Graphs with Async Nodes������������������������������������������� 513


Async World Example���������������������������������������������������������������������������������������������������������������� 514
Why and When async_node?���������������������������������������������������������������������������������������������������� 519
A More Realistic Example��������������������������������������������������������������������������������������������������������� 521

Chapter 19: Flow Graphs on Steroids: OpenCL Nodes������������������������������������������ 535


Hello OpenCL_Node Example���������������������������������������������������������������������������������������������������� 536
Where Are We Running Our Kernel?������������������������������������������������������������������������������������������ 544
Back to the More Realistic Example of Chapter 18������������������������������������������������������������������� 551
The Devil Is in the Details��������������������������������������������������������������������������������������������������������� 561
The NDRange Concept��������������������������������������������������������������������������������������������������������� 562
Playing with the Offset�������������������������������������������������������������������������������������������������������� 568
Specifying the OpenCL Kernel��������������������������������������������������������������������������������������������� 569
Even More on Device Selection������������������������������������������������������������������������������������������������� 570
A Warning Regarding the Order Is in Order!������������������������������������������������������������������������������ 574

Chapter 20: TBB on NUMA Architectures�������������������������������������������������������������� 581


Discovering Your Platform Topology������������������������������������������������������������������������������������������ 583
Understanding the Costs of Accessing Memory������������������������������������������������������������������ 587
Our Baseline Example��������������������������������������������������������������������������������������������������������� 588
Mastering Data Placement and Processor Affinity�������������������������������������������������������������� 589

xi
Table of Contents

Putting hwloc and TBB to Work Together���������������������������������������������������������������������������������� 595


More Advanced Alternatives����������������������������������������������������������������������������������������������������� 601

Appendix A: History and Inspiration��������������������������������������������������������������������� 605


A Decade of “Hatchling to Soaring”������������������������������������������������������������������������������������������ 605
#1 TBB’s Revolution Inside Intel������������������������������������������������������������������������������������������ 605
#2 TBB’s First Revolution of Parallelism������������������������������������������������������������������������������ 606
#3 TBB’s Second Revolution of Parallelism������������������������������������������������������������������������� 607
#4 TBB’s Birds��������������������������������������������������������������������������������������������������������������������� 608
Inspiration for TBB�������������������������������������������������������������������������������������������������������������������� 611
Relaxed Sequential Execution Model���������������������������������������������������������������������������������� 612
Influential Libraries�������������������������������������������������������������������������������������������������������������� 613
Influential Languages���������������������������������������������������������������������������������������������������������� 614
Influential Pragmas������������������������������������������������������������������������������������������������������������� 615
Influences of Generic Programming������������������������������������������������������������������������������������ 615
Considering Caches������������������������������������������������������������������������������������������������������������� 616
Considering Costs of Time Slicing��������������������������������������������������������������������������������������� 617
Further Reading������������������������������������������������������������������������������������������������������������������� 618

Appendix B: TBB Précis���������������������������������������������������������������������������������������� 623


Debug and Conditional Coding�������������������������������������������������������������������������������������������������� 624
Preview Feature Macros����������������������������������������������������������������������������������������������������������� 626
Ranges�������������������������������������������������������������������������������������������������������������������������������������� 626
Partitioners������������������������������������������������������������������������������������������������������������������������������� 627
Algorithms��������������������������������������������������������������������������������������������������������������������������������� 628
Algorithm: parallel_do��������������������������������������������������������������������������������������������������������������� 629
Algorithm: parallel_for�������������������������������������������������������������������������������������������������������������� 631
Algorithm: parallel_for_each���������������������������������������������������������������������������������������������������� 635
Algorithm: parallel_invoke�������������������������������������������������������������������������������������������������������� 636
Algorithm: parallel_pipeline������������������������������������������������������������������������������������������������������ 638
Algorithm: parallel_reduce and parallel_deterministic_reduce������������������������������������������������ 641
Algorithm: parallel_scan����������������������������������������������������������������������������������������������������������� 645

xii
Table of Contents

Algorithm: parallel_sort������������������������������������������������������������������������������������������������������������ 648


Algorithm: pipeline�������������������������������������������������������������������������������������������������������������������� 651
Flow Graph�������������������������������������������������������������������������������������������������������������������������������� 653
Flow Graph: graph class����������������������������������������������������������������������������������������������������������� 654
Flow Graph: ports and edges���������������������������������������������������������������������������������������������������� 655
Flow Graph: nodes�������������������������������������������������������������������������������������������������������������������� 655
Memory Allocation�������������������������������������������������������������������������������������������������������������������� 667
Containers��������������������������������������������������������������������������������������������������������������������������������� 673
Synchronization������������������������������������������������������������������������������������������������������������������������ 693
Thread Local Storage (TLS)������������������������������������������������������������������������������������������������������� 699
Timing��������������������������������������������������������������������������������������������������������������������������������������� 708
Task Groups: Use of the Task Stealing Scheduler��������������������������������������������������������������������� 709
Task Scheduler: Fine Control of the Task Stealing Scheduler��������������������������������������������������� 710
Floating-Point Settings������������������������������������������������������������������������������������������������������������� 721
Exceptions��������������������������������������������������������������������������������������������������������������������������������� 723
Threads������������������������������������������������������������������������������������������������������������������������������������� 725
Parallel STL������������������������������������������������������������������������������������������������������������������������������� 726

Glossary���������������������������������������������������������������������������������������������������������������� 729

Index��������������������������������������������������������������������������������������������������������������������� 745

xiii
About the Authors
Michael Voss is a Principal Engineer in the Intel Architecture, Graphics and Software
Group at Intel. He has been a member of the TBB development team since before the
1.0 release in 2006 and was the initial architect of the TBB flow graph API. He is also
one of the lead developers of Flow Graph Analyzer, a graphical tool for analyzing data
flow applications targeted at both homogeneous and heterogeneous platforms. He
has co-authored over 40 published papers and articles on topics related to parallel
programming and frequently consults with customers across a wide range of domains to
help them effectively use the threading libraries provided by Intel. Prior to joining Intel
in 2006, he was an Assistant Professor in the Edward S. Rogers Department of Electrical
and Computer Engineering at the University of Toronto. He received his Ph.D. from the
School of Electrical and Computer Engineering at Purdue University in 2001.

Rafael Asenjo, Professor of Computer Architecture at the University of Malaga, Spain,


obtained a PhD in Telecommunication Engineering in 1997 and was an Associate
Professor at the Computer Architecture Department from 2001 to 2017. He was a
Visiting Scholar at the University of Illinois in Urbana-Champaign (UIUC) in 1996 and
1997 and Visiting Research Associate in the same University in 1998. He was also a
Research Visitor at IBM T.J. Watson in 2008 and at Cray Inc. in 2011. He has been using
TBB since 2008 and over the last five years, he has focused on productively exploiting
heterogeneous chips leveraging TBB as the orchestrating framework. In 2013 and 2014
he visited UIUC to work on CPU+GPU chips. In 2015 and 2016 he also started to research
into CPU+FPGA chips while visiting U. of Bristol. He served as General Chair for ACM
PPoPP’16 and as an Organization Committee member as well as a Program Committee
member for several HPC related conferences (PPoPP, SC, PACT, IPDPS, HPCA, EuroPar,
and SBAC-PAD). His research interests include heterogeneous programming models
and architectures, parallelization of irregular codes and energy consumption.

James Reinders is a consultant with more than three decades experience in Parallel
Computing, and is an author/co-author/editor of nine technical books related to parallel
programming. He has had the great fortune to help make key contributions to two of
the world’s fastest computers (#1 on Top500 list) as well as many other supercomputers,

xv
About the Authors

and software developer tools. James finished 10,001 days (over 27 years) at Intel in mid-
2016, and now continues to write, teach, program, and do consulting in areas related to
parallel computing (HPC and AI).

xvi
Acknowledgments
Two people offered their early and continuing support for this project – Sanjiv Shah and
Herb Hinstorff. We are grateful for their encouragement, support, and occasional gentle
pushes.
The real heroes are reviewers who invested heavily in providing thoughtful and
detailed feedback on draft copies of the chapters within this book. The high quality
of their input helped drive us to allow more time for review and adjustment than we
initially planned. The book is far better as a result.
The reviewers are a stellar collection of users of TBB and key developers of TBB. It
is rare for a book project to have such an energized and supportive base of help in
refining a book. Anyone reading this book can know it is better because of these kind
souls: Eduard Ayguade, Cristina Beldica, Konstantin Boyarinov, José Carlos Cabaleiro
Domínguez, Brad Chamberlain, James Jen-Chang Chen, Jim Cownie, Sergey Didenko,
Alejandro (Alex) Duran, Mikhail Dvorskiy, Rudolf (Rudi) Eigenmann, George Elkoura,
Andrey Fedorov, Aleksei Fedotov, Tomás Fernández Pena, Elvis Fefey, Evgeny Fiksman,
Basilio Fraguela, Henry Gabb, José Daniel García Sánchez, Maria Jesus Garzaran,
Alexander Gerveshi, Darío Suárez Gracia, Kristina Kermanshahche, Yaniv Klein, Mark
Lubin, Anton Malakhov, Mark McLaughlin, Susan Meredith, Yeser Meziani, David
Padua, Nikita Ponomarev, Anoop Madhusoodhanan Prabha, Pablo Reble, Arch Robison,
Timmie Smith, Rubén Gran Tejero, Vasanth Tovinkere, Sergey Vinogradov, Kyle Wheeler,
and Florian Zitzelsberger.
We sincerely thank all those who helped, and we apologize for any who helped us
and we failed to mention!
Mike (along with Rafa and James!) thanks all of the people who have been involved
in TBB over the years: the many developers at Intel who have left their mark on the
library, Alexey Kukanov for sharing insights as we developed this book, the open-source
contributors, the technical writers and marketing professionals that have worked on
documentation and getting the word out about TBB, the technical consulting engineers
and application engineers that have helped people best apply TBB to their problems, the
managers who have kept us all on track, and especially the users of TBB that have always
provided the feedback on the library and its features that we needed to figure out where

xvii
Acknowledgments

to go next. And most of all, Mike thanks his wife Natalie and their kids, Nick, Ali, and
Luke, for their support and patience during the nights and weekends spent on this book.
Rafa thanks his PhD students and colleagues for providing feedback regarding
making TBB concepts more gentle and approachable: José Carlos Romero, Francisco
Corbera, Alejandro Villegas, Denisa Andreea Constantinescu, Angeles Navarro;
particularly to José Daniel García for his engrossing and informative conversations about
C++11, 14, 17, and 20, to Aleksei Fedotov and Pablo Reble for helping with the OpenCL_
node examples, and especially his wife Angeles Navarro for her support and for taking
over some of his duties when he was mainly focused on the book.
James thanks his wife Susan Meredith – her patient and continuous support was
essential to making this book a possibility. Additionally, her detailed editing, which often
added so much red ink on a page that the original text was hard to find, made her one of
our valued reviewers.
As coauthors, we cannot adequately thank each other enough. Mike and James have
known each other for years at Intel and feel fortunate to have come together on this book
project. It is difficult to adequately say how much Mike and James appreciate Rafa! How
lucky his students are to have such an energetic and knowledgeable professor! Without
Rafa, this book would have been much less lively and fun to read. Rafa’s command of
TBB made this book much better, and his command of the English language helped
correct the native English speakers (Mike and James) more than a few times. The three
of us enjoyed working on this book together, and we definitely spurred each other on to
great heights. It has been an excellent collaboration.
We thank Todd Green who initially brought us to Apress. We thank Natalie Pao, of
Apress, and John Somoza, of Intel, who cemented the terms between Intel and Apress
on this project. We appreciate the hard work by the entire Apress team through contract,
editing, and production.
Thank you all,
Mike Voss, Rafael Asenjo, and James Reinders

xviii
Preface
Think Parallel
We have aimed to make this book useful for those who are new to parallel programming
as well as those who are expert in parallel programming. We have also made this book
approachable for those who are comfortable only with C programming, as well as those
who are fluent in C++.
In order to address this diverse audience without “dumbing down” the book, we
have written this Preface to level the playing field.

What Is TBB
TBB is a solution for writing parallel programs in C++ which has become the most
popular, and extensive, support for parallel programming in C++. It is widely used
and very popular for a good reason. More than 10 years old, TBB has stood the test
of time and has been influential in the inclusion of parallel programming support in
the C++ standard. While C++11 made major additions for parallel programming, and
C++17 and C++2x take that ever further, most of what TBB offers is much more than
what belongs in a language standard. TBB was introduced in 2006, so it contains
support for pre-C++11 compilers. We have simplified matters by taking a modern
look at TBB and assuming C++11. Common advice today is “if you don’t have a
C++11 compiler, get one.” Compared with the 2007 book on TBB, we think C++11,
with lambda support in particular, makes TBB both richer and easier to understand
and use.
TBB is simply the best way to write a parallel program in C++, and we hope to help
you be very productive in using TBB.

xix
Preface

Organization of the Book and Preface


This book is organized into four major sections:

I. Preface: Background and fundamentals useful for understanding


the remainder of the book. Includes motivations for the TBB
parallel programming model, an introduction to parallel
programming, an introduction to locality and caches, an
introduction to vectorization (SIMD), and an introduction to
the features of C++ (beyond those in the C language) which are
supported or used by TBB.

II. Chapters 1–8: A book on TBB in its own right. Includes an


introduction to TBB sufficient to do a great deal of effective
parallel programming.

III. Chapters 9–20: Include special topics that give a deeper


understanding of TBB and parallel programming and deal with
nuances in both.

IV. Appendices A and B and Glossary: A collection of useful


information about TBB that you may find interesting,
including history (Appendix A) and a complete reference guide
(Appendix B).

T hink Parallel
For those new to parallel programming, we offer this Preface to provide a foundation
that will make the remainder of the book more useful, approachable, and self-contained.
We have attempted to assume only a basic understanding of C programming and
introduce the key elements of C++ that TBB relies upon and supports. We introduce
parallel programming from a practical standpoint that emphasizes what makes parallel
programs most effective. For experienced parallel programmers, we hope this Preface
will be a quick read that provides a useful refresher on the key vocabulary and thinking
that allow us to make the most of parallel computer hardware.

xx
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Preface

After reading this Preface, you should be able to explain what it means to “Think
Parallel” in terms of decomposition, scaling, correctness, abstraction, and patterns.
You will appreciate that locality is a key concern for all parallel programming. You
will understand the philosophy of supporting task programming instead of thread
programming – a revolutionary development in parallel programming supported by TBB.
You will also understand the elements of C++ programming that are needed above and
beyond a knowledge of C in order to use TBB well.
The remainder of this Preface contains five parts:
(1) An explanation of the motivations behind TBB (begins on page xxi)

(2) An introduction to parallel programming (begins on page xxvi)

(3) An introduction to locality and caches – we call “Locality and


the Revenge of the Caches” – the one aspect of hardware that we
feel essential to comprehend for top performance with parallel
programming (begins on page lii)

(4) An introduction to vectorization (SIMD) (begins on page lx)

(5) An introduction to the features of C++ (beyond those in the C


language) which are supported or used by TBB (begins on page lxii)

Motivations Behind Threading Building Blocks (TBB)


TBB first appeared in 2006. It was the product of experts in parallel programming at
Intel, many of whom had decades of experience in parallel programming models,
including OpenMP. Many members of the TBB team had previously spent years helping
drive OpenMP to the great success it enjoys by developing and supporting OpenMP
implementations. Appendix A is dedicated to a deeper dive on the history of TBB and
the core concepts that go into it, including the breakthrough concept of task-stealing
schedulers.
Born in the early days of multicore processors, TBB quickly emerged as the most
popular parallel programming model for C++ programmers. TBB has evolved over its
first decade to incorporate a rich set of additions that have made it an obvious choice for
parallel programming for novices and experts alike. As an open source project, TBB has
enjoyed feedback and contributions from around the world.

xxi
Preface

TBB promotes a revolutionary idea: parallel programming should enable the


programmer to expose opportunities for parallelism without hesitation, and the
underlying programming model implementation (TBB) should map that to the hardware
at runtime.
Understanding the importance and value of TBB rests on understanding three
things: (1) program using tasks, not threads; (2) parallel programming models do
not need to be messy; and (3) how to obtain scaling, performance, and performance
portability with portable low overhead parallel programming models such as TBB. We
will dive into each of these three next because they are so important! It is safe to say
that the importance of these were underestimated for a long time before emerging as
cornerstones in our understanding of how to achieve effective, and structured, parallel
programming.

Program Using Tasks Not Threads


Parallel programming should always be done in terms of tasks, not threads. We cite an
authoritative and in-depth examination of this by Edward Lee at the end of this Preface.
In 2006, he observed that “For concurrent programming to become mainstream, we
must discard threads as a programming model.”
Parallel programming expressed with threads is an exercise in mapping an
application to the specific number of parallel execution threads on the machine we
happen to run upon. Parallel programming expressed with tasks is an exercise in
exposing opportunities for parallelism and allowing a runtime (e.g., TBB runtime)
to map tasks onto the hardware at runtime without complicating the logic of our
application.
Threads represent an execution stream that executes on a hardware thread for a
time slice and may be assigned other hardware threads for a future time slice. Parallel
programming in terms of threads fail because they are too often used as a one-to-one
correspondence between threads (as in execution threads) and threads (as in hardware
threads, e.g., processor cores). A hardware thread is a physical capability, and the
number of hardware threads available varies from machine to machine, as do some
subtle characteristics of various thread implementations.
In contrast, tasks represent opportunities for parallelism. The ability to subdivide
tasks can be exploited, as needed, to fill available threads when needed.

xxii
Preface

With these definitions in mind, a program written in terms of threads would have
to map each algorithm onto specific systems of hardware and software. This is not only
a distraction, it causes a whole host of issues that make parallel programming more
difficult, less effective, and far less portable.
Whereas, a program written in terms of tasks allows a runtime mechanism, for
example, the TBB runtime, to map tasks onto the hardware which is actually present at
runtime. This removes the distraction of worrying about the number of actual hardware
threads available on a system. More importantly, in practice this is the only method
which opens up nested parallelism effectively. This is such an important capability, that
we will revisit and emphasize the importance of nested parallelism in several chapters.

 omposability: Parallel Programming Does Not Have


C
to Be Messy
TBB offers composability for parallel programming, and that changes everything.
Composability means we can mix and match features of TBB without restriction. Most
notably, this includes nesting. Therefore, it makes perfect sense to have a parallel_for
inside a parallel_for loop. It is also okay for a parallel_for to call a subroutine, which
then has a parallel_for within it.
Supporting composable nested parallelism turns out to be highly desirable
because it exposes more opportunities for parallelism, and that results in more scalable
applications. OpenMP, for instance, is not composable with respect to nesting because
each level of nesting can easily cause significant overhead and consumption of resources
leading to exhaustion and program termination. This is a huge problem when you
consider that a library routine may contain parallel code, so we may experience issues
using a non-composable technique if we call the library while already doing parallelism.
No such problem exists with TBB, because it is composable. TBB solves this, in part, by
letting use expose opportunities for parallelism (tasks) while TBB decides at runtime
how to map them to hardware (threads).
This is the key benefit to coding in terms of tasks (available but nonmandatory
parallelism (see “relaxed sequential semantics” in Chapter 2)) instead of threads
(mandatory parallelism). If a parallel_for was considered mandatory, nesting would
cause an explosion of threads which causes a whole host of resource issues which can
easily (and often do) crash programs when not controlled. When parallel_for exposes

xxiii
Preface

available nonmandatory parallelism, the runtime is free to use that information to match
the capabilities of the machine in the most effective manner.
We have come to expect composability in our programming languages, but most
parallel programming models have failed to preserve it (fortunately, TBB does preserve
composability!). Consider “if” and “while” statements. The C and C++ languages allow
them to freely mix and nest as we desire. Imagine this was not so, and we lived in a world
where a function called from within an if statement was forbidden to contain a while
statement! Hopefully, any suggestion of such a restriction seems almost silly. TBB brings
this type of composability to parallel programming by allowing parallel constructs to be
freely mixed and nested without restrictions, and without causing issues.

 caling, Performance, and Quest for Performance


S
Portability
Perhaps the most important benefit of programming with TBB is that it helps
create a performance portable application. We define performance portability as
the characteristic that allows a program to maintain a similar “percentage of peak
performance” across a variety of machines (different hardware, different operating
systems, or both). We would like to achieve a high percentage of peak performance on
many different machines without the need to change our code.
We would also like to see a 16× gain in performance on a 64-core machine vs. a
quad-core machine. For a variety of reasons, we will almost never see ideal speedup
(never say never: sometimes, due to an increase in aggregate cache size we can see more
than ideal speedup – a condition we call superlinear speedup).

WHAT IS SPEEDUP?

Speedup is formerly defined to be the time to run sequentially (not in parallel) divided by the
time to run in parallel. If my program runs in 3 seconds normally, but in only 1 second on a
quad-core processor, we would say it has a speedup of 3×. Sometimes, we might speak of
efficiency which is speedup divided by the number of processing cores. Our 3× would be 75%
efficient at using the parallelism.

The ideal goal of a 16× gain in performance when moving from a quad-core machine
to one with 64 cores is called linear scaling or perfect scaling.

xxiv
Preface

To accomplish this, we need to keep all the cores busy as we grow their
numbers – something that requires considerable available parallelism. We will dive
more into this concept of “available parallelism” starting on page xxxvii when we discuss
Amdahl’s Law and its implications.
For now, it is important to know that TBB supports high-performance programming
and helps significantly with performance portability. The high-performance support
comes because TBB introduces essentially no overhead which allows scaling to proceed
without issue. Performance portability lets our application harness available parallelism
as new machines offer more.
In our confident claims here, we are assuming a world where the slight additional
overhead of dynamic task scheduling is the most effective at exposing the parallelism
and exploiting it. This assumption has one fault: if we can program an application to
perfectly match the hardware, without any dynamic adjustments, we may find a few
percentage points gain in performance. Traditional High-Performance Computing
(HPC) programming, the name given to programming the world’s largest computers
for intense computations, has long had this characteristic in highly parallel scientific
computations. HPC developer who utilize OpenMP with static scheduling, and find it
does well with their performance, may find the dynamic nature of TBB to be a slight
reduction in performance. Any advantage previously seen from such static scheduling is
becoming rarer for a variety of reasons. All programming including HPC programming,
is increasing in complexity in a way that demands support for nested and dynamic
parallelism support. We see this in all aspects of HPC programming as well, including
growth to multiphysics models, introduction of AI (artificial intelligence), and use of ML
(machine learning) methods. One key driver of additional complexity is the increasing
diversity of hardware, leading to heterogeneous compute capabilities within a single
machine. TBB gives us powerful options for dealing with these complexities, including
its flow graph features which we will dive into in Chapter 3.

It is clear that effective parallel programming requires a separation between


exposing parallelism in the form of tasks (programmer’s responsibility) and
mapping tasks to hardware threads (programming model implementation’s
responsibility).

xxv
Random documents with unrelated
content Scribd suggests to you:
the Molly Maguires, and get as high in their counsels as possible, in
order that he might reveal their secrets to the authorities, thereby
preventing outrages when possible and securing convictions where
he could not prevent. He was successful in both efforts. After many
months of work and peril he finally succeeded in securing sufficient
evidence to accomplish the conviction of a large number of the
members of the society, breaking down completely their customary
defense of an alibi. In all, nineteen Molly Maguires were hanged, and
a larger number imprisoned, and the power of the organization was
completely shattered.
This series of events is a remarkable illustration of the way in
which customs, and habits of thought, and standards of conduct,
which have grown up by a natural process, and are comprehensible if
not excusable in one land, may develop most alarming and
disgraceful features when transplanted to a new environment. The
essential strength of the Molly Maguires lay in that deep-seated
hatred of an informer which has become a pronounced feature of the
Irish character, as a result of the conditions to which they have been
subjected at home. Thus, while the great mass of the Irish settlers of
the anthracite region abhorred the principles and deeds of the Molly
Maguires, it was almost impossible to secure witnesses against
criminals whose identity was a matter of general knowledge, because
of the greater repugnance to the character of an informer. The
traditional hatred of the Irish peasant towards the landlord was, in
this country, diverted to the capitalist class in a wholly unreasonable
but efficient manner.
There is here, also, a striking demonstration of the capacity of a
relatively small group of turbulent and unassimilated foreigners so to
conduct themselves as to bring an undeserved disrepute upon their
whole group, and foster economic and social changes in society
which will last on long after they are all dead.[98]
While the Irish and Germans were dominating the immigration
situation on the Atlantic coast, the Chinese were occupying the
center of the stage in the west. The stream of Chinese immigration
became considerable at about the same time that the great increase
in European immigration was taking place on the other side of the
continent. As to its causes, Mrs. Mary Roberts Coolidge speaks as
follows: “The first effective contact of China with Western nations
was through the Opium War of 1840, which resulted in an increase of
Chinese taxes, a general disturbance of the laboring classes, and the
penetration of some slight knowledge of European ideas into the
maritime provinces. Although this prepared the way for the
emigration to the West, its precipitating cause lay in ‘the Golden
Romance’ that had filled the world,”—that is, the news of the
discovery of gold in California. “Masters of foreign vessels afforded
every facility to emigration, distributing placards, maps, and
pamphlets with highly colored accounts of the Golden Hills.... But
behind the opportunity afforded by foreign shipping and the
enticement of the discovery of gold lay deeper causes for emigration
—the poverty and ruin in which the inhabitants of Southeastern
China were involved by the great Taiping rebellion which began in
the summer of 1850. The terrors of war, famine, and plundering
paralyzed all industry and trade, and the agricultural classes of the
maritime districts especially were driven to Hong Kong and
Macao.”[99] By the end of 1852 there were in the neighborhood of
25,000 Chinese on the Pacific coast, almost all of them in California.
During the first few years of their coming, the Chinese in
California were welcomed, and were looked upon with favor. They
were industrious, tractable, and inoffensive, and were willing to
undertake the hard, menial, and disagreeable forms of labor—partly
work generally done by women—for which native labor was not
available under existing conditions. Their strange manners and
customs aroused nothing more than feelings of curiosity. But
gradually a feeling of opposition to them began to grow up, fomented
by the jealousy and race prejudice of the miners. Their peculiar
appearance and strange customs began to make them the objects of
suspicion and hatred. This feeling was intensified by the presence of
a large element of southerners in California, who classed all people of
dark skin—“South Americans, South Europeans, Kanakas, Malays, or
Chinese”—together as colored. Wild stories of their character and
habits began to circulate, and with each repetition gained strength
until they passed current as facts. Among these were the assertions
that the Chinese were practically all coolies, or labor slaves, that they
were highly immoral and vicious, that they had secret tribunals
which inflicted the death penalty without due process of law, that
they displaced native labor, that they could not be Christianized, that
they had no intention of remaining as permanent residents of the
country and would not assimilate with the natives, that they sent
money out of the country, etc. Most of these charges have been
proven to be either wholly false or highly exaggerated by recent
investigations, and were so recognized by the more sober and fair-
minded students of the subject at the time. But for the mass of the
people of the Pacific coast, and for many in other parts of the
country, they acquired all the force of established dogma, and their
reiteration passed for argument.
The Chinaman became the scapegoat for all the ills that afflicted
the youthful community, from whatever cause they really arose, and
in time an anti-Chinese declaration came to be essential for the
success of any political party or candidate. In such a state of public
opinion it was inevitable that their lot should be a hard one. They
were robbed, beaten, murdered, and persecuted in a variety of ways.
The foreign miners’ license tax was used against them in a
discriminating way which amounted to quasi legal plunder.
In 1876 the California State Legislature appointed a committee to
look into the matter of Chinese immigration and to make a report.
This was done in 1877, and although the resulting Address and
Memorial to Congress have had a large influence in forming public
opinion, and in shaping legislation, it appears that it was in fact a
purely political document, and that everything was arranged in
advance to secure a report which should accomplish a certain
definite result—the satisfaction of the workingmen of the state, and
the emphasizing of the necessity of federal legislation. The need of
this was strongly felt, because nearly all the acts passed by the coast
states against the Chinese had been declared either unconstitutional
or a violation of treaty.
In response to the repeated demands of the coast states for some
federal action, Congress in 1876 appointed a special committee on
Chinese immigration, which made what purported to be a thorough
investigation of the matter, and reported thereupon. The report was
wholly anti-Chinese. But this was inevitable, as it is apparent from a
careful study of the testimony, that the committee “came to its task
committed to an anti-Chinese conclusion and that it had no judicial
character whatever.”[100] The evidence was willfully distorted to
produce the desired result.
During all this time our relations with China had been nominally
subject to a series of treaties, beginning with that of 1844, and
including the famous Burlingame treaty of 1868. While the earlier
agreements did not specifically mention the rights of Chinese to
reside and trade in the United States, they were in fact allowed the
same privileges in these respects as the citizens of other nations. By
the treaty of 1868, however, the right of voluntary emigration was
definitely recognized as between the two countries on the basis of the
most favored nation; but the Chinese were not given the right of
naturalization. From this privilege they were definitely excluded by
the law of 1870.
It became evident in time that no federal legislation, satisfactory to
the politicians of the western states, could be secured under the
existing treaties. There arose accordingly a demand for a new treaty
which would allow the passage of laws which would include the
points desired by the western representatives, practically the
exclusion of all Chinese not belonging to the merchant class. In
response to this demand there was negotiated, after much conference
between the representatives of the two nations, a new treaty in 1880.
The most important feature of this new instrument was the right
conferred upon the government of the United States reasonably to
regulate, limit, or suspend, but not to prohibit, the coming or
residence of Chinese laborers, whenever it deemed that the interests
of the country demanded such action. It is under this treaty that the
various Chinese exclusion acts have been passed.
The first of these acts was passed in 1882, and provided for the
exclusion of Chinese laborers for a period of ten years. This was not
to apply to Chinese who were already in the country, or who should
enter within ninety days after the passage of the act. Such persons,
who desired to leave the country and return, were required to secure
a certificate, which by an amendatory act of 1884 was made the sole
evidence of the right of a Chinaman to return. This act also required
a certificate of the exempt classes, to be issued by the Chinese
government or such other foreign governments as they might be
subject to. The deportation of Chinese unlawfully in the country was
also provided for by these acts.
These laws were in many respects carelessly drawn and extremely
difficult of execution. In their application they entailed great expense
upon the United States government, and worked extreme hardship
and injustice to many Chinese. They were, nevertheless, effective as
regards their main purpose, for the volume of Chinese immigration
at once diminished exceedingly. The strictness of the exclusion was
increased by the act of 1888, which refused return to any Chinese
laborer unless he had a lawful wife, child, or parent in the United
States, or owned property of the value of $1000 or had debts due him
of like amount. The acts in force were extended for another ten years
by the act of 1892, and again indefinitely in 1902, in each case with
relatively unimportant modifications in detail.
This history of Chinese immigration is not a matter in which the
citizen of the United States can take much pride. Race prejudice,
bigotry, ignorance, and political ambition have played a prominent
part in the agitation, and have been instrumental in securing much of
the legislation. The attitude and conduct of the United States
contrasts unfavorably with the position of China, which has been one
of patient, courteous, dignified, but emphatic protest, and
willingness to coöperate in securing reasonable and beneficial
regulation. The boycott of 1905 has been her principal active reprisal.
In spite of these facts, however, it would be rash to assert that the
exclusion of Chinese laborers, by whatever unfortunate means
accomplished, has not been of actual benefit to the United States.
The assertion that the failure of the Chinese to assimilate has been
due more to race prejudice and exclusiveness on the part of
Americans than to unwillingness to be Americanized on the part of
the Chinese, does not do away with the fact of nonassimilation. Until
Americans are willing to fraternize on terms of social equality with
members of any race, there is great danger to national institutions in
the presence of large numbers of that race within the country.[101]
And when we reflect how enormous Chinese immigration might
easily have become in these recent years of quick and easy
transportation, and excessive activity of steamship agents, contract
labor agents, and others of their kind, it is apparent that if free
immigration had been allowed to these people of a widely diverse
race, we might now be facing a Chinese problem in this country
second in gravity only to the negro question.[102]
By the end of this period the conditions of life in America had so
changed as to diminish the general feeling of complacency toward
unlimited immigration. There was in particular a growing opposition
to contract labor, and an increased demand for federal control of the
immigration situation, especially as all state laws in regard to the
regulation of foreign immigration had been declared
unconstitutional in 1876. There was a conviction in the minds of
some thinkers that the United States no longer stood in need of an
increased labor force. These views were clearly expressed in an
article by Mr. A. B. Mason, published in 1874. Some of his statements
have a new ring. “The conditions that have hitherto greatly favored
immigration no longer exist in their full force.” “The labour market,
especially for agricultural labour, is overstocked.” “The especial
disadvantages of American labour more than counterbalance its
especial advantages.” “English labour is in the main as well off as
American labour.”[103] It is evident that the time was at hand when
the competition of the foreigner in the American labor market could
no longer be regarded with equanimity.
This sentiment did not bear fruit, however, until the year 1882.
The only federal legislation bearing on immigration after the repeal
of the favorable contract labor law in 1868 up to this date, was the act
of March 3, 1875, prohibiting the importation or immigration into
the United States of women for the purpose of prostitution, and also
prohibiting the immigration of criminals, convicted of other than
political offenses. This law, while couched in general terms, was an
outcome of the anti-Chinese agitation, and was passed with this race
particularly in mind.
CHAPTER VI
MODERN PERIOD. FEDERAL LEGISLATION

The year 1882 stands as a prominent landmark in the history of


immigration into the United States. In that year the total
immigration reached the figure of 788,992, a point which had never
been reached before and was not reached again until 1903. It
witnessed the climax of the movement from the Scandinavian
countries, and from Germany; only once since then has the
immigration from the United Kingdom reached the amount of that
year. It coincides almost exactly with the appearance of the streams
of immigration from Italy, Austria-Hungary, and Russia of sufficient
volume to command attention. In that year the first Chinese
exclusion act and the first inclusive federal immigration law were
passed. Consequently the year 1882 stands as a natural and logical
beginning of the modern period of immigration, a period during
which the immigration movement has been marked by
characteristics so peculiarly new and definite as to distinguish it
sharply from anything which went before. The discussion of
immigration during this period is in all its essentials the discussion
of a present-day problem.
One of the most distinctive and obvious characteristics of this
period has been the growth of a complicated body of federal
immigration laws. These have put the whole immigration question
on a new basis, and deserve to be considered in some detail. In the
following review, only those sections of the successive laws which
contain matter that is of general importance have been included. All
merely technical details and many of the provisions regarding
penalties and the practical administration of the laws have been
omitted.
Act of August 3, 1882. Section 1. A duty (commonly known as a
head tax) of fifty cents is to be levied for every passenger not a citizen
of the United States, who comes from any foreign port to any port of
the United States by steam or sail vessel. This duty is to be paid to
the collector of customs of the port, by the master, owner, agent, or
consignee of the vessel within twenty-four hours after entry. The
money so collected is to constitute an Immigrant Fund, to be used to
defray the expenses of regulating immigration, for the care of
immigrants, and the relief of such as are in distress, and in general
for carrying out the provisions of the act. This duty is to constitute a
lien upon the vessel until paid.
Section 2. The Secretary of the Treasury is charged with the
execution of this act, and with supervision over the business of
immigration into the United States. He is authorized to make
contracts with state boards and commissions, which are still charged
with the duty of examining ships arriving at ports of the state. Any
convict, lunatic, idiot, or any person unable to take care of himself or
herself without becoming a public charge shall not be permitted to
land.
Section 3. The Secretary of the Treasury is empowered to make
provisions to protect immigrants from fraud and loss, and to carry
out the law.
Section 4. All foreign convicts, except those convicted of political
offenses, shall be returned to the nations to which they belong and
from which they came. The expense of returning all persons not
permitted to land is to be borne by the owners of the vessel in which
they came.
Section 5. This act shall take effect immediately.
The salient points of this law are the imposition of a federal head
tax, the beginning of a list of excluded classes, the return of excluded
aliens, at the expense of the shipowners, and the assignment of the
immigration business to the Secretary of the Treasury, the actual
work of examination, however, still being done by the state boards.
The next act bearing on immigration was Section 22 of the act of
June 26, 1884, and was designed to correct a discrimination in favor
of land transportation contained in Section 1 of the act of 1882. It
provided that until the provisions of this section should be made
applicable to passengers coming into the United States by land
carriage, they should not apply to passengers coming in vessels
trading exclusively between ports of the United States and Canada
and Mexico.
Act of February 26, 1885. Section 1. It shall be “unlawful for any
person, company, partnership, or corporation, in any manner
whatsoever, to prepay the transportation, or in any way to assist or
encourage the importation or migration of any alien or aliens, any
foreigner or foreigners, into the United States, its Territories, or the
District of Columbia, under contract or agreement, parol or special,
express or implied, made previously to the importation or migration
of such alien or aliens, foreigner or foreigners, to perform labor or
service of any kind in the United States, its Territories, or the District
of Columbia.”
Section 2. All contracts of the above nature shall be void.
Section 3. Provides for a fine of $1000 for every violation of the
above provision, payable for each alien being party to such a
contract.
Section 4. The master of any vessel who knowingly brings in
contract laborers shall be fined not more than $500, and may also be
imprisoned for not more than six months.
Section 5. The following classes shall be excepted from the
provisions of the above sections: secretaries, servants, and domestics
of foreigners temporarily residing in the United States; skilled
workmen for any industry not now established in the United States,
provided that such labor cannot be otherwise obtained; actors,
artists, lecturers or singers, or persons employed strictly as personal
or domestic servants. This act shall not prevent any individual from
assisting any member of his family or any relative or personal friend
to come in for the purpose of settlement.
On February 23, 1887, there was an amendatory act passed to the
above act, specifically intrusting the Secretary of the Treasury with
the carrying out of its provisions, and providing for the return of
contract laborers in a manner similar to other excluded classes.
On October 19, 1888, the law of 1887 was amended, providing that
a person who has entered the country contrary to the contract labor
law, may be deported within one year at the expense of the owner of
the importing vessel, or if he came by land, of the person contracting
for his services.
The section containing the provision for excluding contract
laborers has been quoted verbatim to emphasize its extremely strict
and inclusive wording. It would be very difficult for any person who
had the slightest idea of what he was going to do in this country to
prove himself outside the letter of that law. The softening clauses of
the law are put in the form of exceptions, thus throwing the burden
of the proof upon the immigrant. The last amendment quoted is of
especial interest as introducing the principle of deportation after
landing.[104]
Act of March 3, 1891. Section 1. The following additions are made
to the excluded classes: paupers or persons likely to become a public
charge, persons suffering from a loathsome or a contagious disease,
polygamists, and any person whose ticket or passage is paid for with
the money of another, or who is assisted by others to come, unless it
is specifically proved that he does not belong to one of the excluded
classes, including contract laborers.
Section 3. Assisting or encouraging immigration by promise of
employment through advertising in a foreign country is declared
illegal, with the exception of the advertisements of state agencies.
Section 4. Encouragement or solicitation of immigration by
steamship or transportation companies, except by means of regular
advertisements giving an account of sailings, facilities, and terms is
declared illegal.
Section 5. The following are added to the excepted classes under
the contract labor law: ministers of any religious denomination,
persons belonging to any recognized profession, professors of
colleges and seminaries. Relatives and friends of persons in this
country are not hereafter to be excepted.
Section 6. Persons bringing in aliens not legally entitled to enter
are made liable to a fine of not more than $1000, or imprisonment
for not more than one year, or both.
Section 7. The office of Superintendent of Immigration is created,
to be under the Secretary of the Treasury.
Section 8. Shipmasters shall file with the proper officers a
manifest, giving the name, nationality, last residence, and
destination of each alien passenger. Inspection is to be made by
inspection officers before landing, or a temporary landing may be
made at a specified place. The medical examination is to be made by
surgeons of the Marine Hospital Service. During the temporary
landing, aliens are to be properly fed and cared for. Right of appeal
granted. Landing, or allowing to land, alien passengers at any other
time or place than that specified by the inspectors is made an offense
punishable by a (maximum) fine of $1000, or imprisonment for one
year, or both. The Secretary of the Treasury is empowered to
prescribe rules for the inspection of immigrants along the borders of
Canada, British Columbia, and Mexico. The duties and powers
previously vested in the state boards are now to go to the regular
inspection officers of the United States.
Section 10. All aliens who unlawfully come to the United States
are to be immediately sent back on the vessel in which they came, all
expenses in the meantime to be borne by the shipowner.
Section 11. Any alien who comes into the United States in
violation of law may be deported within one year, and any alien who
becomes a public charge within one year after landing, from causes
existing prior to this landing, may be deported. The expenses of all
deportations are to be borne by the transportation agency
responsible for bringing in the immigrant, if that is possible, and if
not, by the United States.
The items in this act particularly worthy of notice are the
following: extension of the excluded classes; prohibition of
encouraging immigration by advertising or solicitation, an attempt to
cure two serious evils, the success of which we shall have occasion to
note later; relatives and personal friends in this country no longer
excepted from the contract labor clause (this exception had almost
vitiated the former law); requirement of manifests; the complete
assumption of the work of inspection by the federal government;
extension of the principle of deportation to public charges.
Act of March 3, 1893. Section 1. Manifests greatly enlarged in
detail.
Section 2. Alien passengers are to be listed in convenient groups
of not more than thirty each, and given tickets corresponding to their
numbers on the manifests. The master of the vessel must certify that
he and the ship’s surgeon have made an examination of all the
immigrants before sailing, and believe none of them to belong to the
excluded classes.
Section 3. If the ship has no surgeon, examination must be made
by a competent surgeon hired by the transportation company.
Section 5. Immigrants who are not beyond any doubt entitled to
land are to be held for special inquiry by a board of not less than four
inspectors.
The noteworthy features in this law are examination at the expense
of the company at the port of embarkation, listing the immigrants in
groups of thirty, the institution of the boards of special inquiry.
August 18, 1894. Head tax is raised to $1.
March 2, 1895. The Superintendent of Immigration is hereafter to
be designated the Commissioner General of Immigration.
June 6, 1900. The Commissioner General of Immigration is made
responsible for the administration of the Chinese Exclusion Acts.
March 3, 1903. Section 1. The head tax is raised to $2, and is not to
apply to citizens of Canada, Cuba, or Mexico.
Section 2. The following are added to the debarred classes:
epileptics, persons who have been insane within five years previous,
persons who have had two or more attacks of insanity at any time
previously; professional beggars, anarchists, or persons who believe
in or advocate the overthrow by force or violence of the government
of the United States, or of all government or of all forms of law, or
the assassination of public officials; prostitutes, and persons who
procure or attempt to bring in prostitutes or women for the purpose
of prostitution; those who, within one year, have been deported
under the contract labor clause.
Section 3. The importation of prostitutes is forbidden under a
(maximum) penalty of five years’ imprisonment and a fine of $5000.
Section 9. The bringing in of any person afflicted with a
loathsome or a dangerous contagious disease by any person or
company, except railway lines, is forbidden. A fine of $100 is
attached if it appears that the disease might have been detected at
the time of embarkation.
Section 11. If a rejected alien is helpless from sickness, physical
disability, or infancy, and is accompanied by an alien whose
protection is required, both shall be returned in the usual way.
Section 20. The period of deportation for aliens who have come
into this country in violation of law, including those who have
become public charges within two years after landing, is raised to
two years.
Section 21. A similar provision for deportation within three years
is made for the above classes of aliens, with the exception of public
charges.
Section 24. The appointment of immigration inspectors and other
employees is put under the Civil Service rules.
Section 25. The boards of special inquiry are to consist of three
members. Either the alien or any dissenting member of the board
may appeal.
Section 39. Anarchists, etc., are not to be naturalized.
The important features of this act are the further extension of the
excluded classes; special attention and penalties with respect to
prostitutes; the period of deportation raised to two and three years.
Act of February 14, 1903. The Department of Commerce and Labor
is created, and the Commissioner General of Immigration is
transferred to it from the Treasury Department.
March 22, 1904. Newfoundland is added to the countries exempt
from the head tax.
June 29, 1906. The Bureau of Immigration is henceforth to be
called the Bureau of Immigration and Naturalization, and is to have
charge of the business of naturalization. A register is to be kept at
immigration stations, giving full information in regard to all aliens
arriving in the United States.
On February 20, 1907, there was passed an inclusive immigration
law, designed to include all of the previous laws, and repealing such
provisions of earlier laws as are not consistent with the present law.
The principal changes introduced by the new law are as follows:
Section 1. The head tax is raised to $4. It is not to be levied on
aliens who have resided for at least one year immediately preceding,
in Canada, Newfoundland, Cuba, or Mexico, nor on aliens in transit
through the United States.
Section 2. To the excluded classes are added imbeciles, feeble-
minded persons, persons afflicted with tuberculosis, persons not
included in any of the specifically excluded classes who have a
mental or physical deficiency which may affect their ability to earn a
living, persons who admit having committed a crime involving moral
turpitude, persons who admit their belief in the practice of polygamy,
women or girls coming into the United States for the purpose of
prostitution, or for any other immoral purpose, or persons who
attempt to bring in such women or girls, and all children under the
age of sixteen unaccompanied by one or both of their parents, at the
discretion of the Secretary of Commerce and Labor. Persons whose
tickets are paid for with the money of another must show
affirmatively that they were not paid for by any corporation, society,
association, municipality, or foreign government, either directly or
indirectly. This is not to apply to aliens in continuous transit through
the United States to foreign contiguous territory.
Section 3. The harboring of immoral women and girls in houses of
prostitution, or any other place for purposes of prostitution, within a
period of three years after their arrival, is made an offense
punishable in the same manner as importing them. Such women are
liable to deportation within three years.
Section 9. A fine of $100 is imposed on any person bringing in
aliens subject to any of the following disabilities: idiots, imbeciles,
epileptics, or persons afflicted with tuberculosis (or with a loathsome
or dangerous contagious disease), if these existed and might have
been detected previous to embarkation.
Section 12. It is made the duty of shipmasters taking alien
passengers out of the United States to furnish a report, before
sailing, giving the name, age, sex, nationality, residence in the United
States, occupation, and time of last arrival in the United States of
each such alien passenger.
Section 20. All deportations may be within three years.[105]
Section 25. Appeal from a decision of a board of special inquiry
may be made by the rejected alien or by any member of the board,
through the commissioner of the port and the Commissioner General
of Immigration to the Secretary of Commerce and Labor, except in
cases of tuberculosis, loathsome or dangerous contagious disease, or
mental or physical disability, as previously provided for, in which
case the decision of the board is final.
Section 26. Any alien who is not admissible because likely to
become a public charge, or because of physical disability other than
tuberculosis or loathsome or dangerous contagious disease, may be
admitted on a suitable bond against becoming a public charge.
Section 39. An Immigration Commission is to be appointed.
Section 40. The establishment of a Division of Information is
authorized. Its duty is to promote a beneficial distribution of aliens
admitted into the United States.
Section 42. Provisions regarding steerage accommodations.[106]
The especially noteworthy features of this act are the following:
further extension of the excluded classes; more stringent provisions
regarding immoral women, and their managers; the fine for bringing
in inadmissible aliens extended to other classes; the beginning of
statistics of departing aliens; appeal not allowed from the decision of
a board of special inquiry in case of mental or physical disability;
Immigration Commission authorized; Division of Information
established.
The only important addition to immigration legislation since this
act is the act of March 26, 1910, by which there were added to the
excluded and deportable classes “persons who are supported by or
receive in full or in part the proceeds of prostitution.” The three-year
limit for deportation was removed as regards sexually immoral
aliens. Closely connected with this phase of the immigration statutes
is a recent act prohibiting the importation from one state to another
of persons for the purpose of prostitution. In accordance with an act
just passed (1913) the business of immigration and naturalization
passes over to the newly created Department of Labor.
We have seen that up to 1882 practically all the federal acts
relating to immigration had to do with the regulation of steerage
conditions. Until the year 1907 these acts, which were encouraging in
tendency, were always considered as a separate body of legislation
from the real immigration laws, which were primarily restrictive in
character. In the act of that year, however, the control of the steerage
was included in the immigration law, where it logically belonged.
There had been one or two important pieces of steerage legislation
passed previous to this time which we have not as yet noticed.
The last important steerage act which has been noted was the act
of 1855. The principal law between that date and 1907 was the act of
1882. “Viewed from the standpoint of its predecessors the passenger
act of 1882 was an excellent measure. Its framers had profited by
observing the results of the legislative experiments of about sixty-two
years. This advantage, together with the marvelous development and
progress in the methods of passenger traffic, enabled the lawmakers
to draft an intelligent and comprehensive bill. By its provisions the
safety and comfort of emigrants were, theoretically at least, assured.
No deck less than 6 feet in height on any vessel was allowed to be
used for passengers. On the main deck and the deck next below 100
cubic feet of air space was allowed each passenger, and on the second
deck below the main deck 120 cubic feet was allowed each person.
Decks other than the three above mentioned were under no
circumstances to be used for passengers. With the development of
shipbuilding, however, other decks were added to ships, and this
provision soon became obsolete. Sufficient berths for all passengers
were to be provided, the dimensions of each berth to be not less than
2 feet in width and 6 feet in length, with suitable partitions dividing
them. The sexes were to be properly separated. The steerage was to
be amply supplied with fresh air by means of modern approved
ventilators. Three cooked meals, consisting of wholesome food, were
to be served regularly each day. Each ship was to have a fully
equipped modern hospital for the use of sick passengers. A
competent physician was to be in attendance and suitable medicines
were to be carried. The ship’s master was authorized to enforce such
rules and regulations as would promote habits of cleanliness and
good health. Dangerous articles, such as highly explosive substances
and powerful acids, were forbidden on board.”[107]
The above act remained in force until 1907, when it was
superseded by Section 42 of the immigration act of that year. By this
law the cubic air space system of the act of 1882 was abandoned in
favor of the superficial area system of preventing overcrowding.
Eighteen clear feet of deck space on the main deck or the deck next
below were to be provided for each passenger, and 20 feet on the
second deck below. If the height between the lower passenger deck
and the one next above was less than 7 feet, there must be 30 clear
feet of deck space per passenger. There was also provision for light
and ventilation. No passengers were to be carried on any other decks
than those mentioned.
This act was unsatisfactory, as there was much uncertainty as to
which was the main deck, inasmuch as ships with as many as eight
decks were carrying immigrants. The British law was superior in this
respect. It specified the lowest passenger deck as the one next below
the water line. All above this were denominated passenger decks.
This law required 18 clear superficial feet for each passenger carried
on the lowest passenger deck, and 15 feet for each passenger on
passenger decks. If the height of the lowest passenger deck was less
than 7 feet, or if it was not properly lighted and ventilated, there
must be 25 feet per passenger, and under similar conditions on
passenger decks, 18 feet. There must be 5 feet of superficial open
deck space for each passenger. In reckoning the space on the lowest
passenger deck and passenger decks the space occupied by the
baggage of passengers, public rooms, lavatories, and bathrooms used
exclusively by steerage passengers might be counted, provided the
actual sleeping space was not less than 15 feet on the lowest and 12
feet on the others. On December 19, 1908, the United States passed a
law making our steerage provisions correspond with the British act,
except that the last provisions are 18 feet and 15 feet respectively in
the United States law.
In the practical application of such a complicated set of laws as
these it is inevitable that many questions and uncertainties should
arise. For the guidance of immigration officials in the performance of
their duties, a long list of rules and regulations are prescribed by the
Commissioner General. A few of these, which have an immediate
bearing on the admission of aliens must be noted. Stowaways are
considered ipso facto inadmissible, and as a rule are not even
examined. Certain border ports are specified on the Canadian and
Mexican borders, and any alien entering at any other port is assumed
to have entered in violation of law. All aliens arriving in Canada,
destined to the United States, are inspected at one of the following
ports: Halifax, Nova Scotia; Quebec and Point Levi, Quebec; St.
John, New Brunswick; Vancouver and Victoria, British Columbia.
The United States maintains inspection stations at these points, and
aliens examined there are given a certificate stating that the alien has
been inspected and is admissible, accompanied by a personal
description for purposes of identification. Special boards of inquiry
are also established in other border cities for the examination of
aliens, originally destined for Canada, but who later desire to be
admitted to the United States within one year after their arrival in
Canada. Aliens entering the United States by Mexican border ports
are, in general, subject to the same inspection as if arriving by a
seaport.
Aliens in transit are examined in the same manner as if desiring to
remain in the United States, and if they are found to belong to the
debarred classes they are refused permission to land. The head tax is
charged on their account, as for other aliens, but it is refunded to the
transportation company if the latter furnishes satisfactory proof that
the alien has passed by a continuous journey through the territory of
the United States, within thirty days, such proof to be furnished
within sixty days after the arrival of the alien.
Throughout the development of this body of laws certain well-
marked tendencies can be traced. In the first place, the criteria of
admission have steadily increased in severity, until now the law
provides for the exclusion of practically every class of applicants who
might fairly be considered undesirable, with the exception, perhaps,
of illiterates. Secondly, we may note a tendency to concentrate all
business, connected with the admission of aliens into this country or
into membership in the nation, in the hands of a single branch of the
federal government, and the increasing power and importance of this
branch. Thirdly, there is manifest an increasing recognition of the
right of this country to protect itself against unwelcome additions to
its population, not only by refusing them admission, but by expelling
them from the country, if their subsequent conduct proves them
unworthy of retention.
CHAPTER VII
VOLUME AND RACIAL COMPOSITION OF THE
IMMIGRATION STREAM

As regards the volume of the immigration current the modern


period has witnessed a continuation of the same general process
which has been going on since 1820. The same succession of crests
and depressions in the great wave has continued, the only difference
being that the apex reached a much higher point than ever before.
And, as in other periods, the great determining factor in the volume
of immigration has been the economic situation in this country.
Prosperity has always been attended by large immigration, hard
times by the reverse. As already remarked, the year 1882 was marked
by the largest annual immigration which had hitherto been recorded.
The next low-water mark was reached in the middle nineties,
following the depression of 1893. As the country recovered from this,
immigration began to increase again, and rose almost steadily until
in 1907 it reached the highest record which it has ever attained, a
grand total of 1,285,349 immigrants in one year.[108] The crisis of that
year interrupted the course of affairs, and immigration fell off
sharply, and has not since completely recovered.
There is one matter connected with the volume of immigration
which marks the last few years of the modern period and is of the
greatest importance. This is the provision for estimating the exact
net gain or loss in population each year through immigration
movements. Until very recently the only immigration figures which
were considered worth while were those of arriving aliens. It was
tacitly assumed that our immigrant traffic was a wholly one-sided
one. But gradually people began to realize that there was a large
countercurrent of departing aliens. In the Report of the
Commissioner General of Immigration for 1906 (p. 56) an effort is
made to supply as far as possible these data for the years 1890 to
1906. But in the absence of any legislation requiring shipmasters to
furnish lists of departing passengers, these figures are admittedly
incomplete, and no attempt is made to distinguish aliens from
citizens of the United States. The nearest approach that can be made
to ascertaining the number of departing aliens is to assume that all
the passengers other than cabin belonged to this class. This is
probably not very far from the truth, and taking these figures as a
guide, we can get some idea of how large the outward movement has
been at certain times, particularly during the period of commercial
depression which marked the middle nineties. Thus in 1895 while
there were 258,536 arrivals of immigrant aliens, there were 216,665
departures of the class mentioned, making a total gain of only 41,871;
in 1898 the net gain was only 98,442 against a total immigration of
229,299. Unfortunately, figures are not available for 1896–1897. The
importance of this phase of the subject eventually became so evident
that in the immigration law of 1907 a provision was included
requiring masters of departing vessels to file accurate and detailed
lists of their alien passengers, giving certain important facts
concerning them. Accordingly, in the fiscal year 1908 we have for the
first time complete and accurate data regarding departing aliens.
In that year another important distinction is made, that between
immigrant and nonimmigrant aliens on the inward passage, and
emigrant and nonemigrant aliens on the outward passage.
Immigrant aliens are those whose place of last permanent residence
was in some foreign country, and who are coming here with the
intention of residing permanently. Nonimmigrant aliens are of two
classes: those whose place of last permanent residence was the
United States, but who have been abroad for a short period of time,
and those whose place of last permanent residence was in a foreign
country, and who are coming to the United States without the
intention of residing permanently, including aliens in transit.
Departing aliens are classified in a corresponding way. Emigrant
aliens are those whose place of last permanent residence was the
United States, and who are going abroad with the intention of
residing there permanently. Nonemigrant aliens are of two classes:
those whose place of last permanent residence was the United States,

You might also like