100% found this document useful (5 votes)
16 views

Data Parallel C++: Programming Accelerated Systems Using C++ and SYCL 2nd Edition James Reinders - Quickly download the ebook in PDF format for unlimited reading

The document provides information about various eBooks available for download at ebookmeta.com, including titles related to programming, algorithms, and health research. It highlights 'Data Parallel C++: Programming Accelerated Systems Using C++ and SYCL, Second Edition' by James Reinders as a key title. Additionally, it includes links to other recommended digital products for immediate download.

Uploaded by

susansrajaui
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (5 votes)
16 views

Data Parallel C++: Programming Accelerated Systems Using C++ and SYCL 2nd Edition James Reinders - Quickly download the ebook in PDF format for unlimited reading

The document provides information about various eBooks available for download at ebookmeta.com, including titles related to programming, algorithms, and health research. It highlights 'Data Parallel C++: Programming Accelerated Systems Using C++ and SYCL, Second Edition' by James Reinders as a key title. Additionally, it includes links to other recommended digital products for immediate download.

Uploaded by

susansrajaui
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Read Anytime Anywhere Easy Ebook Downloads at ebookmeta.

com

Data Parallel C++: Programming Accelerated Systems


Using C++ and SYCL 2nd Edition James Reinders

https://ebookmeta.com/product/data-parallel-c-programming-
accelerated-systems-using-c-and-sycl-2nd-edition-james-
reinders/

OR CLICK HERE

DOWLOAD EBOOK

Visit and Get More Ebook Downloads Instantly at https://ebookmeta.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Mnemonics for Radiologists and FRCR 2B Viva Preparation A


Systematic Approach Aug 15 2013 _ 1908911956 _ CRC Press
1st Edition Yoong
https://ebookmeta.com/product/mnemonics-for-radiologists-and-
frcr-2b-viva-preparation-a-systematic-approach-
aug-15-2013-_-1908911956-_-crc-press-1st-edition-yoong/
ebookmeta.com

Modern Parallel Programming with C++ and Assembly Language


Daniel Kusswurm

https://ebookmeta.com/product/modern-parallel-programming-with-c-and-
assembly-language-daniel-kusswurm/

ebookmeta.com

Problem Solving in Data Structures Algorithms Using C 2nd


Edition Hemant Jain

https://ebookmeta.com/product/problem-solving-in-data-structures-
algorithms-using-c-2nd-edition-hemant-jain/

ebookmeta.com

Bane High Heat BBW Mountain Man Instalove 1st Edition


Kelsie Calloway Calloway Kelsie

https://ebookmeta.com/product/bane-high-heat-bbw-mountain-man-
instalove-1st-edition-kelsie-calloway-calloway-kelsie/

ebookmeta.com
Test-Driven Development with React: Apply Test-Driven
Development in Your Applications 1st Edition Qiu

https://ebookmeta.com/product/test-driven-development-with-react-
apply-test-driven-development-in-your-applications-1st-edition-qiu/

ebookmeta.com

Principles of General Organic and Biological Chemistry


11th Edition Janice Gorzynski Smith

https://ebookmeta.com/product/principles-of-general-organic-and-
biological-chemistry-11th-edition-janice-gorzynski-smith/

ebookmeta.com

Seemings and the Foundations of Justification 1st Edition


Blake Mcallister

https://ebookmeta.com/product/seemings-and-the-foundations-of-
justification-1st-edition-blake-mcallister/

ebookmeta.com

Delivering on the Promise of Democracy Visual Case Studies


in Educational Equity and Transformation 1st Edition
Sukhwant Jhaj
https://ebookmeta.com/product/delivering-on-the-promise-of-democracy-
visual-case-studies-in-educational-equity-and-transformation-1st-
edition-sukhwant-jhaj/
ebookmeta.com

Designing and Conducting Gender Sex and Health Research


1st Edition John L Oliffe Lorraine Greaves

https://ebookmeta.com/product/designing-and-conducting-gender-sex-and-
health-research-1st-edition-john-l-oliffe-lorraine-greaves/

ebookmeta.com
Burning the Big House 4th Edition Terence Dooley

https://ebookmeta.com/product/burning-the-big-house-4th-edition-
terence-dooley/

ebookmeta.com
Data Parallel C++
Programming Accelerated Systems Using
C++ and SYCL

Second Edition

James Reinders
Ben Ashbaugh
James Brodman
Michael Kinsner
John Pennycook
Xinmin Tian
Foreword by Erik Lindahl, GROMACS and
Stockholm University
Data Parallel C++
Programming Accelerated
Systems Using C++ and SYCL
Second Edition

James Reinders
Ben Ashbaugh
James Brodman
Michael Kinsner
John Pennycook
Xinmin Tian
Foreword by Erik Lindahl, GROMACS and
Stockholm University
Data Parallel C++: Programming Accelerated Systems Using C++ and SYCL, Second Edition
James Reinders Michael Kinsner
Beaverton, OR, USA Halifax, NS, Canada
Ben Ashbaugh John Pennycook
Folsom, CA, USA San Jose, CA, USA
James Brodman Xinmin Tian
Marlborough, MA, USA Fremont, CA, USA

ISBN-13 (pbk): 978-1-4842-9690-5 ISBN-13 (electronic): 978-1-4842-9691-2


https://doi.org/10.1007/978-1-4842-9691-2

Copyright © 2023 by Intel Corporation


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on
microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed.
Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0
International License (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes
were made.
The images or other third party material in this book are included in the book’s Creative Commons license, unless indicated
otherwise in a credit line to the material. If material is not included in the book’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of
a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the
trademark owner, with no intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such,
is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
Intel, the Intel logo, Intel Optane, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. OpenCL
and the OpenCL logo are trademarks of Apple Inc. in the U.S. and/or other countries. OpenMP and the OpenMP logo are
trademarks of the OpenMP Architecture Review Board in the U.S. and/or other countries. SYCL, the SYCL logo, Khronos and
the Khronos Group logo are trademarks of the Khronos Group Inc. The open source DPC++ compiler is based on a published
Khronos SYCL specification. The current conformance status of SYCL implementations can be found at https://www.khronos.
org/conformance/adopters/conformant-products/sycl.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests are measured using specific computer systems, components, software, operations and functions. Any change
to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you
in fully evaluating your contemplated purchases, including the performance of that product when combined with other
products. For more complete information visit https://www.intel.com/benchmarks. Performance results are based on testing
as of dates shown in configuration and may not reflect all publicly available security updates. See configuration disclosure for
details. No product or component can be absolutely secure. Intel technologies’ features and benefits depend on system
configuration and may require enabled hardware, software or service activation. Performance varies depending on system
configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at
www.intel.com.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the
authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The
publisher makes no warranty, express or implied, with respect to the material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Susan McDermot
Development Editor: James Markham
Coordinating Editor: Jessica Vakili
Distributed to the book trade worldwide by Springer Science+Business Media New York, 1 NY Plaza, New York, NY 10004.
Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit https://www.springeronline.com.
Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM
Finance Inc). SSBM Finance Inc is a Delaware corporation.
For information on translations, please e-mail booktranslations@springernature.com; for reprint, paperback, or audio rights,
please e-mail bookpermissions@springernature.com.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also
available for most titles. For more information, reference our Print and eBook Bulk Sales web page at https://www.apress.com/
bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available to readers on the Github
repository: https://github.com/Apress/Data-Parallel-CPP. For more detailed information, please visit https://www.apress.
com/gp/services/source-code.
Paper in this product is recyclable
Table of Contents
About the Authors������������������������������������������������������������������������������xix

Preface����������������������������������������������������������������������������������������������xxi

Foreword������������������������������������������������������������������������������������������ xxv

Acknowledgments��������������������������������������������������������������������������� xxix

Chapter 1: Introduction������������������������������������������������������������������������1
Read the Book, Not the Spec��������������������������������������������������������������������������������2
SYCL 2020 and DPC++�����������������������������������������������������������������������������������������3
Why Not CUDA?�����������������������������������������������������������������������������������������������������4
Why Standard C++ with SYCL?����������������������������������������������������������������������������5
Getting a C++ Compiler with SYCL Support���������������������������������������������������������5
Hello, World! and a SYCL Program Dissection�������������������������������������������������������6
Queues and Actions����������������������������������������������������������������������������������������������7
It Is All About Parallelism��������������������������������������������������������������������������������������8
Throughput������������������������������������������������������������������������������������������������������8
Latency������������������������������������������������������������������������������������������������������������9
Think Parallel���������������������������������������������������������������������������������������������������9
Amdahl and Gustafson����������������������������������������������������������������������������������10
Scaling�����������������������������������������������������������������������������������������������������������11
Heterogeneous Systems��������������������������������������������������������������������������������11
Data-Parallel Programming���������������������������������������������������������������������������13

iii
Table of Contents

Key Attributes of C++ with SYCL������������������������������������������������������������������������14


Single-Source������������������������������������������������������������������������������������������������14
Host���������������������������������������������������������������������������������������������������������������15
Devices����������������������������������������������������������������������������������������������������������15
Kernel Code���������������������������������������������������������������������������������������������������16
Asynchronous Execution�������������������������������������������������������������������������������18
Race Conditions When We Make a Mistake���������������������������������������������������19
Deadlock��������������������������������������������������������������������������������������������������������22
C++ Lambda Expressions�����������������������������������������������������������������������������23
Functional Portability and Performance Portability���������������������������������������26
Concurrency vs. Parallelism��������������������������������������������������������������������������������28
Summary������������������������������������������������������������������������������������������������������������30

Chapter 2: Where Code Executes��������������������������������������������������������31


Single-Source�����������������������������������������������������������������������������������������������������31
Host Code������������������������������������������������������������������������������������������������������33
Device Code���������������������������������������������������������������������������������������������������34
Choosing Devices������������������������������������������������������������������������������������������������36
Method#1: Run on a Device of Any Type�������������������������������������������������������������37
Queues����������������������������������������������������������������������������������������������������������37
Binding a Queue to a Device When Any Device Will Do���������������������������������41
Method#2: Using a CPU Device for Development, Debugging,
and Deployment��������������������������������������������������������������������������������������������������42
Method#3: Using a GPU (or Other Accelerators)��������������������������������������������������45
Accelerator Devices���������������������������������������������������������������������������������������46
Device Selectors��������������������������������������������������������������������������������������������46
Method#4: Using Multiple Devices����������������������������������������������������������������������50

iv
Table of Contents

Method#5: Custom (Very Specific) Device Selection������������������������������������������51


Selection Based on Device Aspects���������������������������������������������������������������51
Selection Through a Custom Selector�����������������������������������������������������������53
Creating Work on a Device����������������������������������������������������������������������������������54
Introducing the Task Graph����������������������������������������������������������������������������54
Where Is the Device Code?����������������������������������������������������������������������������56
Actions�����������������������������������������������������������������������������������������������������������60
Host tasks������������������������������������������������������������������������������������������������������63
Summary������������������������������������������������������������������������������������������������������������65

Chapter 3: Data Management�������������������������������������������������������������67


Introduction���������������������������������������������������������������������������������������������������������68
The Data Management Problem�������������������������������������������������������������������������69
Device Local vs. Device Remote�������������������������������������������������������������������������69
Managing Multiple Memories�����������������������������������������������������������������������������70
Explicit Data Movement���������������������������������������������������������������������������������70
Implicit Data Movement��������������������������������������������������������������������������������71
Selecting the Right Strategy��������������������������������������������������������������������������71
USM, Buffers, and Images�����������������������������������������������������������������������������������72
Unified Shared Memory��������������������������������������������������������������������������������������72
Accessing Memory Through Pointers������������������������������������������������������������73
USM and Data Movement������������������������������������������������������������������������������74
Buffers����������������������������������������������������������������������������������������������������������������77
Creating Buffers��������������������������������������������������������������������������������������������78
Accessing Buffers������������������������������������������������������������������������������������������78
Access Modes�����������������������������������������������������������������������������������������������80

v
Table of Contents

Ordering the Uses of Data�����������������������������������������������������������������������������������80


In-order Queues���������������������������������������������������������������������������������������������83
Out-of-Order Queues�������������������������������������������������������������������������������������84
Choosing a Data Management Strategy��������������������������������������������������������������92
Handler Class: Key Members������������������������������������������������������������������������������93
Summary������������������������������������������������������������������������������������������������������������96

Chapter 4: Expressing Parallelism������������������������������������������������������97


Parallelism Within Kernels����������������������������������������������������������������������������������98
Loops vs. Kernels������������������������������������������������������������������������������������������������99
Multidimensional Kernels���������������������������������������������������������������������������������101
Overview of Language Features�����������������������������������������������������������������������102
Separating Kernels from Host Code������������������������������������������������������������102
Different Forms of Parallel Kernels�������������������������������������������������������������������103
Basic Data-Parallel Kernels������������������������������������������������������������������������������105
Understanding Basic Data-Parallel Kernels�������������������������������������������������105
Writing Basic Data-Parallel Kernels������������������������������������������������������������107
Details of Basic Data-Parallel Kernels���������������������������������������������������������109
Explicit ND-Range Kernels��������������������������������������������������������������������������������112
Understanding Explicit ND-Range Parallel Kernels�������������������������������������113
Writing Explicit ND-Range Data-Parallel Kernels����������������������������������������121
Details of Explicit ND-Range Data-­Parallel Kernels�������������������������������������122
Mapping Computation to Work-Items���������������������������������������������������������������127
One-to-One Mapping�����������������������������������������������������������������������������������128
Many-to-One Mapping���������������������������������������������������������������������������������128
Choosing a Kernel Form������������������������������������������������������������������������������������130
Summary����������������������������������������������������������������������������������������������������������132

vi
Table of Contents

Chapter 5: Error Handling�����������������������������������������������������������������135


Safety First��������������������������������������������������������������������������������������������������������135
Types of Errors��������������������������������������������������������������������������������������������������136
Let’s Create Some Errors!���������������������������������������������������������������������������������138
Synchronous Error���������������������������������������������������������������������������������������139
Asynchronous Error�������������������������������������������������������������������������������������139
Application Error Handling Strategy������������������������������������������������������������������140
Ignoring Error Handling�������������������������������������������������������������������������������141
Synchronous Error Handling������������������������������������������������������������������������143
Asynchronous Error Handling����������������������������������������������������������������������144
The Asynchronous Handler��������������������������������������������������������������������������145
Invocation of the Handler����������������������������������������������������������������������������148
Errors on a Device���������������������������������������������������������������������������������������������149
Summary����������������������������������������������������������������������������������������������������������150

Chapter 6: Unified Shared Memory���������������������������������������������������153


Why Should We Use USM?��������������������������������������������������������������������������������153
Allocation Types������������������������������������������������������������������������������������������������154
Device Allocations���������������������������������������������������������������������������������������154
Host Allocations�������������������������������������������������������������������������������������������155
Shared Allocations���������������������������������������������������������������������������������������155
Allocating Memory��������������������������������������������������������������������������������������������156
What Do We Need to Know?������������������������������������������������������������������������156
Multiple Styles���������������������������������������������������������������������������������������������157
Deallocating Memory����������������������������������������������������������������������������������164
Allocation Example��������������������������������������������������������������������������������������165

vii
Table of Contents

Data Management���������������������������������������������������������������������������������������������165
Initialization�������������������������������������������������������������������������������������������������165
Data Movement�������������������������������������������������������������������������������������������166
Queries��������������������������������������������������������������������������������������������������������������174
One More Thing�������������������������������������������������������������������������������������������������177
Summary����������������������������������������������������������������������������������������������������������178

Chapter 7: Buffers����������������������������������������������������������������������������179
Buffers��������������������������������������������������������������������������������������������������������������180
Buffer Creation��������������������������������������������������������������������������������������������181
What Can We Do with a Buffer?������������������������������������������������������������������188
Accessors����������������������������������������������������������������������������������������������������������189
Accessor Creation���������������������������������������������������������������������������������������192
What Can We Do with an Accessor?������������������������������������������������������������198
Summary����������������������������������������������������������������������������������������������������������199

Chapter 8: Scheduling Kernels and Data Movement������������������������201


What Is Graph Scheduling?�������������������������������������������������������������������������������202
How Graphs Work in SYCL��������������������������������������������������������������������������������202
Command Group Actions�����������������������������������������������������������������������������203
How Command Groups Declare Dependences��������������������������������������������203
Examples�����������������������������������������������������������������������������������������������������204
When Are the Parts of a Command Group Executed?����������������������������������213
Data Movement�������������������������������������������������������������������������������������������������213
Explicit Data Movement�������������������������������������������������������������������������������213
Implicit Data Movement������������������������������������������������������������������������������214
Synchronizing with the Host�����������������������������������������������������������������������������216
Summary����������������������������������������������������������������������������������������������������������218

viii
Table of Contents

Chapter 9: Communication and Synchronization�����������������������������221


Work-Groups and Work-Items���������������������������������������������������������������������������221
Building Blocks for Efficient Communication����������������������������������������������������223
Synchronization via Barriers�����������������������������������������������������������������������223
Work-Group Local Memory��������������������������������������������������������������������������225
Using Work-Group Barriers and Local Memory�������������������������������������������������227
Work-Group Barriers and Local Memory in ND-Range Kernels�������������������231
Sub-Groups�������������������������������������������������������������������������������������������������������235
Synchronization via Sub-Group Barriers�����������������������������������������������������236
Exchanging Data Within a Sub-Group����������������������������������������������������������237
A Full Sub-Group ND-Range Kernel Example����������������������������������������������239
Group Functions and Group Algorithms������������������������������������������������������������241
Broadcast����������������������������������������������������������������������������������������������������241
Votes������������������������������������������������������������������������������������������������������������242
Shuffles�������������������������������������������������������������������������������������������������������243
Summary����������������������������������������������������������������������������������������������������������246

Chapter 10: Defining Kernels������������������������������������������������������������249


Why Three Ways to Represent a Kernel?����������������������������������������������������������249
Kernels as Lambda Expressions�����������������������������������������������������������������������251
Elements of a Kernel Lambda Expression���������������������������������������������������251
Identifying Kernel Lambda Expressions������������������������������������������������������254
Kernels as Named Function Objects�����������������������������������������������������������������255
Elements of a Kernel Named Function Object���������������������������������������������256
Kernels in Kernel Bundles���������������������������������������������������������������������������������259
Interoperability with Other APIs������������������������������������������������������������������������264
Summary����������������������������������������������������������������������������������������������������������264

ix
Table of Contents

Chapter 11: Vectors and Math Arrays����������������������������������������������267


The Ambiguity of Vector Types��������������������������������������������������������������������������268
Our Mental Model for SYCL Vector Types����������������������������������������������������������269
Math Array (marray)������������������������������������������������������������������������������������������271
Vector (vec)�������������������������������������������������������������������������������������������������������273
Loads and Stores�����������������������������������������������������������������������������������������274
Interoperability with Backend-Native Vector Types�������������������������������������276
Swizzle Operations��������������������������������������������������������������������������������������276
How Vector Types Execute��������������������������������������������������������������������������������280
Vectors as Convenience Types��������������������������������������������������������������������280
Vectors as SIMD Types��������������������������������������������������������������������������������284
Summary����������������������������������������������������������������������������������������������������������286

Chapter 12: Device Information and Kernel Specialization��������������289


Is There a GPU Present?������������������������������������������������������������������������������������290
Refining Kernel Code to Be More Prescriptive��������������������������������������������������291
How to Enumerate Devices and Capabilities����������������������������������������������������293
Aspects��������������������������������������������������������������������������������������������������������296
Custom Device Selector������������������������������������������������������������������������������298
Being Curious: get_info<>��������������������������������������������������������������������������300
Being More Curious: Detailed Enumeration Code����������������������������������������301
Very Curious: get_info plus has()�����������������������������������������������������������������303
Device Information Descriptors������������������������������������������������������������������������303
Device-Specific Kernel Information Descriptors�����������������������������������������������303
The Specifics: Those of “Correctness”��������������������������������������������������������������304
Device Queries��������������������������������������������������������������������������������������������305
Kernel Queries���������������������������������������������������������������������������������������������306

x
Table of Contents

The Specifics: Those of “Tuning/Optimization”�������������������������������������������������307


Device Queries��������������������������������������������������������������������������������������������307
Kernel Queries���������������������������������������������������������������������������������������������308
Runtime vs. Compile-Time Properties��������������������������������������������������������������308
Kernel Specialization����������������������������������������������������������������������������������������309
Summary����������������������������������������������������������������������������������������������������������312

Chapter 13: Practical Tips����������������������������������������������������������������313


Getting the Code Samples and a Compiler�������������������������������������������������������313
Online Resources����������������������������������������������������������������������������������������������313
Platform Model�������������������������������������������������������������������������������������������������314
Multiarchitecture Binaries���������������������������������������������������������������������������315
Compilation Model���������������������������������������������������������������������������������������316
Contexts: Important Things to Know�����������������������������������������������������������������319
Adding SYCL to Existing C++ Programs�����������������������������������������������������������321
Considerations When Using Multiple Compilers�����������������������������������������������322
Debugging���������������������������������������������������������������������������������������������������������323
Debugging Deadlock and Other Synchronization Issues�����������������������������325
Debugging Kernel Code�������������������������������������������������������������������������������326
Debugging Runtime Failures�����������������������������������������������������������������������327
Queue Profiling and Resulting Timing Capabilities��������������������������������������330
Tracing and Profiling Tools Interfaces����������������������������������������������������������334
Initializing Data and Accessing Kernel Outputs������������������������������������������������335
Multiple Translation Units����������������������������������������������������������������������������������344
Performance Implication of Multiple Translation Units��������������������������������345
When Anonymous Lambdas Need Names��������������������������������������������������������345
Summary����������������������������������������������������������������������������������������������������������346

xi
Table of Contents

Chapter 14: Common Parallel Patterns���������������������������������������������349


Understanding the Patterns������������������������������������������������������������������������������350
Map�������������������������������������������������������������������������������������������������������������351
Stencil���������������������������������������������������������������������������������������������������������352
Reduction����������������������������������������������������������������������������������������������������354
Scan������������������������������������������������������������������������������������������������������������356
Pack and Unpack�����������������������������������������������������������������������������������������358
Using Built-In Functions and Libraries��������������������������������������������������������������360
The SYCL Reduction Library������������������������������������������������������������������������360
Group Algorithms�����������������������������������������������������������������������������������������366
Direct Programming������������������������������������������������������������������������������������������370
Map�������������������������������������������������������������������������������������������������������������370
Stencil���������������������������������������������������������������������������������������������������������371
Reduction����������������������������������������������������������������������������������������������������373
Scan������������������������������������������������������������������������������������������������������������374
Pack and Unpack�����������������������������������������������������������������������������������������377
Summary����������������������������������������������������������������������������������������������������������380
For More Information�����������������������������������������������������������������������������������381

Chapter 15: Programming for GPUs��������������������������������������������������383


Performance Caveats����������������������������������������������������������������������������������������383
How GPUs Work������������������������������������������������������������������������������������������������384
GPU Building Blocks������������������������������������������������������������������������������������384
Simpler Processors (but More of Them)������������������������������������������������������386
Simplified Control Logic (SIMD Instructions)�����������������������������������������������391
Switching Work to Hide Latency������������������������������������������������������������������398
Offloading Kernels to GPUs�������������������������������������������������������������������������������400
SYCL Runtime Library����������������������������������������������������������������������������������400
GPU Software Drivers����������������������������������������������������������������������������������401

xii
Table of Contents

GPU Hardware���������������������������������������������������������������������������������������������402
Beware the Cost of Offloading!��������������������������������������������������������������������403
GPU Kernel Best Practices��������������������������������������������������������������������������������405
Accessing Global Memory���������������������������������������������������������������������������405
Accessing Work-Group Local Memory���������������������������������������������������������409
Avoiding Local Memory Entirely with Sub-Groups��������������������������������������412
Optimizing Computation Using Small Data Types����������������������������������������412
Optimizing Math Functions��������������������������������������������������������������������������413
Specialized Functions and Extensions��������������������������������������������������������414
Summary����������������������������������������������������������������������������������������������������������414
For More Information�����������������������������������������������������������������������������������415

Chapter 16: Programming for CPUs��������������������������������������������������417


Performance Caveats����������������������������������������������������������������������������������������418
The Basics of Multicore CPUs���������������������������������������������������������������������������419
The Basics of SIMD Hardware���������������������������������������������������������������������������422
Exploiting Thread-Level Parallelism������������������������������������������������������������������428
Thread Affinity Insight���������������������������������������������������������������������������������431
Be Mindful of First Touch to Memory�����������������������������������������������������������435
SIMD Vectorization on CPU��������������������������������������������������������������������������������436
Ensure SIMD Execution Legality������������������������������������������������������������������437
SIMD Masking and Cost������������������������������������������������������������������������������440
Avoid Array of Struct for SIMD Efficiency����������������������������������������������������442
Data Type Impact on SIMD Efficiency����������������������������������������������������������444
SIMD Execution Using single_task��������������������������������������������������������������446
Summary����������������������������������������������������������������������������������������������������������448

xiii
Table of Contents

Chapter 17: Programming for FPGAs������������������������������������������������451


Performance Caveats����������������������������������������������������������������������������������������452
How to Think About FPGAs��������������������������������������������������������������������������������452
Pipeline Parallelism�������������������������������������������������������������������������������������456
Kernels Consume Chip “Area”���������������������������������������������������������������������459
When to Use an FPGA����������������������������������������������������������������������������������������460
Lots and Lots of Work����������������������������������������������������������������������������������460
Custom Operations or Operation Widths������������������������������������������������������461
Scalar Data Flow�����������������������������������������������������������������������������������������462
Low Latency and Rich Connectivity�������������������������������������������������������������463
Customized Memory Systems���������������������������������������������������������������������464
Running on an FPGA�����������������������������������������������������������������������������������������465
Compile Times���������������������������������������������������������������������������������������������467
The FPGA Emulator��������������������������������������������������������������������������������������469
FPGA Hardware Compilation Occurs “Ahead-­of-Time”��������������������������������470
Writing Kernels for FPGAs���������������������������������������������������������������������������������471
Exposing Parallelism�����������������������������������������������������������������������������������472
Keeping the Pipeline Busy Using ND-Ranges����������������������������������������������475
Pipelines Do Not Mind Data Dependences!�������������������������������������������������478
Spatial Pipeline Implementation of a Loop��������������������������������������������������481
Loop Initiation Interval���������������������������������������������������������������������������������483
Pipes������������������������������������������������������������������������������������������������������������489
Custom Memory Systems����������������������������������������������������������������������������495
Some Closing Topics�����������������������������������������������������������������������������������������498
FPGA Building Blocks����������������������������������������������������������������������������������498
Clock Frequency������������������������������������������������������������������������������������������500
Summary����������������������������������������������������������������������������������������������������������501

xiv
Table of Contents

Chapter 18: Libraries������������������������������������������������������������������������503


Built-In Functions����������������������������������������������������������������������������������������������504
Use the sycl:: Prefix with Built-In Functions������������������������������������������������506
The C++ Standard Library��������������������������������������������������������������������������������507
oneAPI DPC++ Library (oneDPL)�����������������������������������������������������������������������510
SYCL Execution Policy���������������������������������������������������������������������������������511
Using oneDPL with Buffers��������������������������������������������������������������������������513
Using oneDPL with USM������������������������������������������������������������������������������517
Error Handling with SYCL Execution Policies����������������������������������������������519
Summary����������������������������������������������������������������������������������������������������������520

Chapter 19: Memory Model and Atomics�����������������������������������������523


What’s in a Memory Model?�����������������������������������������������������������������������������525
Data Races and Synchronization�����������������������������������������������������������������526
Barriers and Fences������������������������������������������������������������������������������������529
Atomic Operations���������������������������������������������������������������������������������������531
Memory Ordering�����������������������������������������������������������������������������������������532
The Memory Model�������������������������������������������������������������������������������������������534
The memory_order Enumeration Class�������������������������������������������������������536
The memory_scope Enumeration Class������������������������������������������������������538
Querying Device Capabilities�����������������������������������������������������������������������540
Barriers and Fences������������������������������������������������������������������������������������542
Atomic Operations in SYCL��������������������������������������������������������������������������543
Using Atomics with Buffers�������������������������������������������������������������������������548
Using Atomics with Unified Shared Memory�����������������������������������������������550
Using Atomics in Real Life��������������������������������������������������������������������������������550
Computing a Histogram�������������������������������������������������������������������������������551
Implementing Device-Wide Synchronization�����������������������������������������������553

xv
Table of Contents

Summary����������������������������������������������������������������������������������������������������������556
For More Information�����������������������������������������������������������������������������������557

Chapter 20: Backend Interoperability�����������������������������������������������559


What Is Backend Interoperability?��������������������������������������������������������������������559
When Is Backend Interoperability Useful?��������������������������������������������������������561
Adding SYCL to an Existing Codebase���������������������������������������������������������562
Using Existing Libraries with SYCL��������������������������������������������������������������564
Using Backend Interoperability for Kernels�������������������������������������������������������569
Interoperability with API-Defined Kernel Objects����������������������������������������569
Interoperability with Non-SYCL Source Languages�������������������������������������571
Backend Interoperability Hints and Tips�����������������������������������������������������������574
Choosing a Device for a Specific Backend��������������������������������������������������574
Be Careful About Contexts!��������������������������������������������������������������������������576
Access Low-Level API-Specific Features����������������������������������������������������576
Support for Other Backends������������������������������������������������������������������������577
Summary����������������������������������������������������������������������������������������������������������577

Chapter 21: Migrating CUDA Code����������������������������������������������������579


Design Differences Between CUDA and SYCL���������������������������������������������������579
Multiple Targets vs. Single Device Targets��������������������������������������������������579
Aligning to C++ vs. Extending C++�������������������������������������������������������������581
Terminology Differences Between CUDA and SYCL������������������������������������������582
Similarities and Differences������������������������������������������������������������������������������583
Execution Model������������������������������������������������������������������������������������������584
Memory Model��������������������������������������������������������������������������������������������589
Other Differences����������������������������������������������������������������������������������������592

xvi
Table of Contents

Features in CUDA That Aren’t In SYCL… Yet!����������������������������������������������������595


Global Variables�������������������������������������������������������������������������������������������595
Cooperative Groups�������������������������������������������������������������������������������������596
Matrix Multiplication Hardware�������������������������������������������������������������������597
Porting Tools and Techniques����������������������������������������������������������������������������598
Migrating Code with dpct and SYCLomatic�������������������������������������������������598
Summary����������������������������������������������������������������������������������������������������������603
For More Information����������������������������������������������������������������������������������������604

Epilogue: Future Direction of SYCL���������������������������������������������������605

Index�������������������������������������������������������������������������������������������������615

xvii
About the Authors
James Reinders is an Engineer at Intel Corporation with more than four
decades of experience in parallel computing and is an author/coauthor/
editor of more than ten technical books related to parallel programming.
James has a passion for system optimization and teaching. He has had the
great fortune to help make contributions to several of the world’s fastest
computers (#1 on the TOP500 list) as well as many other supercomputers
and software developer tools.

Ben Ashbaugh is a Software Architect at Intel Corporation where he has


worked for over 20 years developing software drivers and compilers for
Intel graphics products. For the past ten years, Ben has focused on parallel
programming models for general-purpose computation on graphics
processors, including SYCL and the DPC++ compiler. Ben is active in the
Khronos SYCL, OpenCL, and SPIR working groups, helping define industry
standards for parallel programming, and he has authored numerous
extensions to expose unique Intel GPU features.

James Brodman is a Principal Engineer at Intel Corporation working on


runtimes and compilers for parallel programming, and he is one of the
architects of DPC++. James has a Ph.D. in Computer Science from the
University of Illinois at Urbana-Champaign.

xix
About the Authors

Michael Kinsner is a Principal Engineer at Intel Corporation developing


parallel programming languages and compilers for a variety of
architectures. Michael contributes extensively to spatial architectures and
programming models and is an Intel representative within The Khronos
Group where he works on the SYCL and OpenCL industry standards for
parallel programming. Mike has a Ph.D. in Computer Engineering from
McMaster University and is passionate about programming models that
cross architectures while still enabling performance.

John Pennycook is a Software Enabling and Optimization Architect


at Intel Corporation, focused on enabling developers to fully utilize
the parallelism available in modern processors. John is experienced
in optimizing and parallelizing applications from a range of scientific
domains, and previously served as Intel’s representative on the steering
committee for the Intel eXtreme Performance User’s Group (IXPUG).
John has a Ph.D. in Computer Science from the University of Warwick.
His research interests are varied, but a recurring theme is the ability to
achieve application “performance portability” across different hardware
architectures.

Xinmin Tian is an Intel Fellow and Compiler Architect at Intel Corporation


and serves as Intel’s representative on the OpenMP Architecture Review
Board (ARB). Xinmin has been driving OpenMP offloading, vectorization,
and parallelization compiler technologies for Intel architectures. His
current focus is on LLVM-based OpenMP offloading, SYCL/DPC++
compiler optimizations for CPUs/GPUs, and tuning HPC/AI application
performance. He has a Ph.D. in Computer Science from Tsinghua
University, holds 27 US patents, has published over 60 technical papers
with over 1300+ citations of his work, and has coauthored two books that
span his expertise.

xx
Preface
If you are new to parallel programming that is okay. If you have never
heard of SYCL or the DPC++ compilerthat is also okay
Compared with programming in CUDA, C++ with SYCL offers
portability beyond NVIDIA, and portability beyond GPUs, plus a tight
alignment to enhance modern C++ as it evolves too. C++ with SYCL offers
these advantages without sacrificing performance.
C++ with SYCL allows us to accelerate our applications by harnessing
the combined capabilities of CPUs, GPUs, FPGAs, and processing devices
of the future without being tied to any one vendor.
SYCL is an industry-driven Khronos Group standard adding
advanced support for data parallelism with C++ to exploit accelerated
(heterogeneous) systems. SYCL provides mechanisms for C++ compilers
that are highly synergistic with C++ and C++ build systems. DPC++ is an
open source compiler project based on LLVM that adds SYCL support.
All examples in this book should work with any C++ compiler supporting
SYCL 2020 including the DPC++ compiler.
If you are a C programmer who is not well versed in C++, you are in
good company. Several of the authors of this book happily share that
they picked up much of C++ by reading books that utilized C++ like this
one. With a little patience, this book should also be approachable by C
programmers with a desire to write modern C++ programs.

Second Edition
With the benefit of feedback from a growing community of SYCL users, we
have been able to add content to help learn SYCL better than ever.

xxi
Preface

This edition teaches C++ with SYCL 2020. The first edition preceded
the SYCL 2020 specification, which differed only slightly from what the
first edition taught (the most obvious changes for SYCL 2020 in this edition
are the header file location, the device selector syntax, and dropping an
explicit host device).

Important resources for updated SYCL information, including any


known book errata, include the book GitHub (https://github.
com/Apress/data-parallel-CPP), the Khronos Group SYCL
standards website (www.khronos.org/sycl), and a key SYCL
education website (https://sycl.tech).

Chapters 20 and 21 are additions encouraged by readers of the first


edition of this book.
We added Chapter 20 to discuss backend interoperability. One of
the key goals of the SYCL 2020 standard is to enable broad support for
hardware from many vendors with many architectures. This required
expanding beyond the OpenCL-only backend support of SYCL 1.2.1. While
generally “it just works,” Chapter 20 explains this in more detail for those
who find it valuable to understand and interface at this level.
For experienced CUDA programmers, we have added Chapter 21 to
explicitly connect C++ with SYCL concepts to CUDA concepts both in
terms of approach and vocabulary. While the core issues of expressing
heterogeneous parallelism are fundamentally similar, C++ with SYCL offers
many benefits because of its multivendor and multiarchitecture approach.
Chapter 21 is the only place we mention CUDA terminology; the rest of this
book teaches using C++ and SYCL terminology with its open multivendor,
multiarchitecture approaches. In Chapter 21, we strongly encourage
looking at the open source tool “SYCLomatic” (github.com/oneapi-src/
SYCLomatic), which helps automate migration of CUDA code. Because it

xxii
Preface

is helpful, we recommend it as the preferred first step in migrating code.


Developers using C++ with SYCL have been reporting strong results on
NVIDIA, AMD, and Intel GPUs on both codes that have been ported from
CUDA and original C++ with SYCL code. The resulting C++ with SYCL
offers portability that is not possible with NVIDIA CUDA.
The evolution of C++, SYCL, and compilers including DPC++
continues. Prospects for the future are discussed in the Epilogue, after
we have taken a journey together to learn how to create programs for
heterogeneous systems using C++ with SYCL.
It is our hope that this book supports and helps grow the SYCL
community and helps promote data-parallel programming in C++
with SYCL.

Structure of This Book


This book takes us on a journey through what we need to know to be an
effective programmer of accelerated/heterogeneous systems using C++
with SYCL.

Chapters 1–4: Lay Foundations


Chapters 1–4 are important to read in order when first approaching C++
with SYCL.
Chapter 1 lays the first foundation by covering core concepts that are
either new or worth refreshing in our minds.
Chapters 2–4 lay a foundation of understanding for data-parallel
programming in C++ with SYCL. When we finish reading Chapters 1–4,
we will have a solid foundation for data-parallel programming in C++.
Chapters 1–4 build on each other and are best read in order.

xxiii
Preface

Chapters 5–12: Build on Foundations


With the foundations established, Chapters 5–12 fill in vital details by
building on each other to some degree while being easy to jump between
as desired. All these chapters should be valuable to all users of C++
with SYCL.

Chapters 13–21: Tips/Advice for SYCL in Practice


These final chapters offer advice and details for specific needs. We
encourage at least skimming them all to find content that is important to
your needs.

Epilogue: Speculating on the Future


The book concludes with an Epilogue that discusses likely and potential
future directions for C++ with SYCL, and the Data Parallel C++ compiler
for SYCL.
We wish you the best as you learn to use C++ with SYCL.

xxiv
Foreword
SYCL 2020 is a milestone in parallel computing. For the first time we have
a modern, stable, feature-complete, and portable open standard that can
target all types of hardware, and the book you hold in your hand is the
premier resource to learn SYCL 2020.
Computer hardware development is driven by our needs to solve
larger and more complex problems, but those hardware advances are
largely useless unless programmers like you and me have languages that
allow us to implement our ideas and exploit the power available with
reasonable effort. There are numerous examples of amazing hardware,
and the first solutions to use them have often been proprietary since it
saves time not having to bother with committees agreeing on standards.
However, in the history of computing, they have eventually always ended
up as vendor lock-in—unable to compete with open standards that allow
developers to target any hardware and share code—because ultimately the
resources of the worldwide community and ecosystem are far greater than
any individual vendor, not to mention how open software standards drive
hardware competition.
Over the last few years, my team has had the tremendous privilege
of contributing to shaping the emerging SYCL ecosystem through our
development of GROMACS, one of the world’s most widely used scientific
HPC codes. We need our code to run on every supercomputer in the
world as well as our laptops. While we cannot afford to lose performance,
we also depend on being part of a larger community where other teams
invest effort in libraries we depend on, where there are open compilers
available, and where we can recruit talent. Since the first edition of this
book, SYCL has matured into such a community; in addition to several

xxv
Foreword

vendor-provided compilers, we now have a major community-driven


implementation1 that targets all hardware, and there are thousands of
developers worldwide sharing experiences, contributing to training
events, and participating in forums. The outstanding power of open
source—whether it is an application, a compiler, or an open standard—is
that we can peek under the hood to learn, borrow, and extend. Just as we
repeatedly learn from the code in the Intel-led LLVM implementation,2
the community-driven implementation from Heidelberg University, and
several other codes, you can use our public repository3 to compare CUDA
and SYCL implementations in a large production codebase or borrow
solutions for your needs—because when you do so, you are helping to
further extend our community.
Perhaps surprisingly, data-parallel programming as a paradigm is
arguably far easier than classical solutions such as message-passing
communication or explicit multithreading—but it poses special challenges
to those of us who have spent decades in the old paradigms that focus on
hardware and explicit data placement. On a small scale, it was trivial for
us to explicitly decide how data is moved between a handful of processes,
but as the problem scales to thousands of units, it becomes a nightmare to
manage the complexity without introducing bugs or having the hardware
sit idle waiting for data. Data-parallel programming with SYCL solves
this by striking the balance of primarily asking us to explicitly express the
inherent parallelism of our algorithm, but once we have done that, the
compiler and drivers will mostly handle the data locality and scheduling
over tens of thousands of functional units. To be successful in data-parallel
programming, it is important not to think of a computer as a single unit
executing one program, but as a collection of units working independently

1
Community-driven implementation from Heidelberg University: tinyurl.com/
HeidelbergSYCL
2
DPC++ compiler project: github.com/intel/llvm
3
GROMACS: gitlab.com/gromacs/gromacs/

xxvi
Foreword

to solve parts of a large problem. As long as we can express our problem as


an algorithm where each part does not have dependencies on other parts,
it is in theory straightforward to implement it, for example, as a parallel
for-loop that is executed on a GPU through a device queue. However, for
more practical examples, our problems are frequently not large enough
to use an entire device efficiently, or we depend on performing tens of
thousands of iterations per second where latency in device drivers starts
to be a major bottleneck. While this book is an outstanding introduction to
performance-portable GPU programming, it goes far beyond this to show
how both throughput and latency matter for real-world applications, how
SYCL can be used to exploit unique features both of CPUs, GPUs, SIMD
units, and FPGAs, but it also covers the caveats that for good performance
we need to understand and possibly adapt code to the characteristics of
each type of hardware. Doing so, it is not merely a great tutorial on data-
parallel programing, but an authoritative text that anybody interested in
programming modern computer hardware in general should read.
One of SYCL’s key strengths is the close alignment to modern C++.
This can seem daunting at first; C++ is not an easy language to fully master
(I certainly have not), but Reinders and coauthors take our hand and lead
us on a path where we only need to learn a handful of C++ concepts to get
started and be productive in actual data-parallel programming. However,
as we become more experienced, SYCL 2020 allows us to combine this
with the extreme generality of C++17 to write code that can be dynamically
targeted to different devices, or relying on heterogeneous parallelism that
uses CPU, GPU, and network units in parallel for different tasks. SYCL is
not a separate bolted-on solution to enable accelerators but instead holds
great promise to be the general way we express data parallelism in C++.
The SYCL 2020 standard now includes several features previously only
available as vendor extensions, for example, Unified Shared Memory,
sub-groups, atomic operations, reductions, simpler accessors, and many
other concepts that make code cleaner, and facilitates both development
as well as porting from standard C++17 or CUDA to have your code target

xxvii
Foreword

more diverse hardware. This book provides a wonderful and accessible


introduction to all of them, and you will also learn how SYCL is expected to
evolve together with the rapid development C++ is undergoing.
This all sounds great in theory, but how portable is SYCL in practice?
Our application is an example of a codebase that is quite challenging to
optimize since data access patterns are random, the amount of data to
process in each step is limited, we need to achieve thousands of iterations
per second, and we are limited both by memory bandwidth, floating-point,
and integer operations—it is an extreme opposite of a simple data-parallel
problem. We spent over two decades writing assembly SIMD instructions
and native implementations for several GPU architectures, and our
very first encounters with SYCL involved both pains with adapting to
differences and reporting performance regressions to driver and compiler
developers. However, as of spring 2023, our SYCL kernels can achieve
80–100% of native performance on all GPU architectures not only from a
single codebase but even a single precompiled binary.
SYCL is still young and has a rapidly evolving ecosystem. There are
a few things not yet part of the language, but SYCL is unique as the only
performance-portable standard available that successfully targets all
modern hardware. Whether you are a beginner wanting to learn parallel
programming, an experienced developer interested in data-parallel
programming, or a maintainer needing to port 100,000 lines of proprietary
API code to an open standard, this second edition is the only book you will
need to become part of this community.

Erik Lindahl
Professor of Biophysics
Dept. Biophysics & Biochemistry
Science for Life Laboratory
Stockholm University

xxviii
Acknowledgments
We have been blessed with an outpouring of community input for this
second edition of our book. Much inspiration came from interactions with
developers as they use SYCL in production, classes, tutorials, workshops,
conferences, and hackathons. SYCL deployments that include NVIDIA
hardware, in particular, have helped us enhance the inclusiveness and
practical tips in our teaching of SYCL in this second edition.
The SYCL community has grown a great deal—and consists of
engineers implementing compilers and tools, and a much larger group of
users that adopt SYCL to target hardware of many types and vendors. We
are grateful for their hard work, and shared insights.
We thank the Khronos SYCL Working Group that has worked diligently
to produce a highly functional specification. In particular, Ronan Keryell
has been the SYCL specification editor and a longtime vocal advocate
for SYCL.
We are in debt to the numerous people who gave us feedback from
the SYCL community in all these ways. We are also deeply grateful for
those who helped with the first edition a few years ago, many of whom we
named in the acknowledgement of that edition.
The first edition received feedback via GitHub,1 which we did review
but we were not always prompt in acknowledging (imagine six coauthors
all thinking “you did that, right?”). We did benefit a great deal from that
feedback, and we believe we have addressed all the feedback in the
samples and text for this edition. Jay Norwood was the most prolific at
commenting and helping us—a big thank you to Jay from all the authors!

1
github.com/apress/data-parallel-CPP

xxix
Acknowledgments

Other feedback contributors include Oscar Barenys, Marcel Breyer, Jeff


Donner, Laurence Field, Michael Firth, Piotr Fusik, Vincent Mierlak, and
Jason Mooneyham. Regardless of whether we recalled your name here or
not, we thank everyone who has provided feedback and helped refine our
teaching of C++ with SYCL.
For this edition, a handful of volunteers tirelessly read draft
manuscripts and provided insightful feedback for which we are incredibly
grateful. These reviewers include Aharon Abramson, Thomas Applencourt,
Rod Burns, Joe Curley, Jessica Davies, Henry Gabb, Zheming Jin, Rakshith
Krishnappa, Praveen Kundurthy, Tim Lewis, Eric Lindahl, Gregory Lueck,
Tony Mongkolsmai, Ruyman Reyes Castro, Andrew Richards, Sanjiv Shah,
Neil Trevett, and Georg Viehöver.
We all enjoy the support of our family and friends, and we cannot
thank them enough. As coauthors, we have enjoyed working as a team
challenging each other and learning together along the way. We appreciate
our collaboration with the entire Apress team in getting this book
published.
We are sure that there are more than a few people whom we have failed
to mention explicitly who have positively impacted this book project. We
thank all who helped us.
As you read this second edition, please do provide feedback if you find
any way to improve it. Feedback via GitHub can open up a conversation,
and we will update the online errata and book samples as needed.
Thank you all, and we hope you find this book invaluable in your
endeavors.

xxx
CHAPTER 1

Introduction
We have undeniably entered the age of accelerated computing. In order to
satisfy the world’s insatiable appetite for more computation, accelerated
computing drives complex simulations, AI, and much more by providing
greater performance and improved power efficiency when compared with
earlier solutions.
Heralded as a “New Golden Age for Computer Architecture,”1 we are
faced with enormous opportunity through a rich diversity in compute
devices. We need portable software development capabilities that are
not tied to any single vendor or architecture in order to realize the full
potential for accelerated computing.
SYCL (pronounced sickle) is an industry-driven Khronos Group
standard adding advanced support for data parallelism with C++ to
support accelerated (heterogeneous) systems. SYCL provides mechanisms
for C++ compilers to exploit accelerated (heterogeneous) systems in a way
that is highly synergistic with modern C++ and C++ build systems. SYCL is
not an acronym; SYCL is simply a name.

1
A New Golden Age for Computer Architecture by John L. Hennessy, David
A. Patterson; Communications of the ACM, February 2019, Vol. 62 No. 2,
Pages 48-60.

© Intel Corporation 2023 1


J. Reinders et al., Data Parallel C++, https://doi.org/10.1007/978-1-4842-9691-2_1
Chapter 1 Introduction

ACCELERATED VS. HETEROGENEOUS

These terms go together. Heterogeneous is a technical description


acknowledging the combination of compute devices that are programmed
differently. Accelerated is the motivation for adding this complexity to systems
and programming. There is no guarantee of acceleration ever; programming
heterogeneous systems will only accelerate our applications when we do it
right. This book helps teach us how to do it right!

Data parallelism in C++ with SYCL provides access to all the compute
devices in a modern accelerated (heterogeneous) system. A single C++
application can use any combination of devices—including GPUs, CPUs,
FPGAs, and application-specific integrated circuits (ASICs)—that are
suitable to the problems at hand. No proprietary, single-vendor, solution
can offer us the same level of flexibility.
This book teaches us how to harness accelerated computing using
data-parallel programming using C++ with SYCL and provides practical
advice for balancing application performance, portability across compute
devices, and our own productivity as programmers. This chapter lays
the foundation by covering core concepts, including terminology, which
are critical to have fresh in our minds as we learn how to accelerate C++
programs using data parallelism.

Read the Book, Not the Spec


No one wants to be told “Go read the spec!”—specifications are hard to
read, and the SYCL specification (www.khronos.org/sycl/) is no different.
Like every great language specification, it is full of precision but is light on
motivation, usage, and teaching. This book is a “study guide” to teach C++
with SYCL.

2
Chapter 1 Introduction

No book can explain everything at once. Therefore, this chapter does


what no other chapter will do: the code examples contain programming
constructs that go unexplained until future chapters. We should not get
hung up on understanding the coding examples completely in Chapter 1
and trust it will get better with each chapter.

SYCL 2020 and DPC++


This book teaches C++ with SYCL 2020. The first edition of this book
preceded the SYCL 2020 specification, so this edition includes updates
including adjustments in the header file location (sycl instead of CL),
device selector syntax, and removal of an explicit host device.
DPC++ is an open source compiler project based on LLVM. It is
our hope that SYCL eventually be supported by default in the LLVM
community and that the DPC++ project will help make that happen. The
DPC++ compiler offers broad heterogeneous support that includes GPU,
CPU, and FPGA. All examples in this book work with the DPC++ compiler
and should work with any C++ compiler supporting SYCL 2020.

Important resources for updated SYCL information, including any


known book errata, include the book GitHub (github.com/Apress/
data-parallel-CPP), the Khronos Group SYCL standards website
(www.khronos.org/sycl), and a key SYCL education website
(sycl.tech).

As of publication time, no C++ compiler claims full conformance or


compliance with the SYCL 2020 specification. Nevertheless, the code in
this book works with the DPC++ compiler and should work with other C++
compilers that have most of SYCL 2020 implemented. We use only standard
C++ with SYCL 2020 excepting for a few DPC++-specific extensions that

3
Chapter 1 Introduction

are clearly called out in Chapter 17 (Programming for FPGAs) to a small


degree, Chapter 20 (Backend Interoperability) when connecting to Level
Zero backends, and the Epilogue when speculating on the future.

Why Not CUDA?


Unlike CUDA, SYCL supports data parallelism in C++ for all vendors and
all types of architectures (not just GPUs). CUDA is focused on NVIDIA
GPU support only, and efforts (such as HIP/ROCm) to reuse it for GPUs
by other vendors have limited ability to succeed despite some solid
success and usefulness. With the explosion of accelerator architectures,
only SYCL offers the support we need for harnessing this diversity and
offering a multivendor/multiarchitecture approach to help with portability
that CUDA does not offer. To more deeply understand this motivation,
we highly recommend reading (or watching the video recording of their
excellent talk) “A New Golden Age for Computer Architecture” by industry
legends John L. Hennessy and David A. Patterson. We consider this a
must-read article.
Chapter 21, in addition to addressing topics useful for migrating code
from CUDA to C++ with SYCL, is valuable for those experienced with
CUDA to bridge some terminology and capability differences. The most
significant capabilities beyond CUDA come from the ability for SYCL to
support multiple vendors, multiple architectures (not just GPUs), and
multiple backends even for the same device. This flexibility answers the
question “Why not CUDA?”
SYCL does not involve any extra overhead compared with CUDA or
HIP. It is not a compatibility layer—it is a generalized approach that is open
to all devices regardless of vendor and architecture while simultaneously
being in sync with modern C++. Like other open multivendor and
multiarchitecture techniques, such as OpenMP and OpenCL, the ultimate
proof is in the implementations including options to access hardware-­
specific optimizations when absolutely needed.

4
Chapter 1 Introduction

Why Standard C++ with SYCL?


As we will point out repeatedly, every program using SYCL is first and
foremost a C++ program. SYCL does not rely on any language changes
to C++. SYCL does take C++ programming places it cannot go without
SYCL. We have no doubt that all programming for accelerated computing
will continue to influence language standards including C++, but we do
not believe the C++ standard should (or will) evolve to displace the need
for SYCL any time soon. SYCL has a rich set of capabilities that we spend
this book covering that extend C++ through classes and rich support for
new compiler capabilities necessary to meet needs (already existing today)
for multivendor and multiarchitecture support.

Getting a C++ Compiler with SYCL Support


All examples in this book compile and work with all the various
distributions of the DPC++ compiler and should compile with other C++
compilers supporting SYCL (see “SYCL Compilers in Development” at
www.khronos.org/sycl). We are careful to note the very few places where
extensions are used that are DPC++ specific at the time of publication.
The authors recommend the DPC++ compiler for a variety of reasons,
including our close association with the DPC++ compiler. DPC++ is an
open source compiler project to support SYCL. By using LLVM, the DPC++
compiler project has access to backends for numerous devices. This has
already resulted in support for Intel, NVIDIA, and AMD GPUs, numerous
CPUs, and Intel FPGAs. The ability to extend and enhance support openly
for multiple vendors and multiple architecture makes LLVM a great choice
for open source efforts to support SYCL.
There are distributions of the DPC++ compiler, augmented with
additional tools and libraries, available as part of a larger project to
offer broad support for heterogeneous systems, which include libraries,

5
Chapter 1 Introduction

debuggers, and other tools, known as the oneAPI project. The oneAPI
tools, including the DPC++ compiler, are freely available (www.oneapi.io/
implementations).

1. #include <iostream>
2. #include <sycl/sycl.hpp>
3. using namespace sycl;
4.
5. const std::string secret{
6. "Ifmmp-!xpsme\"\012J(n!tpssz-!Ebwf/!"
7. "J(n!bgsbje!J!dbo(u!ep!uibu/!.!IBM\01"};
8.
9. const auto sz = secret.size();
10.
11. int main() {
12. queue q;
13.
14. char* result = malloc_shared<char>(sz, q);
15. std::memcpy(result, secret.data(), sz);
16.
17. q.parallel_for(sz, [=](auto& i) {
18. result[i] -= 1;
19. }).wait();
20.
21. std::cout << result << "\n";
22. free(result, q);
23. return 0;
24. }

Figure 1-1. Hello data-parallel programming

 ello, World! and a SYCL


H
Program Dissection
Figure 1-1 shows a sample SYCL program. Compiling and running it
results in the following being printed:
Hello, world! (and some additional text left to experience by running it)
We will completely understand this example by the end of Chapter 4.
Until then, we can observe the single include of <sycl/sycl.hpp> (line 2)
that is needed to define all the SYCL constructs. All SYCL constructs live
inside a namespace called sycl.

6
Chapter 1 Introduction

• Line 3 lets us avoid writing sycl:: over and over.

• Line 12 instantiates a queue for work requests directed


to a particular device (Chapter 2).

• Line 14 creates an allocation for data shared with the


device (Chapter 3).

• Line 15 copies the secret string into device memory,


where it will be processed by the kernel.

• Line 17 enqueues work to the device (Chapter 4).

• Line 18 is the only line of code that will run on the


device. All other code runs on the host (CPU).

Line 18 is the kernel code that we want to run on devices. That kernel
code decrements a single character. With the power of parallel_for(),
that kernel is run on each character in our secret string in order to decode
it into the result string. There is no ordering of the work required, and it is
run asynchronously relative to the main program once the parallel_for
queues the work. It is critical that there is a wait (line 19) before looking at
the result to be sure that the kernel has completed, since in this example
we are using a convenient feature (Unified Shared Memory, Chapter 6).
Without the wait, the output may occur before all the characters have been
decrypted. There is more to discuss, but that is the job of later chapters.

Queues and Actions


Chapter 2 discusses queues and actions, but we can start with a simple
explanation for now. Queues are the only connection that allows an
application to direct work to be done on a device. There are two types
of actions that can be placed into a queue: (a) code to execute and (b)
memory operations. Code to execute is expressed via either single_task
or parallel_for (used in Figure 1-1). Memory operations perform copy

7
Chapter 1 Introduction

operations between host and device or fill operations to initialize memory.


We only need to use memory operations if we seek more control than
what is done automatically for us. These are all discussed later in the
book starting with Chapter 2. For now, we should be aware that queues
are the connection that allows us to command a device, and we have
a set of actions available to put in queues to execute code and to move
around data. It is also very important to understand that requested actions
are placed into a queue without waiting. The host, after submitting an
action into a queue, continues to execute the program, while the device
will eventually, and asynchronously, perform the action requested via
the queue.

QUEUES CONNECT US TO DEVICES

We submit actions into queues to request computational work and data


movement.

Actions happen asynchronously.

It Is All About Parallelism


Since programming in C++ for data parallelism is all about parallelism,
let’s start with this critical concept. The goal of parallel programming is
to compute something faster. It turns out there are two aspects to this:
increased throughput and reduced latency.

Throughput
Increasing throughput of a program comes when we get more work done
in a set amount of time. Techniques like pipelining may stretch out the
time necessary to get a single work-item done, to allow overlapping of

8
Chapter 1 Introduction

work that leads to more work-per-unit-of-time being done. Humans


encounter this often when working together. The very act of sharing work
involves overhead to coordinate that often slows the time to do a single
item. However, the power of multiple people leads to more throughput.
Computers are no different—spreading work to more processing cores
adds overhead to each unit of work that likely results in some delays, but
the goal is to get more total work done because we have more processing
cores working together.

Latency
What if we want to get one thing done faster—for instance, analyzing
a voice command and formulating a response? If we only cared about
throughput, the response time might grow to be unbearable. The concept
of latency reduction requires that we break up an item of work into
pieces that can be tackled in parallel. For throughput, image processing
might assign whole images to different processing units—in this case,
our goal may be optimizing for images per second. For latency, image
processing might assign each pixel within an image to different processing
cores—in this case, our goal may be maximizing pixels per second from a
single image.

Think Parallel
Successful parallel programmers use both techniques in their
programming. This is the beginning of our quest to Think Parallel.
We want to adjust our minds to think first about where parallelism
can be found in our algorithms and applications. We also think about how
different ways of expressing the parallelism affect the performance we
ultimately achieve. That is a lot to take in all at once. The quest to Think
Parallel becomes a lifelong journey for parallel programmers. We can learn
a few tips here.

9
Chapter 1 Introduction

Amdahl and Gustafson


Amdahl’s Law, stated by the supercomputer pioneer Gene Amdahl in
1967, is a formula to predict the theoretical maximum speed-up when
using multiple processors. Amdahl lamented that the maximum gain from
parallelism is limited to (1/(1-p)) where p is the fraction of the program
that runs in parallel. If we only run two-thirds of our program in parallel,
then the most that program can speed up is a factor of 3. We definitely
need that concept to sink in deeply! This happens because no matter how
fast we make that two-thirds of our program run, the other one-third still
takes the same time to complete. Even if we add 100 GPUs, we will only get
a factor of 3 increase in performance.
For many years, some viewed this as proof that parallel computing
would not prove fruitful. In 1988, John Gustafson wrote an article titled
“Reevaluating Amdahl’s Law.” He observed that parallelism was not used
to speed up fixed workloads, but it was used to allow work to be scaled
up. Humans experience the same thing. One delivery person cannot
deliver a single package faster with the help of many more people and
trucks. However, a hundred people and trucks can deliver one hundred
packages more quickly than a single driver with a truck. Multiple drivers
will definitely increase throughput and will also generally reduce latency
for package deliveries. Amdahl’s Law tells us that a single driver cannot
deliver one package faster by adding ninety-nine more drivers with their
own trucks. Gustafson noticed the opportunity to deliver one hundred
packages faster with these extra drivers and trucks.
This emphasizes that parallelism is most useful because the size of
problems we tackle keep growing in size year after year. Parallelism would
not nearly as important to study if year after year we only wanted to run the
same size problems faster. This quest to solve larger and larger problems
fuels our interest in exploiting data parallelism, using C++ with SYCL, for
the future of computer (heterogeneous/accelerated systems).

10
Exploring the Variety of Random
Documents with Different Content
Christian. Body and soul he is given over to reprobation; and we
have no need to go out of our way to shelter him in any degree from
the laws of his own heretic land: a land which for centuries has
given the true faith up to persecution and injustice of every kind. Let
him take his chance. I ask you to do nothing more. The evidence is
very strong against him. No other person was seen near this
unfortunate young man. But a very short time could have elapsed
after they were remarked together, apparently in high dispute,
before this fatal occurrence took place. Other evidence may appear,
and he may be proved guilty or innocent; but, at all events, he must
be tried, and the time of that trial may be yet remote. The first cases
that will be taken will certainly be those connected with these riots,
and the only direct witness against you will be then in jail."

"But how am I to act in this business?" demanded Sir Arthur


Adelon. "As a magistrate, as the person in whose house both the
dead man and the living were staying, I shall continually be called
upon to share in the different proceedings, and my part will be a
terribly difficult one to play, my friend."

"Not in the least," answered Filmer. "You must refuse to act as a


magistrate, even should you be called upon, alleging your
acquaintance with both parties, and your natural partiality for Mr.
Dudley, on account of old friendship between his father and yourself,
as sufficient excuses. Whatever evidence you give may be highly
favourable to the accused person. The testimony against him will be
strong enough, rest assured of that."

"Then do you really think him guilty?" demanded the baronet,


gazing at the priest, with those doubts which a long acquaintance
with his character had impressed even upon the mind of a man not
very acute.

"Nay, I do not prejudge the question," replied Filmer. "As yet we


have not sufficient grounds to go upon. All I say is, the case of
suspicion is very strong; and what I would advise you to do, under
any circumstances, would be to send immediately for your nearest
neighbour, Mr. Conway, turn over the case to him, and let him judge
whether it be not necessary instantly to issue a warrant for the
apprehension of Mr. Dudley, when he returns. It were better that not
a moment were lost, for although you have probably ridden fast, it
cannot be long ere the person we suspect is here."

"Perhaps he may not return at all," said Sir Arthur. "It is more
than probable that, on foot and unarmed, he has been apprehended
as one of the rioters, but we can send, at all events." And ringing
the bell sharply, he gave the necessary orders.

"But now," continued the baronet, reverting to the topic of


greatest interest in his own mind, as soon as the servant had left the
room, "how am I to act in regard to this attack upon Barhampton?"

"We must see," replied the priest. "Should Norries be dead, or


have made his escape, you must assume a degree of boldness;
acknowledge that your views are the same in regard to general
principles as those of the unfortunate men implicated; but declare
openly that you have always opposed any recourse to physical force
in the assertion of any political opinions whatever, and bring forward
witnesses to prove that you attempted to dissuade them from all
violence, refusing to take any part therein. That will be easily done;
and should any one come forward to state that you were present at
the attack, you can show that you went thither on hearing that it
was about to take place, in order to constrain them to refrain from
executing their intentions by every means in your power."

"But how can I show that?" demanded Sir Arthur.

"We will find a way," replied Filmer; "but that can be discussed to-
morrow. I must now go out to console some of my little flock who
are suffering from affliction. In the mean time you must manage this
examination. The witnesses are the old man at the lodge, your
butler, the head footman, Brown, and the fishermen who are now
waiting in the servants' hall."

As he spoke he moved towards the door. Sir Arthur would fain


have detained him a moment to ask farther questions, but Filmer
laid his hand upon his arm, saying, "Be firm, be firm!" and left him.

CHAPTER XVIII.

At the distance of about a quarter of a mile from Clive Grange


was a group of six or seven cottages, of neat and comfortable
appearance, tenanted by labourers on Mr. Clive's own farm. They
were all respectable, hard-working people; and as Clive himself was
not without his prejudices, especially upon religious matters, he had
contrived that most of those whom he employed should be Roman
Catholics. As there were not many of that church in the part of the
country where he lived, some of these men had come from a
distance. He would not, indeed, refuse a good workman and a man
of high character on account of his being a Protestant, but he had a
natural preference for persons of his own views, and all things equal,
chose them rather than any others. This preference was known far
and wide; and consequently, when any of his distant friends wished
to recommend an honest man of the Romish creed to employment,
where they were certain to be well treated, they wrote to Mr. Clive,
so that he had rarely any difficulty in suiting himself.

In one of these cottages, at a much later hour than usual, a light


was burning on the night of which I have been speaking; and within,
over the smouldering embers of a small wood fire, sat a tall man of
the middle age, with a peculiar deep-set blue eye, fringed with dark
lashes, which is very frequently to be found amongst the Milesian
race. His figure was bent, and his hands stretched out over the
smouldering hearth to gain any little heat that it gave out; and, as
he thus sat, his eyes were bent upon the red sparks amongst the
white ashes, with a grave, contemplative gaze. He seemed dull, and
somewhat melancholy, and from time to time muttered a few words
to himself with the peculiar tone of his countrymen.

"Ay-e!" he said, as something struck him in the half-extinguished


fire, "that one's gone out too. If the priest stays much longer they'll
all be out, one after the other. Well, it's little matter for that; we
must all go out some time or another, and very often when we think
we are burning brightest. That young lad now, I dare say, when he
went out for his walk, never fancied his neck would be broke before
he came home again. Sorrow a bit! He got what he deserved
anyhow, and I'd ha' done it for him if the master hadn't--Hist! That
must be the priest's step coming down the hill. He is the only man
likely to be out so late in this country, and going with such a slow
step, though the lads are having a bit of a shindy to-night they tell
me."

The next moment the latch was lifted, the door opened, and Mr.
Filmer walked in. The labourer instantly rose and placed a wooden
chair for his pastor by the side of the fire, saying, "Good night, your
reverence! It's mighty cold this afternoon."

"I don't find it so," answered Filmer; "but I dare say you do,
sitting all alone here, with but a little spark like that. I was afraid you
would get tired of waiting, and go to bed. I am much obliged to you
for sitting up as I told you."

"Oh! in course I did as your reverence said," answered Daniel


Connor. "I always obey my priest."
"That's right, Dan," answered Mr. Filmer. "Now I have come to tell
you what I want you to do, like a good lad."

"Anything your reverence says, I am quite ready to do," replied


the Irishman. "I kept the matter quite quiet as you said, and not a
bare word about it passed my lips to any of the servants, for I am
not going to say anything that can hurt the master, for a better
never lived than he."

"No, Dan," answered the priest; "but I'll tell you what you must
do, you must say a word or two to serve him." And Filmer fixed his
eyes keenly upon the man's face, which brightened up in a moment
with a very shrewd and merry smile, as he replied, "That I'll do with
all my heart, your reverence. It's but the telling me what to say and
I'll say it."

"Well then, you see, Dan," continued Filmer, "this is likely to be a


bad business for Mr. Clive, if we do not manage very skilfully. He is
somewhat obstinate himself, and might with difficulty be persuaded
to take the line of defence we want, and which indeed is necessary
to his own safety. Now the first thing that will take place here is the
coroner's inquest."

"Ay! I suppose so," said Connor; "but they shan't get anything out
of me there, I can answer for it. I can be as blind as a mole when I
like, and as deaf too."

"But you must be somewhat more, Dan," was the priest's reply.
"You see, if suspicion fixes to no one, and the jury bring in a verdict
of wilful murder against some person or persons unknown, the
magistrates will never leave inquiring into the matter till they fix it
upon your poor master. What we must do must be to turn the first
suspicions upon some one else, so as to keep Mr. Clive free of them
altogether, and then he will be safe enough."

"Won't that be something very like murder, your reverence?"


asked Connor, abruptly, with a very grave face. "I never did the like
of that, and I think it's a sin, is it not?"

"The sin be upon me," answered Filmer, sternly. "Cannot I absolve


you, Daniel Connor, for that which I bid you do? Are you going to
turn heretic too? Do you doubt that the church has power to absolve
you from your sins, or that where she points out the course to you
the end does not justify the means?"

"Oh, no! the blessed saints forbid!" exclaimed Connor, eagerly. "I
don't doubt a word of it; I am quite sure your reverence is right; I
was only just asking you, like!"

"Oh! if that's all," answered Mr. Filmer, "and you are not beginning
to feel scandalous doubts from living so long amongst a number of
heretics all about, I will answer your question plainly. It is not at all
like murder, nor will there be any sin in it. The person who is likely to
be suspected will be able easily to clear himself in the end; so that
he runs no risk of anything but a short imprisonment, which may
perhaps turn to the good of his soul, for I shall not fail to visit him,
and show him the way to the true light. But in the mean time, Mr.
Clive will be saved from all danger; and if you look at the matter as
a true son of the church, you will see that there is no choice
between a believer like Mr. Clive and an obstinate heretic and
unbeliever like this other man."

"Oh! if it is a heretic!" exclaimed Connor, with a laugh, "that quite


alters the matter; I didn't know he was a heretic."

"You do not suppose, I hope," replied Mr. Filmer, "that I would


have proposed such a thing if he was not. All my children are equally
dear to me, be they high or low, and I would not peril one to save
another."

"Well, your reverence, I am quite ready to do whatever you say,"


answered Connor; "and if you just give me a thought of the right
way I'll walk along it as straight as a line."
"The case is this, then," rejoined the priest; "there was a quarrel
between this young lord and a Mr. Dudley, which went on more or
less through the whole of this day. Dudley went out about eight
o'clock, and Lord Hadley followed him and overtook him, and they
went on quarrelling by the way. Very soon after that the young lord
met with his death. Now men will naturally think that Mr. Dudley
killed him, for no one but you and your master and Miss Clive saw
him after, till he was speechless. What you must do then is this:--
when you hear that the coroner's inquest is sitting, you must come
up and offer to give evidence; and you must tell them exactly where
you were standing when the young lord came up to the top of the
cliff; and then you must say that you saw a man come up to him,
and a quarrel take place, and two or three blows struck, and the
unhappy lad pitched over the cliff."

"And not a word about Miss Helen?" said the man.

"Not a word," answered Filmer. "Keep yourself solely to the fact of


having seen a man of gentlemanly appearance----"

"Oh! he is a gentleman, every inch of him," exclaimed Connor.


"No doubt about that, your reverence."

"So you can state," continued the priest; "but take care not to
enter too much into detail. Say you saw him but indistinctly."

"That's true enough," cried the labourer; "for it was a darkish


night, and I was low down in the glen and he high up on the side of
the hill, so that I caught but a glimmer of him, as it were. But it was
the master, notwithstanding, that I am quite sure of, or else the devil
in his likeness. But, by the blessed saints! I do not think it could be
the devil either, for he did what any man would have done in his
place, and what I should have done in another minute if he hadn't
come up, for I would not have stood by to see the young lady ill-
treated, no how."
"Doubtless not," answered the priest; "and it would be hard that
the life of such a man should be sacrificed for merely defending his
own child."

"Oh, no! that shall never be," answered Connor, "if my word can
stop it; and so, father," he continued, with a shrewd look, "I suppose
that the best thing I can do is, if I am asked any questions, to say
that I didn't rightly see the gentleman that did it; but that he looked
like a real gentleman, and may be about the height of this Mr.
Dudley. I saw him twice at the farmhouse, and if he is in the room, I
can point him out as being about the tallness of the man I saw; and
that's not a lie either, for they are much alike, in length at least.
Neither one. nor the other stands much under six feet. I'd better not
swear to him, however, for that would be bad work."

"By no means," answered the priest. "Keep to mere general facts;


that can but cause suspicion. I wish not to injure the young man,
but merely to turn suspicion upon him rather than Mr. Clive; and by
so doing, to give even Mr. Dudley himself a sort of involuntary
penance, which may soften an obdurate heart towards the church
which his fathers foolishly abandoned, and leave him one more
chance of salvation, if he chooses to accept of it. It is a hard thing,
Daniel Connor, to remain for many thousands of years in the flames
of purgatory, where every moment is marked and prolonged by
torture indescribable, instead of entering into eternal beatitude,
where all sense of time is lost in inexpressible joy from everlasting to
everlasting; but it is a still harder thing to be doomed in hell to
eternal punishment, where the whole wrath and indignation of God
is poured out upon the head of the unrepenting and the obstinate
for ever and ever."

"It is mighty hard, indeed!" answered the labourer, making the


sign of the cross. "The Blessed Virgin keep us all from such luck as
that!"
"It is from that I wish to save him," rejoined Mr. Filmer; "but his
heart must first be humbled, for you know very well, Daniel, that
pride is the source of unbelief in the minds of all these heretics.
They judge their own opinions to be far better than the dogmas of
the church, the decisions of councils, or the exposition of the
fathers; and by the same sin which caused the fall of the angels,
they have also fallen from the faith. Let no true son of the church
follow their bad example; but knowing that all things are a matter of
faith, and that the church is the interpreter mentioned in Scripture,
submit their human and fallible reason implicitly to that high and
holy authority which is vested in the successor of the Apostle and
the Councils of the Church, where they will find the only infallible
guide."

"Oh! but I'll do that, certainly," replied Connor, eagerly; and yet a
shade of doubt seemed to hang upon him, for he added, the
moment after, "But you know, your reverence, that when they swear
me they will make me swear to tell the whole truth, and if I do not
say that I know it was Mr. Clive, it will be false swearing."

"Heed not that," answered Filmer, with a frown. "Have I not told
you that I will absolve you, and do absolve you? Besides, how can
you swear to that which you only believe, but do not exactly know.
You told me this evening, up at the hall, that you did not see your
master's face when he struck the blow."

"Ah! but I saw his face well enough when he was going up,"
replied the labourer.

"That does not prove that he was the same who did the deed,"
said Filmer. "Another might have suddenly come there, without your
perceiving how."

"He was mighty like the master, any how," said the man, in a low
tone; "but I'll say just what your reverence bids me."
"Do so," answered Filmer, turning to leave the cottage; "the
church speaks by my voice, and accursed be all who disobey her!"

The stern earnestness with which he spoke; the undoubting


confidence which his words and looks displayed in his power, as a
priest of that church which pretends to hold the ultimate fate of all
beings in its hands; his own apparent faith in that vast and
blasphemous pretension; had their full effect upon his auditor, who,
though a good man, a shrewd man, and not altogether an
unenlightened man, had sucked in such doctrines with his mother's
milk, so that they became, as it were, a part of his very nature. "To
be sure I will obey," said Connor; "it is no sin of mine if any harm
comes of it. That's the priest's affair, any how." And he retired to his
bed.

CHAPTER XIX.

Father Peter turned away to the right, and walked on; for he had
yet work to do, and a somewhat different part to play before the
night was done. The versatility of the genius of the Roman church is
one of its most dangerous qualities. The principle that the end
justifies the means, makes it seem right to those who hold such a
doctrine, to 'be all things to all men,' in a very different sense from
that of the apostle. Five minutes brought Mr. Filmer to the door of
the Grange, and he looked over that side of the house for a light,
but in vain. One of the large dogs came and fawned upon him, and
all the rest were silent; for it is wonderful how soon and easily he
accustomed all creatures to his influence. His slow, quiet, yet firm
footfall was known amongst those animals as well as their master's
or Edgar Adelon's, and at two or three hundred yards they had
recognised it.

After a moment's consideration, Filmer rang the bell gently, and


the next instant Clive himself appeared with a light in his hand. He
was fully dressed, and his face was grave and composed. "Ah,
father!" he said, as soon as he perceived who his visitor was, "this is
kind of you. Come in. Helen has not gone to bed yet."

"I am glad to hear it, my son," replied Filmer, "for I want to speak
a few words with you both." Thus saying, he walked on before Mr.
Clive into the room where Helen Clive usually sat. He found her with
her eyes no longer tearful, but red with weeping; and seating
himself with a kindly manner beside her, he said, "Grieve not, my
dear child, whatever has happened. There is consolation for all who
believe."

"But you know not yet, father, what has happened," answered
Helen, with a glance at her father: "you will know soon, however."

"I do know what has happened, Helen," said the priest; "though
not all the particulars; and I have come down at once to give you
comfort and advice. Tell me, my son, how did this sad event occur?"

"It is soon rumoured, it would seem, then," observed Clive, in a


gloomy tone. "I told you, Helen, that concealment was hopeless,
though we thought no eye saw it but our own, and that of Him who
saw all, and would judge the provocation as well as the
punishment."

"Concealment is not hopeless, my son," replied Filmer, "if


concealment should, be needful, as I fear it is. Only one person saw
you, and he came at once to tell me, and bring me down to comfort
you; for he is a faithful child of our holy mother the church, and will
betray no man. But tell me all, Clive. Am I not your friend as well as
your pastor?"
"Tell him, Helen--tell the good father," said Clive, seating himself
at the table, and leaning his head upon his hand. "I have no heart to
speak of it."

The priest turned his eyes to Helen, who immediately took up the
tale which her father was unwilling to tell. "I believe I am myself to
blame," she said, in a low, sweet tone; "though God knows I
thought not of what would follow when I went out. But I must tell
you why I did so. My father and I had been talking all the evening of
the wild and troubled state of the country, and of what was likely to
take place at Barhampton tonight."

"It has taken place," replied Father Filmer; "the magistrates were
prepared for the rioters; the troops have been in amongst the
people, and many a precious life has been lost."

"It was what we feared," continued Helen, sadly. "Alas! that men
will do such wild and lawless things. But about that very tumult my
father was anxious and uneasy, and towards half-past six he went
out to see if he could meet my uncle Norries as he went, and at all
events to look out from the top of the downs towards Barhampton.
He promised me that he would on no account go farther than the
old wall, and that he would be back in half an hour. But more than
an hour passed, and I grew frightened, till at last I sent up Daniel
Conner to see if he could find my father. He seemed long, though
perhaps he was not, and I then resolved to go myself. I had no fear
at all; for I had never heard of Lord Hadley being out at night, and I
thought he would be at the dinner-table, and I quite safe--safer,
indeed, than in the day. I was only anxious for my father, and for
him I was very anxious. However, I walked on fast, and soon came
to the downs, but I could see no one, and taking the slanting path
up the slope, I came just to the edge of the cliff, and looked out
over the sea to Barhampton Head. There was nothing to be seen
there, and only a light in a ship at sea. That made me more
frightened than ever, for I had felt sure that I should find my father
there; and thinking that he might have sat down somewhere to wait,
I called him aloud, to beg he would come home. There was no
answer, but I heard a step coming up the path which runs between
the two slopes, and then goes down over the lower broken part of
the cliff to the sea-shore; and feeling sure that it was either my
father, or Connor, or one of the boatmen, who would not have hurt
me for the world, I was just turning to go down that way when Lord
Hadley sprang up the bank, and caught hold of me by the hand. I
besought him to let me go, and then I was very frightened indeed,
so that I hardly knew, or know, what I said or did. All I am sure of is
that he tried to persuade me to go away with him to France; and he
told me there was a ship for that country out there at sea, and its
boat with the boatmen down upon the shore, for he had spoken to
them in the morning. He said a great deal that I forget, telling me
that he would marry me as soon as we arrived in France; but I was
very angry--too angry, indeed--and what I said in reply seemed to
make him quite furious, for he swore that I should go, with a terrible
oath. I tried to get away, but he kept hold of my hand, and threw his
other arm round me, and was dragging me away down the path
towards the sea-shore, when suddenly my father came up and
struck him. I had not been able to resist much, on account of my
broken arm, but the moment my father came up he let me go, and
returned the blow he had received. We were then close upon the
edge of the cliff, and there is, if you recollect, a low railing, where
the path begins to descend. My father struck him again and again,
and at last he fell back against the railing, which broke, I think,
under his weight, and oh! father, I saw him fall headlong over the
cliff. I thought I should have died at that moment, and before I
recovered myself my father had taken me by the hand and was
leading me away. When we had got a hundred yards or two, I
stopped, and asked if it would not be better to go or send down to
the sea-shore, to see if some help could not be rendered to him. My
father said he had heard the boatmen come to assist him, and that
was enough."

Clive had covered his eyes with his hand while Helen spoke; but
at her last words he looked up, saying, in a stern tone, "Quite
enough! He well deserved what he has met with. I did not intend it,
it is true; but whether he be dead or living, he has only had the
chastisement he merited. I had heard but an hour or two before all
his base conduct to this dear child--I had heard that he had
outraged, insulted, persecuted her; and although I had promised
Norries not to kill him, yet I had resolved, the first time I met with
him, to flay him alive with my horsewhip. I found him again insulting
her; and can any man say I did wrong to punish the base villain on
the spot? I regret it not; I would do it again, be the consequences
what they may; and so I will tell judge and jury whenever I am
called upon to speak."

"I trust that may never be, my son," replied the priest, looking at
him with an expression of melancholy interest; "and I doubt not at
all that, if you follow the advice which I will give you, suspicion will
never even attach to you."

"I shall be very happy, father, to hear your advice," answered


Clive; "but I have no great fears of any evil consequences. People
cannot blame me for striking a man who was insulting and seeking
to wrong my child. I did but defend my own blood and her honour,
and there is no crime in that."

"People often make a crime where there is none, Clive," answered


Mr. Filmer. "This young man is dead, and you must recollect that he
was a peer of England."

"That makes no difference," exclaimed Clive. "Thank God we do


not live in a land where the peer can do wrong any more than the
peasant! I am sorry he is dead, for I did not intend to kill him; but
he well deserved his death, and his station makes no difference."

"None in the eye of the law," replied Mr. Filmer, gravely; "but it
may make much in the ear of a jury. I know these things well, Clive;
and depend upon it, that if this matter should come before a court
of justice at the present time, especially when such wild acts have
been committed by the people, you are lost. In the first place, you
cannot prove the very defence you make----"

"Why, my child was there, and saw it all!" cried Clive, interrupting
him.

"Her evidence would go for very little," answered the priest; "and
as I know you would not deny having done it, your own candour
would ruin you. The best view that a jury would take of your case,
even supposing them not to be worked upon by the rank of the dead
man, could only produce a verdict of manslaughter, which would
send you for life to a penal colony, to labour like a slave, perhaps in
chains."

Clive started, and gazed anxiously in his face, as if that view of


the case were new to him. "Better die than that!" he said; "better
die than that!"

"Assuredly," replied Mr. Filmer. "But why should you run the risk of
either? I tell you, if you will follow my advice, you shall pass without
suspicion." But Clive waved his hand almost impatiently, saying,
"Impossible, father, impossible! I am not a man who can set a guard
upon his lips; and I should say things from time to time which would
soon lead men to see and know who it was that did it. I could not
converse with any of my neighbours here without betraying myself."

"Then you must go away for a time," answered Filmer. "That was
the very advice I was going to give you. If you act with decision, and
leave the country for a short time, I will be answerable for your
remaining free from even a doubt."

"The very way to bring doubt upon myself," answered Clive, with
a short, bitter laugh. "Would not every one ask why Clive ran away?"

"The answer would then be simple," said the priest, "namely, that
he went, probably, because he had engaged with his brother-in-law,
Norries, in these rash schemes against the government which have
been so signally frustrated this night at Barhampton."

"One crime instead of another!" answered Clive, gloomily, bending


down his brow upon his hands again.

"With this difference," continued Mr. Filmer, "that the one will be
soon and easily pardoned, the other never; that for the one you
cannot be pursued into another land, that for the other you would
be pursued and taken; that the one brings no disgrace upon your
name, that the other blasts you as a felon, leaves a stain upon your
child, deprives her of a parent, ruins her happiness for ever."

"Oh fly, father, fly!" cried Helen. "Save yourself from such a
horrible fate!"

"What! and leave you here unprotected!" exclaimed Clive.

"Oh no! let me go with you!" cried Helen,

"Of course," said the priest. "You cannot, and you must not go
alone. Take Helen with you, and be sure that her devotion towards
you will but increase and strengthen that strong affection which she
has inspired in one worthy of her, and of whom she is worthy. I have
promised you, Clive, or rather I should say, I have assured you, that
your daughter shall be the wife of him she loves, ay, with his father's
full consent. If you follow my advice, it shall be so; but do not
suppose that Sir Arthur would ever suffer his son to marry the
daughter of a convict. As it is, he knows that your blood is as good
as his own, and that the only real difference is in fortune; but with a
tainted name the case would be very different. There would be an
insurmountable bar against their union, and you would make her
whole life wretched, as well as cast away your own happiness for
ever."

"But how can I fly?" asked Clive. "The whole thing will be known
to-morrow, and ere I reached London I should be pursued and
taken."

"There is a shorter way than that," answered Filmer, "and one


that cannot fail."

"The French ship!" cried Helen, with a look of joy.

"Even so," rejoined the priest; "she will sail in a few hours. You
have nothing to do but send down what things you need as fast as
possible, get one of the boats to row you out, embark, and you are
safe. I will give you letters to a friend in Brittany, who will show you
all kindness, and you can remain there at peace till I tell you that
you may safely return."

Clive paused, and seemed to hesitate for a moment or two; but


Helen gazed imploringly in his face, and at length he threw his arms
around her, saying, "I will go, my child; I have no right to make you
wretched also. Were it for myself alone, nothing should make me
run away; but now nothing must induce me to sacrifice you. Go,
Helen; get ready quickly. Perhaps they may think that I have had
some share in this tumult, and suspicion pass away in that manner."

"Undoubtedly they will," rejoined Mr. Filmer; "and I will take care
to give suspicion that direction. Be quick, Helen: but do you not
need some one to aid you."

"I will get the girl Margaret," said Helen Clive, "for I am very
helpless." And closing the door, she departed.

"What shall I do with the farm?" inquired Clive, as soon as she


was gone. "I fear everything will go to ruin."

"Not so, not so," answered Mr. Filmer, cheerfully. "I will see that it
is well attended to; and though, perhaps, something may go wrong,
against which nothing but the owner's eye can secure, yet nothing
like ruin shall take place. And now, hasten away, Clive, and make
your own preparations. No time is to be lost; for if the people on
board the ship learn that the attack upon Barhampton has failed,
they may perhaps put to sea sooner than the hour they had
appointed. I will write the letter while you are getting ready, and I
will go down with you to the beach, and see you off."

About three quarters of an hour passed in some hurry and


confusion, ere Clive and his daughter were prepared to set out. The
priest's letter was written and sealed; a man was called up to wheel
some boxes and trunks down to the shore; and various orders and
directions were given for the management of the farm during Clive's
absence. The servants seemed astonished, but asked no questions;
and Mr. Filmer skilfully let drop some words which, when
remembered at an after period, might connect the flight of Mr. Clive
with the mad attempt upon the town of Barhampton. When all was
completed, they set forth on foot, passing through the narrow lanes
in the neighbourhood of the house, till they reached and crossed the
high road, and then, following one of the little dells through the
downs, descended by a somewhat rugged path to the sea-side.
Some of the boatmen were already up, preparing to put to sea; and
as Clive had often been a friend to all of them, no difficulty was
made in fulfilling his desire. The sea was as calm as a small lake;
and though the water was too low to launch one of their large boats
easily, yet a small one was pushed over the sands, and Helen and
her father stood beside it, ready to embark, when a quick step,
running over the beach, was heard, and Mr. Filmer exclaimed,
"Quick, quick, into the boat, and put off!"

"That is Edgar's foot," said Helen, hanging back. "Oh, let me wait,
and bid him adieu! I know it is Edgar's foot!"

"The ear of love is quick," said Mr. Filmer. "I did not recognise it;"
and in another moment Edgar Adelon stood beside them.

"I have been to the house," he said, "and they told me where to
seek you."
"We are forced to go away for a time by some unpleasant
circumstances, Mr. Adelon," said Clive, gravely.

"I know--I know it all," answered Edgar, quickly. "I watched the
whole attack from the hill. It was a strange, ghastly sight, and I will
not stop you, Mr. Clive, for it would be ruin to stay; but let me speak
one word to dear Helen--but one word, and I will not keep you."

The father made no opposition; he knew what it was to love well,


and he would not withhold the small drop of consolation from the
bitter cup of parting. Edgar drew the fair girl a few steps aside, and
they spoke together earnestly for a few minutes. He then pressed
her hand affectionately in his, and each repeated "For ever!" Then
leading her back towards the boat, against the sides of which the
water was now rising, he shook Clive's hand warmly, saying, "God
bless and protect you! Let me put her in the boat." And before any
one could answer, he had lifted Helen tenderly in his arms, walked
with her into the shallow water, and placed her in the little bark.
Clive followed, after another word or two with Mr. Filmer; the
boatmen pushed off, and the prow went glittering through the
waves. Edgar Adelon stood and gazed, till Mr. Filmer touched him on
the arm, saying, "Come, my son;" and then, with a deep sigh, the
young man followed him towards the cliffs.

"I must go back to the Grange for my horse," said Edgar, as the
priest was turning along the high road towards Brandon.

"Better send for it," said Mr. Filmer. "Your father has returned, and
may inquire for you."

"It is strange," said Edgar, following him. "I could have sworn I
saw his tall bay hunter among the people at Barhampton."

"You might well be mistaken," answered Mr. Filmer; "but whatever


you saw, Edgar, take my advice, and say to no one that you saw
anything--no, not to Eda."
Edgar did not reply, and the rest of their walk passed in silence till
they reached the gates of the park. They were open, and a man was
standing at the lodge door, with whom the priest paused to speak
for an instant, while Edgar, at his request, walked on. Mr. Filmer
overtook the young man ere he had gone a hundred yards, and as
they approached the house, he said, "You had better go straight to
your room, and to bed, Edgar. Unpleasant things have happened.
Eda has retired, your father has another magistrate with him, and
neither your presence nor mine will be agreeable."

"To my own room, certainly," answered Edgar Adelon; "but not to


bed, nor to sleep, father. I have need of thought more than rest;"
and when the door was opened, he passed straight through the hall,
taking a light from the servant, and mounting the stairs towards his
own room.

CHAPTER XX.

We must now return for a short time to Mr. Dudley, having


brought up many of the other personages connected with this tale
nearly to the same point at which we last left himself. As soon as he
had entered the lodge in the custody of the two constables, he
demanded in a calm tone to see their warrant, entertaining but little
doubt that he had been apprehended for taking some share in the
riots of which he had been a witness, and that the ignorance of the
men who held him in custody had occasioned the use of such very
vague and unsatisfactory terms as 'murder or manslaughter, as the
case may be.' What was his astonishment, however, when he read
as follows:--
"To the Constable of the Hundred of ----, in the County of ----,
and all the other Peace Officers of the same County.

"Forasmuch as Patrick Ferrars, of the parish of Brandon, in the


said county, servant, hath this day made information before me,
Stephen Conway, Esquire, one of her Majesty's justices of the peace,
in and for the said county, that he hath just cause to suspect, and
doth suspect, that Edward Dudley, Esquire, on the ---- day of ----, in
the year of our Lord 18--, at or near the place called Clive Down, in
the said parish of Brandon, in the said county, feloniously, wilfully,
and of his malice aforethought, did kill and murder Henry Lord
Hadley, by striking him sundry blows, and throwing him over the cliff
at the said place, by which the said Lord Hadley instantly died: these
are therefore to command you, or one of you, in her Majesty's
name, forthwith to apprehend and bring before me, or some other
of her Majesty's justices of the peace, in and for the said county, the
body of the said Edgar Dudley, to answer unto the said charge, and
be farther dealt with according to law. Herein fail not."

"Good heaven!" he exclaimed, in a tone of astonishment, which


could not be assumed; "do you mean to say that Lord Hadley has
been killed?"

"Come, come, master, that won't do," said the dull brute into
whose hands he had fallen. "You know all about it, I dare say. You
must march into that 'ere room till to-morrow morning, for there's no
use in taking you twenty miles to the jail, to bring you back again
tomorrow to the crowner's 'quest."

It was with great difficulty that Dudley restrained his temper. The
charge at first sight seemed to him ridiculous, and he would have
scoffed at it, if horror at the fate of his unhappy pupil had not
occupied his mind so completely that no light thought could find
place.
"I ask you civilly, sir," he said, moving into the room pointed out,
closely followed by the constables, "to give me some information in
regard to facts which I must know to-morrow morning, and in which
I am deeply interested. If you are so discourteous as to refuse me
an answer, I cannot force you; but at the same time I suppose there
is nobody on earth but yourself who would think of denying me
some information respecting a friend who, I gather from your
warrant, has been killed."

"Very like a friend to pitch him over the cliff!" answered the
constable. "Howsumdever, the magistrates know all about it, and
you had better wait and talk to them, for if you talk more to me I
shall send down for the handcuffs: a fool I was for not bringing them
with me. We shall sit up with ye by turns, for I am not going to let
ye get off, master, you may depend upon it."

Dudley only replied by a contemptuous smile, and, seating himself


in a chair, he gave himself up to thought, while the one constable
took a place opposite, and the other retired and locked the door. For
nearly two hours Dudley remained meditating over the strange turn
which had taken place in his fate; and as he reflected upon various
circumstances which had occurred during the evening, his situation
began to assume a more serious aspect than it had at first
presented. Not that he supposed, for one moment, he was in the
slightest danger, for his consciousness of innocence was too great to
admit of his believing that, when his whole conduct was explained,
even a suspicion would rest upon him; but he recollected the violent
dispute which he had had with Lord Hadley in the morning, in the
presence of several witnesses, and also called to mind that when he
had gone out after dinner, in order to fulfil his promises to Eda, he
had been followed and overtaken by Lord Hadley, and that the first
part, at least, of their conversation had been carried on in a sharp
and angry tone. He remembered, too, that they had met several
people, and that though in the end the young nobleman had seemed
somewhat touched by his remonstrances, and surprised and vexed
at his decided resignation of all farther responsibility regarding his
conduct, no one had witnessed the more moderate and kindly
manner in which they had parted, or could prove that they had
parted at all before the fatal occurrence of which he had such vague
information. The attempt to extract anything more from the
constable he saw would be in vain, though he thirsted for
intelligence; and his thoughts, after dwelling for some time upon his
own case, naturally turned to the unhappy youth who had been cut
off at so early a period, in the midst of a career of folly and vice. He
could not help sighing over such a result; for notwithstanding
headstrong passions, and a certain degree of weakness of character,
which would have prevented Lord Hadley from ever becoming a
great man, Dudley had perceived some traits of goodness in his
nature, which, under right direction, either by the care of wise and
prudent friends, or by the chastening rod of adversity, might have
been so guided as to render him an estimable and useful member of
society. His mind reverted to his own young days, and he recollected
wild schemes, rash enterprises, some faults and follies which he now
greatly regretted; and he thought, "If I had gone on, the pampered
child of prosperity, I might perhaps have been like him." He did
himself injustice, it is true, but still the fancy was a natural one; and
he felt, at least, that in his case 'the uses of adversity had been
sweet.'

The body and the mind are alternately slaves to each other. When
stimulated to strong exertion, the mind conquers the body; when
oppressed with fatigue or sickness, the body conquers the mind; but
the powers of both seem sometimes worn out together, and then
sleep is the only resource: that heavy, overpowering sleep, the
temporary death of all the faculties; when no memory of the past,
no knowledge of the present, no expectation of the future, comes in
dreams to rouse even fancy from the benumbing influence that
overshadows us. Such was the case with Dudley at the end of those
two hours. He had gone out early in the morning in the pursuit of
healthful exercise; but in the course of his ramble with Edgar
Adelon, subjects had arisen which moved him deeply. His young
companion, with all the warm enthusiasm and confidence of his
nature, had poured forth to him all the stores of grief, anxiety, and
indignation, which had been accumulating in silence and in secret
since first he had become aware of Lord Hadley's pursuit of Helen;
and Dudley, entering warmly into his feelings, had chosen his course
at once. He had determined to speak decidedly to his pupil; to place
before his eyes the scandal and the wickedness of that which he was
engaged in; to demand that it should either cease at once, or he
quit Brandon; and in case he refused, to resign all farther control
over him, and instantly to make the young peer's relations in London
aware of the fact and the cause. Then had come the fierce and
angry discussion with Lord Hadley, followed by an agitating
conversation with Eda; another dispute with his pupil, perhaps more
painful than the first; the hurried and anxious walk to Barhampton,
and the troubled scene which had taken place there. He was
exhausted, mentally and corporeally; and at the end of two hours he
slept, leaning his head upon his folded arms, and remaining so still
and silent, that it seemed as if death rather than slumber possessed
him. His sleep lasted long, too, and he was aroused only by some
one shaking him roughly by the shoulder on the following morning.
Dudley started up, and wondered where he was; but gradually a
recollection of all the facts returned; and the man's words: "Come,
master, the crowner is sitting," required no explanation.

Somewhat to Dudley's surprise, when he reached the door of the


lodge, he found the carriage of Sir Arthur Adelon waiting for him;
and entering with one constable, while the other took his seat upon
the box, he was driven up the avenue to Brandon House. The
servants at the door showed no signs of want of respect, and he was
immediately conducted between his two captors into the library,
where he found a number of persons assembled in a confused mass
at the end of the room, and the coroner's jury seated round the
large table, near the windows. In the centre was a portly man in a
white waistcoat, with a pompous, wine-empurpled face, and an
exceedingly bald head, whom he concluded rightly to be the coroner.
Several magistrates were also in the room, amongst whom were two
persons with whom he had dined at the table of Sir Arthur Adelon a
few days before; but Dudley looked in vain for the baronet himself,
or for any well-known and friendly face. He wanted no support, it is
true; for he was not timid by nature, and he was conscious of
innocence; but yet he would have felt well pleased to have had
friends around him. One of the magistrates shook hands with him,
however, and the other bowed; while some people near the coroner
whispered to that officer, whose eyes were instantly fixed upon the
new comer.

"Mr. Edward Dudley, I believe," he said, aloud; and when Dudley


signified that it was so by bending his head, the other continued:
"Although not strictly necessary, sir, inasmuch as this is an inquest
for the purpose of ascertaining how a certain person met with his
death, and we consequently as yet know nothing of accused or
accusers, yet, as I have been given to understand that a warrant has
been issued for your apprehension under the hand of my worshipful
friend, Mr. Conway, I have thought it best that you should be
present, in order that you should watch proceedings in which you
are deeply interested. You will remark that it is not necessary for you
to say anything upon this occasion, and to do so or not must be left
to your own discretion."

"I thank you for your caution, sir," replied Dudley; "although,
having been bred to the bar, it was not so necessary in my case as it
might be in some. I have no knowledge of the circumstances which
have caused any suspicion to fall upon me, and shall hear with
interest the evidence which may be given regarding facts that I am
utterly unacquainted with."

"Ahem!" said the coroner. "We will now hear the witnesses in the
natural order, gentlemen of the jury. By the natural order, I mean the
order in which the facts connected with the discovery happened. Our
first question will be, where and how the body was found; next,
whose the body is--for you will remark, gentlemen of the jury, that
at the present moment all we know is, that the body of a dead man
has been found under exceedingly suspicious circumstances, and we
must have it identified; then we must inquire how he came by his
death. If the person who first found the corpse is in court, let him
stand forward."

A man of somewhat more than six feet high, in a round jacket


and oilskin hat, advanced to the table, and gave his evidence in a
very clear and intelligent manner, saying, "I was standing out upon
the sand last night, near upon low water----"

"Where at?" asked the coroner. "Pray describe the place as


accurately as possible."

"Why, it was just between Gullpoint and our cottages at St.


Martin's," replied the boatman; "and the hour might be about eight,
or near it. The water was not quite out, so it must have been about
eight. I was standing looking out after the French brig, which had
been making signals like, with lights of different colours, which I did
not understand, when all in a minute I heard some one give a sort
of loud cry, just as if they had been hurt or frightened. It came from
the land, and I heard it quite plain, for the wind set off shore, and
turning round, I looked up in the way that the sound seemed to
come from----"

"Was it moonlight?" asked the coroner.

"Lord bless you, no, sir!" replied the boatman; "but the night was
not very dark, for that matter. However, as I turned, I heard a bit of
a row at the top of the cliff, and I could see two men standing up
there close together, one a tall man, t'other a little shorter; and the
tall one hit the other twice or three times, and then down he came. I
could see him fall back, but after that I lost him, for you see, sir, as
he tumbled down the cliff, it was darker there. When they were a-
top, they had got the sky behind them; but when he fell, he got into
the gloom, and I saw no more of him, till hearing a cry almost like
that of a gull, only louder, I ran up as hard as I could. As I came
over the shingle near the cliff, I heard a groan or two, and just

You might also like