100% found this document useful (2 votes)

1K views

Data Management at Scale Piethein Strengholt 2024 scribd download

Management

Uploaded by

bunenekaneli

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

1K views

Data Management at Scale Piethein Strengholt 2024 scribd download

Management

Uploaded by

bunenekaneli

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Get ebook downloads in full at ebookmeta.

com

Data Management at Scale Piethein Strengholt

https://ebookmeta.com/product/data-management-at-scale-
piethein-strengholt/

OR CLICK BUTTON

DOWNLOAD NOW

Explore and download more ebook at https://ebookmeta.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Data Management at Scale, Second Edition Piethein

Strengholt

https://ebookmeta.com/product/data-management-at-scale-second-edition-
piethein-strengholt/

ebookmeta.com

Data Management at Scale Best Practices for Enterprise

Architecture 1st Edition Piethein Strengholt

https://ebookmeta.com/product/data-management-at-scale-best-practices-
for-enterprise-architecture-1st-edition-piethein-strengholt/

ebookmeta.com

Data Management at Scale: Modern Data Architecture with

Data Mesh and Data Fabric (2nd Edition) Piethein
Strengholt
https://ebookmeta.com/product/data-management-at-scale-modern-data-
architecture-with-data-mesh-and-data-fabric-2nd-edition-piethein-
strengholt/
ebookmeta.com

Make Me Your Ho Ho Ho Daddy It s Christmas Daddy 2 1st

Edition Cherry Sweet

https://ebookmeta.com/product/make-me-your-ho-ho-ho-daddy-it-s-
christmas-daddy-2-1st-edition-cherry-sweet/

ebookmeta.com
Biofertilizers Study and Impact Inamuddin

https://ebookmeta.com/product/biofertilizers-study-and-impact-
inamuddin/

ebookmeta.com

Wiedergeburt Legend of the Reïncarnated Warrior Volume 6

1st Edition Brandon Varnell

https://ebookmeta.com/product/wiedergeburt-legend-of-the-reincarnated-
warrior-volume-6-1st-edition-brandon-varnell/

ebookmeta.com

How People Learn: A new model of learning and cognition to

improve performance and education 2nd Edition Nick
Shackleton-Jones
https://ebookmeta.com/product/how-people-learn-a-new-model-of-
learning-and-cognition-to-improve-performance-and-education-2nd-
edition-nick-shackleton-jones/
ebookmeta.com

Privatizing Peace How Commerce Can Reduce Conflict in

Space 1st Edition Wendy N Whitman Cobb

https://ebookmeta.com/product/privatizing-peace-how-commerce-can-
reduce-conflict-in-space-1st-edition-wendy-n-whitman-cobb/

ebookmeta.com

The Green Fairy Book 1st Edition Andrew Lang.

https://ebookmeta.com/product/the-green-fairy-book-1st-edition-andrew-
lang/

ebookmeta.com
Confidence Pocketbook : Little Exercises for a Self-
Assured Life 1st Edition Gill Hasson

https://ebookmeta.com/product/confidence-pocketbook-little-exercises-
for-a-self-assured-life-1st-edition-gill-hasson/

ebookmeta.com
Data Management at Scale
Modern Data Architecture with Data Mesh and Data
Fabric

SECOND EDITION

Piethein Strengholt
Data Management at Scale
by Piethein Strengholt
Copyright © 2023 Piethein Strengholt. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North,
Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales
promotional use. Online editions are also available for most titles
(http://oreilly.com). For more information, contact our
corporate/institutional sales department: 800-998-9938 or
corporate@oreilly.com.

Acquisitions Editor: Michelle Smith

Development Editor: Shira Evans

Production Editor: Katherine Tozer

Copyeditor: Rachel Head

Proofreader: Piper Editorial Consulting, LLC

Indexer: nSight, Inc.

Interior Designer: David Futato

Cover Designer: Karen Montgomery

Illustrator: Kate Dullea

April 2023: Second Edition

Revision History for the Second Edition
2023-04-10: First Release

See https://oreilly.com/catalog/errata.csp?isbn=9781098138868 for

release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc.
Data Management at Scale, the cover image, and related trade dress
are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author, and do not
represent the publisher’s views. While the publisher and the author
have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and
the author disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from
the use of or reliance on this work. Use of the information and
instructions contained in this work is at your own risk. If any code
samples or other technology this work contains or describes is
subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof
complies with such licenses and/or rights.
This work is part of a collaboration between O’Reilly and Microsoft.
See our statement of editorial independence.
978-1-098-13886-8
[LSI]
Foreword
Whenever we talk about software, we inevitably end up talking
about data—how much there is, where it all lives, what it means,
where it came from or needs to go, and what happens when it
changes. These questions have stuck with us over the years, while
the technology we use to manage our data has changed rapidly.
Today’s databases provide instantaneous access to vast online
datasets; analytics systems answer complex, probing questions;
event-streaming platforms not only connect different applications but
also provide storage, query processing, and built-in data
management tools.
As these technologies have evolved, so have the expectations of our
users. A user is often connected to many different backend systems,
located in different parts of a company, as they switch from mobile
to desktop to call center, change location, or move from one
application to another. All the while, they expect a seamless and
real-time experience. I think the implications of this are far greater
than many may realize. The challenge involves a large estate of
software, data, and people that must appear—at least to our users—
to be a single joined-up unit.
Managing company-wide systems like this has always been a dark
art, something I got a feeling for when I helped build the
infrastructure that backs LinkedIn. All of LinkedIn’s data is generated
continuously, 24 hours a day, by processes that never stop. But
when I first arrived at the company, the infrastructure for harnessing
that data was often limited to big, slow, batch data dumps at the
end of the day and simplistic lookups, jerry-rigged together with
homegrown data feeds. The concept of “end-of-the-day batch
processing” seemed to me to be some legacy of a bygone era of
punch cards and mainframes. Indeed, for a global business, the day
doesn’t end.
As LinkedIn grew, it too became a sprawling software estate, and it
was clear to me that there was no off-the-shelf solution for this kind
of problem. Furthermore, having built the NoSQL databases that
powered LinkedIn’s website, I knew that there was an emerging
renaissance of distributed systems techniques, which meant
solutions could be built that weren’t possible before. This led to
Apache Kafka, which combined scalable messaging, storage, and
processing over the profile updates, page visits, payments, and
other event streams that sat at the core of LinkedIn.
While Kafka streamlined LinkedIn’s dataflows, it also affected the
way applications were built. Like many Silicon Valley firms at the turn
of the last decade, we had been experimenting with microservices,
and it took several iterations to come up with something that was
both functional and stable. This problem was as much about data
and people as it was about software: a complex, interconnected
system that had to evolve as the company grew. Handling a problem
this big required a new kind of technology, but it also needed a new
skill set to go with it.
Of course, there was no manual for navigating this problem back
then. We worked it out as we went along, but this book may well
have been the missing manual we needed. In it, Piethein provides a
comprehensive strategy for managing data not simply in a solitary
database or application but across the many databases, applications,
microservices, storage layers, and all other types of software that
make up today’s technology landscapes.
He also takes an opinionated view, with an architecture to match,
grounded in a well-thought-out set of principles. These help to
bound the decision space with logical guardrails, inside of which a
host of practical solutions should fit. I think this approach will be
very valuable to architects and engineers as they map their own
problem domain to the trade-offs described in this book. Indeed,
Piethein takes you on a journey that goes beyond data and
applications into the rich fabric of interactions that bind entire
companies together.
Jay Kreps
Cofounder and CEO at Confluent
Preface

Data management is an emerging and disruptive subject.

Datafication is everywhere. This transformation is happening all
around us: in smartphones, TV devices, ereaders, industrial
machines, self-driving cars, robots, and so on. It’s changing our lives
at an accelerating speed.
As the amount of data generated skyrockets, so does its complexity.
Disruptive trends like cloudification, API and ecosystem connectivity,
microservices, open data, software as a service (SaaS), and new
software delivery models have a tremendous effect on data
management. In parallel, we see an enormous number of new
applications transforming our businesses. All these trends are
fragmenting the data landscape. As a result, we are seeing more
point-to-point interfaces, endless discussions about data quality and
ownership, and plenty of ethical and legal dilemmas regarding
privacy, safety, and security. Agility, long-term stability, and clear
data governance compete with the need to develop new business
cases swiftly. We sorely need a clear vision for the future of data
management.
This book’s perspective on data management is informed by my
personal experience driving the data architecture agenda for a large
enterprise as chief data architect. Executing that role showed me
clearly the impact a good data strategy can have on a large
organization. After leaving that company, I started working as the
chief data officer for Microsoft Netherlands. In this exciting new
position, I’ve worked with over 50 large customers discussing and
attempting to come up with a perfect data solution. Here are some
of the common threads I’ve identified across all enterprises:
An overarching data strategy is often missing or not connected
to the business objectives. Discussions about data management
mostly pivot to technology trends and engineering discussions.
What is needed is business engagement: a good strategy and
well-thought-out data management and analysis plan that
includes tangible value in the form of business use cases. To
make my point: the focus must be put on usage and turning
data into business value.
Enterprises have difficulties in interpreting new concepts like the
data mesh and data fabric, because pragmatic guidance and
experiences from the field are missing. In addition to that, the
data mesh fully embraces a decentralized approach, which is a
transformational change not only for the data architecture and
technology, but even more so for organization and processes.
This means the transformation cannot only be led by IT; it’s a
business transformation as well.
Enterprises find it difficult to comprehend the latest technology
trends. They’re unable to interpret nuances or make pragmatic
choices.
Enterprises struggle to get started: large ambitions often end
with limited action; the execution plan and architecture remain
too high-level, too conceptual; top-down commitment from
leadership is missing.

These experiences and my observations across a range of

enterprises inspired me to write this second edition of Data
Management at Scale. You may wonder why this book is worth
reading, over the first edition—let’s take a closer look.
Why I Wrote This Book and Why Now
The first edition was founded on the experience I gained while
working at ABN AMRO as chief data architect.1 In that role, my team
and I practiced the approach of federation: shifting activities and
responsibilities in response to the need for a faster pace of change.
We used governance for balancing the imperatives of centralization
and decentralization. This shift was supported by a central data team
that started to develop platforms for empowering business units to
meet their goals. With platforms, we introduced self-service and
aligned analysts to domains, supporting them in implementing their
use cases. We experimented with domain-driven design and
eventually switched to business architecture for managing the
architectural landscape as a whole. I used all these experiences as
input for writing the first edition.
The term data mesh as a description of a sociotechnical approach to
using data at large was coined at around the time the manuscript for
the first edition was being finalized. When Zhamak Dehghani’s article
describing the concept appeared on Martin Fowler’s website, it
revealed concrete names for concepts we’d already been using at
ABN AMRO for many years. These names became industry terms,
and the concept quickly began to resonate with large organizations
as a solution to the friction enterprises encounter when scaling up.
So, why write a second edition? To start with, it was the data mesh
concept. I love the ideas of bringing data management and software
architecture closer together and businesses taking ownership of their
data, but I firmly believe that, with all the fuss, a more nuanced
view is needed.
In my previous role as an enterprise architect, we had hundreds of
application teams, thousands of services, and many large legacy
applications to manage. In such situations, you approach complexity
differently. With the data mesh architecture, artist, song, and playlist
are often used as data domain examples. This approach of
decomposing data into fine-grained domains might work well when
designing microservices, but it isn’t well suited to (re)structuring
large data landscapes. A different viewpoint is needed for scale.
Next, a more nuanced and pragmatic view of data products is
needed. There are good reasons why data must be managed
holistically and end-to-end. Enterprises have reusability and
consistency concerns. They’re forced by regulation to conform to the
same dimensions for group reporting, accounting, financial
reporting, and auditing and risk management. I know this might
sound controversial, but a data product cannot be advocated to be
managed as a container: something that packages data, metadata,
code, and infrastructure all together in an architecture as tiny as a
microservice. This doesn’t reflect how today’s big data platforms
work. Finally, the data mesh story isn’t complete: it focuses only on
data that is used for analytical purposes, not operational purposes; it
omits master data management;2 the consumer side must be
complemented with an intelligent data fabric; and it doesn’t provide
much data modeling guidance for building data products.
Another incentive for publishing a second edition was concerns
about the book’s practicality. The first version was perceived by
various readers as too abstract. Some critical reviewers even left
comments questioning my hands-on experience. In this second
edition I’ve worked hard to address these concerns, providing many
real-world examples and concrete solution diagrams. From time to
time, I also refer to blog posts that I’ve written about how to
implement designs. One final note on this: there are a large number
of very complex topics to cover, which are also highly context-
sensitive. It would be impossible to provide examples of everything
in a single volume, so I’ve had to use some discretion.
I’m excited to share my thoughts on best practices and observations
from the field, and I hope this book inspires you. Reflecting on my
time working at ABN AMRO, there are lots of good lessons to be
taken from other enterprises. I’ve seen a lot of good approaches.
There’s no right or wrong when building good data architecture; it’s
all about making the right trade-offs and discovering what works
best for your situation.
If you’ve already read the first edition, you should find this one
significantly different and much improved. Structurally it’s more or
less the same, but every chapter has been revised and enhanced. All
the diagrams have also been revised, new content has been added,
and it’s much more practical. Within each chapter you’ll find many
tips, starting points, and references to helpful articles.
Who Is This Book For?
This book is intended for large enterprises, though smaller
organizations may find much of value in it. It’s geared toward:
Executives and architects
Chief data officers, chief technology officers, chief architects,
enterprise architects, and lead data architects

Analytics teams
Data scientists, data engineers, data analysts, and heads of
analytics

Development teams
Data engineers, data scientists, business intelligence engineers,
data modelers and designers, and other data professionals

Compliance and governance teams

Chief information security officers, data protection officers,
information security analysts, regulatory compliance heads, data
stewards, and business analysts

How to Read or Use This Book

It’s important to say up front that this book touches upon a lot of
complex topics that are often interrelated or intertwined with other
subjects. So we’ll be hopping between different technologies,
business methods, frameworks, and architecture patterns. From time
to time I bring in my own operational experience when
implementing different architectures, so we’ll be working at different
levels of abstraction. To describe the journey through the book, I’ll
use the analogy of a helicopter ride.
We’ll start with a zoomed-out view, looking at data management,
data strategy, and data architecture at an abstract and higher level.
From this helicopter view, we’ll start to zoom in and first explore
what data domains and landing zones are. We’ll then fly to the
source system side of our landscape, in which applications are
managed and data is created, and circle until we have covered most
of the areas of data management. Then we’ll fly over to the
consumer side of the landscape and start learning about the
dynamics there. After that, we’ll bring everything we’ve covered
together by putting things into practice.
To help you navigate through the book, the following table gives a
high-level overview of which subjects will be intensively discussed in
each chapter.
Table P-1. Key topics in each chapter

Ch. 1 Ch. 2 Ch. 3 Ch. 4

Data x
management

Data strategy x x x

Data x x
architecture

Data x
integration

Data x
modeling

Data
governance

Data security

Data quality x

Metadata
management

MDM

Business
intelligence
Other documents randomly have
different content
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or expense
to the user, provide a copy, a means of exporting a copy, or a means
of obtaining a copy upon request, of the work in its original “Plain
Vanilla ASCII” or other form. Any alternate format must include the
full Project Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,

performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing

access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who

notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of

any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™

electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend

considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except

for the “Right of Replacement or Refund” described in paragraph
1.F.3, the Project Gutenberg Literary Archive Foundation, the owner
of the Project Gutenberg™ trademark, and any other party
distributing a Project Gutenberg™ electronic work under this
agreement, disclaim all liability to you for damages, costs and
expenses, including legal fees. YOU AGREE THAT YOU HAVE NO
REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF
WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE
PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE
FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you

discover a defect in this electronic work within 90 days of receiving it,
you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or entity
that provided you with the defective work may elect to provide a
replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth in
paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied

warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and distribution
of Project Gutenberg™ electronic works, harmless from all liability,
costs and expenses, including legal fees, that arise directly or
indirectly from any of the following which you do or cause to occur:
(a) distribution of this or any Project Gutenberg™ work, (b)
alteration, modification, or additions or deletions to any Project
Gutenberg™ work, and (c) any Defect you cause.

Section 2. Information about the Mission of

Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the

assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project

Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,

Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to

the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many small
donations ($1 to $5,000) are particularly important to maintaining tax
exempt status with the IRS.

The Foundation is committed to complying with the laws regulating

charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where

we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make

any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About Project

Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.

Project Gutenberg™ eBooks are often created from several printed

editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,

including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.

Business Analysis For Dummies, 2nd 2nd Edition Ali Cox - Read the ebook online or download it for the best experience
100% (2)
Business Analysis For Dummies, 2nd 2nd Edition Ali Cox - Read the ebook online or download it for the best experience
60 pages
Cracking the Java Interview_ Top Q&A
No ratings yet
Cracking the Java Interview_ Top Q&A
19 pages
Comptia: Exam Questions 220-1101
No ratings yet
Comptia: Exam Questions 220-1101
14 pages
Corporate+Actions by Michael Simmons
100% (1)
Corporate+Actions by Michael Simmons
429 pages
Chapter 4 Case Study
No ratings yet
Chapter 4 Case Study
1 page
Unavista EMIR Faq v2.3
No ratings yet
Unavista EMIR Faq v2.3
59 pages
JR Businesses Analyst Job Role
No ratings yet
JR Businesses Analyst Job Role
3 pages
John Burton Technology Quotient Session Summary-Final
No ratings yet
John Burton Technology Quotient Session Summary-Final
4 pages
The Nature of Systems Development: Warning
100% (1)
The Nature of Systems Development: Warning
12 pages
Entity Event Modelling: Entity Event Matrix, Entity Life History, Effect Correspondence Diagram
No ratings yet
Entity Event Modelling: Entity Event Matrix, Entity Life History, Effect Correspondence Diagram
20 pages
Solution Architecture Proposal Requirements Package Shuber
No ratings yet
Solution Architecture Proposal Requirements Package Shuber
35 pages
Cloud Security Policy Template
100% (2)
Cloud Security Policy Template
16 pages
The SQA Unit and Other Actors in The SQA System
No ratings yet
The SQA Unit and Other Actors in The SQA System
14 pages
James Martin
100% (1)
James Martin
4 pages
ITMS-T2 - Skills Framework - IT Performance Assessment Framework
No ratings yet
ITMS-T2 - Skills Framework - IT Performance Assessment Framework
53 pages
Digital Technology and Architecture Whit
No ratings yet
Digital Technology and Architecture Whit
8 pages
User Stories and Tasks
100% (1)
User Stories and Tasks
4 pages
Business Data Analytics Part 4
No ratings yet
Business Data Analytics Part 4
52 pages
Business Data Analytics Part 3
No ratings yet
Business Data Analytics Part 3
59 pages
Agile Software Development: Chapter 3 - Summary
No ratings yet
Agile Software Development: Chapter 3 - Summary
38 pages
How To Break Up A Programming Project Into Tasks For Other Developers
100% (1)
How To Break Up A Programming Project Into Tasks For Other Developers
9 pages
KA6 - Solution Evaluation
No ratings yet
KA6 - Solution Evaluation
2 pages
1engage at A Glance (SalesKit)
No ratings yet
1engage at A Glance (SalesKit)
24 pages
What Can A Workflow Management System Do?
No ratings yet
What Can A Workflow Management System Do?
16 pages
People Make Difference
No ratings yet
People Make Difference
14 pages
The Lean Canvas: Problem Solution Unique Value Prop. Unfair Advantage Customer Segments
No ratings yet
The Lean Canvas: Problem Solution Unique Value Prop. Unfair Advantage Customer Segments
7 pages
A Study On Customer Satisfaction Towards ROYAL
No ratings yet
A Study On Customer Satisfaction Towards ROYAL
23 pages
Lean Portfolio Manager
No ratings yet
Lean Portfolio Manager
7 pages
Course Outline - C.2 IS602 - Spreadsheet Modeling For T&O Managers (Students Copy)
No ratings yet
Course Outline - C.2 IS602 - Spreadsheet Modeling For T&O Managers (Students Copy)
8 pages
Framework Design
No ratings yet
Framework Design
154 pages
Business Analyst Questions 1697392771
No ratings yet
Business Analyst Questions 1697392771
28 pages
AWS Core Service Options Cheat Sheet: by Via
No ratings yet
AWS Core Service Options Cheat Sheet: by Via
1 page
Installing Moodle - MoodleDocs
No ratings yet
Installing Moodle - MoodleDocs
10 pages
Mastering Prioritization - by Airfocus
No ratings yet
Mastering Prioritization - by Airfocus
62 pages
PRINCE2.Realtests - prinCE2 Foundation.v2015!03!26.by - Elouise.150q
No ratings yet
PRINCE2.Realtests - prinCE2 Foundation.v2015!03!26.by - Elouise.150q
63 pages
Csc-252 System Analysis and Design
No ratings yet
Csc-252 System Analysis and Design
5 pages
BiZZdesign Second Edition
From Everand
BiZZdesign Second Edition
Gerardus Blokdyk
No ratings yet
Get Designing User Experience 4th Edition David Benyon PDF ebook with Full Chapters Now
100% (5)
Get Designing User Experience 4th Edition David Benyon PDF ebook with Full Chapters Now
81 pages
Salesforce Demo
100% (1)
Salesforce Demo
13 pages
Week 3 - Forms, RWD, Bootstrap Grid & WCAG
No ratings yet
Week 3 - Forms, RWD, Bootstrap Grid & WCAG
51 pages
Excel Imp Formulas
No ratings yet
Excel Imp Formulas
7 pages
04 - Introduction To Synthetic Data
No ratings yet
04 - Introduction To Synthetic Data
15 pages
Mandate 1: Plan of Delivery: Plan of Delivery (I.e., Methods of Instructions) and Assist To Students
No ratings yet
Mandate 1: Plan of Delivery: Plan of Delivery (I.e., Methods of Instructions) and Assist To Students
8 pages
ERP Implementation and Transition Stratagies
No ratings yet
ERP Implementation and Transition Stratagies
21 pages
Mobile Bus Ticketing System Development
No ratings yet
Mobile Bus Ticketing System Development
8 pages
Sequence To Activity Diagram Example
No ratings yet
Sequence To Activity Diagram Example
6 pages
Risk Management: Theory and Practice
No ratings yet
Risk Management: Theory and Practice
7 pages
Quality Management: Production and Operations Management
No ratings yet
Quality Management: Production and Operations Management
15 pages
Ethics in Information Systems PDF
100% (1)
Ethics in Information Systems PDF
2 pages
Senior Executive Diploma - Brochure
100% (1)
Senior Executive Diploma - Brochure
8 pages
Interview Skills Competency Based Questions Factsheet Final
No ratings yet
Interview Skills Competency Based Questions Factsheet Final
7 pages
Python in Django: Learn To Create Web Application Using The Versatile, Robust, and Powerful Python Language
No ratings yet
Python in Django: Learn To Create Web Application Using The Versatile, Robust, and Powerful Python Language
21 pages
Agile
No ratings yet
Agile
288 pages
Utest Whitepaper Agile Testing
No ratings yet
Utest Whitepaper Agile Testing
11 pages
Listo System Chapter 3 and 4
No ratings yet
Listo System Chapter 3 and 4
5 pages
SIPOC
No ratings yet
SIPOC
27 pages
Change Management Raci
No ratings yet
Change Management Raci
2 pages
0407 General Mills SAP Data Services 41 & Information Steward 41 Upgrade & Migration
No ratings yet
0407 General Mills SAP Data Services 41 & Information Steward 41 Upgrade & Migration
31 pages
Single customer view Second Edition
From Everand
Single customer view Second Edition
Gerardus Blokdyk
No ratings yet
Data Literacy Fundamentals: Understanding the Power & Value of Data
From Everand
Data Literacy Fundamentals: Understanding the Power & Value of Data
Ben Jones
No ratings yet
Scrum Master Fundamentals
From Everand
Scrum Master Fundamentals
Selwyn Classen
No ratings yet
Small business software Standard Requirements
From Everand
Small business software Standard Requirements
Gerardus Blokdyk
No ratings yet
High-level design A Clear and Concise Reference
From Everand
High-level design A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
[FREE PDF sample] World of Reading T O T S Whale Hello There Disney Books ebooks
100% (3)
[FREE PDF sample] World of Reading T O T S Whale Hello There Disney Books ebooks
23 pages
[FREE PDF sample] You're a Mean One, Matthew Prince (Boy Meets Boy 2) 1st Edition Timothy Janovsky ebooks
100% (3)
[FREE PDF sample] You're a Mean One, Matthew Prince (Boy Meets Boy 2) 1st Edition Timothy Janovsky ebooks
37 pages
Download full The Inexplicable Logic of My Life 1st Edition Benjamin Alire SáEnz ebook all chapters
100% (3)
Download full The Inexplicable Logic of My Life 1st Edition Benjamin Alire SáEnz ebook all chapters
40 pages
Little Creeping Things 1st Edition Chelsea Ichaso 2024 scribd download
100% (3)
Little Creeping Things 1st Edition Chelsea Ichaso 2024 scribd download
40 pages
Buy ebook Strategic Management: Theory & Cases: An Integrated Approach, 13e Charles W. L. Hill cheap price
100% (3)
Buy ebook Strategic Management: Theory & Cases: An Integrated Approach, 13e Charles W. L. Hill cheap price
37 pages
Usb Multilink Universal Rev C Fact Sheet
No ratings yet
Usb Multilink Universal Rev C Fact Sheet
1 page
GIVINGOPINION 2024 08 28T00 - 32 - 54 - 972686 F9a756
No ratings yet
GIVINGOPINION 2024 08 28T00 - 32 - 54 - 972686 F9a756
24 pages
60d68803d - Assignment 2 (SP24)
No ratings yet
60d68803d - Assignment 2 (SP24)
1 page
Talent Rocketium
No ratings yet
Talent Rocketium
15 pages
Teaching and learning Technologies center tutorial 1st Edition by Adobe Acrobat - Download the ebook now to start reading without waiting
100% (6)
Teaching and learning Technologies center tutorial 1st Edition by Adobe Acrobat - Download the ebook now to start reading without waiting
71 pages
BABOK Agile Extension PDF
40% (5)
BABOK Agile Extension PDF
22 pages
Principles of Business Information Systems 4th Edition Ralph Stair - eBook PDFinstant download
100% (3)
Principles of Business Information Systems 4th Edition Ralph Stair - eBook PDFinstant download
54 pages
Innosight Brochure
No ratings yet
Innosight Brochure
12 pages
Computer Architect Assignment
No ratings yet
Computer Architect Assignment
2 pages
Virtual Assistant Program
No ratings yet
Virtual Assistant Program
415 pages
Chapter 2 Basic Elements of Java
No ratings yet
Chapter 2 Basic Elements of Java
41 pages
An Introduction To SQL Functions (Slides)
No ratings yet
An Introduction To SQL Functions (Slides)
13 pages
Ed Project Work
No ratings yet
Ed Project Work
10 pages
Bài 3 - Software Security
No ratings yet
Bài 3 - Software Security
40 pages
Basic Digital Forensics
No ratings yet
Basic Digital Forensics
130 pages
INGLES 1. Pronouns
No ratings yet
INGLES 1. Pronouns
11 pages
Lingo 11 Users Manual
100% (3)
Lingo 11 Users Manual
714 pages
Tos Ict 4 Matatag
100% (1)
Tos Ict 4 Matatag
4 pages
Directory Enumeration With Gobuster (Session 2)
No ratings yet
Directory Enumeration With Gobuster (Session 2)
9 pages
Network Security Multiple Choice Questions and Answers With MCQ
No ratings yet
Network Security Multiple Choice Questions and Answers With MCQ
60 pages
Comptia Server sk0 005 Exam Objectives (5 0)
No ratings yet
Comptia Server sk0 005 Exam Objectives (5 0)
18 pages
Compactlogix 5370 Controllers, Revision 21: Release Notes
No ratings yet
Compactlogix 5370 Controllers, Revision 21: Release Notes
22 pages
Advanced View Arduino Projects List - Use Arduino For Projects
No ratings yet
Advanced View Arduino Projects List - Use Arduino For Projects
71 pages
GPS PPK Kit-Compressed
No ratings yet
GPS PPK Kit-Compressed
13 pages
Mettl Bulk Upload Template Coding Questions v2
No ratings yet
Mettl Bulk Upload Template Coding Questions v2
22 pages
Unit 4
No ratings yet
Unit 4
6 pages
Manual - Zeemods Audio Framework
No ratings yet
Manual - Zeemods Audio Framework
8 pages

Data Management at Scale Piethein Strengholt 2024 scribd download

Uploaded by

Data Management at Scale Piethein Strengholt 2024 scribd download

Uploaded by

Get ebook downloads in full at ebookmeta.

Data Management at Scale Piethein Strengholt

Explore and download more ebook at https://ebookmeta.com

Data Management at Scale, Second Edition Piethein

Data Management at Scale Best Practices for Enterprise

Data Management at Scale: Modern Data Architecture with

Make Me Your Ho Ho Ho Daddy It s Christmas Daddy 2 1st

Wiedergeburt Legend of the Reïncarnated Warrior Volume 6

How People Learn: A new model of learning and cognition to

Privatizing Peace How Commerce Can Reduce Conflict in

The Green Fairy Book 1st Edition Andrew Lang.

Acquisitions Editor: Michelle Smith

Development Editor: Shira Evans

Production Editor: Katherine Tozer

Copyeditor: Rachel Head

Proofreader: Piper Editorial Consulting, LLC

Indexer: nSight, Inc.

Interior Designer: David Futato

Cover Designer: Karen Montgomery

Illustrator: Kate Dullea

April 2023: Second Edition

See https://oreilly.com/catalog/errata.csp?isbn=9781098138868 for

Data management is an emerging and disruptive subject.

These experiences and my observations across a range of

Compliance and governance teams

How to Read or Use This Book

Ch. 1 Ch. 2 Ch. 3 Ch. 4

1.E.7. Do not charge a fee for access to, viewing, displaying,

1.E.8. You may charge a reasonable fee for copies of or providing

• You provide a full refund of any money paid by a user who

• You provide, in accordance with paragraph 1.F.3, a full refund of

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™

1.F.1. Project Gutenberg volunteers and employees expend

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you

1.F.5. Some states do not allow disclaimers of certain implied

Section 2. Information about the Mission of

Volunteers and financial support to provide volunteers with the

Section 3. Information about the Project

The Foundation’s business office is located at 809 North 1500 West,

Section 4. Information about Donations to

The Foundation is committed to complying with the laws regulating

While we cannot and do not solicit contributions from states where

International donations are gratefully accepted, but we cannot make

Section 5. General Information About Project

Project Gutenberg™ eBooks are often created from several printed

This website includes information about Project Gutenberg™,

You might also like