Immediate Download Mastering PostgreSQL 15 Advanced Techniques To Build and Manage Scalable Reliable and Fault Tolerant Database Applications 5th Edition Hans-Jurgen Schonig Ebooks 2024
Immediate Download Mastering PostgreSQL 15 Advanced Techniques To Build and Manage Scalable Reliable and Fault Tolerant Database Applications 5th Edition Hans-Jurgen Schonig Ebooks 2024
com
https://textbookfull.com/product/advanced-methods-for-fault-diagnosis-
and-fault-tolerant-control-steven-x-ding/
textbookfull.com
https://textbookfull.com/product/the-international-political-economy-
of-oil-and-gas-1st-edition-slawomir-raszewski-eds/
textbookfull.com
Fragility Fracture Nursing: Holistic Care and Management
of the Orthogeriatric Patient Karen Hertz
https://textbookfull.com/product/fragility-fracture-nursing-holistic-
care-and-management-of-the-orthogeriatric-patient-karen-hertz/
textbookfull.com
https://textbookfull.com/product/the-handmaid-s-tale-margaret-atwood/
textbookfull.com
https://textbookfull.com/product/big-ideas-for-curious-minds-an-
introduction-to-philosophy-the-school-of-life/
textbookfull.com
https://textbookfull.com/product/fate-and-impact-of-microplastics-in-
marine-ecosystems-from-the-coastline-to-the-open-sea-juan-baztan/
textbookfull.com
Mastering PostgreSQL 15
Hans-Jürgen Schönig
BIRMINGHAM—MUMBAI
Mastering PostgreSQL 15
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any
form or by any means, without the prior written permission of the publisher, except in the case of brief quotations
embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented.
However, the information contained in this book is sold without warranty, either express or implied. Neither the
author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged
to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products
mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the
accuracy of this information.
ISBN 978-1-80324-834-9
www.packtpub.com
Contributors
I would like to thank my colleague Şeyma Mintaş who helped me review the book.
Marcelo Diaz is a software engineer with more than 15 years of experience, with a special focus on
PostgreSQL. He is passionate about open source software and has promoted its application in critical
and high-demand environments, working as a software developer and consultant for both private and
public companies. He currently works very happily at Cybertec and as a technical reviewer for Packt
Publishing. He enjoys spending his leisure time with his daughter, Malvina, and his wife, Romina.
He also likes playing football.
Dinesh Kumar Chemuduru works as a principal architect (OSS) at Tessell Inc. He has been working
with PostgreSQL since 2011, and he also worked as a consultant at AWS. He is also an author and
contributor to a few popular open source solutions. He co-authored PostgreSQL High Performance
Cookbook 9.6, which was released in 2016. He loves to code in Dart, Go, Angular, and C++ and loves
to deploy them in Kubernetes.
Thanks and love to my wife, Manoja Reddy, and my kids, Yashvi and Isha.
Table of Contents
Prefacexiii
1
PostgreSQL 15 Overview 1
Making use of DBA-related features 1 Working around NULL and UNIQUE 9
Removing support for old pg_dump 1 Adding the MERGE command
Deprecating Python 2 2 to PostgreSQL 10
Fixing the public schema 2 Using performance-related features 11
Adding pre-defined roles 2 Adding multiple compression algorithms 11
Adding permissions to variables 3 Handling parallel queries more efficiently 12
Improving pg_stat_statements 3 Improved statistics handling 12
New wait events 4 Prefetching during WAL recovery 12
Adding logging functionality 4
Additional replication features 12
Understanding developer-related Two-phase commit for logical decoding 12
features7 Adding row and column filtering 13
Security invoker views 7 Improving ALTER SUBSCRIPTION 13
ICU locales 7 Supporting compressed base backups 14
Better numeric 8 Introducing archiving libraries 15
Handling ON DELETE 9
Summary15
2
Understanding Transactions and Locking 17
Working with PostgreSQL Transactional DDLs 23
transactions17 Understanding basic locking 24
Handling errors inside a transaction 21
Avoiding typical mistakes and explicit locking 26
Making use of SAVEPOINT 22
vi Table of Contents
3
Making Use of Indexes 49
Understanding simple queries Understanding PostgreSQL
and the cost model 50 index types 80
Making use of EXPLAIN 51 Hash indexes 81
Digging into the PostgreSQL cost model 53 GiST indexes 81
Deploying simple indexes 55 GIN indexes 84
Making use of sorted output 56 SP-GiST indexes 85
Using bitmap scans effectively 59 BRINs86
Using indexes in an intelligent way 59 Adding additional indexes 88
Understanding index de-duplication 62
Achieving better answers
Improving speed using with fuzzy searching 90
clustered tables 62 Taking advantage of pg_trgm 90
Clustering tables 66 Speeding up LIKE queries 92
Making use of index-only scans 67 Handling regular expressions 93
4
Handling Advanced SQL 101
Supporting range types 102 Using sliding windows 121
Querying ranges efficiently 103 Abstracting window clauses 128
Handling multirange types 105 Using on-board windowing functions 129
When to use range types 107 Writing your own aggregates 137
Introducing grouping sets 107 Creating simple aggregates 137
Loading some sample data 108 Adding support for parallel queries 141
Applying grouping sets 109 Improving efficiency 142
Investigating performance 111 Writing hypothetical aggregates 144
Combining grouping sets with Handling recursions 146
the FILTER clause 113
UNION versus UNION ALL 147
Making use of ordered sets 114 Inspecting a practical example 148
Understanding hypothetical Working with JSON and JSONB 150
aggregates116
Displaying and creating JSON documents 150
Utilizing windowing functions Turning JSON documents into rows 152
and analytics 117 Accessing a JSON document 153
Partitioning data 118
Ordering data inside a window 119
Summary157
5
Log Files and System Statistics 159
Gathering runtime statistics 159 Configuring the postgresql.conf file 184
Working with PostgreSQL system views 160 Summary191
Creating log files 184 Questions191
6
Optimizing Queries for Good Performance 193
Learning what the optimizer does 193 Understanding execution plans 209
A practical example – how the query Approaching plans systematically 209
optimizer handles a sample query 194 Spotting problems 211
viii Table of Contents
7
Writing Stored Procedures 259
Understanding stored Introducing PL/Perl 292
procedure languages 259 Introducing PL/Python 300
Understanding the fundamentals of stored Improving functions 304
procedures versus functions 261
Reducing the number of function calls 304
The anatomy of a function 261
Using functions for
Exploring various stored
various purposes 307
procedure languages 265
Summary309
Introducing PL/pgSQL 267
Writing stored procedures in PL/pgSQL 290 Questions309
8
Managing PostgreSQL Security 311
Managing network security 311 Managing the pg_hba.conf file 316
Understanding bind addresses Handling instance-level security 321
and connections 312 Defining database-level security 326
Table of Contents ix
9
Handling Backup and Recovery 347
Performing simple dumps 347 Replaying backups 355
Running pg_dump 348 Handling global data 356
Passing passwords and Summary357
connection information 349
Questions357
Extracting subsets of data 352
10
Making Sense of Backups and Replication 359
Understanding the transaction log 360 Performing failovers and
Looking at the transaction log 360 understanding timelines 383
Understanding checkpoints 361 Managing conflicts 385
Optimizing the transaction log 362 Making replication more reliable 387
11
Deciding on Useful Extensions 421
Understanding how extensions work 421 Encrypting data with pgcrypto 439
Checking for available extensions 423 Prewarming caches with pg_prewarm 439
Inspecting performance with
Making use of contrib modules 426 pg_stat_statements441
Using the adminpack module 426 Inspecting storage with pgstattuple 441
Applying bloom filters 428 Fuzzy searching with pg_trgm 443
Deploying btree_gist and btree_gin 431 Connecting to remote servers
dblink – considering phasing out 432 using postgres_fdw 443
Fetching files with file_fdw 433
Other useful extensions 449
Inspecting storage using pageinspect 435
Investigating caching with pg_buffercache 437 Summary449
12
Troubleshooting PostgreSQL 451
Approaching an unknown database 451 Understanding noteworthy
Inspecting pg_stat_activity 452 error scenarios 462
Querying pg_stat_activity 452 Facing clog corruption 462
Understanding checkpoint messages 463
Checking for slow queries 455 Managing corrupted data pages 464
Inspecting individual queries 456 Careless connection management 465
Digging deeper with perf 457 Fighting table bloat 465
Inspecting the log 458 Summary466
Checking for missing indexes 459 Questions466
Checking for memory and I/O 460
Table of Contents xi
13
Migrating to PostgreSQL 467
Migrating SQL statements Using the OFFSET clause 475
to PostgreSQL 467 Using temporal tables 475
Using LATERAL joins 468 Matching patterns in time series 476
Using grouping sets 468 Moving from Oracle to PostgreSQL 476
Using the WITH clause – common
Using the oracle_fdw extension to move data 476
table expressions 469
Using ora_migrator for fast migration 479
Using the WITH RECURSIVE clause 470
CYBERTEC Migrator – migration for
Using the FILTER clause 471
the “big boys” 480
Using windowing functions 472
Using Ora2Pg to migrate from Oracle 481
Using ordered sets – the WITHIN
Common pitfalls 483
GROUP clause 472
Using the TABLESAMPLE clause 473 Summary485
Using limit/offset 474
Index487
Chapter 6, Optimizing Queries for Good Performance, is all about good query performance and
outlines optimization techniques that are essential to bringing your database up to speed to handle
even bigger workloads.
Chapter 7, Writing Stored Procedures, introduces you to the concept of server-side code such as functions,
stored procedures, and a lot more. You will learn how to write triggers and dive into server-side logic.
Chapter 8, Managing PostgreSQL Security, helps you to make your database more secure, and explains
what can be done to ensure safety and data protection at all levels.
Chapter 9, Handling Backup and Recovery, helps you to make copies of your database to protect yourself
against crashes and database failure.
Chapter 10, Making Sense of Backups and Replication, follows up on backups and recovery and explains
additional techniques, such as streaming replication, clustering, and a lot more. It covers the most
advanced topics.
Chapter 11, Deciding on Useful Extensions, explores extensions and additional useful features that can
be added to PostgreSQL.
Chapter 12, Troubleshooting PostgreSQL, completes the circle of topics and explains what can be done
if things don’t work as expected. You will learn how to find the most common issues and understand
how problems can be fixed.
Chapter 13, Migrating to PostgreSQL, teaches you how to move your databases to PostgreSQL efficiently
and quickly. It covers the most common database systems people will migrate from.
Note:
Some parts of chapters i.e., 8, 9, 10, 11,12 and 13 are mostly dedicated to unix / linux and mac
users and rest runs fine on windows.
Preface xv
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file
extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “. You
cannot run it inside a SELECT statement. Instead, you have to invoke CALL. The following listing
shows the syntax of the CALL command.”
A block of code is set as follows:
test=# \h CALL
Command: CALL
Description: invoke a procedure
Syntax:
CALL name ( [ argument ] [, ...] )
URL: https://www.postgresql.org/docs/15/sql-call.html
When we wish to draw your attention to a particular part of a code block, the relevant lines or items
are set in bold:
# - Connection Settings –
# listen_addresses = 'localhost'
# what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
xvi Preface
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance,
words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the
Administration panel.”
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at customercare@
packtpub.com and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen.
If you have found a mistake in this book, we would be grateful if you would report this to us. Please
visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would
be grateful if you would provide us with the location address or website name. Please contact us at
copyright@packtpub.com with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you
are interested in either writing or contributing to a book, please visit authors.packtpub.com.
https://packt.link/free-ebook/9781803248349
• DBA-related features
• Developer-related features
• Performance-related features
• Additional replication features
Of course, there is always more stuff. However, let us focus on the most important changes affecting
most users.
Considering that PostgreSQL 9.2.0 was released to the PostgreSQL community FTP server on September
10, 2012, most people should have gotten rid of their PostgreSQL 9.1 (and older) systems by now.
If you have not been able to upgrade since then, we highly recommend doing that. It is still possible
to upgrade from such an old version to PostgreSQL 15. However, you will need an intermediate step
and to use pg_dump twice.
Deprecating Python 2
PostgreSQL allows developers to write stored procedures in various languages. This includes Python
but is not limited to it. The trouble is that Python 2.x has been deprecated for a long time already.
Starting with version 15, the PostgreSQL community has also dropped support for PL/Python2U and
only supports version 3 from now on.
This means that all code that is still in Python 2 should be moved to Python 3 in order to function properly.
This was rarely done but caused security leaks people were generally not aware of. With the introduction
of PostgreSQL, the situation has changed. The public schema is, from now on, not available to the
general public and you have to be granted permission to use it like before. The new behavior will make
applications a lot safer and ensure that permissions are not there accidentally.
---------------------------
pg_database_owner
pg_read_all_data
pg_write_all_data
pg_monitor
pg_read_all_settings
pg_read_all_stats
pg_stat_scan_tables
pg_read_server_files
pg_write_server_files
pg_execute_server_program
pg_signal_backend
pg_checkpointer
(12 rows)
With the introduction of PostgreSQL, a new role has been added, pg_checkpointer, which allows
users to manually run checkpoints if needed.
This new feature allows administrators to disable bad behavior and prohibit bad parameter settings
that can compromise the availability of the entire server.
Improving pg_stat_statements
Every version will also provide us with some improvements related to pg_stat_statements,
which in my judgment is the key to good performance. Consider the following code snippet:
test=# \d pg_stat_statements
View "public.pg_stat_statements"
Column | Type | ...
------------------------+------------------+ ...
userid | oid | ...
...
jit_functions | bigint | ...
4 PostgreSQL 15 Overview
The module is now able to display information about the JIT compilation process and helps to detect
JIT-related performance problems. Those problems are not too frequent – however, it can happen that
once in a while, a JIT compilation process takes too long. This is especially true if you are running a
query containing hundreds of columns.
• ArchiveCommand
• ArchiveCleanupCommand
• RestoreCommand
• RecoveryEndCommand
Those events complement the existing wait event infrastructure and give some insights into
replication-related issues.
Reading a tightly packed file containing millions of JSON documents is not really user-friendly, so I
recommend using a tool such as jq to make this stream more readable and user-friendly to process:
"txid": 0,
"error_severity": "LOG",
"message": "ending log output to stderr",
"hint": "Future log output will go to log destination
\"jsonlog\".",
"backend_type": "postmaster",
"query_id": 0
}
{
"timestamp": "2022-11-04 08:50:59.000 CET",
"pid": 32183,
"session_id": "6364c462.7db7",
"line_num": 2,
"session_start": "2022-11-04 08:50:58 CET",
"txid": 0,
"error_severity": "LOG",
"message": "starting PostgreSQL 15.0 on x86_64-apple-
darwin21.6.0, compiled by Apple clang version
13.1.6 (clang-1316.0.21.2.5), 64-bit",
"backend_type": "postmaster",
"query_id": 0
}
{
"timestamp": "2022-11-04 08:50:59.006 CET",
"pid": 32183,
"session_id": "6364c462.7db7",
"line_num": 3,
"session_start": "2022-11-04 08:50:58 CET",
"txid": 0,
"error_severity": "LOG",
"message": "listening on IPv6 address \"::1\", port 5432",
"backend_type": "postmaster",
"query_id": 0
}
...
In general, it is recommended to not use JSON logs excessively as they occupy a fair amount of space.
Random documents with unrelated
content Scribd suggests to you:
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.