Latest Blog Posts

Understand PostgerSQL’s Portal – Executor vs Process Utility
Posted by cary huang in Highgo Software on 2024-05-17 at 22:22

Introduction

When you send a query to PostgreSQL, it normally would go through stages of query processing and return you the results at the end. These stages are known as:

  • Parse
  • Analyze
  • Rewrite
  • Plan
  • Execute

I wrote another blog to briefly explain the responsibility of each query processing stage. You can find it here. In this blog. There is one that focuses primarily on the planner module that you can find here. In this blog, we will focus on the executor part of the query processing, which is the entry point to “communicate” with other PostgreSQL internal modules to collect the correct data for the query.

In fact, executor may not be the best term to describe this stage. In the PostgreSQL source code, it is actually “packaged” in another object called “portal”. Often, and it has 2 potential paths to take, the “executor” path, or the “process utility” path. Very often, we just call this stage as “executor” rather than “portal”, because executor handles most of the DML query types that involve SELECT, INSERT, UPDATE, DELETE…etc and it has to process them according to the query plan created by the “planner” module.

Process Utility, on the other hand, does not have a plan from “planner” to process and it handles DDL and other non-DML queries such as VACUUM, CREATE, PREPARE …etc.

The portal (or executor) is not as complicated as planner in terms of the logics involved, but it does involve a lot of other modules in PostgreSQL that cooperate together to produce the right output. Let’s take a look.

Where it all start

the function exec_simple_query() in postgres.c is where the query processing stages take place. We will focus on what happens after PortalStart().

Relationship Between Portal, Executor and ProcessUtility

As mentioned above, portal is an object that encapsulates “executor” and “process utility” where:

  • executor is responsible for executing the “plan” created by planner – SELECT, UPDATE, INSERT…etc
  • process utility is responsible fo
[...]

Understanding Synchronous and Asynchronous Replication in PostgreSQL – What is Best for You?
Posted by semab tariq in Stormatics on 2024-05-17 at 05:17

Learn about synchronous and asynchronous replication in PostgreSQL, including how they work and when to use each.

The post Understanding Synchronous and Asynchronous Replication in PostgreSQL – What is Best for You? appeared first on Stormatics.

Geographically distributed Postgres for multi-tenant applications
Posted by Tudor Golubenco on 2024-05-16 at 00:00
Documenting a pattern for making multi-tenant applications global by distributing the data, using only standard PostgreSQL functionality.

PGConf.DE 2024 - Review
Posted by Andreas Scherbaum on 2024-05-15 at 22:55

PGConf.DE 2024 was last month, the feedback is now closed - time for a review.

Removing Bloat with pg_repack Extension
Posted by muhammad ali in Stormatics on 2024-05-15 at 06:55

PostgreSQL's pg_repack extension offers a solution to database maintenance challenges by optimizing table storage without disrupting operations. It tackles issues like bloat and fragmentation efficiently. Learn about its operation, options, and restrictions for effective database management.

The post Removing Bloat with pg_repack Extension appeared first on Stormatics.

Git Filters
Posted by Christoph Berg in Cybertec on 2024-05-14 at 07:13

Nowadays, most Debian packages are maintained in Git repositories, with the majority hosted on Debian's GitLab instance at salsa.debian.org.Debian is running a "vcswatch" service that keeps track of the status of all packaging repositories that have a Vcs-Git (and other VCSes) header set. This status is stored in a PostgreSQL database, which is then utilized by Debian's package maintenance infrastructure for various purposes. One such application is to identify repositories that may require a package upload to release pending changes.

Naturally, this amount of Git data required several expansions of the scratch partition on qa.debian.org in the past, up to 300 GB in the last iteration. Attempts to reduce that size using shallow clones (git clone --depth=50) resulted in only marginal space savings of a few percent. Running git gc on all repos helps somewhat, but is a tedious task. As Debian is growing, so are the repos both in size and number. I ended up blocking all repos with checkouts larger than a gigabyte, and even then, the only solutions were again either expanding the disk space or lowering the blocking threshold.

Since we only require minimal information from the repositories, specifically the content of debian/changelog and a few other files from debian/, along with the number of commits since the last tag on the packaging branch - it made sense to try obtaining this information without fetching a full repo clone. The question of whether we could retrieve this solely using the GitLab API at salsa.debian.org was pondered but never answered. But then, in #1032623, Gábor Németh suggested the use of git clone --filter blob:none. This suggestion remained unattended in the bug report for almost a year until the next "disk full" event prompted me to give it a try.

The blob:none filter makes git clone omit all files, fetching only commit and tree information. Any blob (file content) needed at git run time is transparently fetched from the upstream repository and stored locally. It turned out to be

[...]

When to split patches for PostgreSQL
Posted by Peter Eisentraut in EDB on 2024-05-14 at 04:00

In my previous article on how to submit patches by email for PostgreSQL, I skipped over whether patches should be split up. Let’s discuss that now.

(See the previous article, as well as general Git documentation, for the technical details of how to split up a patch. Here, I’m only talking about why.)

What I’m talking about here specifically is where instead of attaching one patch to an email to pgsql-hackers, you attach a sequence of patch files like 0001-first.patch, 0002-second.patch, etc. (see previous article for the correct naming).

What follows is my opinion, based on what I like to see during patch reviews, and how I tend to prepare my patch submissions. Maybe it’s all wrong and others hate it and wish I’d stop it. Feedback welcome. But anyway.

The foremost principle is, each patch in a series should make sense on its own, and it should move the source tree from one working state to another working state. (Think of the “C” in ACID.) When I review a patch series, I review each patch separately, and I also run each patch (incrementally on top of its predecessors) through the test suites. If that fails, meh.

This principle can be applied and interpreted in different ways, and that’s fine. Let’s go through a few different cases:

[I will show some examples. In order not to pick on anyone else, I’m only picking examples that I authored or I’m otherwise closely involved in.]

  1. Simplest case: You send the whole thing as one patch. Nothing wrong with that, let’s be clear. Start here if you don’t know better or you’re not a Git wizard or none of the other cases apply. There appears to be a natural limit on how big a patch can be processed successfully, but that’s not a hard rule, just indirect advice. Keep your patch free of unrelated cleanup. If you have cleanup, see below.

    [Example: Here is the first patch submission for a potential new feature. It’s just a big patch for now, early in development, maybe we’ll find ways to organize it differently later. But f

[...]

Extension Summit Topic Review
Posted by David Wheeler in Tembo on 2024-05-13 at 19:12

Boy howdy that went fast.

This Wednesday, May 15, the final Postgres extension ecosystem mini-summit will review topics covered in previous Mini-Summits, various Planet PostgreSQL posts, the #extensions channel on the Postgres Slack and the Postgres Discord. Following a brief description of each, we’ll determine how to reduce the list to the most important topics to take on at the Extension Ecosystem Summit at PGConf.dev in Vancouver on May 28. I’ll post a summary later this week along with details for how to participate in the selection process.

In the meantime, here’s the list as of today:

  • Metadata:
    • Third-party dependencies
    • Types of extensions
    • Taxonomies
    • System requirements (OS, version, CPU, etc.)
    • Categorization
    • Versioning
  • Registry:
    • Identity, namespacing, and uniqueness
    • Distributed vs. centralized publishing
    • Binary packaging and distribution patterns
    • Federated distribution
    • Services and tools to improve or build
    • Stats, Reports, Badging: (stars, reviews, comments, build & test matrices, etc.)
  • Packaging:
    • Formats (e.g., tarball, OCI, RPM, wheel, etc.)
    • Include dynamic libs in binary packaging format? (precedent: Python wheel)
    • Build farming
    • Platforms, architectures, and OSes
    • Security, trust, and verification
  • Developer:
    • Extension developer tools
    • Improving the release process
    • Build pipelines: Supporting PGXS, prgx, Rust, Go, Python, Ruby, Perl, and more
  • Community:
    • Community integration: identity, infrast
[...]

Use of “pgtt” Extension in Self Managed vs. Cloud Products
Posted by Shane Borden on 2024-05-13 at 18:40

There are several benefits to using temp tables in SQL DML operations. Given my long relationship and work on Oracle databases, I am highly familiar with the Oracle style “Global Temporary Tables” and have used them to solve many performance issues when complex joins are involved. That’s why I was excited when Gilles Darold released the extension “pgtt” for PostgreSQL. I was even more excited when Google released the extension for use in the Google CloudSQL for Postgres and AlloyDB for Postgres products.

In my opinion, native Postgres temporary tables leave a lot to be desired. For example, each table has to be created upon use. Nothing is persistent. This takes time and leads to Postgres catalog bloat (looking to do a post on this soon). That being said, in self managed instances, ‘pgtt’ is very easy to use. You can load the extension either at connection time by user, or for anyone that connects to a database. For example (make sure the extension is created in the database before setting these):

alter role test_user SET session_preload_libraries to 'pgtt';

alter database test_pgtt SET session_preload_libraries = 'pgtt';

This will, by default, use the ‘pgtt’ version of temporary tables instead of the default temporary tables Postgres offers. You can then create ‘pgtt’ style tables and refer to them via the pgtt_schema with no further modifications.

To revert back in the session, you must “disconnect” the session alter the user / database to revert the behavior to “old” style temporary tables. To my knowledge, if the library is enabled this way there is no way to use both styles of temporary tables at the same time.

alter role test_user RESET session_preload_libraries;

alter database pgtt_test RESET session_preload_libraries;

The Set Up…..

/* set flag "pgtt.enabled" on alloydb / cloudsql postgres console  to on */

/* create the extension in every database you need to use global temp tables */
CREATE EXTENSION pgtt;
LOAD 'pgtt';

/* Additional e
[...]

Conversion Chronicles – Invisible Indexes in Oracle to PostgreSQL Migration.
Posted by Deepak Mahto on 2024-05-13 at 15:22

Introduction – Invisible or Hidden Index.

Any database conversion offers valuable insights into how things work across platforms or how to achieve similar functionality in the target database. Invisible indexes are one such feature that is often requested in PostgreSQL when migrating away, due to their use case of create new or marking existing indexes as invisible for the optimizer.

Making indexes invisible can be beneficial for several use cases, such as:

  1. Creating invisible indexes allows for simulating and measuring impacts before actually dropping duplicate or unused indexes.
  2. Temporarily marking an index as invisible can influence the execution plan to avoid using it as an index access path option.

Oracle – Invisible Index

Before delving into the options available to achieve similar features in PostgreSQL, let’s first understand how it would work in Oracle with a sample table.

Sample Oracle examples

create table test_invisible as
select level as col1, level as col2, 
rpad('a',level,'x') as col3
from dual
connect by rownum < 501;

create index test_invisible_idx1 on 
test_invisible(col1);

create index test_invisible_idx2 on 
test_invisible(col1,col2);

explain plan for 
select /*+ NO_PARALLEL */ * from test_invisible 
where col1 = 1;

select * from table(dbms_xplan.display);

In our scenario, let’s assume we want to eliminate a single column index (test_invisible_idx1). We’ll alter the index to be invisible and then verify the execution plan to assess the impact on performance.

alter index test_invisible_idx1 invisible;

explain plan for 
select /*+ NO_PARALLEL */ * from test_invisible where col1 = 1;

select * from table(dbms_xplan.display);

In Oracle, making any index invisible has a global effect—it’s not limited to the current session or specific users.

PostgreSQL – Invisible Index(hypopg Extension)

Making indexes invisible isn’t officially supported as part of the ALTER INDEX command in vanilla PostgreSQL. However, leveraging the Ext

[...]

Advanced psql scripting with coproc
Posted by Daniel Vérité on 2024-05-13 at 15:10
In this post, let’s see how to control psql from outside with the coproc functionality in bash.

Philip Marks
Posted by Andreas 'ads' Scherbaum on 2024-05-13 at 14:00
PostgreSQL Person of the Week Interview with Philip Marks: London boy, life long Arsenal fan! Married, two kids, well they’re both in their 20s now! Now currently living on the Welsh English borders in the hills, thank you broadband for finally getting there.

PGSQL Phriday #016: Query Tuning
Posted by gabrielle roth on 2024-05-11 at 02:30
For PGSQL Phriday #016, Ryan’s asked us to “…discuss your process for tuning difficult queries. Specifically, try to focus on that one problematic query that really challenged you and you always use it as an example when helping or teaching others your methods.” Here are the generic steps I take, mostly in order. After that […]

Deep Dive into PostgREST - Time Off Manager (Part 1)
Posted by Radim Marek on 2024-05-11 at 00:00

The primary motivation behind boringSQL is to explore the robust world of SQL and the PostgreSQL ecosystem, demonstrating how these "boring" tools can cut through the ever-increasing noise and complexity of modern software development. In this series, I'll guide you through building a simple yet fully functional application—a Time Off Manager. The goal of this project is not only to demonstrate practical database/SQL approaches but also to provide a complete, extendable solution that you can immediately build upon. Each part of this series will deliver a self-contained application, setting the stage for introducing more complex functionalities and practices.

The first part of this guide focuses mainly on the application logic and exposing raw data using postgREST, allowing you to grasp the concept.

Requirements

This tutorial assumes you can install PostgreSQL, connect to it using your preferred DB client, have permissions to create a new database, and understand the basics of schema operations and SQL. Similarly, you should be able to follow the installation instructions for postgREST.

The complete source code for the guide is available at GitHub.

The Time Off Manager

When choosing a sample application for this tutorial, I was torn between the popular but simple TodoMVC and a slightly more complex application. In the end, a sample like Time Off Manager provides a much richer database scheme, going beyond the basics and offering a solution closer to real-life systems.

Time Off Manager is also an excellent example that combines simple business logic, workflows, and practicality. The main requirements are:

  • Maintaining the list of users (employees) with their respective managers
  • Allowing the maintenance of the time off balance, with an audit history
  • Introducing an approval workflow and automation

While the application is intended for learning purposes, the database and API can be easily exposed and built upon.

The Database

To get started, you will need to

[...]

PG Phriday: On the Move
Posted by Shaun M. Thomas in Tembo on 2024-05-10 at 20:13

Sometimes when we have an existing database and schema full of tables, there comes a time when we need to perform a migration. Maybe it’s because we want to move to or from a cloud service. Perhaps we have a small database and want to do a quick version upgrade via dump / restore. Whatever the reason, we may decide to clean up some technical debt while we’re making the transition.

Many Postgres experts recommend against creating objects in the public schema. This is the default schema that exists in nearly all Postgres databases, and there are often implicit grants that could make our objects available in unexpected scenarios. It’s also a cluttered namespace if all tables, views, functions, etc., are created there by default. Using it is sloppy and makes future data or operation segregation much more difficult.

So how can we move a bunch of existing stuff out of the public schema safely?

PGXN Language Poll Result
Posted by David Wheeler in Tembo on 2024-05-10 at 18:13

Back on March 28, I asked the Postgres community whether new services for PGXN v2 should be written in Go, Rust, or “some of each”. I went so far as to create a poll, which ran through April 12. A month later you might reasonably be wondering what became of it. Has David been refusing to face reality and accept the results?

The answer is “no”. Or at least I don’t think so. Who among us really knows ourselves. Since it closed, the poll has provided the results since it closed, but I suspect few have looked. So here they are:

Candidate Votes % All Votes
🦀 Rust 102 60.4%
🐿️ Go 53 31.4%
🐿️ + 🦀 Some of each 13 7.7%

🦀 Rust is the clear winner.

I don’t know whether some Rust brigade descended upon the poll, but the truth is that the outcome was blindingly apparent within a day of posting the poll. So much so that I decided to get ahead of things and try writing a pgrx extension. I released jsonschema on PGXN on April 30. Turned out to be kind of fun, and the pgrx developers kindly answered all my questions and even made a new release to simplify integration testing, now included in the pgrx-build-test utility in the pgxn-tools Docker image.

But I digress. As a result of this poll and chatting with various holders of stakes at work and haunting the #extensions Slack channel, I plan to use Rust for all new PGXN projects — unless there is an overwhelmingly compelling reason to use something else for a specific use case.

W

[...]

Back from PGConf Belgium 2024
Posted by Stefan Fercot in Data Egret on 2024-05-10 at 15:32

Last Tuesday (May 7, 2024) happened PGConf.BE in Leuven, Belgium. Living in Belgium myself, going to tech conferences in my own country is always a little bit special (and so much fun!).
While it was only the 4th time that the event was (very successfully) organized, and despite the attendance was not very high, this cozy conference still manage to have great line-up of talks every year!
(Previous years schedules for reference: 2019, 2022, 2023)

As opening talk, Pavlo Golub (Cybertec) told us about how they came to create the pg_timetable job scheduler for PostgreSQL. Briefly said, using system crontab doesn’t provide knowledge of the database specifics and using a scheduler directly inside PostgreSQL wouldn’t allow to easily perform system tasks. So they came up with this new tool, written in Go, to be able to support a variety of different platforms. Finally, Pavlo ended-up his talk with a quick demo on how to use the tool. Quite interesting tool IMHO.

Then Tomas Vondra (EDB), a very famous PostgreSQL hacker and committer, told us about performance cliffs: abrupt changes of performance for a given query, after only a small change in data and/or query parameters. In example, even if a query plan might look good, we could hit a performance cliff at runtime because of resource exhaustion: on-memory sort going to disk because the amount of data to sort doesn’t fit in the allocated memory. In this case, it is easy to understand how changing the selectivity of a query (estimated part of data to retrieve) is going to affect the query runtime but not the query planning. That’s also why unfortunately cost-base planning is often hard to correlate with query duration.
This specific example is usually mitigated by DBAs by raising work_mem or moving temp files to “in memory” disks, but it doesn’t solve the underlying problem though.
The conference being held in a University, Tomas also wanted to add an academic side to his talk by raising some thoughts and possible improvements to research and invest

[...]

Understanding Multi-Master Architecture in PostgreSQL: Benefits and Trade-offs
Posted by semab tariq in Stormatics on 2024-05-10 at 04:32

Multi-master architecture has gained significant traction in the world of database management, offering a solution to traditional limitations in scalability, fault tolerance, and high availability. By allowing multiple nodes to operate as master, this architecture promises a more flexible and robust database system. However, along with these benefits come certain challenges, including data consistency, resource […]

The post Understanding Multi-Master Architecture in PostgreSQL: Benefits and Trade-offs appeared first on Stormatics.

Postgres 17 Release Notes
Posted by Bruce Momjian in EDB on 2024-05-10 at 00:00

I have just completed the first draft of the Postgres 17 release notes. It includes developer community feedback but still needs more XML markup and links.

The release note feature count is 188. The most pleasant surprise for me was the large number of optimizer improvements. Postgres 17 Beta 1 should be released soon. The final Postgres 17 release is planned for September/October of this year.

Time Partitioning and Custom Time Intervals in Postgres with pg_partman
Posted by Keith Fiske in Crunchy Data on 2024-05-09 at 17:00

Whether you are managing a large table or setting up automatic archiving, time based partitioning in Postgres is incredibly powerful. pg_partman’s newest versions support a huge variety of custom time internals. Marco just published a post on using pg_partman with our new database product for doing analytics with Postgres, Crunchy Bridge for Analytics. So I thought this would be a great time to review the basic and complex options for the time based partitioning.

Time partitioning intervals

When I first started designing pg_partman for time-based partitioning, it only had preset intervals that users could choose. Currently, pg_partman supports all of Postgres’ time base interval values. The partitioning interval is set during the initial parent creation, in the p_interval field.

SELECT partman.create_parent(
    p_parent_table := 'partman_test.time_taptest_table'
    , p_control := 'col3'
    , p_interval := '1 day'
    , p_template_table := 'partman_test.time_taptest_table_template'
);

Additional examples like:

p_interval := '1 month'
p_interval := '1 year'

Even with allowing these intervals, some common intervals used in business, like weekly and quarterly, can be a little tricky. But thankfully pg_partman still has options to make these intervals easy and now even more flexible. So let’s dig into these examples.

Weekly partitioning

Weekly partitioning was and still is a fairly popular partitioning interval. When I started working on it, I’d thankfully found the ISO week date standard to allow me to tackle the more difficult issues of handling weeks (leap years, starting days, 53 week years) when I’d wanted to label the children with the week number . However with declarative partitioning I found an opportunity to allow this to be more flexible when redesigning things for version 5 of partman. While the result did get rid of the nice weekly numbering pattern I had liked for this interval (IYYYwIW which came out to something like “2024w15”), the new method lets people start their wee

[...]

PostgreSQL Minor version updates announced – May 2024
Posted by Avi Vallarapu on 2024-05-09 at 13:12

The PostgreSQL Global Development Group has announced minor version updates to all supported PostgreSQL versions : 16, 15, 14, 13, and 12. These minor version updates introduce over 55 bug fixes that have been reported in recent months, reflecting PostgreSQL’s commitment to stability and performance. You might have seen our summary of PostgreSQL in 2023, […]

The post PostgreSQL Minor version updates announced – May 2024 appeared first on HexaCluster.

Understanding PostgreSQL Fill Factor: Benefits, Drawbacks, and Best Practices
Posted by muhammad ali in Stormatics on 2024-05-09 at 11:22

Learn how to optimize fill factor to boost database performance, minimize fragmentation, and balance storage needs.

The post Understanding PostgreSQL Fill Factor: Benefits, Drawbacks, and Best Practices appeared first on Stormatics.

I will be speaking at POSETTE: An Event for Postgres!
Posted by Henrietta Dombrovskaya on 2024-05-08 at 23:39

Tuning Parameters vs. Tuning Queries is yet another talk that “almost made it” to a couple of conferences but ended up being on the reserve list, and I never had to present it anywhere except for Chicago PUG. And one more time, my friends at Microsoft are giving me a chance:)

I used to be skeptical about online conferences because I had a couple of disastrous experiences, but not with that team! This year, I am happy to do it again!. On June 11, my talk, Tuning Parameters in Postgres vs. Tuning Your Queries, will be streamed at POSETTE 2024. Tune in on June 11 at 8-30 AM CT!

Conversion Chronicles – Mapping Oracle Number Data types using AWS SCT and Ora2pg.
Posted by Deepak Mahto on 2024-05-08 at 21:31

Mapping Oracle Number Data types using any conversion tools has always been a topic of conversation that I’ve engaged in with my customers and partners. Schema conversion and data type conversion are among the first tasks we undertake once migration is initiated post-assessment. I have previously collaborated and contributed a detailed article on number data type mapping and its significance. Feel free to give it a read. [1] , [2] and [3]

In this blog, we’ll delve into how two of my favorite conversion tools, the AWS Schema Conversion Tool and Ora2pg, facilitate and support Number Data type mapping in Schema Conversion. We’ll explore some of the settings that help us enable optimal decision-making based on number data type definitions and constraints.

Sample Oracle Table and Dataset

Let’s consider the following sample Oracle table with various variants of NUMBER data types. We’ll perform a conversion on it using AWS SCT and Ora2pg.

 CREATE TABLE "SAMPLE_NUMBER_DATA_TYPE" 
   (  
  "COL1_NUMBER_INT" NUMBER, 
  "COL2_NUMBER_DECIMAL" NUMBER, 
  "COL3_NUMBER_HIGH_DECIMAL" NUMBER, 
  "COL4" NUMBER(5,0), 
  "COL5" NUMBER(12,0), 
  "COL6" NUMBER(32,0), Sc
  "COL7" NUMBER(12,2),
  "COL8" NUMBER(24,5),
  "COL9" NUMBER(32,10)
);

ALTER TABLE "SAMPLE_NUMBER_DATA_TYPE" ADD CONSTRAINT PK_SAMPLE1 PRIMARY KEY("COL1_NUMBER_INT");

INSERT INTO "SAMPLE_NUMBER_DATA_TYPE"
SELECT ROWNUM,
       ROWNUM / 2,
       ROWNUM * ROWNUM / ( ROWNUM * ROWNUM * ROWNUM * ROWNUM ),
       ROWNUM * 2,
       ROWNUM * ROWNUM * ROWNUM,
       ROWNUM * ROWNUM * ROWNUM * ROWNUM,
       ROWNUM / 2,
       ROWNUM / 3,
       Round(ROWNUM * ROWNUM * ROWNUM / ( ROWNUM * 0.33 ), 10)
FROM   dual
CONNECT BY ROWNUM < 501;
COMMIT;


SELECT * FROM "SAMPLE_NUMBER_DATA_TYPE";

In the sample table, we’ve incorporated various variants of Number data types, and we’ll walk through how to map them to appropriate PostgtreSQL Numeric types.

AWS Schema Conversion Tool – Number Data Type Mapping

The AWS Schema Conversion Tool is a matu

[...]

Troubleshooting PostgreSQL on Kubernetes With Coroot
Posted by Sergey Pronin in Percona on 2024-05-08 at 18:11
PostgreSQL on Kubernetes With CorootCoroot, an open source observability tool powered by eBPF, went generally available with version 1.0 last week. As this tool is cloud-native, we were curious to know how it can help troubleshoot databases on Kubernetes.In this blog post, we will see how to quickly debug PostgreSQL with Coroot and Percona Operator for PostgreSQL.PrepareInstall CorootThe easiest […]

Mini Summit Five
Posted by David Wheeler in Tembo on 2024-05-07 at 22:12

The video for Yurri Rashkovskii’s presentation at the fifth Postgres Extension Ecosystem Mini-Summit last week is up. Links:

Here’s my interpolation of YouTube’s auto-generated transcript, interspersed with chat activity.

Introduction

Presentation

  • Yurri: Today I’m going to be talking about universally buildable extensions. This is going to be a shorter presentation, but the point of it is to create some ideas, perhaps some takeaways, and actually provoke a conversation during the call. It would be really amazing to explore what others think um so without further ado…

  • I’m with Omnigres, where we’re building a lot of extensions. Often they push the envelope of what extensions are supposed to do. For example, one of our first extensions is an HTTP server that embeds a web server inside of Postgres. We had to do a lot of unconventional things. We have other extensions uniquely positioned to work both on developer machines and production machines — because we serve the the developers and devops market.

  • The point of Omnigres is turning Postgres into an application runtime — or an application server — so we really care how extensions get adopted. When we think about application developers, they need to be able to use extensions while they’re developing, not just in production or on some remote server. They need extensions to work on their machine.

  • The thing is, not everybody is using Linux Other people use macOS and Windows and we have to account for that. There are many interesting problems associated with things like dependencies.

  • So there’s a very common approach used by those who who try to orches

[...]

Using Automatic DDL Replication with pgEdge Distributed PostgresQL
Posted by Cady Motyka in pgEdge on 2024-05-07 at 20:24

Until now, managing and updating Data Definition Language (DDL) schemas in PostgreSQL was a time-consuming, manual effort and especially difficult in a distributed environment.  With the introduction of automatic DDL replication, pgEdge makes it easier and faster to deliver distributed applications built on open, standard-based PostgreSQL.  In this blog, we delve into what Automatic DDL replication entails and how it can be leveraged effectively with pgEdge.

Understanding Automatic DDL Replication

First, let's unpack what DDL commands entail. DDL, or Data Definition Language, encompasses those statements responsible for creating and modifying database objects within PostgreSQL, including tables, indexes, views, and constraints. Commands such as CREATE, ALTER, and DROP fall into this category. On the other hand, Data Manipulation Language (DML) commands focus on operations performed on the data within these objects, such as SELECT, INSERT, UPDATE, and DELETE.Traditionally, logical replication in PostgreSQL primarily facilitated the replication of DML statements, leaving DDL commands requiring manual intervention. This meant that any modifications to the database schema required manual execution of DDL commands on each node in the cluster—a cumbersome and error-prone process.

Configuring Automatic DDL Replication in pgEdge

With pgEdge automatic DDL replication, you can update the database schema on a single node, and the changes are seamlessly propagated to other nodes within the cluster. This significantly simplifies the management of distributed PostgreSQL applications, eliminating the need for manual intervention across multiple nodes.Implementing automatic DDL replication in pgEdge is a straightforward process. Before enabling auto-DDL, you should ensure that the schema on each node within the cluster matches precisely. Then, enable auto-DDL replication after installing pgEdge Platform but before creating tables in a replicated database.  Ensure that the same users and roles exist on all nodes within [...]

cursor_tuple_fraction and the PostgreSQL JDBC driver
Posted by Laurenz Albe in Cybertec on 2024-05-07 at 09:15
cursor_tuple_fraction and the PostgreSQL JDBC driver
@ Laurenz Albe 2024

This article is a kind of failure story. I originally set out to write a piece about the PostgreSQL parameter cursor_tuple_fraction, only to discover that my boss has already covered that topic to some extent. Then I thought that I could investigate the parameter’s effect on the JDBC driver. That led me to an embarrassingly long wrestle with auto_explain and an eventual failure to find anything. But the world is littered with success stories, so why not share some failure? And there is the one or other interesting insight in store that might make the article worth your while.

About cursor_tuple_fraction

In a database, cursors are the canonical way of iterating through result sets. Cursors avoid the need to read the complete result set at once, which reduces the memory consumption. In PostgreSQL you can use the non-standard statement DECLARE to explicitly create a cursor, but usually you use client API functions or PL/pgSQL FOR loops that create PostgreSQL cursors “behind the scene”.

Transferring result rows to the client takes some time, and processing on the client side takes even more time. If an application processes a query result in chunks using a cursor, it can take a while until it reaches the end of the result set. Therefore, it is most important to get the first result rows as quickly as possible when you read the result set using a cursor. The total execution time is less important. This is where the PostgreSQL parameter cursor_tuple_fraction comes into play: for queries executed using a cursor, it tells the optimizer to prefer execution plans that produce the first result rows quickly, at the expense of the total query execution time. Hans’ article shows you a good example of what cursor_tuple_fraction does.

Another use case for cursor_tuple_fraction is if you use a cursor, but want to optimize the total execution time of the query. This is usually the case if you declare a cursor WITH HOLD, so that it can outlive a database transaction. The result

[...]

Nazir Bilal Yavuz
Posted by Andreas 'ads' Scherbaum on 2024-05-06 at 14:00
PostgreSQL Person of the Week Interview with Nazir Bilal Yavuz: I am Nazir Bilal Yavuz, open source PostgreSQL developer working at Microsoft. I live in Turkey. I spend my free time by doing sports, playing games and football, watching tv series and movies.

The Notifier Pattern for Applications That Use Postgres
Posted by Brandur Leach on 2024-05-06 at 05:54

Listen/notify in Postgres is an incredible feature that makes itself useful in all kinds of situations. I’ve been using it a long time, started taking it for granted long ago, and was somewhat shocked recently looking into MySQL and SQLite to learn that even in 2024, no equivalent exists.

In a basic sense, listen/notify is such a simple concept that it needs little explanation. Clients subscribe on topics and other clients can send on topics, passing a message to each subscribed client. The idea takes only three seconds to demonstrate using nothing more than a psql shell:

=# LISTEN test_topic;
LISTEN
Time: 2.828 ms

=# SELECT pg_notify('test_topic', 'test_message');
 pg_notify
-----------

(1 row)

Time: 17.892 ms
Asynchronous notification "test_topic" with payload "test_message" received from server process with PID 98481.

But despite listen/notify’s relative simplicity, when it comes to applications built on top of Postgres, it’s common to use it less than optimally, eating through scarce Postgres connections and with little regard to failure cases.


Here’s where the notifier pattern for Postgres comes in. It’s an extremely simple idea, but in my experience, one that’s rarely seen in practice. Let’s start with these axioms:

  • LISTENs are affixed to specific connections. After listening, the original connection msut still be available somewhere to successfully receive messages.

  • There may be many components within an application that’d like to listen on topics for completely orthogonal uses.

  • Despite optimizations over the years, connections in Postgres are still somewhat of a precious, limited resource, and should be conserved. We’d like to minimize the number of them required for listen/notify use.

  • A single connection can listen on any number of topics.

With those stated, we can explain the role of the notifier. Its job is to hold a single Postgres connection per process, allow other components in the same

[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.