Tag Archives: GeminiAI

Understanding High Water Mark Locking Issues in PostgreSQL Vacuums

I recently had a customer that wanted to leverage read replicas to ensure that their read queries were not going to impeded with work being done on the primary instance and also required an SLA of at worst a few seconds. Ultimately they weren’t meeting the SLA and my colleagues and I were asked to look at what was going on.

The first thing we came to understand is that the pattern of work on the primary is a somewhat frequent large DELETE statement followed by a data refresh accomplished by a COPY from STDIN command against a partitioned table with 16 hash partitions.

The problem being observed was that periodically the SELECTs occurring on the read replica would time out and not meet the SLA. Upon investigation, we found that the “startup” process on the read replica would periodically request an “exclusive lock” on some random partition. This exclusive lock would block the SELECT (which is partition unaware) and then cause the timeout. But what is causing the timeout?

After spending some time investigating, the team was able to correlate the exclusive lock with a routine “autovacuum” occurring on the primary. But why was it locking? After inspection of the WAL, it turns out that it the issue was due to a step in the vacuum process whereby it tries to return free pages at the end of the table back to the OS, truncation of the High Water Mark (HWM). Essentially the lock is requested on the primary and then transmitted to the replica via the WAL so that the tables can be kept consistent.

To confirm that it was in fact the step in VACUUM that truncates the HWM, we decided to alter each partition of the table to allow VACUUM to skip that step:

ALTER TABLE [table name / partition name] SET (vacuum_truncate = false);

After letting this run for 24 hours, we in fact saw no further blocking locks causing SLA misses on the replicas. Should we worry about shrinking the High Water Mark (HWM)? Well as with everything in IT, it depends. Other DBMS engines like Oracle do not shrink the High Water Mark (HWM), typically maintenance operations such as DBMS_REDEF or ALTER TABLE … SHRINK SPACE / SHRINK SPACE COMPACT deal with that. So now that we are talking about PostgreSQL do we need to worry about it? This is where the pg_freespacemap extension can help. We can use this extension and a script to check to see if in fact the High Water Mark (HWM) is growing or staying put. If it is growing, we can just execute a regular VACUUM with an additional option called TRUNCATE to handle it:

VACUUM (verbose, truncate true) [schema].[table name];

When you do this, you will see one additional message in the VACUUM output signifying that the VACUUM truncated the High Water Mark (HWM):

INFO:  table "large_table": truncated 302534 to 302233 pages

As I stated earlier, we can use pg_freespacemap to see if we actually need to worry about the High Water Mark (HWM) growing. I could have taken a lot of time to write a script to figure it out, but instead, I enlisted Google Gemini to see what it would come up with. After a few iterations, the output was nearly perfect!

CREATE EXTENSION pg_freespacemap;

CREATE OR REPLACE FUNCTION show_empty_pages(p_table_name TEXT)
RETURNS VOID AS $$
DECLARE
    -- Core processing variables
    table_oid_regclass  REGCLASS;
    block_size          BIGINT;
    fsm_granularity     BIGINT;
    max_fsm_free_space  BIGINT;
    total_pages         BIGINT;
    high_water_mark     BIGINT := 0;

    -- Variables for the final summary
    first_empty_block   BIGINT;
    free_pages_at_end   BIGINT;
    free_space_at_end   TEXT;
BEGIN
    -- Setup
    table_oid_regclass := p_table_name::regclass;
    block_size  := current_setting('block_size')::bigint;
    SELECT relpages INTO total_pages FROM pg_class WHERE oid = table_oid_regclass;
    fsm_granularity    := block_size / 256;
    max_fsm_free_space := floor((block_size - 24) / fsm_granularity) * fsm_granularity;

    --------------------------------------------------------------------------------
    -- PASS 1: FIND THE HIGH-WATER MARK (last page with data)
    --------------------------------------------------------------------------------
    FOR i IN REVERSE (total_pages - 1)..0 LOOP
        IF pg_freespace(table_oid_regclass, i) < max_fsm_free_space THEN
            high_water_mark := i;
            EXIT;
        END IF;
    END LOOP;

    --------------------------------------------------------------------------------
    -- FINAL STEP: CALCULATE AND RAISE THE SUMMARY NOTICE
    --------------------------------------------------------------------------------
    first_empty_block := high_water_mark + 1;
    free_pages_at_end := total_pages - first_empty_block;
    IF free_pages_at_end < 0 THEN
        free_pages_at_end := 0;
    END IF;
    free_space_at_end := pg_size_pretty(free_pages_at_end * block_size);

    RAISE NOTICE '-------------------------------------------------------------';
    RAISE NOTICE 'Summary for table: %', p_table_name;
    RAISE NOTICE '-------------------------------------------------------------';
    RAISE NOTICE 'The High Water Mark (HWM) is at page: %', total_pages;
    IF total_pages <> first_empty_block THEN
    	RAISE NOTICE 'First potentially empty page is at: %', first_empty_block;
    	RAISE NOTICE 'Total Pages in Table: %', total_pages;
    	RAISE NOTICE 'Number of potentially truncatable pages at the end: %', free_pages_at_end;
    	RAISE NOTICE 'Amount of free space at the end of the table: %', free_space_at_end;
    ELSE
    	RAISE NOTICE 'There are no empty pages to truncate';
    END IF;
    RAISE NOTICE '-------------------------------------------------------------';
END;
$$ LANGUAGE plpgsql;

This handy script could be periodically executed to check the High Water Mark (HWM) and will produce the following output:

(postgres@10.3.1.17:5432) [postgres] > SELECT * FROM show_empty_pages('public.large_table');
NOTICE:  -------------------------------------------------------------
NOTICE:  Summary for table: public.large_table
NOTICE:  -------------------------------------------------------------
NOTICE:  The High Water Mark (HWM) is at page: 302534
NOTICE:  First potentially empty page is at: 302233
NOTICE:  Total Pages in Table: 302534
NOTICE:  Number of potentially truncatable pages at the end: 301
NOTICE:  Amount of free space at the end of the table: 2408 kB
NOTICE:  -------------------------------------------------------------

If there is no freespace after the last full block the output will look like this:

NOTICE:  -------------------------------------------------------------
NOTICE:  Summary for table: public.large_table
NOTICE:  -------------------------------------------------------------
NOTICE:  The High Water Mark (HWM) is at page: 302233
NOTICE:  There are no empty pages to truncate
NOTICE:  -------------------------------------------------------------

So while there is no right answer on how to deal with this, ensure you know the implications of each step in the process. In this case, we have decided to turn the “vacuum_truncation” option to false, but maybe another option might be to tune vacuum in another way such as either making it more or less frequent. Always evaluate your own situation, but in any case it’s always good to know what happens in your database when certain commands are executed.

Enjoy!