Do PostgreSQL Sub-Transactions Hurt Performance?

The short answer is always “maybe”. However, in the following post, I hope to demonstrate what creates a sub-transactions and what happens to the overall transaction id utilization when they are invoked. I will also show how performance is affected when there are lots of connections creating and consuming sub-transactions.

First, it is important to understand what statements will utilize a transaction id and which ones may be more critical (expensive) than others:

Calling Nested Procedures: 🟢 Free. No new XIDs are used. They share the parent’s transaction.
BEGIN…END (No Exception Block): 🟢 Free. Just organization.
COMMIT: 🟡 Expensive. Burns a main Transaction ID (finite resource, leads to Vacuum Freeze).
EXCEPTION: 🔴 Dangerous. Creates a subtransaction (performance killer).

So why is misplaced EXCEPTION logic possibly a performance killer? PostgreSQL is optimized to handle a small number of open subtransactions very efficiently. Each backend process has a fixed-size array in shared memory (part of the PGPROC structure) that can hold up to 64 open subtransaction IDs (XIDs) for the current top-level transaction. As long as your nesting depth stays below 64, PostgreSQL manages everything in this fast, local memory array. It does not need to use the Subtrans SLRU (Simple Least Recently Used) subsystem, which is what pg_stat_slru tracks. The problem is that the utilization of subtransaction IDs (XIDs) can get out of hand rather quickly if you are not paying attention to your application flow and once PostgreSQL runs out of fast RAM slots (PGPROC array) and spills tracking data to the slow SLRU cache (pg_subtrans), performance degrades non-linearly (often 50x–100x slower) and causes global locking contention that can freeze other users.

One of the other byproducts of utilizing too many sub-transactions is the additional WAL that will be generated. All of these can be demonstrated by a simple block of code:

------------------------------------------------------------------
-- 1. PREP: Create the helper function
------------------------------------------------------------------
CREATE OR REPLACE FUNCTION generate_subtrans_load(p_id int, depth int) 
RETURNS void AS $$
BEGIN
    IF depth > 0 THEN
        BEGIN
            PERFORM generate_subtrans_load(p_id, depth - 1);
        EXCEPTION WHEN OTHERS THEN NULL;
        END;
    ELSE
        INSERT INTO penalty_test VALUES (p_id, 'Deep Stack');
    END IF;
END;
$$ LANGUAGE plpgsql;

------------------------------------------------------------------
-- 2. THE COMPLETE TEST SUITE
------------------------------------------------------------------
DO $$ 
DECLARE
    -- Timing Variables
    v_start_ts  timestamp;
    v_t_base    interval; v_t_single interval; 
    v_t_48      interval; v_t_128    interval;

    -- Row Verification
    v_rows_base int;      v_rows_single int;
    v_rows_48   int;      v_rows_128    int;

    -- WAL Variables
    v_start_lsn pg_lsn;   v_end_lsn  pg_lsn;
    v_wal_base  numeric;  v_wal_single numeric;
    v_wal_48    numeric;  v_wal_128    numeric;

    -- XID Variables
    v_start_xid xid8;     v_end_xid  xid8;
    v_xid_base  bigint;   v_xid_single bigint;
    v_xid_48    bigint;   v_xid_128    bigint;

    i int;
    v_target_rows int := 5000; 
    
    -- Helper calc vars
    v_us_48 numeric;
    v_us_128 numeric;
BEGIN
    ---------------------------------------------------
    -- A. SETUP
    ---------------------------------------------------
    DROP TABLE IF EXISTS penalty_test;
    -- Explicitly preserve rows so COMMIT doesn't empty the table
    CREATE TEMP TABLE penalty_test (id int, val text) ON COMMIT PRESERVE ROWS;

    ---------------------------------------------------
    -- B. TEST 1: BASELINE (Standard Loop)
    ---------------------------------------------------
    -- 1. Force fresh start
    COMMIT; 
    v_start_ts  := clock_timestamp();
    v_start_lsn := pg_current_wal_insert_lsn();
    v_start_xid := pg_snapshot_xmax(pg_current_snapshot());

    FOR i IN 1..v_target_rows LOOP
        INSERT INTO penalty_test VALUES (i, 'Standard');
    END LOOP;
    
    -- 2. Force snapshot refresh to see XID consumption
    COMMIT; 
    v_end_lsn := pg_current_wal_insert_lsn();
    v_end_xid := pg_snapshot_xmax(pg_current_snapshot());
    v_t_base  := clock_timestamp() - v_start_ts;
    
    SELECT count(*) INTO v_rows_base FROM penalty_test;
    v_wal_base := v_end_lsn - v_start_lsn;
    v_xid_base := (v_end_xid::text::bigint - v_start_xid::text::bigint);

    ---------------------------------------------------
    -- C. TEST 2: SINGLE TRAP (Depth 1)
    ---------------------------------------------------
    TRUNCATE TABLE penalty_test;
    COMMIT; -- Clear stats
    
    v_start_ts  := clock_timestamp();
    v_start_lsn := pg_current_wal_insert_lsn();
    v_start_xid := pg_snapshot_xmax(pg_current_snapshot());

    FOR i IN 1..v_target_rows LOOP
        BEGIN
            INSERT INTO penalty_test VALUES (i, 'Single Trap');
        EXCEPTION WHEN OTHERS THEN NULL;
        END;
    END LOOP;
    
    COMMIT; -- Force refresh
    v_end_lsn := pg_current_wal_insert_lsn();
    v_end_xid := pg_snapshot_xmax(pg_current_snapshot());
    v_t_single := clock_timestamp() - v_start_ts;

    SELECT count(*) INTO v_rows_single FROM penalty_test;
    v_wal_single := v_end_lsn - v_start_lsn;
    v_xid_single := (v_end_xid::text::bigint - v_start_xid::text::bigint);

    ---------------------------------------------------
    -- D. TEST 3: SAFE ZONE (Depth 48)
    ---------------------------------------------------
    TRUNCATE TABLE penalty_test;
    COMMIT;
    
    v_start_ts  := clock_timestamp();
    v_start_lsn := pg_current_wal_insert_lsn();
    v_start_xid := pg_snapshot_xmax(pg_current_snapshot());

    FOR i IN 1..v_target_rows LOOP
        PERFORM generate_subtrans_load(i, 48);
    END LOOP;
    
    COMMIT;
    v_end_lsn := pg_current_wal_insert_lsn();
    v_end_xid := pg_snapshot_xmax(pg_current_snapshot());
    v_t_48    := clock_timestamp() - v_start_ts;

    SELECT count(*) INTO v_rows_48 FROM penalty_test;
    v_wal_48 := v_end_lsn - v_start_lsn;
    v_xid_48 := (v_end_xid::text::bigint - v_start_xid::text::bigint);

    ---------------------------------------------------
    -- E. TEST 4: OVERFLOW ZONE (Depth 128)
    ---------------------------------------------------
    TRUNCATE TABLE penalty_test;
    COMMIT;
    
    v_start_ts  := clock_timestamp();
    v_start_lsn := pg_current_wal_insert_lsn();
    v_start_xid := pg_snapshot_xmax(pg_current_snapshot());

    FOR i IN 1..v_target_rows LOOP
        PERFORM generate_subtrans_load(i, 128);
    END LOOP;
    
    COMMIT;
    v_end_lsn := pg_current_wal_insert_lsn();
    v_end_xid := pg_snapshot_xmax(pg_current_snapshot());
    v_t_128   := clock_timestamp() - v_start_ts;

    SELECT count(*) INTO v_rows_128 FROM penalty_test;
    v_wal_128 := v_end_lsn - v_start_lsn;
    v_xid_128 := (v_end_xid::text::bigint - v_start_xid::text::bigint);

    ---------------------------------------------------
    -- F. THE REPORT
    ---------------------------------------------------
    RAISE NOTICE '===================================================';
    RAISE NOTICE '    POSTGRESQL SUBTRANSACTION IMPACT REPORT      ';
    RAISE NOTICE '===================================================';
    RAISE NOTICE 'Target Rows: %', v_target_rows;
    RAISE NOTICE '---------------------------------------------------';
    
    RAISE NOTICE 'METRIC 1: EXECUTION TIME & VERIFICATION';
    RAISE NOTICE '  1. Baseline:         % (Rows: %)', v_t_base, v_rows_base;
    RAISE NOTICE '  2. Single Exception: % (Rows: %)', v_t_single, v_rows_single;
    RAISE NOTICE '  3. Safe Zone (48):   % (Rows: %)', v_t_48, v_rows_48;
    RAISE NOTICE '  4. Overflow (128):   % (Rows: %)', v_t_128, v_rows_128;
    RAISE NOTICE '---------------------------------------------------';

    v_us_48  := extract(epoch from v_t_48) * 1000000;
    v_us_128 := extract(epoch from v_t_128) * 1000000;

    RAISE NOTICE 'METRIC 2: AVERAGE COST PER SUBTRANSACTION';
    RAISE NOTICE '   - Safe Zone (48):     % us per subtrans', round(v_us_48 / (v_target_rows * 48), 2);
    RAISE NOTICE '   - Overflow Zone (128): % us per subtrans', round(v_us_128 / (v_target_rows * 128), 2);
    
    IF (v_us_128 / (v_target_rows * 128)) > (v_us_48 / (v_target_rows * 48)) THEN
        RAISE NOTICE '   -> RESULT: Overflow subtransactions were % %% slower per unit.', 
            round( ( ((v_us_128 / (v_target_rows * 128)) - (v_us_48 / (v_target_rows * 48))) / (v_us_48 / (v_target_rows * 48)) * 100)::numeric, 1);
    ELSE
         RAISE NOTICE '   -> RESULT: Overhead appears linear.';
    END IF;
    RAISE NOTICE '---------------------------------------------------';
    
    RAISE NOTICE 'METRIC 3: DISK USAGE (WAL WRITTEN)';
    RAISE NOTICE '  1. Baseline:         % bytes', v_wal_base;
    RAISE NOTICE '  2. Single Exception: % bytes', v_wal_single;
    RAISE NOTICE '  3. Safe Zone (48):   % bytes', v_wal_48;
    RAISE NOTICE '  4. Overflow (128):   % bytes', v_wal_128;
    RAISE NOTICE '---------------------------------------------------';

    RAISE NOTICE 'METRIC 4: TRANSACTION ID CONSUMPTION';
    RAISE NOTICE '  1. Baseline:         % XIDs', v_xid_base;
    RAISE NOTICE '  2. Single Exception: % XIDs', v_xid_single;
    RAISE NOTICE '  3. Safe Zone (48):   % XIDs', v_xid_48;
    RAISE NOTICE '  4. Overflow (128):   % XIDs', v_xid_128;
    
    RAISE NOTICE '===================================================';
END $$;

The code block above shows 4 different scenarios:

Baseline: Simple insert that inserts a number of rows in a loop with no exception logic
Single Exception (Depth 1): The same insert loop but with an exception block that fires after every insert
48 Sub-transaction Loop (Depth 48): A function call that arbitrarily creates 48 subtransactions (well below the 64 limit) and then completes the same insert
128 Sub-transaction Loop (Depth 128): A function call that arbitrarily creates 128 subtransactions (well above the 64 limit) and then completes the same insert

What you will see is that when run by a single user the impact is not terribly great. Besides utilizing more transaction ids (XIDs) and generating much wal, the system and performance impact of this does not appear to be a big deal. When running for a 5000 row insert:

NOTICE:  ===================================================
NOTICE:      POSTGRESQL SUBTRANSACTION IMPACT REPORT
NOTICE:  ===================================================
NOTICE:  Target Rows: 5000
NOTICE:  ---------------------------------------------------
NOTICE:  METRIC 1: EXECUTION TIME & VERIFICATION
NOTICE:    1. Baseline:         00:00:00.020292 (Rows: 5000)
NOTICE:    2. Single Exception: 00:00:00.033419 (Rows: 5000)
NOTICE:    3. Safe Zone (48):   00:00:01.414132 (Rows: 5000)
NOTICE:    4. Overflow (128):   00:00:03.817267 (Rows: 5000)
NOTICE:  ---------------------------------------------------
NOTICE:  METRIC 2: AVERAGE COST PER SUBTRANSACTION
NOTICE:     - Safe Zone (48):     5.89 us per subtrans
NOTICE:     - Overflow Zone (128): 5.96 us per subtrans
NOTICE:     -> RESULT: Overflow subtransactions were 1.2 % slower per unit.
NOTICE:  ---------------------------------------------------
NOTICE:  METRIC 3: DISK USAGE (WAL WRITTEN)
NOTICE:    1. Baseline:         48 bytes
NOTICE:    2. Single Exception: 43936 bytes
NOTICE:    3. Safe Zone (48):   2076216 bytes
NOTICE:    4. Overflow (128):   5536832 bytes
NOTICE:  ---------------------------------------------------
NOTICE:  METRIC 4: TRANSACTION ID CONSUMPTION
NOTICE:    1. Baseline:         1 XIDs
NOTICE:    2. Single Exception: 5001 XIDs
NOTICE:    3. Safe Zone (48):   240001 XIDs
NOTICE:    4. Overflow (128):   640002 XIDs
NOTICE:  ===================================================
DO
Time: 5294.689 ms (00:05.295)

As you can see in the results above, due to the transaction tracking, each scenario burns more XIDs and creates more WAL. This is due to the additional transaction tracking that must be available should the WAL need to be replayed. Where the real killer comes in is when multiple connections all begin to execute this same code. When I did a synthetic test using Apache JMeter the results are astounding. The test was completed on a 4vCPU C4A AlloyDB Instance:

Test	Average Time Per Exec	Min Exec Time	Max EXEC Time
1 User x 50 iterations	2137ms	2069ms	3477ms
4 Users x 50 iterations	3171ms	2376ms	4080ms
10 Users x 50 iterations	9041ms	4018ms	47880ms

As more users are added, the time is increased due to the time needed to manage the XID space and write the additional WAL. Some of this is increased time is due to the elongated connection time due to the XID space. When the same test was executed with a Managed Connection Pooler, times came down a little bit:

Test	Average Time Per Exec	Min Exec Time	Max EXEC Time
10 Users x 50 iterations	6399ms	2690ms	10624ms

Now you ask, how can I monitor for this performance impact? The verdict is that it depends. Wide variations in execution time depending on database load may be one indication. Another indication may be entries in the pg_stat_slru table (however depending on the available ram, metrics may not appear here) and a final indication will always be WAL usage. In summary:

Metric	What it tells you	What it hides
Execution Time	“My query is slow.”	It doesn’t explain why (CPU vs Lock vs I/O). Additional investigation may be required.
pg_stat_slru	“Disk is thrashing.”	It reads 0 if you have enough RAM, hiding the fact that your CPU is burning up managing memory locks.
WAL Volume	The Real Truth.	It proves you are writing massive amounts of metadata (savepoint markers) to disk, even if the data volume is small.

When considering the three scenarios:

Tier 1: Standard Loop (Baseline)
- Mechanism: One single transaction for the whole batch.
- Overhead: Near zero.
- Verdict: 🟢 Safe. This is how Postgres is designed to work.
Tier 2: The “Safety Trap” (Exception Block)
- Mechanism: Uses BEGIN…EXCEPTION inside the loop.
- Hidden Cost: Every single iteration creates a Subtransaction. This burns a Subtransaction ID and forces a WAL write to create a “savepoint” it can roll back to.
- Verdict: 🟡 Risky. It is 3x–5x slower and generates massive Write-Ahead Log (WAL) bloat, even for successful inserts.
Tier 3: The “Overflow” Disaster (Depth > 64)
- Mechanism: Nesting subtransactions deeper than 64 layers (or having >64 active savepoints).
- The Cliff: PostgreSQL runs out of fast RAM slots (PGPROC array) and must spill tracking data to the slow SLRU cache (pg_subtrans).
- Verdict: 🔴 Catastrophic. Performance degrades non-linearly (often 50x–100x slower) and causes global locking contention that can freeze other users.

Final Recommendation:
If you need to handle errors in a bulk load (e.g., “Insert 10,000 rows, skip the ones that fail”):

DO validate data before the insert to filter out bad rows in the application layer or a staging table
Do NOT wrap every insert in an EXCEPTION block. i.e a LOOP
Use EXCEPTION logic purposefully and avoid the need for CATCH ALL like “WHEN OTHERS”
DO use INSERT … ON CONFLICT DO NOTHING (if the error is unique constraint)

Would you like to read more and get additional perspective?
https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful

Hope this helps you understand the effect of subtransactions in PostgreSQL!

🛩️ Shane Borden's Technology Blog

My Experiences Navigating Technology – All Views Are My Own

Do PostgreSQL Sub-Transactions Hurt Performance?

Leave a comment Cancel reply

Share this:

Leave a comment Cancel reply