Tuning the PostgreSQL “random_page_cost” Parameter

In my previous post “Three Configuration Parameters for PostgreSQL That Are Worth Further Investigation!“, I introduced three PostgreSQL parameters that I feel are the most “neglected” and present the most opportunity for performance tuning in a PostgreSQL instance. I’ve previously posted parts 1 and 2 which cover “work_mem” and “effective_io_concurrency“, so in the final part of this series, I would like to demonstrate tuning the “random_page_cost” parameter.

Because PostgreSQL has the ability to be installed on many different types of systems, the default for this parameter represents a system that is likely the least performant, one that has low CPU and a disk subsystem that is less than ideal. This setting can be overridden at the individual object level as well, however that may represent a management nightmare so I would recommend against that. A good explanation of the parameter exists here, and for most CloudSQL instances, should likely be set lower than the default because random page costs are expected to be less expensive on the types of I/O subsystems are present within today’s cloud environments.

For those of you that come from Oracle backgrounds, this parameter is very much like the “OPTIMIZER_INDEX_COST_ADJ” parameter that we used to manipulate in older Oracle versions. To refresh your mind on this parameter you can see the 19c explanation here.

As a simple example of how the query plan can change for a simple SQL, I will first show the query plan with the default setting of 4. While it is using an index, the access path could be better:

set max_parallel_workers_per_gather = 0;
set session random_page_cost to 4;

explain (analyze, verbose, costs, settings, buffers, wal, timing, summary, format text)
SELECT c.c_name, c.c_acctbal, sum(o.o_totalprice)
FROM orders o
JOIN customer c ON (c.c_custkey = o.o_custkey)
WHERE c.c_custkey = 30003 and o.o_orderstatus = 'O'
GROUP BY c.c_name, c.c_acctbal;

                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=76.28..76.37 rows=1 width=57) (actual time=0.034..0.035 rows=0 loops=1)
   Output: c.c_name, c.c_acctbal, sum(o.o_totalprice)
   Group Key: c.c_name, c.c_acctbal
   Buffers: shared hit=7
   ->  Sort  (cost=76.28..76.30 rows=8 width=33) (actual time=0.033..0.034 rows=0 loops=1)
         Output: c.c_name, c.c_acctbal, o.o_totalprice
         Sort Key: c.c_name, c.c_acctbal
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=7
         ->  Nested Loop  (cost=4.97..76.16 rows=8 width=33) (actual time=0.027..0.028 rows=0 loops=1)
               Output: c.c_name, c.c_acctbal, o.o_totalprice
               Buffers: shared hit=7
               ->  Index Scan using customer_pk on public.customer c  (cost=0.42..8.44 rows=1 width=31) (actual time=0.014..0.015 rows=1 loops=1)
                     Output: c.c_custkey, c.c_mktsegment, c.c_nationkey, c.c_name, c.c_address, c.c_phone, c.c_acctbal, c.c_comment
                     Index Cond: (c.c_custkey = '30003'::numeric)
                     Buffers: shared hit=4
               ->  Bitmap Heap Scan on public.orders o  (cost=4.55..67.64 rows=8 width=14) (actual time=0.009..0.009 rows=0 loops=1)
                     Output: o.o_orderdate, o.o_orderkey, o.o_custkey, o.o_orderpriority, o.o_shippriority, o.o_clerk, o.o_orderstatus, o.o_totalprice, o.o_comment
                     Recheck Cond: (o.o_custkey = '30003'::numeric)
                     Filter: (o.o_orderstatus = 'O'::bpchar)
                     Buffers: shared hit=3
                     ->  Bitmap Index Scan on order_customer_fkidx  (cost=0.00..4.55 rows=16 width=0) (actual time=0.008..0.008 rows=0 loops=1)
                           Index Cond: (o.o_custkey = '30003'::numeric)
                           Buffers: shared hit=3
 Settings: effective_cache_size = '3053008kB', effective_io_concurrency = '10', max_parallel_workers_per_gather = '0', work_mem = '512MB'
 Query Identifier: 7272380376793434809
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.234 ms
 Execution Time: 0.076 ms

And now with a change to a setting of 2, we get a different access path:

set max_parallel_workers_per_gather = 0;
set session random_page_cost to 2;

explain (analyze, verbose, costs, settings, buffers, wal, timing, summary, format text)
SELECT c.c_name, c.c_acctbal, sum(o.o_totalprice)
FROM orders o
JOIN customer c ON (c.c_custkey = o.o_custkey)
WHERE c.c_custkey = 30003 and o.o_orderstatus = 'O'
GROUP BY c.c_name, c.c_acctbal;

                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=39.38..39.48 rows=1 width=57) (actual time=0.027..0.028 rows=0 loops=1)
   Output: c.c_name, c.c_acctbal, sum(o.o_totalprice)
   Group Key: c.c_name, c.c_acctbal
   Buffers: shared hit=7
   ->  Sort  (cost=39.38..39.40 rows=8 width=33) (actual time=0.026..0.027 rows=0 loops=1)
         Output: c.c_name, c.c_acctbal, o.o_totalprice
         Sort Key: c.c_name, c.c_acctbal
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=7
         ->  Nested Loop  (cost=0.85..39.26 rows=8 width=33) (actual time=0.021..0.022 rows=0 loops=1)
               Output: c.c_name, c.c_acctbal, o.o_totalprice
               Buffers: shared hit=7
               ->  Index Scan using customer_pk on public.customer c  (cost=0.42..4.44 rows=1 width=31) (actual time=0.012..0.012 rows=1 loops=1)
                     Output: c.c_custkey, c.c_mktsegment, c.c_nationkey, c.c_name, c.c_address, c.c_phone, c.c_acctbal, c.c_comment
                     Index Cond: (c.c_custkey = '30003'::numeric)
                     Buffers: shared hit=4
               ->  Index Scan using order_customer_fkidx on public.orders o  (cost=0.43..34.75 rows=8 width=14) (actual time=0.008..0.008 rows=0 loops=1)
                     Output: o.o_orderdate, o.o_orderkey, o.o_custkey, o.o_orderpriority, o.o_shippriority, o.o_clerk, o.o_orderstatus, o.o_totalprice, o.o_comment
                     Index Cond: (o.o_custkey = '30003'::numeric)
                     Filter: (o.o_orderstatus = 'O'::bpchar)
                     Buffers: shared hit=3
 Settings: effective_cache_size = '3053008kB', effective_io_concurrency = '10', max_parallel_workers_per_gather = '0', random_page_cost = '2', work_mem = '512MB'
 Query Identifier: 7272380376793434809
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.199 ms
 Execution Time: 0.064 ms

Now, for this simple example, the execution time isn’t vastly different, because in both cases an index is being used, however, in cases where the parameter adjustment allows an index to be used over a sequential scan, you will really see the benefit.

Ultimately, there are some other parameters that may benefit from adjustment such as the “cpu_*” parameters, however, those will require much more testing and experimentation over the adjustment of “random_page_cost” especially if your system is running SSDs as in most Google CloudSQL for Postgres instances or even Google AlloyDB where the I/O subsystem is built specifically for the implementation. And if you use either of these implementations, I would highly consider updating this parameter from the default of 4 to at least 2, maybe even 1.1 depending on the shape that you have chosen and the I/O limits served by each Shape.

Enjoy!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.