Why Isn’t My Query Benefiting from Partition Pruning?

BLOGGERS NOTE… The original posting had verbiage where it wasn’t clear that I was referring to the planner not partition pruning vs the execution not partition pruning. Apologies for the error. The below post has been corrected:

Recently I had a customer come to me with a poorly performing query that had a perfectly fine execution time on Oracle, but once migrated to Postgres the query was slow. Upon investigation, it was found that the planning portion of the query was not pruning partitions causing a much longer planning time. That said even though the planning portion was not accomplishing partition pruning, the execution portion was (This is indicated by the portion of the explain plan notating “never executed”) .

The query had “WHERE” and “JOIN” clauses which explicitly specified partition keys and were joined on the partition key, so why wasn’t pruning happening during planning time?

The setup for the test is at the bottom of the blog post: Jump to Test Case Setup

Original Query

The tables in question are all partitioned by “hire_date” as as you can see in the “WHERE” clause below, the planner should project partition pruning because the “hire_date” from the driving table is equal to that of the joined table, yet the planner did not prune partitions:

WITH top_emp AS (
   SELECT 
       DISTINCT
       id,
       first_name,
       last_name,
       hire_date,
       birth_date
   FROM
       employees.employee_part emp
       INNER JOIN employees.salary sal on (sal.employee_id = emp.id)
   WHERE
       emp.hire_date = '1985-01-01'::date
       and emp.gender = 'M'
       and emp.birth_date between '1957-01-01'::date and '1970-01-01'::date
       and sal.amount > 50000
)
SELECT
    dept.employee_id,
    dept.department_id,
    dept.hire_date
FROM
    top_emp emp1,
    employees.department_employee_part dept
WHERE
    emp1.hire_date = dept.hire_date and
    emp1.id = dept.employee_id;

/* Explain Plan */
 Nested Loop  (cost=1467.16..2646.20 rows=307 width=17) (actual time=4.565..4.613 rows=4 loops=1)
   Output: dept.employee_id, dept.department_id, dept.hire_date
   Buffers: shared hit=330
   ->  Unique  (cost=1466.87..1471.11 rows=37 width=31) (actual time=4.537..4.560 rows=4 loops=1)
         Output: emp.id, emp.first_name, emp.last_name, emp.hire_date, emp.birth_date
         Buffers: shared hit=318
         ->  Sort  (cost=1466.87..1467.72 rows=339 width=31) (actual time=4.534..4.539 rows=69 loops=1)
               Output: emp.id, emp.first_name, emp.last_name, emp.hire_date, emp.birth_date
               Sort Key: emp.id, emp.first_name, emp.last_name, emp.birth_date
               Sort Method: quicksort  Memory: 30kB
               Buffers: shared hit=318
               ->  Nested Loop  (cost=0.43..1452.62 rows=339 width=31) (actual time=1.524..4.450 rows=69 loops=1)
                     Output: emp.id, emp.first_name, emp.last_name, emp.hire_date, emp.birth_date
                     Buffers: shared hit=309
                     ->  Seq Scan on employees.employee_part_p1985 emp  (cost=0.00..999.32 rows=37 width=31) (actual time=1.500..4.380 rows=4 loops=1)
                           Output: emp.id, emp.first_name, emp.last_name, emp.hire_date, emp.birth_date
                           Filter: ((emp.birth_date >= '1957-01-01'::date) AND (emp.birth_date <= '1970-01-01'::date) AND (emp.hire_date = '1985-01-01'::date) AND (emp.gender = 'M'::employee_gender))
                           Rows Removed by Filter: 35312
                           Buffers: shared hit=293
                     ->  Index Scan using idx_16991_primary on employees.salary sal  (cost=0.43..12.16 rows=9 width=8) (actual time=0.010..0.014 rows=17 loops=4)
                           Output: sal.employee_id, sal.amount, sal.from_date, sal.to_date
                           Index Cond: (sal.employee_id = emp.id)
                           Filter: (sal.amount > 50000)
                           Rows Removed by Filter: 1
                           Buffers: shared hit=16
   ->  Append  (cost=0.29..31.59 rows=16 width=17) (actual time=0.010..0.011 rows=1 loops=4)
         Buffers: shared hit=12
         ->  Index Scan using contract_employees_p1985_pkey on employees.department_employee_part_p1985 dept_1  (cost=0.29..2.33 rows=1 width=17) (actual time=0.006..0.006 rows=1 loops=4)
               Output: dept_1.employee_id, dept_1.department_id, dept_1.hire_date
               Index Cond: (dept_1.employee_id = emp.id)
               Filter: (emp.hire_date = dept_1.hire_date)
               Buffers: shared hit=12
         ->  Index Scan using contract_employees_p1986_pkey on employees.department_employee_part_p1986 dept_2  (cost=0.29..2.36 rows=1 width=17) (never executed)
               Output: dept_2.employee_id, dept_2.department_id, dept_2.hire_date
               Index Cond: (dept_2.employee_id = emp.id)
               Filter: (emp.hire_date = dept_2.hire_date)
         ->  Index Scan using contract_employees_p1987_pkey on employees.department_employee_part_p1987 dept_3  (cost=0.29..2.33 rows=1 width=17) (never executed)
               Output: dept_3.employee_id, dept_3.department_id, dept_3.hire_date
               Index Cond: (dept_3.employee_id = emp.id)
               Filter: (emp.hire_date = dept_3.hire_date)

......... LOTS OF PARTITIONS ........

         ->  Index Scan using contract_employees_p2000_pkey on employees.department_employee_part_p2000 dept_16  (cost=0.14..0.24 rows=1 width=17) (never executed)
               Output: dept_16.employee_id, dept_16.department_id, dept_16.hire_date
               Index Cond: (dept_16.employee_id = emp.id)
               Filter: (emp.hire_date = dept_16.hire_date)
 Settings: effective_cache_size = '3256704kB', effective_io_concurrency = '128', random_page_cost = '1.1', search_path = '"$user", public, employees', temp_buffers = '128MB', work_mem = '256MB'
 Query Identifier: -1847211568235447798
 Planning:
   Buffers: shared hit=1313
 Planning Time: 4.111 ms
 Execution Time: 5.176 ms

So if the partition keys are equal and the tables are partitioned the exact same way, then why is the planner not projecting partition pruning? The answer can be found in the Postgres Listserv by clicking here.

The planner can only push quals down into a subquery, it cannot pull
quals from a subquery into the outer query.

So I want (and need the planner to partition prune) so how can I get it? Both alternatives require re-writing the query although one is a little less invasive than the other. If you know that the partition keys are exactly equal and there can only be one value, then you can modify the the where clause to have an additional entry like this (notice the additional entry in the lower “where” clause):

WITH top_emp AS (
   SELECT 
       DISTINCT
       id,
       first_name,
       last_name,
       hire_date,
       birth_date
   FROM
       employees.employee_part emp
       INNER JOIN employees.salary sal on (sal.employee_id = emp.id)
   WHERE
       emp.hire_date = '1985-01-01'::date
       and emp.gender = 'M'
       and emp.birth_date between '1957-01-01'::date and '1970-01-01'::date
       and sal.amount > 50000
)
SELECT
    dept.employee_id,
    dept.department_id,
    dept.hire_date
FROM
    top_emp emp1,
    employees.department_employee_part dept
WHERE
    emp1.hire_date = dept.hire_date and
    dept.hire_date = '1985-01-01'::date and
    emp1.id = dept.employee_id;

Or, you could totally re-write the query like this so, just in case you could match multiple partitions in the other table, that situation is handled:

SELECT
   employee_id,
   department_id,
   hire_date
FROM (
   SELECT 
       dept.employee_id,
       dept.department_id,
       dept.hire_date,
       count(1)
   FROM
       employees.employee_part emp
       INNER JOIN employees.salary sal on (sal.employee_id = emp.id)
       JOIN employees.department_employee_part dept on (emp.hire_date = dept.hire_date and emp.id = dept.employee_id)  
   WHERE
       emp.hire_date = '1985-01-01'::date
       and emp.gender = 'M'
       and emp.birth_date between '1957-01-01'::date and '1970-01-01'::date
       and sal.amount > 50000
   GROUP BY
       dept.employee_id,
       dept.department_id,
       dept.hire_date) a;

In this case, the following plan is used and partition pruning occurs. Both planning and execution time are significantly less as well:

 Subquery Scan on a  (cost=1273.80..1273.83 rows=1 width=17) (actual time=2.629..2.641 rows=4 loops=1)
   Output: a.employee_id, a.department_id, a.hire_date
   Buffers: shared hit=331
   ->  GroupAggregate  (cost=1273.80..1273.82 rows=1 width=25) (actual time=2.628..2.639 rows=4 loops=1)
         Output: dept.employee_id, dept.department_id, dept.hire_date, NULL::bigint
         Group Key: dept.employee_id, dept.department_id, dept.hire_date
         Buffers: shared hit=331
         ->  Sort  (cost=1273.80..1273.80 rows=1 width=17) (actual time=2.620..2.624 rows=69 loops=1)
               Output: dept.employee_id, dept.department_id, dept.hire_date
               Sort Key: dept.employee_id, dept.department_id
               Sort Method: quicksort  Memory: 29kB
               Buffers: shared hit=331
               ->  Nested Loop  (cost=0.74..1273.79 rows=1 width=17) (actual time=0.836..2.604 rows=69 loops=1)
                     Output: dept.employee_id, dept.department_id, dept.hire_date
                     Buffers: shared hit=331
                     ->  Nested Loop  (cost=0.30..967.66 rows=43 width=25) (actual time=0.828..2.557 rows=4 loops=1)
                           Output: emp.id, dept.employee_id, dept.department_id, dept.hire_date
                           Buffers: shared hit=315
                           ->  Seq Scan on employees.department_employee_part_p1985 dept  (cost=0.00..776.50 rows=109 width=17) (actual time=0.801..2.515 rows=9 loops=1)
                                 Output: dept.employee_id, dept.department_id, dept.hire_date
                                 Filter: (dept.hire_date = '1985-01-01'::date)
                                 Rows Removed by Filter: 39071
                                 Buffers: shared hit=288
                           ->  Memoize  (cost=0.30..2.03 rows=1 width=12) (actual time=0.004..0.004 max time=0.020 rows=0 loops=9)
                                 Output: emp.id, emp.hire_date
                                 Cache Key: dept.employee_id
                                 Cache Mode: logical
                                 Hits: 0  Misses: 9  Evictions: 0  Overflows: 0  Memory Usage: 1kB
                                 Buffers: shared hit=27
                                 ->  Index Scan using employee_part_p1985_pkey on employees.employee_part_p1985 emp  (cost=0.29..2.02 rows=1 width=12) (actual time=0.002..0.003 max time=0.012 rows=0 loops=9)
                                       Output: emp.id, emp.hire_date
                                       Index Cond: (emp.id = dept.employee_id)
                                       Filter: ((emp.birth_date >= '1957-01-01'::date) AND (emp.birth_date <= '1970-01-01'::date) AND (emp.hire_date = '1985-01-01'::date) AND (emp.gender = 'M'::employees.employee_gender))
                                       Rows Removed by Filter: 1
                                       Buffers: shared hit=27
                     ->  Memoize  (cost=0.44..8.11 rows=9 width=8) (actual time=0.004..0.009 max time=0.014 rows=17 loops=4)
                           Output: sal.employee_id
                           Cache Key: emp.id
                           Cache Mode: logical
                           Hits: 0  Misses: 4  Evictions: 0  Overflows: 0  Memory Usage: 3kB
                           Buffers: shared hit=16
                           ->  Index Scan using idx_16991_primary on employees.salary sal  (cost=0.43..8.10 rows=9 width=8) (actual time=0.003..0.006 max time=0.009 rows=17 loops=4)
                                 Output: sal.employee_id
                                 Index Cond: (sal.employee_id = emp.id)
                                 Filter: (sal.amount > 50000)
                                 Rows Removed by Filter: 1
                                 Buffers: shared hit=16
 Settings: effective_cache_size = '9830040kB', random_page_cost = '1.1', work_mem = '256MB'
 Query Identifier: 2871877186944609726
 Planning:
   Buffers: shared hit=26
 Planning Time: 0.460 ms
 Execution Time: 2.707 ms

The Verdict

So as you can see it is very important to get the planner to project what you want. While ultimately the execution phase was not fully executed, it still had to look at the stats of the partition to decide not to execute it resulting in some wasted time. Also, keep in mind that this test case is a very simple one. The customer issue I worked on had a lot more partitions and a lot more data so the planning time being wasted was much higher resulting in a much higher impact.

Test Case Setup

To create a test case for this example, I used the employee database located on this Github site:
https://github.com/h8/employees-database

I then created some partitioned versions of the tables from the sample database as shown below:

-- Table: test_schema.contract_employees

CREATE TABLE employees.department_employee_part (
	employee_id int8 NOT NULL,
	department_id bpchar(4) NOT NULL,
	from_date date NOT NULL,
	to_date date NOT NULL,
    hire_date date NOT NULL
)  PARTITION BY RANGE (hire_date);

-- Partitions SQL

CREATE TABLE employees.department_employee_part_p1985 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1985_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1985-01-01') TO ('1986-01-01');
CREATE TABLE employees.department_employee_part_p1986 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1986_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1986-01-01') TO ('1987-01-01');
CREATE TABLE employees.department_employee_part_p1987 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1987_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1987-01-01') TO ('1988-01-01');
CREATE TABLE employees.department_employee_part_p1988 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1988_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1988-01-01') TO ('1989-01-01');
CREATE TABLE employees.department_employee_part_p1989 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1989_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1989-01-01') TO ('1990-01-01');
CREATE TABLE employees.department_employee_part_p1990 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1990_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1990-01-01') TO ('1991-01-01');
CREATE TABLE employees.department_employee_part_p1991 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1991_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1991-01-01') TO ('1992-01-01');
CREATE TABLE employees.department_employee_part_p1992 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1992_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1992-01-01') TO ('1993-01-01');
CREATE TABLE employees.department_employee_part_p1993 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1993_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1993-01-01') TO ('1994-01-01');
CREATE TABLE employees.department_employee_part_p1994 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1994_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1994-01-01') TO ('1995-01-01');
CREATE TABLE employees.department_employee_part_p1995 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1995_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1995-01-01') TO ('1996-01-01');
CREATE TABLE employees.department_employee_part_p1996 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1996_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1996-01-01') TO ('1997-01-01');
CREATE TABLE employees.department_employee_part_p1997 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1997_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1997-01-01') TO ('1998-01-01');
CREATE TABLE employees.department_employee_part_p1998 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1998_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1998-01-01') TO ('1999-01-01');
CREATE TABLE employees.department_employee_part_p1999 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p1999_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('1999-01-01') TO ('2000-01-01');
CREATE TABLE employees.department_employee_part_p2000 PARTITION OF employees.department_employee_part
( 
    CONSTRAINT department_employee_part_p2000_pkey PRIMARY KEY (employee_id, department_id)
)
    FOR VALUES FROM ('2000-01-01') TO ('2001-01-01');

CREATE INDEX ON employees.department_employee_part USING btree (department_id);

insert into employees.department_employee_part
select de.*,emp.hire_date 
from employees.employee emp join employees.department_employee de on (emp.id = de.employee_id);

vacuum (verbose, analyze) employees.department_employee_part;

CREATE TABLE employees.employee_part (
	id int8 DEFAULT nextval('employees.id_employee_seq'::regclass) NOT NULL,
	birth_date date NOT NULL,
	first_name varchar(14) NOT NULL,
	last_name varchar(16) NOT NULL,
	gender employees.employee_gender NOT NULL,
	hire_date date NOT NULL
) PARTITION BY RANGE (hire_date);

CREATE TABLE employees.employee_part_p1985 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1985_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1985-01-01') TO ('1986-01-01');
CREATE TABLE employees.employee_part_p1986 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1986_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1986-01-01') TO ('1987-01-01');
CREATE TABLE employees.employee_part_p1987 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1987_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1987-01-01') TO ('1988-01-01');
CREATE TABLE employees.employee_part_p1988 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1988_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1988-01-01') TO ('1989-01-01');
CREATE TABLE employees.employee_part_p1989 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1989_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1989-01-01') TO ('1990-01-01');
CREATE TABLE employees.employee_part_p1990 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1990_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1990-01-01') TO ('1991-01-01');
CREATE TABLE employees.employee_part_p1991 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1991_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1991-01-01') TO ('1992-01-01');
CREATE TABLE employees.employee_part_p1992 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1992_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1992-01-01') TO ('1993-01-01');
CREATE TABLE employees.employee_part_p1993 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1993_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1993-01-01') TO ('1994-01-01');
CREATE TABLE employees.employee_part_p1994 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1994_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1994-01-01') TO ('1995-01-01');
CREATE TABLE employees.employee_part_p1995 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1995_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1995-01-01') TO ('1996-01-01');
CREATE TABLE employees.employee_part_p1996 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1996_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1996-01-01') TO ('1997-01-01');
CREATE TABLE employees.employee_part_p1997 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1997_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1997-01-01') TO ('1998-01-01');
CREATE TABLE employees.employee_part_p1998 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1998_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1998-01-01') TO ('1999-01-01');
CREATE TABLE employees.employee_part_p1999 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p1999_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('1999-01-01') TO ('2000-01-01');
CREATE TABLE employees.employee_part_p2000 PARTITION OF employees.employee_part
( 
    CONSTRAINT employee_part_p2000_pkey PRIMARY KEY (id)
)
    FOR VALUES FROM ('2000-01-01') TO ('2001-01-01');
    
insert into  employees.employee_part select * from  employees.employee;

vacuum (verbose, analyze) employees.employee_part;


2 thoughts on “Why Isn’t My Query Benefiting from Partition Pruning?

  1. YS's avatarYS

    a quick question regarding to the original execution plan:

    ->  IndexScan using contract_employees_p1986_pkey onemployees.department_employee_part_p1986 dept_2  (cost=0.29..2.36 rows=1 width=17) (never executed)

                   Output: dept_2.employee_id, dept_2.department_id, dept_2.hire_date

                   IndexCond: (dept_2.employee_id = emp.id)

                   Filter: (emp.hire_date = dept_2.hire_date)

             ->  IndexScan using contract_employees_p1987_pkey onemployees.department_employee_part_p1987 dept_3  (cost=0.29..2.33 rows=1 width=17) (never executed)

                   Output: dept_3.employee_id, dept_3.department_id, dept_3.hire_date

                   IndexCond: (dept_3.employee_id = emp.id)

                   Filter: (emp.hire_date = dept_3.hire_date)

    ......... LOTS OFPARTITIONS ........

    Since index scan on all other partitions are marked as “never executed”, doesn’t that mean the runtime partitioning pruning is taking in place? Because if you remove “emp1.hire_date = dept.hire_date” filter from where clause, the index scan on each partitions will be actual performed.

    I understand that execution time partition pruning only happens on nested loop join, if there is no index on outer table and planner has to go with hash-join, then the original query can’t utilize any partitioning pruning at all, but that is not the case you show here.

    Like

    Reply
    1. sborden76's avatarsborden76 Post author

      Thank you for pointing this out. I just realized that I published the incorrect version of the blog post. The correct version of the post was much more clear that it was the planner not pruning partitions vs the execution phase not fully pruning partitions.

      This has been corrected.

      Like

      Reply

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.