15.1. How Parallel Query Works

When the optimizer determines that parallel query is the fastest execution strategy for a particular query, it will create a query plan which includes a Gather node. Here is a simple example:

EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Gather  (cost=1000.00..217018.43 rows=1 width=97)
   Workers Planned: 2
   ->  Parallel Seq Scan on pgbench_accounts  (cost=0.00..216018.33 rows=1 width=97)
         Filter: (filler ~~ '%x%'::text)
(4 rows)

In all cases, the Gather node will have exactly one child plan, which is the portion of the plan that will be executed in parallel. If the Gather node is at the very top of the plan tree, then the entire query will execute in parallel. If it is somewhere else in the plan tree, then only that portion of the query will run in parallel. In the example above, the query accesses only one table, so there is only one plan node other than the Gather node itself; since that plan node is a child of the Gather node, it will run in parallel.

Using EXPLAIN, you can see the number of workers chosen by the planner. When the Gather node is reached during query execution, the process which is implementing the user's session will request a number of background worker processes equal to the number of workers chosen by the planner. The total number of background workers that can exist at any one time is limited by max_worker_processes, so it is possible for a parallel query to run with fewer workers than planned, or even with no workers at all. The optimal plan may depend on the number of workers that are available, so this can result in poor query performance. If this occurrence is frequent, considering increasing max_worker_processes so that more workers can be run simultaneously or alternatively reducing max_parallel_workers_per_gather so that the planner requests fewer workers.

Every background worker process which is successfully started for a given parallel query will execute the portion of the plan which is a descendent of the Gather node. The leader will also execute that portion of the plan, but it has an additional responsibility: it must also read all of the tuples generated by the workers. When the parallel portion of the plan generates only a small number of tuples, the leader will often behave very much like an additional worker, speeding up query execution. Conversely, when the parallel portion of the plan generates a large number of tuples, the leader may be almost entirely occupied with reading the tuples generated by the workers and performing any further processing steps which are required by plan nodes above the level of the Gather node. In such cases, the leader will do very little of the work of executing the parallel portion of the plan.