PostgreSQL学徒

SQL 优化之 OR 语句的转化

Word count: 2kReading time: 9 min
2025/04/20
loading

前言

在常见的 SQL 优化技巧中,将 OR 语句改写为 UNION 是一种常见的手段。当 OR 连接的两个条件互斥时,可以改写为 UNION ALL,从而避免去重操作,进一步提升查询性能,主要原因在于优化器在处理 OR 时,很难充分利用已有的索引,导致查询计划可能退化为全表扫描或较低效的执行方式。最近在浏览 PolarDB 的优化器章节中,有专门的OR 子句转 UNION ALL一节,GET 到不少新的知识点。

OR 子句转 UNION ALL

目前,PostgreSQL 优化器对 SQL 中的 OR 子句过滤条件的优化能力较为有限。如果 OR 子句中的过滤条件仅涉及一张表,且所有过滤条件上均具备适当的索引,则优化器会为此类场景生成一个 BitmapOr 的 Index Path。

基于此背景,让我们在 17.4 的版本中验证一下,同时关闭并行,减少并行所带来的影响。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
postgres=# CREATE TABLE t1(id int, num int, dsc text, log_date text);
CREATE TABLE
postgres=# CREATE TABLE t2(id int, cnt int, change text, op_date text);
CREATE TABLE
postgres=# INSERT INTO t1 SELECT i, i%100, 'test'||1, to_char('1990-10-10'::date + i, 'YYYY-MM-DD') FROM generate_series(1,1000000)i;
INSERT 0 1000000
postgres=# INSERT INTO t2 SELECT i, i%1000, 'now'||i, to_char('1990-10-10'::date + i, 'YYYY-MM-DD') FROM generate_series(1,1000000)i;
INSERT 0 1000000
postgres=# CREATE INDEX ON t1(id);
CREATE INDEX
postgres=# CREATE INDEX ON t1(num);
CREATE INDEX
postgres=# CREATE INDEX ON t2(id);
CREATE INDEX
postgres=# CREATE INDEX ON t2(cnt);
CREATE INDEX
postgres=# ANALYZE t1;
ANALYZE
postgres=# ANALYZE t2;
ANALYZE
postgres=# set max_parallel_workers_per_gather to 0;
SET
postgres=# EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=2.07..86670.16 rows=10919 width=53) (actual time=0.037..887.670 rows=11000 loops=1)
Merge Cond: (t1.id = t2.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
Rows Removed by Join Filter: 989000
-> Index Scan using t1_id_idx on t1 (cost=0.42..33336.43 rows=1000000 width=25) (actual time=0.014..227.958 rows=1000000 loops=1)
-> Index Scan using t2_id_idx on t2 (cost=0.42..33336.43 rows=1000000 width=28) (actual time=0.011..233.349 rows=1000000 loops=1)
Planning Time: 0.313 ms
Execution Time: 888.761 ms
(8 rows)

正如此文中所说,此处选择了 Merge Join,t1 和 t2 表都进行了全表扫描,关联之后再进行过滤,(t1.num = 1 OR t2.cnt = 2) 被当做一个整体进行过滤

上述 OR 子句被当作一个整体,优化器无法使用 t1.num 或者 t2.cnt 上的索引,导致 t1 与 t2 进行全表扫描。实际上,OR 子句在逻辑上可以转化为 UNION ALL,包含两个或多个分支的查询形式。

但是实际上 num 和 cnt 上均有索引,因此假如提前进行索引过滤,可以大幅减少需要关联的数据量。让我们等价改写一下,此处同样使用官方的例子,大致改写逻辑主要关注后方的 AND (t1.num != 1 OR (t1.num = 1) IS NULL);,改写需要考虑 NULL 的三值逻辑。在 SQL 中,当字段值为 NULL 时,用等号 (=) 或不等号 (!=) 比较会返回 unknown。

  • 如果直接使用 t1.num != 1 来排除已经被第一部分查询覆盖的行,那么当 t1.num 为 NULL 时,表达式 t1.num != 1 的结果不是 true,而是 unknown,因此这一行就会在第二个子查询中被过滤掉。
  • 然而,在原始条件 (t1.num = 1 OR t2.cnt = 2) 中,如果 t1.num 为 NULL,但 t2.cnt = 2 为 true,该行实际上是应该返回的。

为了解决以上问题,在第二个子查询的 WHERE 条件中增加 (t1.num != 1 OR (t1.num = 1) IS NULL)

  • t1.num 为 NULL 时,(t1.num = 1) 会返回 unknown,而 (t1.num = 1) IS NULL 则会返回 true,从而整个 OR 表达式的结果为 true。
  • 这样,就确保了那些 t2.cnt = 2t1.num 为 NULL 的行不会因为 NULL 比较而被错误地过滤掉,而能够和原始查询保持一致。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
postgres=# EXPLAIN ANALYZE
SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE t1.num = 1
UNION ALL
SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE t2.cnt = 2 AND (t1.num != 1 OR (t1.num = 1) IS NULL);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=8105.66..39674.93 rows=10918 width=53) (actual time=16.045..271.818 rows=11000 loops=1)
-> Hash Join (cost=8105.66..29307.99 rows=9933 width=53) (actual time=16.044..262.544 rows=10000 loops=1)
Hash Cond: (t2.id = t1.id)
-> Seq Scan on t2 (cost=0.00..17353.00 rows=1000000 width=28) (actual time=0.009..93.759 rows=1000000 loops=1)
-> Hash (cost=7981.50..7981.50 rows=9933 width=25) (actual time=16.013..16.015 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 685kB
-> Bitmap Heap Scan on t1 (cost=113.41..7981.50 rows=9933 width=25) (actual time=3.986..13.570 rows=10000 loops=1)
Recheck Cond: (num = 1)
Heap Blocks: exact=7353
-> Bitmap Index Scan on t1_num_idx (cost=0.00..110.92 rows=9933 width=0) (actual time=1.922..1.922 rows=10000 loops=1)
Index Cond: (num = 1)
-> Nested Loop (cost=12.56..10312.35 rows=985 width=53) (actual time=0.509..7.858 rows=1000 loops=1)
-> Bitmap Heap Scan on t2 t2_1 (cost=12.14..2757.14 rows=995 width=28) (actual time=0.489..2.050 rows=1000 loops=1)
Recheck Cond: (cnt = 2)
Heap Blocks: exact=1000
-> Bitmap Index Scan on t2_cnt_idx (cost=0.00..11.89 rows=995 width=0) (actual time=0.249..0.249 rows=1000 loops=1)
Index Cond: (cnt = 2)
-> Index Scan using t1_id_idx on t1 t1_1 (cost=0.42..7.58 rows=1 width=25) (actual time=0.005..0.005 rows=1 loops=1000)
Index Cond: (id = t2_1.id)
Filter: ((num <> 1) OR ((num = 1) IS NULL))
Planning Time: 0.564 ms
Execution Time: 272.753 ms
(22 rows)

通过这样改写之后,可以看到每个过滤条件都独立地走到了索引扫描中,最后再 Append,查询时间也提升了接近 4 倍。

另外,我去 GP7 和 GP6 里面都跑了一下,都是如下类似的结果,并不能自动处理这样的逻辑。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
postgres=# EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2);
QUERY PLAN
------------------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..1187.59 rows=10590 width=53)
-> Hash Join (cost=0.00..1185.71 rows=2648 width=53)
Hash Cond: (t2.id = t1.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
-> Seq Scan on t2 (cost=0.00..438.98 rows=250000 width=28)
-> Hash (cost=438.56..438.56 rows=250000 width=25)
-> Seq Scan on t1 (cost=0.00..438.56 rows=250000 width=25)
Optimizer: GPORCA
(8 rows)

postgres=# set optimizer to off;
SET
postgres=# EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2);
QUERY PLAN
----------------------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=6084.00..10458.28 rows=11019 width=53)
-> Hash Join (cost=6084.00..10320.55 rows=2755 width=53)
Hash Cond: (t1.id = t2.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
-> Seq Scan on t1 (cost=0.00..2959.00 rows=250000 width=25)
-> Hash (cost=2959.00..2959.00 rows=250000 width=28)
-> Seq Scan on t2 (cost=0.00..2959.00 rows=250000 width=28)
Optimizer: Postgres-based planner
(8 rows)


---GP6
postgres=# EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2);
QUERY PLAN
------------------------------------------------------------------------------------
Gather Motion 6:1 (slice1; segments: 6) (cost=0.00..1078.96 rows=16839 width=50)
-> Hash Join (cost=0.00..1076.46 rows=2807 width=50)
Hash Cond: (t1.id = t2.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
-> Seq Scan on t1 (cost=0.00..436.04 rows=166667 width=25)
-> Hash (cost=436.04..436.04 rows=166667 width=25)
-> Seq Scan on t2 (cost=0.00..436.04 rows=166667 width=25)
Optimizer: Pivotal Optimizer (GPORCA)
(8 rows)

postgres=# set optimizer to off;
SET
postgres=# EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2);
QUERY PLAN
-----------------------------------------------------------------------------------------
Gather Motion 6:1 (slice1; segments: 6) (cost=24338.00..46274.36 rows=21418 width=50)
-> Hash Join (cost=24338.00..46274.36 rows=3570 width=50)
Hash Cond: (t1.id = t2.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
-> Seq Scan on t1 (cost=0.00..11838.00 rows=166667 width=25)
-> Hash (cost=11838.00..11838.00 rows=166667 width=25)
-> Seq Scan on t2 (cost=0.00..11838.00 rows=166667 width=25)
Optimizer: Postgres query optimizer
(8 rows)

小结

SQL 优化有很多套路可循,比如此例中的 OR 改写、标量子查询改写等等。另外,PolarDB 优化器章节中提到的”关联子查询上拉”,在 17 版本中测了下,是可以自动改写的。

参考

高级SQL优化 | 你真的了解用 UNION替换OR吗?

OR子句转UNION ALL

索引失效?别慌,PawSQL带你深入了解15种性能优化策略!

CATALOG
  1. 1. 前言
  2. 2. OR 子句转 UNION ALL
  3. 3. 小结
  4. 4. 参考