在常见的 SQL 优化技巧中,将 OR 语句改写为 UNION 是一种常见的手段。当 OR 连接的两个条件互斥时,可以改写为 UNION ALL,从而避免去重操作,进一步提升查询性能,主要原因在于优化器在处理 OR 时,很难充分利用已有的索引,导致查询计划可能退化为全表扫描或较低效的执行方式。最近在浏览 PolarDB 的优化器章节中,有专门的OR 子句转 UNION ALL一节,GET 到不少新的知识点。
OR 子句转 UNION ALL
目前,PostgreSQL 优化器对 SQL 中的 OR 子句过滤条件的优化能力较为有限。如果 OR 子句中的过滤条件仅涉及一张表,且所有过滤条件上均具备适当的索引,则优化器会为此类场景生成一个 BitmapOr 的 Index Path。
postgres=# CREATE TABLE t1(id int, num int, dsc text, log_date text); CREATE TABLE postgres=# CREATE TABLE t2(id int, cnt int, change text, op_date text); CREATE TABLE postgres=# INSERT INTO t1 SELECT i, i%100, 'test'||1, to_char('1990-10-10'::date+ i, 'YYYY-MM-DD') FROM generate_series(1,1000000)i; INSERT01000000 postgres=# INSERT INTO t2 SELECT i, i%1000, 'now'||i, to_char('1990-10-10'::date+ i, 'YYYY-MM-DD') FROM generate_series(1,1000000)i; INSERT01000000 postgres=# CREATE INDEX ON t1(id); CREATE INDEX postgres=# CREATE INDEX ON t1(num); CREATE INDEX postgres=# CREATE INDEX ON t2(id); CREATE INDEX postgres=# CREATE INDEX ON t2(cnt); CREATE INDEX postgres=# ANALYZE t1; ANALYZE postgres=# ANALYZE t2; ANALYZE postgres=# set max_parallel_workers_per_gather to0; SET postgres=# EXPLAIN ANALYZE SELECT*FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num =1OR t2.cnt =2); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- MergeJoin (cost=2.07..86670.16rows=10919 width=53) (actual time=0.037..887.670rows=11000 loops=1) Merge Cond: (t1.id = t2.id) JoinFilter: ((t1.num =1) OR (t2.cnt =2)) Rows Removed byJoinFilter: 989000 -> Index Scan using t1_id_idx on t1 (cost=0.42..33336.43rows=1000000 width=25) (actual time=0.014..227.958rows=1000000 loops=1) -> Index Scan using t2_id_idx on t2 (cost=0.42..33336.43rows=1000000 width=28) (actual time=0.011..233.349rows=1000000 loops=1) Planning Time: 0.313 ms Execution Time: 888.761 ms (8rows)