PostgreSQL学徒

DB Killer?原来是标量子查询!

Word count: 1.9kReading time: 9 min
2024/06/27
loading

前言

最近遇到一起标量子查询导致的性能问题,标量子查询也是我们日常在写 SQL 语句中需要尤为注意的地方。

何如

A scalar subquery is a regular SELECT query in parentheses that returns exactly one value: one row with one column. The query is run and the returned value is used in the outer query. If the subquery returns zero rows, the value of the subquery expression is null.

标量子查询是圆括号中的常规 SELECT 查询,仅返回一个值:带有一个列的一行。执行此查询,返回值将在外部查询中使用。

**通俗来讲,标量子查询就是嵌套在语句中的 SELECT 子句。**标量子查询由于需要传值,因此就和嵌套循环十分类似,驱动表返回多少条 (主查询),被驱动表就要循环多少次 (标量子查询)。

不难想象,容易产生性能问题的点在于:

  1. 主查询返回结果的执行路径,例如全表扫描、错误的索引扫描等
  2. 主查询返回的条数,如果过多,会导致子查询循环多次
  3. 标量部分的执行路径

描述过于晦涩,让我们看个栗子,也是实际案例中遇到的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
postgres=# create table t1(ts timestamp,pk_system text,deal_batchno text);                                                                                                                                                                              
CREATE TABLE
postgres=# insert into t1 select now(),'hello','world' from generate_series(1,10000);
INSERT 0 10000
postgres=# insert into t1 select now(),'xiongcc','postgres' from generate_series(1,100);
INSERT 0 100
postgres=# insert into t1 select now(),'nihao','mydb' from generate_series(1,100);
INSERT 0 100

postgres=# create table t2(systypecode text,systypename text);
CREATE TABLE
postgres=# insert into t2 select 'amazon','mydb' from generate_series(1,10000);
INSERT 0 10000
postgres=# insert into t2 select 'aurora','nihao' from generate_series(1,1);
INSERT 0 1

查询语句如下:

1
2
3
4
5
6
7
8
9
10
mydb=# explain update t1 set ts = now(),pk_system = (select systypecode from t2 where t2.systypename = t1.pk_system) where t1.deal_batchno = 'mydb';;
QUERY PLAN
---------------------------------------------------------------------
Update on t1 (cost=0.00..18195.25 rows=0 width=0)
-> Seq Scan on t1 (cost=0.00..18195.25 rows=100 width=46)
Filter: (deal_batchno = 'mydb'::text)
SubPlan 1
-> Seq Scan on t2 (cost=0.00..180.01 rows=5000 width=7)
Filter: (systypename = t1.pk_system)
(6 rows)

此处的标量部分对应于执行计划中的 SubPlan,还有个类似的 InitPlan,不过不同于 SubPlan 节点,InitPlan 仅评估一次。因此,在此例中,有多少 deal_batchno = ‘mydb’ 行,SubPlan 就需要运行多少次。

回到最开始的性能问题第一条——主查询的执行路径,此处明显可以创建一个索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mydb=# create index on t1(deal_batchno);
CREATE INDEX
mydb=# analyze t1,t2;
ANALYZE
mydb=# begin;
BEGIN
mydb=*# explain analyze update t1 set ts = now(),pk_system = (select systypecode from t2 where t2.systypename = t1.pk_system) where t1.deal_batchno = 'mydb';
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
--------------
Update on t1 (cost=0.29..18060.48 rows=0 width=0) (actual time=213.580..213.581 rows=0 loops=1)
-> Index Scan using t1_deal_batchno_idx on t1 (cost=0.29..18060.48 rows=100 width=46) (actual time=2.375..211.907 rows
=100 loops=1)
Index Cond: (deal_batchno = 'mydb'::text)
SubPlan 1
-> Seq Scan on t2 (cost=0.00..180.01 rows=5000 width=7) (actual time=2.114..2.114 rows=1 loops=100)
Filter: (systypename = t1.pk_system)
Rows Removed by Filter: 10000
Planning Time: 0.132 ms
Execution Time: 213.626 ms
(9 rows)

可以看到,此处子查询循环了 100 次 (loops = 100)。由于是 UPDATE 语句,因此如果子查询关联出的结果返回多条,更新无疑会报错

ERROR: more than one row returned by a subquery used as an expression

所以此处构建的例子是匹配的 1 条 (rows=1)。可以看到,这条语句执行了 213 ms,效率低下,对于这条 SQL,只需要修改其写法,先关联取出数据,这样就避免了子查询的多次循环,效率自然也就有所提升。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
mydb=# begin;
BEGIN
mydb=*# explain analyze update t1 set ts = now(),pk_system = systypecode from t2 where t2.systypename = t1.pk_system and t1.deal_batchno = 'mydb';
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
---------------------
Update on t1 (cost=60.91..1516.56 rows=0 width=0) (actual time=6.846..6.849 rows=0 loops=1)
-> Hash Join (cost=60.91..1516.56 rows=34 width=27) (actual time=4.904..4.995 rows=100 loops=1)
Hash Cond: (t2.systypename = t1.pk_system)
-> Seq Scan on t2 (cost=0.00..155.01 rows=10001 width=18) (actual time=0.019..2.831 rows=10001 loops=1)
-> Hash (cost=59.64..59.64 rows=102 width=12) (actual time=0.305..0.306 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> Index Scan using t1_deal_batchno_idx on t1 (cost=0.29..59.64 rows=102 width=12) (actual time=0.045..0.2
51 rows=100 loops=1)
Index Cond: (deal_batchno = 'mydb'::text)
Planning Time: 0.166 ms
Execution Time: 6.902 ms
(10 rows)

可以看到,性能从 213.626 ms → 6.902 ms,提升了接近 40 倍,还是很可观的。

值得注意的是,如果是 SELECT 的话,那么需要用 LEFT JOIN 进行修改,因为是传值,没有的值会用 NULL 进行填充,因此用 INNER JOIN 改写就不等价了,当然如果存在主外键的关系,自然可以用 INNER JOIN 进行改写。看个小栗子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
mydb=# create table t1(id1 int,info1 text);                                                                                                                                                                                                                 
CREATE TABLE
mydb=# create table t2(id2 int,info2 text);
CREATE TABLE
mydb=# insert into t1 select n,md5(random()::text) from generate_series(1,10) as n;
INSERT 0 10
mydb=# insert into t2 select n,md5(random()::text) from generate_series(1,10,2) as n;
INSERT 0 5

mydb=# explain analyze select id1,info1, (select t2.info2 from t2 where t2.id2 = t1.id1 ) as t from t1;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=0.00..11.72 rows=10 width=69) (actual time=0.027..0.058 rows=10 loops=1)
SubPlan 1
-> Seq Scan on t2 (cost=0.00..1.06 rows=1 width=33) (actual time=0.002..0.002 rows=0 loops=10)
Filter: (id2 = t1.id1)
Rows Removed by Filter: 4
Planning Time: 0.163 ms
Execution Time: 0.082 ms
(7 rows)

mydb=# select id1,info1, (select t2.info2 from t2 where t2.id2 = t1.id1 ) as t from t1;
id1 | info1 | t
-----+----------------------------------+----------------------------------
1 | d2db92012a9fde800b2bd1801f688aff | d9a14de394aa22d1ce283e2a9c86ae24
2 | e92aab1f4521bd3273cefb858125cd80 |
3 | 9ac682bab8e324d7cd31cfe7d32e6f05 | 539d58f89091e199e9a2583f381541dd
4 | bf27aff302d83352ad2d51757de99a17 |
5 | 2d39a2ed1e73b74e9cc7eacb0f41c433 | a4af0f0e128490dddf88c934f93ab153
6 | dfe0498b188cd7f52f3d6e37b19eb77b |
7 | 24042fb7db4256068a03af5538f343b8 | 2937175d0637830f79af5074bd04f075
8 | 90ef76e0945bfbee57f174b01992f40e |
9 | 68a52163e2e45adafcaf34488efdd2b9 | c10749e0342e8fea1c8242e5da337cb6
10 | 7b06b18d22871cd9aa64ad10930542f3 |
(10 rows)

无法并行

标量子查询还有一个潜在的危害便是——上层节点无法并行执行,又回到刚刚第一条,主查询一般我们都会设法优化其执行路径,比如通过索引减少扫描的数据量。回到之前的例子,先把索引删了,然后多插入点数据,看看优化器能否选择并行

1
2
3
4
mydb=# insert into t1 select now(),'nihao','mydb' from generate_series(1,100000);
INSERT 0 100000
mydb=# insert into t1 select now(),'nihao','mydb' from generate_series(1,1000000);
INSERT 0 1000000

默认情况下,并行度的计算方式是:

也就意味着每当一个表增大三倍时,就会分配一个额外的并行工作进程。t1 表目前大约 60MB,大概可以分配 2 个工作进程。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
mydb=# select pg_size_pretty(pg_relation_size('t1'));
pg_size_pretty
----------------
60 MB
(1 row)

mydb=# begin;
BEGIN
mydb=*# explain update t1 set ts = now(),pk_system = (select systypecode from t2 where t2.systypename = t1.pk_system) where t1.deal_batchno = 'mydb';;
QUERY PLAN
-----------------------------------------------------------------------
Update on t1 (cost=0.00..198045493.90 rows=0 width=0)
-> Seq Scan on t1 (cost=0.00..198045493.90 rows=1100023 width=46)
Filter: (deal_batchno = 'mydb'::text)
SubPlan 1
-> Seq Scan on t2 (cost=0.00..180.01 rows=5000 width=7)
Filter: (systypename = t1.pk_system)
(6 rows)

mydb=*# explain select count(*) from t1;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize Aggregate (cost=15008.51..15008.52 rows=1 width=8)
-> Gather (cost=15008.29..15008.50 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=14008.29..14008.30 rows=1 width=8)
-> Parallel Seq Scan on t1 (cost=0.00..12851.83 rows=462583 width=0)
(5 rows)

但是执行计划显示,压根无法进行并行,上层的 Seq Scan 节点不能参与并行执行,因为它依赖于 SubPlan 节点返回的数据,类似地,InitPlan 的父节点也不能参与并行执行 (但是那些接收 InitPlan 评估结果的节点可以)。

小结

标量子查询,一般都会导致 SQL 效率低下

  1. 对于查询,可以照搬套路,用 LEFT JOIN 进行改写即可。
  2. 对于 UPDATE,可以考虑使用 MERGE JOIN,15 版本之后也支持了MERGE JOIN。

其次,标量子查询如本文所述,还可能导致无法并行,一核有难,八核围观。切忌标量子查询的危害。

参考

https://docs.aws.amazon.com/zh_cn/redshift/latest/dg/r_scalar_subqueries.html

CATALOG
  1. 1. 前言
  2. 2. 何如
  3. 3. 无法并行
  4. 4. 小结
  5. 5. 参考