唯一键冲突导致的种种危害，你都知道吗？

数据库原理

Word count: 2.1kReading time: 9 min

 2024/12/26 

前言

首先，感谢各位读者以及新老朋友们对于《PostgreSQL 14 Internals》中文版的支持🔗 https://postgres-internals.cn/，上线两天，已经有约 5000 人进行了访问，另外针对读者反馈的 BUG 或建议，我也第一时间进行了修复和改进 (悄悄建了一个 14 Internals 读者交流群，主要面向读者，已在④个交流群的筒子就不用再进来了)。

言归正传，回到这一期的话题，起因是一位学员在实战群中问了这样一个问题

老师，刚才看个文档，说在唯一索引情况下，插入重复数据，也会产生死元组，这是为什么啊

也就是说，有唯一约束或者唯一索引的话，如果按照常规方式插入了重复值，会产生死元组。既然都没有插入成功，为什么会留下死元组呢？或者说，为什么不在插入之前进行 pre-check，避免产生死元组呢？这样无疑会更加高效，让我们分析一下！

分析

首先这个现象是正常的，如果按照常规方式进行插入 (注意我这里的措辞 — 常规)，确实会留下死元组：

postgres=# create table t4(id int primary key);
CREATE TABLE
postgres=# insert into t4 values(1);
INSERT 0 1
postgres=# insert into t4 values(1);
ERROR:  duplicate key value violates unique constraint "t4_pkey"
DETAIL:  Key (id)=(1) already exists.
postgres=# SELECT lp,t_xmin,t_ctid,infomask(t_infomask, 1) as infomask,t_data FROM heap_page_items(get_raw_page('t4', 0));
 lp | t_xmin | t_ctid |          infomask           |   t_data   
----+--------+--------+-----------------------------+------------
  1 | 500289 | (0,1)  | XMAX_INVALID|XMIN_COMMITTED | \x01000000
  2 | 500290 | (0,2)  | XMAX_INVALID                | \x01000000
(2 rows)

postgres=# select txid_status('500290');
 txid_status 
-------------
 aborted
(1 row)

可以看到，第二个事务变成了 aborted 的状态，并留下了死元组

postgres=# vacuum t4;
VACUUM
postgres=# SELECT lp,t_xmin,t_ctid,infomask(t_infomask, 1) as infomask,t_data FROM heap_page_items(get_raw_page('t4', 0));
 lp | t_xmin | t_ctid |                 infomask                 |   t_data   
----+--------+--------+------------------------------------------+------------
  1 | 500289 | (0,1)  | XMAX_INVALID|XMIN_INVALID|XMIN_COMMITTED | \x01000000
(1 row)

除了死元组，也消耗了事务 ID，同样，还有存储空间。那有没有其他方式可以实现 pre-check 呢？也就是说，提前检测一下是否有冲突，有冲突就不插入，避免以上种种危害。Sure，你可以使用 insert on conflict do nothing。让我们再验证一下：

postgres=# create table t5(id int primary key);
CREATE TABLE
postgres=# insert into t5 values(1);
INSERT 0 1
postgres=# insert into t5 values(1) on conflict do nothing;
INSERT 0 0
postgres=# insert into t5 values(1) on conflict do nothing;
INSERT 0 0
postgres=# SELECT lp,t_xmin,t_ctid,infomask(t_infomask, 1) as infomask,t_data FROM heap_page_items(get_raw_page('t5', 0));
 lp | t_xmin | t_ctid |          infomask           |   t_data   
----+--------+--------+-----------------------------+------------
  1 | 500292 | (0,1)  | XMAX_INVALID|XMIN_COMMITTED | \x01000000
(1 row)

postgres=# insert into t5 values(2) on conflict do nothing;
INSERT 0 1
postgres=# SELECT lp,t_xmin,t_ctid,infomask(t_infomask, 1) as infomask,t_data FROM heap_page_items(get_raw_page('t5', 0));
 lp | t_xmin | t_ctid |          infomask           |   t_data   
----+--------+--------+-----------------------------+------------
  1 | 500292 | (0,1)  | XMAX_INVALID|XMIN_COMMITTED | \x01000000
  2 | 500293 | (0,2)  | XMAX_INVALID                | \x02000000
(2 rows)

可以看到，使用了 INSERT ON CONFLICT DO NOTHING：

没有消耗事务 ID
没有留下死元组

知晓了现象，让我们再分析一下具体原理，知其然，知其所以然。

内核原理

代码逻辑很简单，ExecInsert → heap_insert()。INSERT ON CONFLICT 是 9.5 提交的，在此提交中：https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=168d5805e4c08bed7b95d351bf097cff7c07dd65，有这样一段描述

This is implemented using a new infrastructure called “speculative
insertion”. It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

这种预检查可以避免将元组插入堆中，在元组重复的情况下再将其删除的开销。HEAP_INSERT_IS_SPECULATIVE 即所谓的“speculative
insertion” — 推测性插入，如果发现冲突，直接撤消，而无需取消整个事务。其他会话可以等待推测性插入得到确认，将其变成常规元组，或者取消。

让我们看下代码流程，在 table_tuple_insert 和 ExecInsertIndexTuples 分别打个断点：

postgres=# create table t4(id int primary key);
CREATE TABLE
postgres=# insert into t4 values(1);
INSERT 0 1

(gdb) b table_tuple_insert
Breakpoint 1 at 0x543ce7: table_tuple_insert. (6 locations)
(gdb) b ExecInsertIndexTuples
Breakpoint 2 at 0x721260: file execIndexing.c, line 307.

再插入一条数据，就会卡在断点处，让我们手动 C 一下：

postgres=# insert into t4 values(1);
...此处卡住

(gdb) c
Continuing.

Breakpoint 1, table_tuple_insert (rel=0x7f8811b4f770, slot=0x1d34420, cid=0, options=0, bistate=0x0)
    at ../../../src/include/access/tableam.h:1377
1377    ../../../src/include/access/tableam.h: No such file or directory.

此时第二条元组还没有插入进表中，因为还没有完成插入

postgres=# SELECT lp,t_xmin,t_ctid,infomask(t_infomask, 1) as infomask,t_data FROM heap_page_items(get_raw_page('t4', 0));
 lp | t_xmin | t_ctid |   infomask   |   t_data   
----+--------+--------+--------------+------------
  1 | 500303 | (0,1)  | XMAX_INVALID | \x01000000
(1 row)

继续 C 了之后，数据便会成功插入，但是卡在索引插入流程中

(gdb) c
Continuing.

Breakpoint 2, ExecInsertIndexTuples (resultRelInfo=0x1d332d8, slot=0x1d34420, estate=0x1d32e88, update=false, 
    noDupErr=false, specConflict=0x0, arbiterIndexes=0x0, onlySummarizing=false) at execIndexing.c:307
307     execIndexing.c: No such file or directory.

postgres=# SELECT lp,t_xmin,t_ctid,infomask(t_infomask, 1) as infomask,t_data FROM heap_page_items(get_raw_page('t4', 0));
 lp | t_xmin | t_ctid |   infomask   |   t_data   
----+--------+--------+--------------+------------
  1 | 500303 | (0,1)  | XMAX_INVALID | \x01000000
  2 | 500304 | (0,2)  | XMAX_INVALID | \x01000000
(2 rows)

postgres=# select txid_status('500304');
 txid_status 
-------------
 in progress
(1 row)

事务状态也处于 “in progress”，再次 C 了一下之后，便会走到 check_exclusion_or_unique_constraint()，进而报错，然后使整个事务回滚

postgres=# SELECT itemoffset, ctid, itemlen, nulls, vars, data, dead, htid, tids[0:2] AS some_tids
        FROM bt_page_items('t4_pkey', 1);
 itemoffset | ctid  | itemlen | nulls | vars |          data           | dead | htid  | some_tids 
------------+-------+---------+-------+------+-------------------------+------+-------+-----------
          1 | (0,1) |      16 | f     | f    | 01 00 00 00 00 00 00 00 | f    | (0,1) | 
(1 row)

postgres=# insert into t4 values(1);
ERROR:  duplicate key value violates unique constraint "t4_pkey"
DETAIL:  Key (id)=(1) already exists.
postgres=# select txid_status('500304');
 txid_status 
-------------
 aborted
(1 row)

如果使用了 INSERT ON CONFLICT DO NOTHING，流程则不太一样：table_tuple_insert_speculative → ExecInsertIndexTuples → table_tuple_complete_speculative，具体细节就不再展示，各位读者可以自己追踪分析。

小结

让我们小结一下，如果是常规插入，遇到唯一键冲突，实际上的大致流程是：

构造待插入的 HeapTuple
调用 heap_insert()
- 把新元组物理写进堆页面。
- 设置 t_xmin = 当前事务的 XID，表示是由本事务插入。
- 刷新共享缓冲区并返回（此时在物理文件层面，元组已经落地）。
写入索引（ExecInsertIndexTuples()）并进行唯一约束检查
- 若该表上有唯一索引/主键，则此时会在相应的索引插入一条索引记录。
- 索引层发现有重复键时，立刻报错 ERROR: duplicate key value ...。
错误处理
- 如果抛错，事务被整体回滚，导致已插入堆的那条元组 (以及所有相关更改) 失效：
  - 对其他事务来说，这条元组不可见；
  - 对数据库存储而言，这条元组已经写进去了，但被标记为“随着事务回滚而永不可见”，成为死元组，也就需要后续 VACUUM 进行清理。

这就是“先插后删 (或标记死元组) ”的由来，但它并不表现为在代码里专门写了个“如果冲突，就删除/标记死元组”的逻辑，而是依赖 PostgreSQL 的事务回滚机制，“插入的事务失败 => 该事务内所有写入都被标记无效”。无效元组需要 vacuum 进行回收。

而INSERT ... ON CONFLICT 语法时，就新增了“推测性插入 (speculative insertion)”的基础设施，主要变化在于：

heap_insert() 新增了 HEAP_INSERT_SPECULATIVE 标志，这样做的目的是：在插入堆时告诉系统“这个元组是暂时的，我可能要去确认它，也可能很快撤销它”。
如果发现冲突，可以轻量级地“撤销推测性插入”
- 在索引里做唯一性检查之前或之中，如果侦测到并发冲突，可以把这条尚未“正式”可见的元组干净地撤销，不会像以前那样写入并标记“XMIN aborted”。
- 也不会在系统里留下一个死元组等待清理，极大提高了高并发场景下的效率。

因此，9.5 之后引入“推测性插入”，才在插入流程中显式引入了 “我先插进堆，但是要等确认才能真正可见，如果冲突则可以撤销” 的逻辑，这才避免了死元组的产生。

OK，挺不错的一个优化项，对应到日志中，如果看到有大量的 ERROR：duplicate key，赶紧用 INSERT … ON CONFLICT 进行优化吧。

参考

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=168d5805e4c08bed7b95d351bf097cff7c07dd65

https://aws.amazon.com/cn/blogs/database/hidden-dangers-of-duplicate-key-violations-in-postgresql-and-how-to-avoid-them/

Next Post

2024 年终总结
Previous Post

笃行致远，《PostgreSQL 14 Internals》三载始成

CATALOG

1. 前言
2. 分析
3. 内核原理
4. 小结
5. 参考