PostgreSQL17性能优化(2)允许将相关的IN子查询转换为连接

本文将介绍PostgreSQL 17服务端优化器允许将相关的IN子查询转换为连接。本次测试使用benchmark5.0压测场景的bmsql_oorder、bmsql_customer表,数据都为300w(100仓数据压测的结果)

对比两个版本的执行计划

--PostgreSQL16.3版本
testdb=# explain analyze select * from bmsql_oorder bo where bo.o_c_id in (
testdb(#  select bc.c_id from bmsql_customer bc 
testdb(#  where bc.c_w_id =bo.o_w_id and bc.c_d_id =bo.o_d_id and bo.o_c_id = bc.c_id );
                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on bmsql_oorder bo  (cost=0.00..7389126.00 rows=1500000 width=36) (actual time=0.029..4240.642 rows=3000000 loops=1)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Index Only Scan using bmsql_customer_pkey on bmsql_customer bc  (cost=0.43..4.45 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3000000)
           Index Cond: ((c_w_id = bo.o_w_id) AND (c_d_id = bo.o_d_id) AND (c_id = bo.o_c_id))
           Heap Fetches: 0
 Planning Time: 0.096 ms
 Execution Time: 4311.509 ms
(8 rows)

Time: 4312.153 ms (00:04.312)

查看PostgreSQL16.3版本执行计划,是将SUBLINK作为子查询拉上来,同时由于当时无法使用varlevelsup=1调出Var,因此,在该版本中导致无法执行横向连接。


--PostgreSQL17版本
testdb=# explain analyze select * from bmsql_oorder bo where bo.o_c_id in (
testdb(#  select bc.c_id from bmsql_customer bc 
testdb(#  where bc.c_w_id =bo.o_w_id and bc.c_d_id =bo.o_d_id and bo.o_c_id = bc.c_id );
                                                                                QUERY PLAN                                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=158357.99..297634.01 rows=3000000 width=36) (actual time=629.099..2756.044 rows=3000000 loops=1)
   Hash Cond: ((bo.o_w_id = bc.c_w_id) AND (bo.o_d_id = bc.c_d_id) AND (bo.o_c_id = bc.c_id))
   ->  Seq Scan on bmsql_oorder bo  (cost=0.00..54127.00 rows=3000000 width=36) (actual time=0.417..176.648 rows=3000000 loops=1)
   ->  Hash  (cost=91213.46..91213.46 rows=2999802 width=12) (actual time=628.226..628.227 rows=3000000 loops=1)
         Buckets: 262144  Batches: 32  Memory Usage: 6072kB
         ->  Index Only Scan using bmsql_customer_pkey on bmsql_customer bc  (cost=0.43..91213.46 rows=2999802 width=12) (actual time=0.018..221.598 rows=3000000 loops=1)
               Heap Fetches: 0
 Planning Time: 0.207 ms
 Execution Time: 2827.836 ms
(9 rows)

Time: 2828.682 ms (00:02.829)

查看PostgreSQL17版本执行计划,它是拉起一个半连接,这个时候外表和内表是同级,因此有了更多连接方法或连接顺序的可能。

总结

通过对PostgreSQL16.3和PostgreSQL17的对比,整理如下的表格。

对比项PostgreSQL16.3PostgreSQL17
实现方式子查询,有varlevelsup=1限制半连接,消除了varlevelsup=1限制
外表执行次数11
内表执行次数30000001
执行耗时4312.153 ms2828.682 ms

通过对比发现,PostgreSQL17通过使用半连接的方式优化允许将相关的IN子查询转换为半连接,使得内表执行次数变为了1次,极大的提升了查询的性能,查询耗时也大概提升了30%左右,在一些个别场景下,性能提升大概7倍多。另外我们也需要注意的是并不所有in的子查询场景优化,仅仅只是优化了外表和内表有关联得场景。

点赞(0) 打赏

评论列表 共有 0 条评论

暂无评论

微信公众账号

微信扫一扫加关注

发表
评论
返回
顶部