PostgreSQL 10.0 preview sharding增强 - 支持Append节点并行

本文涉及的产品
RDS MySQL DuckDB 分析主实例,基础系列 4核8GB
RDS AI 助手,专业版
RDS MySQL DuckDB 分析主实例,集群系列 4核8GB
简介:

标签

PostgreSQL , 10.0 , 多核并行 , Append


背景

Append节点通常出现在多个表union , union all或者查询包含多个分区的主表时。

需要对每个子表,或者每个subquery进行查询,然后在将结果合并。

PostgreSQL 10.0增加了一个并行模式即Append节点的并行,多个Append节点,可以并行的执行。、

从而大大的提升效率。

postgres_fdw的异步化调用,也需要Append并行的支持。

Currently an Append plan node does not execute its subplans in  
parallel. There is no distribution of workers across its subplans. The  
second subplan starts running only after the first subplan finishes,  
although the individual subplans may be running parallel scans.  

Secondly, we create a partial Append path for an appendrel, but we do  
that only if all of its member subpaths are partial paths. If one or  
more of the subplans is a non-parallel path, there will be only a  
non-parallel Append. So whatever node is sitting on top of Append is  
not going to do a parallel plan; for example, a select count(*) won't  
divide it into partial aggregates if the underlying Append is not  
partial.  

The attached patch removes both of the above restrictions.  There has  
already been a mail thread [1] that discusses an approach suggested by  
Robert Haas for implementing this feature. This patch uses this same  
approach.  

Attached is pgbench_create_partition.sql (derived from the one  
included in the above thread) that distributes pgbench_accounts table  
data into 3 partitions pgbench_account_[1-3]. The below queries use  
this schema.  

Consider a query such as :  
select count(*) from pgbench_accounts;  

Now suppose, these two partitions do not allow parallel scan :  
alter table pgbench_accounts_1 set (parallel_workers=0);  
alter table pgbench_accounts_2 set (parallel_workers=0);  

On HEAD, due to some of the partitions having non-parallel scans, the  
whole Append would be a sequential scan :  

 Aggregate  
   ->  Append  
         ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts  
         ->  Seq Scan on pgbench_accounts_1  
         ->  Seq Scan on pgbench_accounts_2  
         ->  Seq Scan on pgbench_accounts_3  

Whereas, with the patch, the Append looks like this :  

 Finalize Aggregate  
   ->  Gather  
         Workers Planned: 6  
         ->  Partial Aggregate  
               ->  Parallel Append  
                     ->  Parallel Seq Scan on pgbench_accounts  
                     ->  Seq Scan on pgbench_accounts_1  
                     ->  Seq Scan on pgbench_accounts_2  
                     ->  Parallel Seq Scan on pgbench_accounts_3  

Above, Parallel Append is generated, and it executes all these  
subplans in parallel, with 1 worker executing each of the sequential  
scans, and multiple workers executing each of the parallel subplans.  


======= Implementation details ========  

------- Adding parallel-awareness -------  

In a given worker, this Append plan node will be executing just like  
the usual partial Append node. It will run a subplan until completion.  
The subplan may or may not be a partial parallel-aware plan like  
parallelScan. After the subplan is done, Append will choose the next  
subplan. It is here where it will be different than the current  
partial Append plan: it is parallel-aware. The Append nodes in the  
workers will be aware that there are other Append nodes running in  
parallel. The partial Append will have to coordinate with other Append  
nodes while choosing the next subplan.  

------- Distribution of workers --------  

The coordination info is stored in a shared array, each element of  
which describes the per-subplan info. This info contains the number of  
workers currently executing the subplan, and the maximum number of  
workers that should be executing it at the same time. For non-partial  
sublans, max workers would always be 1. For choosing the next subplan,  
the Append executor will sequentially iterate over the array to find a  
subplan having the least number of workers currently being executed,  
AND which is not already being executed by the maximum number of  
workers assigned for the subplan. Once it gets one, it increments  
current_workers, and releases the Spinlock, so that other workers can  
choose their next subplan if they are waiting.  

This way, workers would be fairly distributed across subplans.  

The shared array needs to be initialized and made available to  
workers. For this, we can do exactly what sequential scan does for  
being parallel-aware : Using function ExecAppendInitializeDSM()  
similar to ExecSeqScanInitializeDSM() in the backend to allocate the  
array. Similarly, for workers, have ExecAppendInitializeWorker() to  
retrieve the shared array.  

这个patch的讨论,详见邮件组,本文末尾URL。

PostgreSQL社区的作风非常严谨,一个patch可能在邮件组中讨论几个月甚至几年,根据大家的意见反复的修正,patch合并到master已经非常成熟,所以PostgreSQL的稳定性也是远近闻名的。

参考

https://commitfest.postgresql.org/13/987/

https://www.postgresql.org/message-id/flat/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com#CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍如何基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
Oracle 关系型数据库 分布式数据库
PolarDB for PostgreSQL报错问题之跨节点执行报错如何解决
PolarDB for PostgreSQL是基于PostgreSQL开发的一款云原生关系型数据库服务,它提供了高性能、高可用性和弹性扩展的特性;本合集将围绕PolarDB(pg)的部署、管理和优化提供指导,以及常见问题的排查和解决办法。
|
存储 SQL Oracle
AnalyticDB PostgreSQL 7.0 支持存储过程(CREATE PROCEDURE)特性
AnalyticDB PostgreSQL 7.0 新增了存储过程功能的支持,让用户在使用ADB PG时能够更方便高效地开发业务,并能够更好地兼容Oracle等传统数仓的业务。
779 1
AnalyticDB PostgreSQL 7.0 支持存储过程(CREATE PROCEDURE)特性
|
存储 SQL Oracle
10 PostgreSQL 表级复制-物化视图篇, 支持异地,异构如 Oracle 到 pg 的物化视图|学习笔记
快速学习10 PostgreSQL 表级复制-物化视图篇,支持异地,异构如 Oracle 到 pg 的物化视图
10 PostgreSQL 表级复制-物化视图篇, 支持异地,异构如 Oracle 到 pg 的物化视图|学习笔记
|
SQL 存储 算法
PostgreSQL并行HashJoin解读
PostgreSQL并行HashJoin解读
PostgreSQL并行HashJoin解读
|
关系型数据库 分布式数据库 PolarDB
《PolarDB for PostgreSQL三节点功能介绍》电子版地址
PolarDB for PostgreSQL三节点功能介绍
1238 0
《PolarDB for PostgreSQL三节点功能介绍》电子版地址
|
SQL 并行计算 关系型数据库
Citus 分布式 PostgreSQL 集群 - SQL Reference(SQL支持和变通方案)
Citus 分布式 PostgreSQL 集群 - SQL Reference(SQL支持和变通方案)
361 0
|
弹性计算 监控 关系型数据库
PostgreSQL 双节点流复制如何同时保证可用性、可靠性(rpo,rto) - (半同步,自动降级方法实践)
标签 PostgreSQL , 同步 , 半同步 , 流复制 背景 两节点HA架构,如何做到跨机房RPO=0(可靠性维度)?同时RTO可控(可用性维度)? 半同步是一个不错的选择。 1、当只挂掉一个节点时,可以保证RPO=0。如下: 主 -> 从(挂) 主(挂) -> 从 2、当一个节点挂掉后,在另一个节点恢复并开启同步模式前,如果在此期间(
2467 0
|
弹性计算 关系型数据库 数据库
PostgreSQL 如何让心跳永远不死,支持半同步自动同步、异步升降级 - udf 心跳
标签 PostgreSQL , 同步 , 半同步 , 流复制 , 心跳 , 自动降级 , 自动升级 , dblink , 异步调用 背景 在心跳时,通过自定义UDF,实现心跳永远不被堵塞,并且支持更加当前的配置自动的进行同步、异步模式的升降级。实现半同步的功能。 UDF输入 1、优先模式(同步、异步) 2、同步等待超时时间 当优先为同步模式时,假设当前为同步配置,如果备库异常导致
2246 0
|
弹性计算 关系型数据库 数据库连接
PostgreSQL 12 preview - Move max_wal_senders out of max_connections for connection slot handling
标签 PostgreSQL , max_wal_senders , max_connections , sorry, too many clients already 背景 如果你需要使用PG的流复制,上游节点的max_wal_senders参数,用来限制这个节点同时最多可以有多少个wal sender进程。 包括逻辑复制、物理复制、pg_basebackup备份等,只要是使用stre
510 0
|
弹性计算 安全 关系型数据库
PostgreSQL 12 preview - 可靠性提升 - data_sync_retry 消除os层write back failed status不可靠的问题
标签 PostgreSQL , data_sync_retry , write back , retry , failed status 背景 有些OS系统,对fsync的二次调用不敏感,因为OS层可能有自己的CACHE,如果使用了buffer write,并且出现write back failed的情况,有些OS可能在下次fsync时并不能正确的反馈fsync的可靠性与否。(因为这个B
727 0

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版
  • 推荐镜像

    更多