PostgreSQL 10.0 preview 多核并行增强 - 索引扫描、子查询、VACUUM、fdw/csp钩子-阿里云开发者社区

PostgreSQL 10.0 preview 多核并行增强 - 索引扫描、子查询、VACUUM、fdw/csp钩子

2017-03-24 1950

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介：

背景

PostgreSQL 9.6推出的多核并行计算特性，支持全表扫描，hash join，聚合操作。

10.0 在此基础上，增加了更多的支持。

1. Parallel bitmap heap scan

2. Parallel Index Scans

3. Parallel Merge Join

4. parallelize queries containing subplans

5. Block level parallel vacuum

6. Extending the parallelism for index-only scans

7. ParallelFinish Hook of FDW/CSP

这是一个fdw钩子，用于在访问FDW/CSP的node（backend process）的内存上下文释放前，让上面的gather node获得上下文的控制权。

从而，从DSM中获得每个fdw node通道的统计信息，比如pg_strom项目，custom scan阶段的dma数据传输的速度，GPU的运算时间等。

使用这个钩子，就可以达到以上目的。

Hello,  
  
The attached patch implements the suggestion by Amit before.  
  
What I'm motivated is to collect extra run-time statistics specific  
to a particular ForeignScan/CustomScan, not only the standard  
Instrumentation; like DMA transfer rate or execution time of GPU  
kernels in my case.  
  
Per-node DSM toc is one of the best way to return run-time statistics  
to the master backend, because FDW/CSP can assign arbitrary length of  
the region according to its needs. It is quite easy to require.  
However, one problem is, the per-node DSM toc is already released when  
ExecEndNode() is called on the child node of Gather.  
  
This patch allows extensions to get control on the master backend's  
context when all the worker node gets finished but prior to release  
of the DSM segment. If FDW/CSP has its special statistics on the  
segment, it can move to the private memory area for EXPLAIN output  
or something other purpose.  
  
One design consideration is whether the hook shall be called from  
ExecParallelRetrieveInstrumentation() or ExecParallelFinish().  
The former is a function to retrieve the standard Instrumentation  
information, thus, it is valid only if EXPLAIN ANALYZE.  
On the other hands, if we put entrypoint at ExecParallelFinish(),  
extension can get control regardless of EXPLAIN ANALYZE, however,  
it also needs an extra planstate_tree_walker().  
  
Right now, we don't assume anything onto the requirement by FDW/CSP.  
It may want run-time statistics regardless of EXPLAIN ANALYZE, thus,  
hook shall be invoked always when Gather node confirmed termination  
of the worker processes.  
  
Thanks,  
--  
NEC OSS Promotion Center / PG-Strom Project  
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

这个patch的讨论，详见邮件组，本文末尾URL。

PostgreSQL社区的作风非常严谨，一个patch可能在邮件组中讨论几个月甚至几年，根据大家的意见反复的修正，patch合并到master已经非常成熟，所以PostgreSQL的稳定性也是远近闻名的。

参考

https://commitfest.postgresql.org/13/812/

https://commitfest.postgresql.org/13/849/

https://commitfest.postgresql.org/13/918/

https://commitfest.postgresql.org/13/941/

https://commitfest.postgresql.org/13/954/

https://commitfest.postgresql.org/13/867/

https://commitfest.postgresql.org/13/917/

相关实践学习

使用PolarDB和ECS搭建门户网站

本场景主要介绍如何基于PolarDB和ECS实现搭建门户网站。

阿里云数据库产品家族及特性

阿里云智能数据库产品团队一直致力于不断健全产品体系，提升产品性能，打磨产品功能，从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手，打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系，结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台，为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案，提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。

PostgreSQL 10.0 preview 多核并行增强 - 索引扫描、子查询、VACUUM、fdw/csp钩子

标签

背景

参考

关系型数据库

热门文章

最新文章

相关产品

相关课程

相关电子书

相关实验场景

推荐镜像