【重新发现PostgreSQL之美】- 22 黄帝内经

本文涉及的产品
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
云原生数据库 PolarDB MySQL 版,通用型 2核8GB 50GB
简介: 大家好,这里是重新发现PostgreSQL之美 - 22 黄帝内经

背景


场景:
电商、社交、SAAS软件行业.
用户表、租户表、用户行为表.

挑战:
按企业、appidhash分区可能出现倾斜, 例如某些APPID的数据量可能非常大, 热数据可能在同一个分区, saas的某些大企业可能被分到同一分区, 使得数据倾斜.

PG 解决方案:
非对称分区方法.

例子


1、创建1个分区表, appid hash分区


         
appid int,
c1 int,
c2 int,
info text,
ts timestamp
) partition by hash (appid);

         
create unlogged table t0 partition of t for values with (modulus 4, REMAINDER 0);
create unlogged table t1 partition of t for values with (modulus 4, REMAINDER 1);
create unlogged table t2 partition of t for values with (modulus 4, REMAINDER 2);
create unlogged table t3 partition of t for values with (modulus 4, REMAINDER 3);

2、写入2000万记录, 其中1000appid=1, 制造倾斜


         

         
insert into t select 1, random()*100, random()*1000, md5(random()::text), clock_timestamp() from generate_series(1,10000000);

         
List of relations
Schema |   Name   |       Type        |  Owner   | Persistence | Access method |  Size   | Description
--------+----------+-------------------+----------+-------------+---------------+---------+-------------
public | t        | partitioned table | postgres | unlogged    |               | 0 bytes |
public | t0       | table             | postgres | unlogged    | heap          | 1014 MB |
public | t1       | table             | postgres | unlogged    | heap          | 189 MB  |
public | t2       | table             | postgres | unlogged    | heap          | 222 MB  |
public | t3       | table             | postgres | unlogged    | heap          | 186 MB  |

3、解决办法, 对于appid=1的分区, 再次按c1 hash分区, 其他分区保持不变.


         
create unlogged table tt0(like t) PARTITION BY hash (c1);
create unlogged table tt0_0 partition of tt0 for values with (modulus 4, REMAINDER 0);
create unlogged table tt0_1 partition of tt0 for values with (modulus 4, REMAINDER 1);
create unlogged table tt0_2 partition of tt0 for values with (modulus 4, REMAINDER 2);
create unlogged table tt0_3 partition of tt0 for values with (modulus 4, REMAINDER 3);
alter table t DETACH PARTITION t0;
insert into tt0 select * from t0;
alter table t attach partition tt0 FOR VALUES WITH ( MODULUS 4, REMAINDER 0 );
end;

         
Unlogged partitioned table "public.t"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition key: HASH (appid)
Partitions: t1 FOR VALUES WITH (modulus 4, remainder 1),
t2 FOR VALUES WITH (modulus 4, remainder 2),
t3 FOR VALUES WITH (modulus 4, remainder 3),
tt0 FOR VALUES WITH (modulus 4, remainder 0), PARTITIONED

         
postgres=# \d+ tt0
Unlogged partitioned table "public.tt0"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition of: t FOR VALUES WITH (modulus 4, remainder 0)
Partition constraint: satisfies_hash_partition('16552'::oid, 4, 0, appid)
Partition key: HASH (c1)
Partitions: tt0_0 FOR VALUES WITH (modulus 4, remainder 0),
tt0_1 FOR VALUES WITH (modulus 4, remainder 1),
tt0_2 FOR VALUES WITH (modulus 4, remainder 2),
tt0_3 FOR VALUES WITH (modulus 4, remainder 3)

可以看到appid=1的数据已经打散


         
List of relations
Schema |   Name   |       Type        |  Owner   | Persistence | Access method |  Size   | Description
--------+----------+-------------------+----------+-------------+---------------+---------+-------------
public | t        | partitioned table | postgres | unlogged    |               | 0 bytes |
public | t1       | table             | postgres | unlogged    | heap          | 189 MB  |
public | t2       | table             | postgres | unlogged    | heap          | 222 MB  |
public | t3       | table             | postgres | unlogged    | heap          | 186 MB  |
public | tt0      | partitioned table | postgres | unlogged    |               | 0 bytes |
public | tt0_0    | table             | postgres | unlogged    | heap          | 279 MB  |
public | tt0_1    | table             | postgres | unlogged    | heap          | 309 MB  |
public | tt0_2    | table             | postgres | unlogged    | heap          | 254 MB  |
public | tt0_3    | table             | postgres | unlogged    | heap          | 172 MB  |

4、检查执行计划, 分区被正常路由


         
QUERY PLAN
------------------------------------------------------------
Seq Scan on t2 t  (cost=0.00..62947.58 rows=9965 width=53)
Filter: (appid = 2)
(2 rows)

         

         
postgres=# explain select * from t where appid=1;
QUERY PLAN
--------------------------------------------------------------------------
Append  (cost=0.00..337327.48 rows=10031536 width=53)
->  Seq Scan on tt0_0 t_1  (cost=0.00..78943.27 rows=2771208 width=53)
Filter: (appid = 1)
->  Seq Scan on tt0_1 t_2  (cost=0.00..87582.51 rows=3057186 width=53)
Filter: (appid = 1)
->  Seq Scan on tt0_2 t_3  (cost=0.00..71825.61 rows=2499200 width=53)
Filter: (appid = 1)
->  Seq Scan on tt0_3 t_4  (cost=0.00..48818.40 rows=1703942 width=53)
Filter: (appid = 1)
(9 rows)

         

         
postgres=# explain select * from t where appid=1 and c1=1;
QUERY PLAN
----------------------------------------------------------------
Seq Scan on tt0_0 t  (cost=0.00..87595.73 rows=99763 width=53)
Filter: ((appid = 1) AND (c1 = 1))
(2 rows)

层次不限、分区方法不

1、如果t1也有问题, 我们可以继续对t1分区进行二次分区, 选择不同的分区个数


         
create unlogged table tt1(like t) PARTITION BY hash (c1);
create unlogged table tt1_0 partition of tt1 for values with (modulus 5, REMAINDER 0);
create unlogged table tt1_1 partition of tt1 for values with (modulus 5, REMAINDER 1);
create unlogged table tt1_2 partition of tt1 for values with (modulus 5, REMAINDER 2);
create unlogged table tt1_3 partition of tt1 for values with (modulus 5, REMAINDER 3);
create unlogged table tt1_4 partition of tt1 for values with (modulus 5, REMAINDER 4);
alter table t DETACH PARTITION t1;
insert into tt1 select * from t1;
alter table t attach partition tt1 FOR VALUES WITH ( MODULUS 4, REMAINDER 1 );
end;

         
Unlogged partitioned table "public.t"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition key: HASH (appid)
Partitions: t2 FOR VALUES WITH (modulus 4, remainder 2),
t3 FOR VALUES WITH (modulus 4, remainder 3),
tt0 FOR VALUES WITH (modulus 4, remainder 0), PARTITIONED,
tt1 FOR VALUES WITH (modulus 4, remainder 1), PARTITIONED

         
postgres=# \d+ tt1
Unlogged partitioned table "public.tt1"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition of: t FOR VALUES WITH (modulus 4, remainder 1)
Partition constraint: satisfies_hash_partition('16552'::oid, 4, 1, appid)
Partition key: HASH (c1)
Partitions: tt1_0 FOR VALUES WITH (modulus 5, remainder 0),
tt1_1 FOR VALUES WITH (modulus 5, remainder 1),
tt1_2 FOR VALUES WITH (modulus 5, remainder 2),
tt1_3 FOR VALUES WITH (modulus 5, remainder 3),
tt1_4 FOR VALUES WITH (modulus 5, remainder 4)

路由正常


         
QUERY PLAN
-----------------------------------------------------------------------
Append  (cost=0.00..53418.63 rows=9951 width=53)
->  Seq Scan on tt1_0 t_1  (cost=0.00..12278.30 rows=2289 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_1 t_2  (cost=0.00..8540.11 rows=1593 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_2 t_3  (cost=0.00..10130.17 rows=1886 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_3 t_4  (cost=0.00..9849.98 rows=1837 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_4 t_5  (cost=0.00..12570.31 rows=2346 width=53)
Filter: (appid = 3)
(11 rows)

         
postgres=# explain select * from t where appid=3 and c1=2;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on tt1_1 t  (cost=0.00..9476.14 rows=103 width=53)
Filter: ((appid = 3) AND (c1 = 2))
(2 rows)

2、我们也可以对t2使用rangelist分区


         
create unlogged table tt2(like t) PARTITION BY range (ts);
create unlogged table tt2_2021 partition of tt2 for values from ('2021-01-01') to ('2022-01-01');
create unlogged table tt2_2022 partition of tt2 for values from ('2022-01-01') to ('2023-01-01');

         
alter table t DETACH PARTITION t2;
insert into tt2 select * from t2;
alter table t attach partition tt2 FOR VALUES WITH ( MODULUS 4, REMAINDER 2 );
end;

         
Unlogged partitioned table "public.t"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition key: HASH (appid)
Partitions: t3 FOR VALUES WITH (modulus 4, remainder 3),
tt0 FOR VALUES WITH (modulus 4, remainder 0), PARTITIONED,
tt1 FOR VALUES WITH (modulus 4, remainder 1), PARTITIONED,
tt2 FOR VALUES WITH (modulus 4, remainder 2), PARTITIONED

         
postgres=# \d+ tt2
Unlogged partitioned table "public.tt2"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition of: t FOR VALUES WITH (modulus 4, remainder 2)
Partition constraint: satisfies_hash_partition('16552'::oid, 4, 2, appid)
Partition key: RANGE (ts)
Partitions: tt2_2021 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2022-01-01 00:00:00'),
tt2_2022 FOR VALUES FROM ('2022-01-01 00:00:00') TO ('2023-01-01 00:00:00')

使用PG, 支持非对称分区(深度不对称、分区方法不对称、分区数不对称), 彻底解决数据倾斜问题.

除了分, 还能合, 对于某些比较小的分区可以合起来减少分区数. 和黄帝内经的人与自然,五行平衡学说类似, 实则泻之,虚则补之.

参考


《如何修改PostgreSQL分区表分区范围- detach attach - 拆分、合并、非平衡分区表、深度不一致分区表》

PostgreSQL hash分区表扩容、缩容(增加分区、减少分区、分区重分布、拆分区、合并分区), hash算法hash_any, 混合hash MODULUS 分区- attach , detach

https://www.postgresql.org/docs/14/sql-createtable.html

 

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
5月前
|
存储 JSON 关系型数据库
带你走进PostgreSQL的世界
带你走进PostgreSQL的世界
175 0
|
关系型数据库 数据库 PostgreSQL
【重新发现PostgreSQL之美】- 43 快速破镜重圆
大家好,这里是重新发现PostgreSQL之美 - 43 快速破镜重圆
|
JSON 关系型数据库 数据库
【重新发现PostgreSQL之美】- 44 摩斯电码
大家好,这里是重新发现PostgreSQL之美 - 44 摩斯电码
|
自然语言处理 算法 关系型数据库
【重新发现PostgreSQL之美 】- 30 打蛇打七寸
大家好,这里是重新发现PostgreSQL之美 - 30 打蛇打七寸
|
安全 Oracle 关系型数据库
【重新发现PostgreSQL之美】- 36 方世玉 安全第一
大家好,这里是重新发现PostgreSQL之美 - 36 方世玉 安全第一
|
传感器 SQL 监控
【重新发现PostgreSQL之美】- 28 旋转门
大家好,这里是重新发现PostgreSQL之美 - 28 旋转门
|
SQL 算法 自动驾驶
【重新发现PostgreSQL之美】- 27 无中生有
大家好,这里是重新发现PostgreSQL之美 - 27 无中生有
|
SQL 关系型数据库 数据库
【重新发现PostgreSQL之美】- 50 一粒老鼠屎
大家好,这里是重新发现PostgreSQL之美 - 50 一粒老鼠屎
|
SQL NoSQL 关系型数据库
【重新发现PostgreSQL之美】- 37 三焦者,决渎之官,水道出焉. FDW
大家好,这里是重新发现PostgreSQL之美 - 37 三焦者,决渎之官,水道出焉. FDW
|
存储 负载均衡 搜索推荐
【重新发现PostgreSQL之美】- 23 彭祖的长寿秘诀
大家好,这里是重新发现PostgreSQL之美 - 23 彭祖的长寿秘诀