PostgreSQL 持续在基于fdw的sharding技术上深耕,9.6开始,在符合条件的前提下,支持JOIN和SORT下推到数据节点执行。
下面是一个测试
创建几个shard库
for subfix in 0 1 2 3
do
psql -c "create database db$subfix"
done
创建master库
psql -c "create database master;"
psql master -c "create extension postgres_fdw;"
在master库创建foreign server和user mapping
for subfix in 0 1 2 3
do
psql master -c "create server db$subfix foreign data wrapper postgres_fdw options (hostaddr 'xxx.xxx.xxx.xxx', dbname 'db$subfix', port '1923');"
psql master -c "create user mapping for postgres server db$subfix options (user 'postgres', password 'postgres');"
done
在shard库创建分片表
for subfix in 0 1 2 3
do
psql db$subfix -c "drop table if exists tbl; create table tbl(id int primary key, info text)"
psql db$subfix -c "drop table if exists tab; create table tab(id int primary key, info text)"
done
在master库创建foreign 表,并设置约束
for subfix in 0 1 2 3
do
psql master -c "drop foreign table if exists tbl$subfix ; create foreign table tbl$subfix (id int not null, info text) server db$subfix options (schema_name 'public', table_name 'tbl');"
psql master -c "alter foreign table tbl$subfix add constraint ck1 check (mod(id,4) = $subfix );"
psql master -c "drop foreign table if exists tab$subfix ; create foreign table tab$subfix (id int not null, info text) server db$subfix options (schema_name 'public', table_name 'tab');"
psql master -c "alter foreign table tab$subfix add constraint ck1 check (mod(id,4) = $subfix );"
done
查看
psql master <<EOF
\det
EOF
结果
List of foreign tables
Schema | Table | Server
--------+-------+--------
public | tab0 | db0
public | tab1 | db1
public | tab2 | db2
public | tab3 | db3
public | tbl0 | db0
public | tbl1 | db1
public | tbl2 | db2
public | tbl3 | db3
(8 rows)
在master库创建父表
psql master -c "create table tbl(id int primary key, info text);"
psql master -c "create table tab(id int primary key, info text);"
在master库创建foreign表和父表的继承关系
for subfix in 0 1 2 3
do
psql master -c "alter foreign table tbl$subfix inherit tbl;"
psql master -c "alter foreign table tab$subfix inherit tab;"
done
测试JOIN的下推
master=# explain verbose select * from tbl1,tab1 where tab1.id=tbl1.id and mod(tbl1.id,4)=1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..226.75 rows=48 width=72)
Output: tbl1.id, tbl1.info, tab1.id, tab1.info
Relations: (public.tbl1) INNER JOIN (public.tab1)
Remote SQL: SELECT r1.id, r1.info, r2.id, r2.info FROM (public.tbl r1 INNER JOIN public.tab r2 ON (((r1.id = r2.id)) AND ((mod(r1.id, 4) = 1))))
(4 rows)
目前sort下推需要关闭优化器enable_sort开关才会下推,也是值得改进的地方
master=# set enable_sort=off;
SET
master=# explain verbose select * from tbl1 where mod(id,4)=mod(100,4) order by id;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Foreign Scan on public.tbl1 (cost=100.00..136.71 rows=7 width=36)
Output: id, info
Remote SQL: SELECT id, info FROM public.tbl WHERE ((mod(id, 4) = 0)) ORDER BY id ASC NULLS LAST
(3 rows)
还需要改进的地方
这样的查询优化器能优化什么?
- 如果要更纯粹的sharding,父表不应该参与计算,只是一个别名而已。因此JOIN 可以根据前提条件下推。
- tab.id=tbl.id and mod(tbl.id,4)=1 可以推演出 and mod(tab.id,4)=1 。 因此tab表只需要扫描tab1。
master=# explain verbose select * from tbl,tab where tab.id=tbl.id and mod(tbl.id,4)=1;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather (cost=0.00..0.00 rows=0 width=0)
Output: tbl.id, tbl.info, tab.id, tab.info
Workers Planned: 1
Single Copy: true
-> Hash Join (cost=130.71..757.17 rows=218 width=72)
Output: tbl.id, tbl.info, tab.id, tab.info
Hash Cond: (tab.id = tbl.id)
-> Append (cost=0.00..603.80 rows=5461 width=36)
-> Seq Scan on public.tab (cost=0.00..0.00 rows=1 width=36)
Output: tab.id, tab.info
-> Foreign Scan on public.tab0 (cost=100.00..150.95 rows=1365 width=36)
Output: tab0.id, tab0.info
Remote SQL: SELECT id, info FROM public.tab
-> Foreign Scan on public.tab1 (cost=100.00..150.95 rows=1365 width=36)
Output: tab1.id, tab1.info
Remote SQL: SELECT id, info FROM public.tab
-> Foreign Scan on public.tab2 (cost=100.00..150.95 rows=1365 width=36)
Output: tab2.id, tab2.info
Remote SQL: SELECT id, info FROM public.tab
-> Foreign Scan on public.tab3 (cost=100.00..150.95 rows=1365 width=36)
Output: tab3.id, tab3.info
Remote SQL: SELECT id, info FROM public.tab
-> Hash (cost=130.61..130.61 rows=8 width=36)
Output: tbl.id, tbl.info
-> Append (cost=0.00..130.61 rows=8 width=36)
-> Seq Scan on public.tbl (cost=0.00..0.00 rows=1 width=36)
Output: tbl.id, tbl.info
Filter: (mod(tbl.id, 4) = 1)
-> Foreign Scan on public.tbl1 (cost=100.00..130.61 rows=7 width=36)
Output: tbl1.id, tbl1.info
Remote SQL: SELECT id, info FROM public.tbl WHERE ((mod(id, 4) = 1))
(31 rows)
修改源码,允许删除继承的约束
vi src/backend/commands/tablecmds.c
/* Don't drop inherited constraints */
// if (con->coninhcount > 0 && !recursing)
// ereport(ERROR,
// (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
// errmsg("cannot drop inherited constraint \"%s\" of relation \"%s\"",
// constrName, RelationGetRelationName(rel))));
重新编译,重启
make && make install
pg_ctl restart -m fast
添加主表的约束,删除子表的约束,造成主表不被访问的假象。
alter table tab add constraint tab_ck check (mod(id, 4) is null);
alter table tbl add constraint tbl_ck check (mod(id, 4) is null);
alter table only tab0 drop constraint tab_ck;
alter table only tab1 drop constraint tab_ck;
alter table only tab2 drop constraint tab_ck;
alter table only tab3 drop constraint tab_ck;
alter table only tbl0 drop constraint tbl_ck;
alter table only tbl1 drop constraint tbl_ck;
alter table only tbl2 drop constraint tbl_ck;
alter table only tbl3 drop constraint tbl_ck;
我的目标是这样能做到下推JOIN,但实际上没有下推,这个是非常痛苦的。
一个真正的分库中间件应该解决这样的问题。
master=# explain verbose select * from tbl ,tab where tab.id=tbl.id and mod(tbl.id,4)=1 and mod(tab.id,4)=1;
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop (cost=200.00..261.98 rows=7 width=72)
Output: tbl1.id, tbl1.info, tab1.id, tab1.info
Join Filter: (tbl1.id = tab1.id)
-> Append (cost=100.00..130.61 rows=7 width=36)
-> Foreign Scan on public.tbl1 (cost=100.00..130.61 rows=7 width=36)
Output: tbl1.id, tbl1.info
Remote SQL: SELECT id, info FROM public.tbl WHERE ((mod(id, 4) = 1))
-> Materialize (cost=100.00..130.65 rows=7 width=36)
Output: tab1.id, tab1.info
-> Append (cost=100.00..130.61 rows=7 width=36)
-> Foreign Scan on public.tab1 (cost=100.00..130.61 rows=7 width=36)
Output: tab1.id, tab1.info
Remote SQL: SELECT id, info FROM public.tab WHERE ((mod(id, 4) = 1))
(13 rows)
这才是我要的结果
QUERY PLAN
-------------------------
Foreign Scan (cost=100.00..161.58 rows=1 width=72)
Output: tbl.id, tbl.info, tab.id, tab.info
Relations: (public.tbl1 tbl) INNER JOIN (public.tab1 tab)
Remote SQL: SELECT r1.id, r1.info, r2.id, r2.info FROM (public.tbl r1 INNER JOIN public.tab r2 ON (((r1.id = r2.id)) AND ((mod(r2.id, 4) = 1)) AND ((mod(r1.id, 4) = 1))))
(4 rows)
小结
- 通常应用会以父表作为常用的表,而不是直接访问子表,但是可以要求用户带上分区约束的条件,从而满足下推的排他性。
- PG应该允许父表和子表有不一样的约束,从而可以利用排他约束把父表的访问过滤掉。
- 父表访问过滤掉之后,子表的JOIN应该可以下推使用。
- 复制表的实现和下推?
- 在PostgreSQL的shard完美之前,用户应该尽量避免JOIN。