PostgreSQL 空间、多维 序列 生成方法

简介:

标签

PostgreSQL , GIS , PostGIS , 序列 , 空间序列


背景

数据库的一维序列是很好理解的东西,就是在一个维度上自增。

那么二维、多维序列怎么理解呢?显然就是在多个维度上齐头并进的自增咯。

二维序列

以二维序列为例,应该是这样增长的:

0,0  
0,1  
1,0  
1,1  
1,2  
2,1  
2,2  
...  

那么如何生成以上二维序列呢?实际上可以利用数据库的多个一维序列来生成。

create sequence seq1;  
create sequence seq2;  
  
create or replace function seq_2d() returns point[] as $$  
declare  
  res point[];  
begin  
  select array_cat(res, array[point(nextval('seq1'), nextval('seq2'))]) into res;  
  select array_cat(res, array[point(currval('seq1')+1, currval('seq2'))]) into res;   
  select array_cat(res, array[point(currval('seq1'), currval('seq2')+1)]) into res;    
  return res;  
end;  
$$ language plpgsql strict;  

测试

test=# select seq_2d();  
          seq_2d             
---------------------------  
 {"(1,1)","(2,1)","(1,2)"}  
(1 row)  
  
test=# select seq_2d();  
          seq_2d             
---------------------------  
 {"(2,2)","(3,2)","(2,3)"}  
(1 row)  
  
test=# select seq_2d();  
          seq_2d             
---------------------------  
 {"(3,3)","(4,3)","(3,4)"}  
(1 row)  

三维序列

三维序列的生成方法类似:

0,0,0  
1,0,0  
0,1,0  
0,0,1  
1,1,0  
0,1,1  
1,0,1  
  
1,1,1  
2,1,1  
1,2,1  
1,1,2  
2,2,1  
1,2,2  
2,1,2  
  
2,2,2  
......  
create sequence seq1;  
create sequence seq2;  
create sequence seq3;  
create extension cube;  
  
create or replace function seq_3d() returns cube[] as $$  
declare  
  res cube[];  
begin  
  select array_cat(res, array[cube(array[nextval('seq1'), nextval('seq2'), nextval('seq3')])]) into res;  
  select array_cat(res, array[cube(array[currval('seq1')+1, currval('seq2'), currval('seq3')])]) into res;  
  select array_cat(res, array[cube(array[currval('seq1'), currval('seq2')+1, currval('seq3')])]) into res;  
  select array_cat(res, array[cube(array[currval('seq1'), currval('seq2'), currval('seq3')+1])]) into res;  
  select array_cat(res, array[cube(array[currval('seq1')+1, currval('seq2')+1, currval('seq3')])]) into res;  
  select array_cat(res, array[cube(array[currval('seq1'), currval('seq2')+1, currval('seq3')+1])]) into res;  
  select array_cat(res, array[cube(array[currval('seq1')+1, currval('seq2'), currval('seq3')+1])]) into res;  
  return res;  
end;  
$$ language plpgsql strict;  

例子

test=# select seq_3d();  
                                        seq_3d                                           
---------------------------------------------------------------------------------------  
 {"(1, 1, 1)","(2, 1, 1)","(1, 2, 1)","(1, 1, 2)","(2, 2, 1)","(1, 2, 2)","(2, 1, 2)"}  
(1 row)  
  
test=# select seq_3d();  
                                        seq_3d                                           
---------------------------------------------------------------------------------------  
 {"(2, 2, 2)","(3, 2, 2)","(2, 3, 2)","(2, 2, 3)","(3, 3, 2)","(2, 3, 3)","(3, 2, 3)"}  
(1 row)  
  

多维序列

以此类推,可以得到多维序列。

多维数据的空间存放和BRIN块级索引

《PostgreSQL 黑科技 - 空间聚集存储》

前面讲到了空间聚集存储,如果数据按空间顺序存放,使用BRIN块级索引,可以在任意维度上得到最好的查询效率,真正做到一个块级索引支持任意列的高效过滤。

例子

create sequence seq1;  
create sequence seq2;  
create sequence seq3;  
  
create table tbl(c1 int, c2 int, c3 int);  
  
create or replace function cluster_insert() returns void as $$  
declare  
begin  
  insert into tbl values (nextval('seq1'), nextval('seq2'), nextval('seq3'));  
  insert into tbl values (currval('seq1')+1, currval('seq2'), currval('seq3'));  
  insert into tbl values (currval('seq1'), currval('seq2')+1, currval('seq3'));  
  insert into tbl values (currval('seq1'), currval('seq2'), currval('seq3')+1);  
  insert into tbl values (currval('seq1')+1, currval('seq2')+1, currval('seq3'));  
  insert into tbl values (currval('seq1'), currval('seq2')+1, currval('seq3')+1);  
  insert into tbl values (currval('seq1')+1, currval('seq2'), currval('seq3')+1);  
end;  
$$ language plpgsql strict;  

压测,写入大量数据

vi test.sql  
select count(*) from (select cluster_insert() from generate_series(1,100)) t;  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 1200  

检查多维聚集性

test=# select * from tbl limit 10;  
   c1    |   c2    |   c3      
---------+---------+---------  
 1992652 | 1992653 | 1992652  
 1992573 | 1992574 | 1992578  
 1992574 | 1992574 | 1992578  
 1992573 | 1992575 | 1992578  
 1992573 | 1992574 | 1992579  
 1992574 | 1992575 | 1992578  
 1992573 | 1992575 | 1992579  
 1992574 | 1992574 | 1992579  
 1992658 | 1992658 | 1992658  
 1992659 | 1992658 | 1992658  
(10 rows)  

创建BRIN块级索引。

create index idx on tbl using brin (c1,c2,c3);  
test=# \dt+ tbl  
                             List of relations
 Schema |        Name         | Type  |  Owner   |    Size    | Description 
--------+---------------------+-------+----------+------------+-------------
 public | tbl                 | table | postgres | 97 GB      | 
(1 row)  
  
test=# \di+ idx  
                        List of relations  
 Schema | Name | Type  |  Owner   | Table |  Size  | Description   
--------+------+-------+----------+-------+--------+-------------  
 public | idx  | index | postgres | tbl   | 456 kB |   
(1 row)  

看看456KB的索引,在97 GB的数据层面,查询效率如何。

任意列、组合查询过滤性。

explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000;  
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c2 between 1 and 1000;  
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c3 between 1 and 1000;  
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000;  
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c3 between 100 and 2000;  
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000 and c3 between 1 and 2000;  
test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000;
                              
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31623.80 rows=1 width=12) (actual time=27.302..50.284 rows=6997 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000))
   Rows Removed by Index Recheck: 229803
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=26.881..26.881 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000))
         Buffers: shared hit=662
 Planning time: 0.095 ms
 Execution time: 50.636 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c2 between 1 and 1000;
                                                       QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31623.80 rows=1 width=12) (actual time=27.886..49.011 rows=6997 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c2 >= 1) AND (tbl.c2 <= 1000))
   Rows Removed by Index Recheck: 229803
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=27.512..27.512 rows=12800 loops=1)
         Index Cond: ((tbl.c2 >= 1) AND (tbl.c2 <= 1000))
         Buffers: shared hit=662
 Planning time: 0.040 ms
 Execution time: 49.348 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c3 between 1 and 1000;
                                                       QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31623.80 rows=1 width=12) (actual time=25.238..46.292 rows=6997 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c3 >= 1) AND (tbl.c3 <= 1000))
   Rows Removed by Index Recheck: 229803
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=24.875..24.875 rows=12800 loops=1)
         Index Cond: ((tbl.c3 >= 1) AND (tbl.c3 <= 1000))
         Buffers: shared hit=662
 Planning time: 0.044 ms
 Execution time: 46.631 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000;
                                                       QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31742.85 rows=1 width=12) (actual time=30.018..48.522 rows=6307 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000))
   Rows Removed by Index Recheck: 230493
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=27.273..27.273 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000))
         Buffers: shared hit=662
 Planning time: 0.049 ms
 Execution time: 48.829 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c3 between 100 and 2000;
                                                       QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31742.85 rows=1 width=12) (actual time=27.565..46.347 rows=6307 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c3 >= 100) AND (tbl.c3 <= 2000))
   Rows Removed by Index Recheck: 230493
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=24.799..24.799 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c3 >= 100) AND (tbl.c3 <= 2000))
         Buffers: shared hit=662
 Planning time: 0.055 ms
 Execution time: 46.656 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000 and c3 between 1 and 2000;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31861.90 rows=1 width=12) (actual time=28.703..49.599 rows=6307 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000) AND (tbl.c3 >= 1) AND (tbl.c3 <= 2000))
   Rows Removed by Index Recheck: 230493
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=25.590..25.590 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000) AND (tbl.c3 >= 1) AND (tbl.c3 <= 2000))
         Buffers: shared hit=662
 Planning time: 0.114 ms
 Execution time: 49.919 ms
(11 rows)

小结

本文介绍了如何创建、生成多维序列。

本文验证了数据如果按照多维序列聚集存放,可以达到块级索引最强过滤性,任意字段都能实现高效率过滤。

如果数据的多列本身不存在相关性,可以参考这篇文档,对数据进行空间重分布存储。得到最强过滤性。

《PostgreSQL 黑科技 - 空间聚集存储》

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍如何基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
9月前
|
关系型数据库 测试技术 Linux
PostgreSQL配置文件修改及启用方法
总的来说,修改和启用PostgreSQL的配置文件是一个直接而简单的过程。只需要找到配置文件,修改你想要改变的选项,然后重启服务器即可。但是,你需要注意的是,不正确的配置可能会导致服务器性能下降,甚至导致服务器无法启动。因此,在修改配置文件之前,你应该充分理解每个选项的含义和影响,如果可能的话,你应该在测试环境中先进行试验。
729 72
|
存储 NoSQL 关系型数据库
深入探索地理空间查询:如何优雅地在MySQL、PostgreSQL及Redis中实现精准的地理数据存储与检索技巧
深入探索地理空间查询:如何优雅地在MySQL、PostgreSQL及Redis中实现精准的地理数据存储与检索技巧
3107 0
|
SQL 关系型数据库 Linux
在CentOS 6上安装和使用PostgreSQL的方法
在CentOS 6上安装和使用PostgreSQL的方法
286 2
|
Ubuntu 关系型数据库 数据库
在Ubuntu 18.04上安装和使用PostgreSQL的方法
在Ubuntu 18.04上安装和使用PostgreSQL的方法
332 1
|
Ubuntu 关系型数据库 Linux
在Ubuntu 14.04上安装和使用PostgreSQL的方法
在Ubuntu 14.04上安装和使用PostgreSQL的方法
208 1
|
关系型数据库 Linux 数据库
在CentOS 7上安装和使用PostgreSQL的方法
在CentOS 7上安装和使用PostgreSQL的方法
1963 0
|
分布式计算 关系型数据库 大数据
MaxCompute产品使用合集之怎么才可以将 PostgreSQL 中的 geometry 空间类型字段同步到 MaxCompute 或另一个 PostgreSQL 数据库
MaxCompute作为一款全面的大数据处理平台,广泛应用于各类大数据分析、数据挖掘、BI及机器学习场景。掌握其核心功能、熟练操作流程、遵循最佳实践,可以帮助用户高效、安全地管理和利用海量数据。以下是一个关于MaxCompute产品使用的合集,涵盖了其核心功能、应用场景、操作流程以及最佳实践等内容。
|
SQL 关系型数据库 MySQL
postgresql |数据库 |数据库的常用备份和恢复方法总结
postgresql |数据库 |数据库的常用备份和恢复方法总结
501 0
|
缓存 关系型数据库 MySQL
postgresql|数据库|序列Sequence的创建和管理
postgresql|数据库|序列Sequence的创建和管理
628 0
|
安全 关系型数据库 数据库
创建 PostgreSQL 表空间时没有指定空间的总大小
创建 PostgreSQL 表空间时没有指定空间的总大小
239 1

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版
  • 推荐镜像

    更多