背景
在当今快速发展的数字化时代,企业对实时数据分析和高效事务处理的需求日益增长。传统的数据库架构往往将在线交易处理(OLTP)与在线分析处理(OLAP)分离,导致系统复杂性增加、维护成本高昂,并且难以满足实时决策支持的要求。随着技术的进步,混合事务与分析处理(HTAP)逐渐成为主流趋势,它允许在同一平台上同时进行高效的交易处理和复杂的数据分析。
阿里云 PolarDB for PostgreSQL作为一款领先的云原生关系型数据库,利用向量化引擎+列存索引等技术实现了 OLTP 和 OLAP 的一体化。用户不仅能够享受到高并发事务处理的卓越性能,还能即时获取最新业务数据的深度洞察,大大缩短了从数据到价值的时间周期。
本方案将为您展示如何通过 PolarDB for PostgreSQL 来实现一体化的 HTAP 能力。
TPC-H 性能测试
TPC-H是业界常用的性能标准测试,由TPC委员会制定发布,用于评测数据库的分析能力。包含8张数据表、22条复杂的SQL查询,包含单表统计、多表 Join、子查询、聚合、排序等。
本次测试数据量:10GB。
测试环境:
- ECS:规格为 2C2GB(ecs.e-c1m1.large),挂载的磁盘 40GB
- PolarDB PostgreSQL:
- 计费类型:按量付费
- 引擎版本:PostgreSQL 14
- 规格为 4C 16GB(标准版,通用规格,X86)
- 节点个数:1 个读写节点,0 个只读节点
注:ECS实例和PolarDB集群需保证在同一个VPC中。
步骤
- 登录到 ECS 服务器,下载 TPC-H 工具:dbgen.tar.gz,解压后进入到目录编译 dbgen,并生成 10GB 的测试数据。约 5 分钟。
mkdir /data cd /data wget https://ganos-hz.oss-cn-hangzhou.aliyuncs.com/csi/dbgen.tar.gz tar -zxvf dbgen.tar.gz cd dbgen make -f makefile.suite ./dbgen -s 10 -f
- 在 ECS 上通过 psql 连接到 PolarDB PostgreSQL 实例。
## 下载polardb tools工具箱,里面包含psql命令 sudo yum install https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241113/phrrjc/PolarDB-Tools-2.0.14.26.0-20241023104506.al8.x86_64.rpm ## 执行psql命令,连接到数据库 /u01/polardb_pg_tools/bin/psql -h xxx -p xxx -Uxxx postgres
- 在 PolarDB PostgreSQL 实例中导入生成的 TPC-H 数据。 约 5 分钟
--创建tpchdb CREATE DATABASE tpchdb WITH ENCODING = 'UTF8' LC_COLLATE = 'C' LC_CTYPE = 'C' TEMPLATE template0; --切换到tpchdb \c tpchdb --在 tpchdb 中创建 polar_csi 插件 CREATE EXTENSION polar_csi; --执行dbgen目录中的dss.ddl文件,创建tpch 8张表结构 \i ./dss.ddl --将dbgen目录中生成的8张表的数据导入到tpchdb中 \copy part from ./part.tbl with delimiter as '|' NULL ''; \copy region from ./region.tbl with delimiter as '|' NULL ''; \copy nation from ./nation.tbl with delimiter as '|' NULL ''; \copy orders from ./orders.tbl with delimiter as '|' NULL ''; \copy customer from ./customer.tbl with delimiter as '|' NULL ''; \copy lineitem from ./lineitem.tbl with delimiter as '|' NULL ''; \copy partsupp from ./partsupp.tbl with delimiter as '|' NULL ''; \copy supplier from ./supplier.tbl with delimiter as '|' NULL '';
- 设置向量化引擎和列存索引相关的参数,用于后续查询
SET polar_csi.enable_pk to on; set polar_csi.enable_query to on; set polar_csi.exec_parallel to 4; set polar_csi.cost_threshold = 0; set polar_csi.memory_limit = 2048;
- 数据导入后,为每张表创建列存索引。约 5 分钟
--partsupp表 ALTER TABLE PARTSUPP ADD CONSTRAINT partsupp_pkey PRIMARY KEY (PS_PARTKEY, PS_SUPPKEY); CREATE INDEX imps ON partsupp USING csi(ps_partkey, ps_suppkey, ps_availqty, ps_supplycost, ps_comment); --part表 ALTER TABLE PART ADD CONSTRAINT part_kpey PRIMARY KEY (P_PARTKEY); CREATE INDEX im_p ON part USING csi(p_partkey, p_name, p_mfgr, p_brand, p_type, p_size, p_container, p_retailprice, p_comment); --supplier表 ALTER TABLE SUPPLIER ADD CONSTRAINT supplier_pkey PRIMARY KEY (S_SUPPKEY); CREATE INDEX im_s ON supplier USING csi(s_suppkey, s_name, s_address, s_nationkey, s_phone, s_acctbal, s_comment); --customer表 ALTER TABLE CUSTOMER ADD CONSTRAINT customer_pkey PRIMARY KEY (C_CUSTKEY); CREATE INDEX im_c ON customer USING csi(c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment); --orders表 ALTER TABLE ORDERS ADD CONSTRAINT orders_pkey PRIMARY KEY (O_ORDERKEY); CREATE INDEX im_o ON orders USING csi(o_orderkey, o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, o_shippriority, o_comment); --lineitem表 ALTER TABLE LINEITEM ADD CONSTRAINT lineitem_pkey PRIMARY KEY (L_ORDERKEY, L_LINENUMBER); CREATE INDEX im_l ON lineitem USING csi(l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment); --nation表 ALTER TABLE NATION ADD CONSTRAINT nation_pkey PRIMARY KEY (N_NATIONKEY); CREATE INDEX im_n ON nation USING csi(n_nationkey, n_name, n_regionkey, n_comment); --region表 ALTER TABLE REGION ADD CONSTRAINT region_pkey PRIMARY KEY (R_REGIONKEY); CREATE INDEX im_r ON region USING csi(r_regionkey, r_name, r_comment);
- 开启向量化引擎,执行 22 条 SQL 语句,统计耗时
\timing --Q1 explain analyze select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-12-01' - '60 day'::interval group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; --Q2 explain analyze select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 43 and p_type like '%NICKEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' ) order by s_acctbal desc, n_name, s_name, p_partkey limit 100; --Q3 explain analyze select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-05' and l_shipdate > date '1995-03-05' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 100; --Q4 explain analyze select o_orderpriority, count(*) as order_count from orders where o_orderdate >= date '1993-05-01' and o_orderdate < date '1993-05-01' + interval '3 month'::interval and exists ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority; --Q5 explain analyze select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' and o_orderdate >= date '1995-01-01' and o_orderdate < date '1995-01-01' + '1 year'::interval group by n_name order by revenue desc; --Q6 explain analyze select sum(l_extendedprice * l_discount) as revenue from lineitem where l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + '1 year'::interval and l_discount between 0.03 - 0.01 and 0.03 + 0.01 and l_quantity < 24; --Q7 explain analyze SELECT supp_nation, cust_nation, l_year, sum(volume) AS revenue FROM ( SELECT n1.n_name AS supp_nation, n2.n_name AS cust_nation, extract(year FROM l_shipdate) AS l_year, l_extendedprice * (1 - l_discount) AS volume FROM supplier, lineitem, orders, customer, nation n1, nation n2 WHERE s_suppkey = l_suppkey AND o_orderkey = l_orderkey AND c_custkey = o_custkey AND s_nationkey = n1.n_nationkey AND c_nationkey = n2.n_nationkey AND ((n1.n_name = 'FRANCE' AND n2.n_name = 'GERMANY') OR (n1.n_name = 'GERMANY' AND n2.n_name = 'FRANCE')) AND l_shipdate BETWEEN CAST('1995-01-01' AS date) AND CAST('1996-12-31' AS date)) AS shipping GROUP BY supp_nation, cust_nation, l_year ORDER BY supp_nation, cust_nation, l_year; --Q8 explain analyze select o_year, sum(case when nation = 'INDONESIA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) as volume, n2.n_name as nation from part, supplier, lineitem, orders, customer, nation n1, nation n2, region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between '1995-01-01'::date and '1996-12-31'::date and p_type = 'PROMO POLISHED NICKEL' ) as all_nations group by o_year order by o_year; --Q9 explain analyze select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%navajo%' ) as profit group by nation, o_year order by nation, o_year desc limit 100; --Q10 explain analyze select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1993-08-01' and o_orderdate < date '1993-08-01' + '3 month'::interval and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc LIMIT 20; --Q11 explain analyze select ps_partkey, sum(ps_supplycost * ps_availqty) as value from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'ALGERIA' group by ps_partkey having sum(ps_supplycost * ps_availqty) > ( select sum(ps_supplycost * ps_availqty) * 0.0001000000 from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'ALGERIA' ) order by value desc; --Q12 explain analyze select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count from orders, lineitem where o_orderkey = l_orderkey and l_shipmode in ('AIR', 'FOB') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= date '1996-01-01' and l_receiptdate < date '1996-01-01' + '1 year'::interval group by l_shipmode order by l_shipmode; --Q13 explain analyze select c_count, count(*) as custdist from ( select c_custkey, count(o_orderkey) from customer left outer join orders on c_custkey = o_custkey and o_comment not like '%pending%requests%' group by c_custkey ) as c_orders (c_custkey, c_count) group by c_count order by custdist desc, c_count desc; --Q14 explain analyze select 100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) else 0 end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue from lineitem, part where l_partkey = p_partkey and l_shipdate >= date '1995-02-01' and l_shipdate < date '1995-02-01' + interval '1 month'::interval; --Q15 explain analyze with revenue0 as ( select l_suppkey as supplier_no, sum(l_extendedprice * (1 - l_discount)) as total_revenue from lineitem where l_shipdate >= date '1995-08-01' and l_shipdate < date '1995-08-01' + '3 month'::interval group by l_suppkey) select s_suppkey, s_name, s_address, s_phone, total_revenue from supplier, revenue0 where s_suppkey = supplier_no and total_revenue = ( select max(total_revenue) from revenue0 ) order by s_suppkey; --Q16 explain analyze select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#13' and p_type not like 'ECONOMY BRUSHED%' and p_size in (11, 8, 10, 31, 21, 13, 32, 28) and ps_suppkey not in ( select s_suppkey from supplier where s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size limit 100; --Q17 explain analyze select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#44' and p_container = 'MED PKG' and l_quantity < ( select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey ); --Q18 explain analyze select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 313 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate limit 100; --LIMIT 100 --Q19 explain analyze select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#15' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#45' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 18 and l_quantity <= 18 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#21' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 30 and l_quantity <= 30 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); --LIMIT -1 --Q20 explain analyze select s_name, s_address from supplier, nation where s_suppkey in ( select ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where p_name like 'lemon%' ) and ps_availqty > ( select 0.5 * sum(l_quantity) from lineitem where l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate >= date '1997-01-01' and l_shipdate < date '1997-01-01' + '1 year'::interval ) ) and s_nationkey = n_nationkey and n_name = 'INDONESIA' order by s_name limit 100; --Q21 explain analyze select s_name, count(*) as numwait from supplier, lineitem l1, orders, nation where s_suppkey = l1.l_suppkey and o_orderkey = l1.l_orderkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from lineitem l2 where l2.l_orderkey = l1.l_orderkey and l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from lineitem l3 where l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'INDIA' group by s_name order by numwait desc, s_name limit 100; --LIMIT 100 --Q22 explain analyze select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal from ( select substring(c_phone from 1 for 2) as cntrycode, c_acctbal from customer where substring(c_phone from 1 for 2) in ('16', '17', '24', '21', '19', '22', '15') and c_acctbal > ( select avg(c_acctbal) from customer where c_acctbal > 0.00 and substring(c_phone from 1 for 2) in ('16', '17', '24', '21', '19', '22', '15') ) and not exists ( select * from orders where o_custkey = c_custkey ) ) as custsale group by cntrycode order by cntrycode; --LIMIT -1
- 关闭向量化引擎,执行 22 条语句,统计耗时
\timing set polar_csi.enable_query to off; --Q1 explain analyze select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-12-01' - '60 day'::interval group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; --Q2 explain analyze select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 43 and p_type like '%NICKEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' ) order by s_acctbal desc, n_name, s_name, p_partkey limit 100; --Q3 explain analyze select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-05' and l_shipdate > date '1995-03-05' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 100; --Q4 explain analyze select o_orderpriority, count(*) as order_count from orders where o_orderdate >= date '1993-05-01' and o_orderdate < date '1993-05-01' + interval '3 month'::interval and exists ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority; --Q5 explain analyze select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' and o_orderdate >= date '1995-01-01' and o_orderdate < date '1995-01-01' + '1 year'::interval group by n_name order by revenue desc; --Q6 explain analyze select sum(l_extendedprice * l_discount) as revenue from lineitem where l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + '1 year'::interval and l_discount between 0.03 - 0.01 and 0.03 + 0.01 and l_quantity < 24; --Q7 explain analyze SELECT supp_nation, cust_nation, l_year, sum(volume) AS revenue FROM ( SELECT n1.n_name AS supp_nation, n2.n_name AS cust_nation, extract(year FROM l_shipdate) AS l_year, l_extendedprice * (1 - l_discount) AS volume FROM supplier, lineitem, orders, customer, nation n1, nation n2 WHERE s_suppkey = l_suppkey AND o_orderkey = l_orderkey AND c_custkey = o_custkey AND s_nationkey = n1.n_nationkey AND c_nationkey = n2.n_nationkey AND ((n1.n_name = 'FRANCE' AND n2.n_name = 'GERMANY') OR (n1.n_name = 'GERMANY' AND n2.n_name = 'FRANCE')) AND l_shipdate BETWEEN CAST('1995-01-01' AS date) AND CAST('1996-12-31' AS date)) AS shipping GROUP BY supp_nation, cust_nation, l_year ORDER BY supp_nation, cust_nation, l_year; --Q8 explain analyze select o_year, sum(case when nation = 'INDONESIA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) as volume, n2.n_name as nation from part, supplier, lineitem, orders, customer, nation n1, nation n2, region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between '1995-01-01'::date and '1996-12-31'::date and p_type = 'PROMO POLISHED NICKEL' ) as all_nations group by o_year order by o_year; --Q9 explain analyze select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%navajo%' ) as profit group by nation, o_year order by nation, o_year desc limit 100; --Q10 explain analyze select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1993-08-01' and o_orderdate < date '1993-08-01' + '3 month'::interval and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc LIMIT 20; --Q11 explain analyze select ps_partkey, sum(ps_supplycost * ps_availqty) as value from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'ALGERIA' group by ps_partkey having sum(ps_supplycost * ps_availqty) > ( select sum(ps_supplycost * ps_availqty) * 0.0001000000 from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'ALGERIA' ) order by value desc; --Q12 explain analyze select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count from orders, lineitem where o_orderkey = l_orderkey and l_shipmode in ('AIR', 'FOB') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= date '1996-01-01' and l_receiptdate < date '1996-01-01' + '1 year'::interval group by l_shipmode order by l_shipmode; --Q13 explain analyze select c_count, count(*) as custdist from ( select c_custkey, count(o_orderkey) from customer left outer join orders on c_custkey = o_custkey and o_comment not like '%pending%requests%' group by c_custkey ) as c_orders (c_custkey, c_count) group by c_count order by custdist desc, c_count desc; --Q14 explain analyze select 100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) else 0 end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue from lineitem, part where l_partkey = p_partkey and l_shipdate >= date '1995-02-01' and l_shipdate < date '1995-02-01' + interval '1 month'::interval; --Q15 explain analyze with revenue0 as ( select l_suppkey as supplier_no, sum(l_extendedprice * (1 - l_discount)) as total_revenue from lineitem where l_shipdate >= date '1995-08-01' and l_shipdate < date '1995-08-01' + '3 month'::interval group by l_suppkey) select s_suppkey, s_name, s_address, s_phone, total_revenue from supplier, revenue0 where s_suppkey = supplier_no and total_revenue = ( select max(total_revenue) from revenue0 ) order by s_suppkey; --Q16 explain analyze select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#13' and p_type not like 'ECONOMY BRUSHED%' and p_size in (11, 8, 10, 31, 21, 13, 32, 28) and ps_suppkey not in ( select s_suppkey from supplier where s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size limit 100; --Q17 explain analyze select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#44' and p_container = 'MED PKG' and l_quantity < ( select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey ); --Q18 explain analyze select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 313 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate limit 100; --LIMIT 100 --Q19 explain analyze select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#15' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#45' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 18 and l_quantity <= 18 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#21' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 30 and l_quantity <= 30 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); --LIMIT -1 --Q20 explain analyze select s_name, s_address from supplier, nation where s_suppkey in ( select ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where p_name like 'lemon%' ) and ps_availqty > ( select 0.5 * sum(l_quantity) from lineitem where l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate >= date '1997-01-01' and l_shipdate < date '1997-01-01' + '1 year'::interval ) ) and s_nationkey = n_nationkey and n_name = 'INDONESIA' order by s_name limit 100; --Q21 explain analyze select s_name, count(*) as numwait from supplier, lineitem l1, orders, nation where s_suppkey = l1.l_suppkey and o_orderkey = l1.l_orderkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from lineitem l2 where l2.l_orderkey = l1.l_orderkey and l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from lineitem l3 where l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'INDIA' group by s_name order by numwait desc, s_name limit 100; --LIMIT 100 --Q22 explain analyze select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal from ( select substring(c_phone from 1 for 2) as cntrycode, c_acctbal from customer where substring(c_phone from 1 for 2) in ('16', '17', '24', '21', '19', '22', '15') and c_acctbal > ( select avg(c_acctbal) from customer where c_acctbal > 0.00 and substring(c_phone from 1 for 2) in ('16', '17', '24', '21', '19', '22', '15') ) and not exists ( select * from orders where o_custkey = c_custkey ) ) as custsale group by cntrycode order by cntrycode; --LIMIT -1
结果
列存索引占用空间
对比向量化引擎列存索引和行存引擎 Heap 表的空间占用情况。
测试结论:
- 行存引擎 Heap 表占用空间为 12.7GB。
- 列存索引占用空间为 5.5GB
表名称 |
表中包含的数据行数 |
PostgreSQL 行存 |
向量化引擎的列存索引 (包含主键) |
LINEITEM |
599,860,52 |
8792 MB |
3678 MB |
ORDERS |
150,000,00 |
2044 MB |
914 MB |
PARTSUPP |
80,000,00 |
1372 MB |
787 MB |
PART |
20,000,00 |
324 MB |
66 MB |
CUSTOMER |
15,000,00 |
284 MB |
118 MB |
SUPPLIER |
1,000,00 |
20 MB |
8 MB |
NATION |
25 |
8 KB |
528 KB |
REGION |
5 |
8 KB |
528 KB |
合计 |
12.7GB |
5.5GB |
查询性能
对比 PostgreSQL 行存引擎与向量化引擎查询效率。
测试结果:向量化引擎性能是 PG 行存引擎性能的近 60 倍。
注:请将下方Q1-Q22任务全部完成,填写表后,并截图
查询语句 |
PolarDB PG 向量化引擎耗时 (单位:秒) |
PostgreSQL 行存引擎耗时 (单位:秒) |
Q1 |
||
Q2 |
||
Q3 |
||
Q4 |
||
Q5 |
||
Q6 |
||
Q7 |
||
Q8 |
||
Q9 |
||
Q10 |
||
Q11 |
||
Q12 |
||
Q13 |
||
Q14 |
||
Q15 |
||
Q16 |
||
Q17 |
||
Q18 |
||
Q19 |
||
Q20 |
||
Q21 |
||
Q22 |
||
总耗时: |