1 设置PolarDB MySQL集群白名单,创建数据库及账号
打开PolarDB MySQL控制台,找到创建的集群
设置白名单,白名单内加入实验创建的ECS资源的公网ip
创建普通账号
创建数据库,绑定到刚才创建的账号
2申请集群公网地址,开启行存/列存自动引流功能
这个设置是RDS没有的,需要点击左侧的基本信息
找到数据库代理企业通讯版,点击公网右侧的申请,申请一个公网地址。公网地址申请完后,点击编辑配置,配置行列存自动引流。
这里的行存/列存自动引流需要开启,默认是关闭的。可以看到系统已经自动选择了主节点、只读节点和只读列存节点。我在做这个实验时,忽略了打开这个自动引流功能,发现前两个sql运行时只需要2秒左右,而在这里打开这个功能后,如果use_imci_engine变量值为off,运行的时间就在10秒以上,甚至接近20秒了。
3 创建测试表,导入数据
在DMS中创建测试表,登录DMS,复制表创建sql脚本,粘贴运行。
运行之后,在表一栏中点击刷新图标即可看到新建的表。
数据导入在ECS中进行,打开web终端,复制数据导入脚本,更改脚本中的xxxx为申请的集群公网地址,粘贴到终端内运行即可。
4
bash /root/benchtpch/tpch/data_kit.sh --parallel2--base /usr -s1-c64--data /root/benchtpch/tpchdata1g --database tpch1g --ddl /root/benchtpch/tpch/columnar.ddl --host mypolar.rwlb.rds.aliyuncs.com --port3306-u test_user -p Password123 load
数据装载大概需要十分钟左右,看到下面的输出,就知道数据装载成功了。
mysql: [Warning] Using a password on the command line interface can be insecure. Fri Aug 2603:56:47 PM CST 2022 [INFO] Loaded data part 63for tpch1g.lineitem from /root/benchtpch/tpchdata1g/lineitem.tbl.63 ... Fri Aug 2603:56:48 PM CST 2022 [INFO] All 2 threads for tpch1g.lineitem finish Fri Aug 2603:56:48 PM CST 2022 [INFO] Finish loading data for database tpch1g with 64 chunks in2 threads
4 测试行列存自动引流的效果
测试之前需要登录数据库
[root@iZuf6c5z7jjuumae5hhkr7Z ~]# /usr/bin/mysql --host mypolar.rwlb.rds.aliyuncs.com --port 3306 -utest_user -pPassword123 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 536874751 Server version: 8.0.13 Source distribution Copyright (c) 2000, 2021, Oracle and/or its affiliates.
可以看到数据库的版本为8.0.13
4.1 单表分析查询
自动引流默认是打开的
mysql> show variables like'use_imci%';+-----------------+-------+| Variable_name | Value |+-----------------+-------+| use_imci_engine |ON|+-----------------+-------+1 row inset(0.01 sec)
执行单表查询的sql语句
mysql>select sum(l_extendedprice * l_discount)as revenue from lineitem where l_shipdate >=date '1994-01-01'and l_shipdate <date '1994-01-01'+ interval '1'yearand l_discount between.06-0.01and.06+0.01and l_quantity <24;+----------------+| revenue |+----------------+|123141078.2283|+----------------+1 row inset(0.47 sec)
执行用时0.47秒,看一下执行计划
mysql> explain select sum(l_extendedprice * l_discount)as revenue from lineitem where l_shipdate >=date '1994-01-01'and l_shipdate <date '1994-01-01'+ interval '1'yearand l_discount between.06-0.01and.06+0.01and l_quantity <24;+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| IMCI Execution Plan (max_dop =2, max_query_mem =428867584)|+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+|Project | Exprs: temp_table1.SUM(lineitem.l_extendedprice* lineitem.l_discount) HashGroupby | OutputTable(1): temp_table1 | Grouping: None | Output Grouping: None | Aggrs: SUM(lineitem.l_extendedprice* lineitem.l_discount) CTableScan | InputTable(0): lineitem | Pred:((lineitem.l_shipdate>=01/01/199400:00:00.000000)AND(lineitem.l_shipdate<01/01/199500:00:00.000000)AND(lineitem.l_quantity<24.00)AND( lineitem.l_discount BTW 0.05AND0.07))|+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row inset(0.01 sec)
使用了IMCI Execution Plan,先扫描后进行了HashGroupby。设置use_imci_engine为off后再看。
mysql>set use_imci_engine = off;Query OK,0 rows affected (0.00 sec)mysql>select-> sum(l_extendedprice * l_discount)as revenue ->from-> lineitem ->where-> l_shipdate >=date '1994-01-01'->and l_shipdate <date '1994-01-01'+ interval '1'year->and l_discount between.06-0.01and.06+0.01->and l_quantity <24;+----------------+| revenue |+----------------+|123141078.2283|+----------------+1 row inset(19.61 sec)
执行用时为19.61秒,慢的不是一点半点。看一下执行计划
mysql> explain select sum(l_extendedprice * l_discount)as revenue from lineitem where l_shipdate >=date '1994-01-01'and l_shipdate <date '1994-01-01'+ interval '1'yearand l_discount between.06-0.01and.06+0.01and l_quantity <24;+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+| id | select_type |table| partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+|1| SIMPLE | lineitem |NULL| ALL |NULL|NULL|NULL|NULL|5970318|0.41| Using where|+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+1 row inset,1 warning (0.00 sec)
做这个测试的时候遇到了以下小插曲,再设置行列存自动引流功能时,忽略了在集群设置时打开这个功能。在做这个测试时,第一次运行 时间为20秒左右,在第二次运行时执行用时就是2秒左右了。猜测是第二次运行时应为数据载到了缓冲区内,所以快了不少。而打开行列存自动引流功能之后,这条语句的执行在use_imci_engine为off时执行用时就不低于10s了。
4.2 多表join查询
mysql>set use_imci_engine =on;Query OK,0 rows affected (0.00 sec)mysql>select-> l_orderkey,-> sum(l_extendedprice *(1- l_discount))as revenue,-> o_orderdate,-> o_shippriority ->from-> customer,-> orders,-> lineitem ->where-> c_mktsegment ='BUILDING'->and c_custkey = o_custkey ->and l_orderkey = o_orderkey ->and o_orderdate <date '1995-03-15'->and l_shipdate >date '1995-03-15'->groupby-> l_orderkey,-> o_orderdate,-> o_shippriority ->orderby-> revenue desc,-> o_orderdate ->limit10;+------------+-------------+-------------+----------------+| l_orderkey | revenue | o_orderdate | o_shippriority |+------------+-------------+-------------+----------------+|2456423|406181.0111|1995-03-05|0||3459808|405838.6989|1995-03-04|0||492164|390324.0610|1995-02-19|0||1188320|384537.9359|1995-03-09|0||2435712|378673.0558|1995-02-26|0||4878020|378376.7952|1995-03-12|0||5521732|375153.9215|1995-03-13|0||2628192|373133.3094|1995-02-22|0||993600|371407.4595|1995-03-05|0||2300070|367371.1452|1995-03-13|0|+------------+-------------+-------------+----------------+10 rows inset(0.75 sec)
use_imci_engine设置为on,执行时间为0.75秒,执行计划如下:
mysql> explain select l_orderkey, sum(l_extendedprice *(1- l_discount))as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment ='BUILDING'and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate <date '1995-03-15'and l_shipdate >date '1995-03-15'groupby l_orderkey, o_orderdate, o_shippriority orderby revenue desc, o_orderdate limit10\G;***************************1. row ***************************IMCI Execution Plan (max_dop =2, max_query_mem =428867584):Project | Exprs: temp_table4.lineitem.l_orderkey, temp_table4.SUM(lineitem.l_extendedprice*1.00- lineitem.l_discount), temp_table4.orders.o_orderdate, temp_table4.orders.o_shippriority TopK |Limit=10| Exprs: temp_table4.SUM(lineitem.l_extendedprice*1.00- lineitem.l_discount)DESC,temp_table4.orders.o_orderdateASC HashGroupby | OutputTable(4): temp_table4 | Grouping: lineitem.l_orderkey orders.o_orderdate orders.o_shippriority| Output Grouping: lineitem.l_orderkey, orders.o_orderdate, orders.o_shippriority| Aggrs: SUM(lineitem.l_extendedprice*1.00- lineitem.l_discount) HashJoin | HashMode: DYNAMIC | JoinMode: INNER | JoinPred: orders.o_orderkey= lineitem.l_orderkey HashJoin | HashMode: DYNAMIC | JoinMode: INNER | JoinPred: orders.o_custkey= customer.c_custkey CTableScan | InputTable(0): orders | Pred:(orders.o_orderdate<03/15/199500:00:00.000000) CTableScan | InputTable(1): customer | Pred:(customer.c_mktsegment="BUILDING") CTableScan | InputTable(2): lineitem | Pred:(lineitem.l_shipdate>03/15/199500:00:00.000000)1 row inset(0.03 sec)
从执行计划来看,执行的hashjoin,看一下use_imci_engine为off时的执行时间
mysql>select-> l_orderkey,-> sum(l_extendedprice *(1- l_discount))as revenue,-> o_orderdate,-> o_shippriority ->from-> customer,-> orders,-> lineitem ->where-> c_mktsegment ='BUILDING'->and c_custkey = o_custkey ->and l_orderkey = o_orderkey ->and o_orderdate <date '1995-03-15'->and l_shipdate >date '1995-03-15'->groupby-> l_orderkey,-> o_orderdate,-> o_shippriority ->orderby-> revenue desc,-> o_orderdate ->limit10;+------------+-------------+-------------+----------------+| l_orderkey | revenue | o_orderdate | o_shippriority |+------------+-------------+-------------+----------------+|2456423|406181.0111|1995-03-05|0||3459808|405838.6989|1995-03-04|0||492164|390324.0610|1995-02-19|0||1188320|384537.9359|1995-03-09|0||2435712|378673.0558|1995-02-26|0||4878020|378376.7952|1995-03-12|0||5521732|375153.9215|1995-03-13|0||2628192|373133.3094|1995-02-22|0||993600|371407.4595|1995-03-05|0||2300070|367371.1452|1995-03-13|0|+------------+-------------+-------------+----------------+10 rows inset(19.69 sec)
执行时间为19.69秒,执行计划如下:
mysql> explain select l_orderkey, sum(l_extendedprice *(1- l_discount))as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment ='BUILDING'and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate <date '1995-03-15'and l_shipdate >date '1995-03-15'groupby l_orderkey, o_orderdate, o_shippriority orderby revenue desc, o_orderdate limit10;+----+-------------+----------+------------+--------+---------------+---------+---------+--------------------------+---------+----------+----------------------------------------------+| id | select_type |table| partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+----------+------------+--------+---------------+---------+---------+--------------------------+---------+----------+----------------------------------------------+|1| SIMPLE | orders |NULL| ALL | PRIMARY |NULL|NULL|NULL|1490641|33.33| Using where; Using temporary; Using filesort ||1| SIMPLE | customer |NULL| eq_ref | PRIMARY | PRIMARY |8| tpch1g.orders.o_custkey|1|10.00| Using where||1| SIMPLE | lineitem |NULL| ref | PRIMARY | PRIMARY |8| tpch1g.orders.o_orderkey|4|33.33| Using where|+----+-------------+----------+------------+--------+---------------+---------+---------+--------------------------+---------+----------+----------------------------------------------+3 rows inset,1 warning (0.00 sec)
从执行计划来看,执行的是MySQL的嵌套join,执行时间长了不少。
4.3 点查
mysql>set use_imci_engine =on;Query OK,0 rows affected (0.00 sec)mysql>select*from lineitem where l_orderkey =1;+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+--------------+--------------+------------+--------------+---------------+-------------------+------------+------------------------------------+| l_orderkey | l_partkey | l_suppkey | l_linenumber | l_quantity | l_extendedprice | l_discount | l_tax | l_returnflag | l_linestatus | l_shipdate | l_commitdate | l_receiptdate | l_shipinstruct | l_shipmode | l_comment |+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+--------------+--------------+------------+--------------+---------------+-------------------+------------+------------------------------------+|1|155190|7706|1|17.00|21168.23|0.04|0.02| N | O |1996-03-13|1996-02-12|1996-03-22| DELIVER IN PERSON | TRUCK | egular courts above the ||1|67310|7311|2|36.00|45983.16|0.09|0.06| N | O |1996-04-12|1996-02-28|1996-04-20| TAKE BACK RETURN | MAIL | ly final dependencies: slyly bold ||1|63700|3701|3|8.00|13309.60|0.10|0.02| N | O |1996-01-29|1996-03-05|1996-01-31| TAKE BACK RETURN | REG AIR | riously. regular, express dep ||1|2132|4633|4|28.00|28955.64|0.09|0.06| N | O |1996-04-21|1996-03-30|1996-05-16| NONE | AIR | lites. fluffily even de ||1|24027|1534|5|24.00|22824.48|0.10|0.04| N | O |1996-03-30|1996-03-14|1996-04-01| NONE | FOB | pending foxes. slyly re ||1|15635|638|6|32.00|49620.16|0.07|0.02| N | O |1996-01-30|1996-02-07|1996-02-03| DELIVER IN PERSON | MAIL | arefully slyly ex |+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+--------------+--------------+------------+--------------+---------------+-------------------+------------+------------------------------------+6 rows inset(0.00 sec)mysql> explain select*from lineitem where l_orderkey =1;+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------+| id | select_type |table| partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------+|1| SIMPLE | lineitem |NULL| ref | PRIMARY | PRIMARY |8| const |6|100.00|NULL|+----+-------------+----------+------------+------+---------------+---------+---------+-------+------+----------+-------+1 row inset,1 warning (0.01 sec)
在use_imci_engine设置为on时,点查仍然走的是MySQL的执行计划,不是IMCI Execution Plan。
5 小结
PolarDB myslq的启行存/列存自动引流把分析性的查询分流到只读列存实例上,把OLTP类的查询引流到主节点或者是只读节点。列存节点的存储方式更适合OLAP类查询,同时也能执行hash连接、合并排序连接等更适合分析性查询的多表连接方式,也支持并行执行,分析性的查询性能会有大幅度的提升。