3. 索引的设计原则
3.1 数据准备
第1步:创建数据库、创建表
CREATE DATABASE testdb1; USE atguigudb1; #1.创建学生表和课程表 CREATE TABLE `student_info` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `student_id` INT NOT NULL , `name` VARCHAR(20) DEFAULT NULL, `course_id` INT NOT NULL , `class_id` INT(11) DEFAULT NULL, `create_time` DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8; CREATE TABLE `course` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `course_id` INT NOT NULL , `course_name` VARCHAR(40) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
第2步:创建模拟数据必需的存储函数
#函数1:创建随机产生字符串函数 DELIMITER // CREATE FUNCTION rand_string(n INT) RETURNS VARCHAR(255) #该函数会返回一个字符串 BEGIN DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ'; DECLARE return_str VARCHAR(255) DEFAULT ''; DECLARE i INT DEFAULT 0; WHILE i < n DO SET return_str =CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1)); SET i = i + 1; END WHILE; RETURN return_str; END // DELIMITER ;
#函数2:创建随机数函数 DELIMITER // CREATE FUNCTION rand_num (from_num INT ,to_num INT) RETURNS INT(11) BEGIN DECLARE i INT DEFAULT 0; SET i = FLOOR(from_num +RAND()*(to_num - from_num+1)) ; RETURN i; END // DELIMITER ;
创建函数,假如报错:
This function has none of DETERMINISTIC......
由于开启过慢查询日志 bin-log, 我们就必须为我们的 function 指定一个参数。
主从复制,主机会将写操作记录在 bin-log 日志中。从机读取 bin-log 日志,执行语句来同步数据。如果使用函数来操作数据,会导致从机和主机操作时间不一致。所以,默认情况下,mysql 不开启创建函数设置。
查看 mysql 是否允许创建函数:
show variables like 'log_bin_trust_function_creators';
命令开启:允许创建函数设置:
set global log_bin_trust_function_creators=1;
mysqld 重启,上述参数又会消失。永久方法:
- windows下:my.ini[mysqld]加上:
log_bin_trust_function_creators=1
linux下:/etc/my.cnf 下 my.cnf[mysqld] 加上:
log_bin_trust_function_creators=1
第3步:创建插入模拟数据的存储过程
# 存储过程1:创建插入课程表存储过程 DELIMITER // CREATE PROCEDURE insert_course( max_num INT ) BEGIN DECLARE i INT DEFAULT 0; SET autocommit = 0; #设置手动提交事务 REPEAT #循环 SET i=i+1; #赋值 INSERT INTO course(course_id, course_name)VALUES(rand_num(10000,10100),rand_string(6)); UNTIL i = max_num END REPEAT; COMMIT; #提交事务 END // DELIMITER ;
# 存储过程2:创建插入学生信息表存储过程 DELIMITER // CREATE PROCEDURE insert_stu( max_num INT ) BEGIN DECLARE i INT DEFAULT 0; SET autocommit = 0; #设置手动提交事务 REPEAT #循环 SET i=i+1; #赋值 INSERT INTO student_info (course_id, class_id ,student_id ,NAME ) VALUES (rand_num(10000,10100),rand_num(10000,10200),rand_num(1,200000),rand_string(6)); UNTIL i = max_num END REPEAT; COMMIT; #提交事务 END // DELIMITER ;
第4步:调用存储过程
CALL insert_course(100); # 课程表中添加100条数据 CALL insert_stu(1000000);# 学生表中插入1000000条数据
3.2 哪些情况适合创建索引
1、字段的数值有唯一性的限制
2、频繁作为 WHERE 查询条件的字段
某个字段在 SELECT 语句的 WHERE 条件中经常被使用到,那么就需要给这个字段创建索引了。尤其是在数据量大的情况下,创建普通索引就可以大幅提升数据查询的效率。
比如 student_info 数据表(含 100 万条数据),假设我们想要查询 student_id=123110 的用户信息。
①查看student_info
表中的索引
可以看出,我们没有对student_id字段创建索引。
②进行如下查询,耗时220ms
③添加索引
alter table student_info add index idx_sid(student_id);
④再查询。耗时0ms。性能提升杠杠的~
3、经常 GROUP BY 和 ORDER BY 的列
索引其实就是让数据按照某种顺序进行存储或检索。当我们使用 GROUP BY 对数据进行分组查询,或者使用 ORDER BY 对数据进行排序的时候,如果 对分组或者排序的字段建立索引,本身索引的数据就已经排好序了,进行分组查询和排序操作性能不是很nice吗?另外,如果待排序的列有多个,那么可以在这些列上建立 组合索引 。
①下面在有student_id索引的情况下,查询:
mysql> SELECT student_id,COUNT(*) AS num -> FROM student_info -> GROUP BY student_id -> LIMIT 100; +------------+-----+ | student_id | num | +------------+-----+ | 1 | 5 | .....此处省略n行...... | 3 | 4 | | 101 | 7 | +------------+-----+ 100 rows in set (0.00 sec)
②删除索引
#删除idx_sid索引 DROP INDEX idx_sid ON student_info;
③再次查询 ,慢的像蜗牛~
mysql> SELECT student_id,COUNT(*) AS num -> FROM student_info -> GROUP BY student_id -> LIMIT 100; +------------+-----+ | student_id | num | +------------+-----+ | 95666 | 9 | .....此处省略n行...... | 173440 | 14 | | 67234 | 9 | +------------+-----+ 100 rows in set (0.78 sec)
同样,如果是ORDER BY,也需要对字段创建索引
④如果同时使用GROUP BY
和ORDER BY
,先看看不加索引的情况
mysql> SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; ERROR 1055 (42000): Expression #1 of ORDER BY clause is not in GROUP BY clause and contains nonaggregated column 'atguigudb1.student_info.create_time' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
⑤出现了一个异常信息,这是因为我们使用的sql_mode
是only_full_group_by
。修改下再来查询,时间代价是6.61s
mysql> SELECT @@sql_mode; +-----------------------------------------------------------------------------------------------------------------------+ | @@sql_mode | +-----------------------------------------------------------------------------------------------------------------------+ | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION | +-----------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SET @@sql_mode = 'STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'; Query OK, 0 rows affected (0.00 sec); # 去掉ONLY_FULL_GROUP_BY mysql> SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; +------------+-----+ | student_id | num | +------------+-----+ | 21497 | 1 | | 17311 | 1 | .....此处省略n行...... | 183509 | 1 | +------------+-----+ 100 rows in set (6.61 sec)
⑥再看看两个字段分别建立单列索引的情况,耗时5.26 s,快了一点点
mysql> ALTER TABLE student_info ADD INDEX idx_sid(student_id); Query OK, 0 rows affected (1.77 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE student_info ADD INDEX idx_cre_time(create_time); Query OK, 0 rows affected (1.49 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; +------------+-----+ | student_id | num | +------------+-----+ | 64044 | 1 | .....此处省略n行...... | 101052 | 1 | | 152620 | 1 | +------------+-----+ 100 rows in set (5.26 sec)
**注意:**建立多个单列索引,并不会都走,像刚才这个例子,只会走idx_sid索引
⑦分析下它的查询过程,原来我们只用了一个索引,由于我们是先GROUP BY student_id
,后ORDER BY create_time
,我们实际上只使用了索引idx_sid
mysql> EXPLAIN SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; +----+-------------+--------------+------------+-------+---------------+---------+---------+------+--------+----------+---------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------+------------+-------+---------------+---------+---------+------+--------+----------+---------------------------------+ | 1 | SIMPLE | student_info | NULL | index | idx_sid | idx_sid | 4 | NULL | 997130 | 100.00 | Using temporary; Using filesort | +----+-------------+--------------+------------+-------+---------------+---------+---------+------+--------+----------+---------------------------------+ 1 row in set, 1 warning (0.00 sec)
⑧建立联合索引的情况,芜湖起飞,直接0.25s。此时我们用EXPLAIN
查看命中的也是 联合索引
mysql> ALTER TABLE student_info ADD INDEX idx_sid_cre_time(student_id,create_time DESC); Query OK, 0 rows affected (2.09 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; +------------+-----+ | student_id | num | +------------+-----+ | 1226 | 8 | .....此处省略n行...... | 1400 | 2 | +------------+-----+ 100 rows in set (0.25 sec) mysql> EXPLAIN SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; +----+-------------+--------------+------------+-------+--------------------------+------------------+---------+------+--------+----------+----------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------+------------+-------+--------------------------+------------------+---------+------+--------+----------+----------------------------------------------+ | 1 | SIMPLE | student_info | NULL | index | idx_sid,idx_sid_cre_time | idx_sid_cre_time | 10 | NULL | 997130 | 100.00 | Using index; Using temporary; Using filesort | +----+-------------+--------------+------------+-------+--------------------------+------------------+---------+------+--------+----------+----------------------------------------------+ 1 row in set, 1 warning (0.00 sec)
⑨再来测试,交换字段顺序建立联合索引idx_cre_time_sid
,耗时5.24s。下面查询真正使用的索引key
是idx_sid
mysql> ALTER TABLE student_info ADD INDEX idx_cre_time_sid(create_time DESC,student_id); Query OK, 0 rows affected (2.10 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> DROP INDEX idx_sid_cre_time ON student_info; #删除联合索引idx_sid_cre_time Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> show INDEX from student_info; # 查看student_info中的索引 +--------------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +--------------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+ | student_info | 0 | PRIMARY | 1 | id | A | 993366 | NULL | NULL | | BTREE | | | YES | NULL | | student_info | 1 | idx_sid | 1 | student_id | A | 199180 | NULL | NULL | | BTREE | | | YES | NULL | | student_info | 1 | idx_cre_time | 1 | create_time | A | 82 | NULL | NULL | YES | BTREE | | | YES | NULL | | student_info | 1 | idx_cre_time_sid | 1 | create_time | D | 77 | NULL | NULL | YES | BTREE | | | YES | NULL | | student_info | 1 | idx_cre_time_sid | 2 | student_id | A | 967825 | NULL | NULL | | BTREE | | | YES | NULL | +--------------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+ 5 rows in set (0.00 sec) mysql> SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100; +------------+-----+ | student_id | num | +------------+-----+ | 64044 | 1 | .....此处省略n行...... | 101052 | 1 | | 152620 | 1 | +------------+-----+ 100 rows in set (5.24 sec) mysql> EXPLAIN SELECT student_id,COUNT(*) AS num FROM student_info -> GROUP BY student_id -> ORDER BY create_time DESC -> LIMIT 100;#起作用的是idx_sid +----+-------------+--------------+------------+-------+--------------------------+---------+---------+------+--------+----------+---------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------+------------+-------+--------------------------+---------+---------+------+--------+----------+---------------------------------+ | 1 | SIMPLE | student_info | NULL | index | idx_sid,idx_cre_time_sid | idx_sid | 4 | NULL | 997130 | 100.00 | Using temporary; Using filesort | +----+-------------+--------------+------------+-------+--------------------------+---------+---------+------+--------+----------+---------------------------------+ 1 row in set, 1 warning (0.00 sec)
**总结:**如果我们仅仅使用GROUP BY 或者 ORDER BY,且后面只有一个字段,则单独建立索引;如果后面跟多个字段,则建立联合索引。如果既有GROUP BY 又有 ORDER BY,那就建立联合索引,且GROUP BY的字段写在前面,ORDER BY的字段写在后面。8.0后的版本也可以考虑使用降序索引