Mysql页面crash问题复现&恢复方法

本文涉及的产品
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,高可用系列 2核4GB
简介: title: mysql页面crash问题复现&恢复方法 date: 2019-07-25 11:33:06 categories: Mysql 看到一个MYSQL单节点CRASH的案例,这里用MYSQL5.
title: mysql页面crash问题复现&恢复方法
date: 2019-07-25 11:33:06
categories: Mysql

看到一个MYSQL单节点CRASH的案例,这里用MYSQL5.7尝试复现一个页面损坏的场景,并尝试恢复数据!

构造数据

sysbench构造10测试表

/home/mingjie.gmj/bin/sysbench-1.0.16/bin/sysbench oltp_common --threads=64 --events=0 --mysql-socket=/home/mingjie.gmj/databases/data/mydata5405/mysql5405.sock --mysql-user=root --tables=10 --table_size=1000 prepare
sysbench 1.0.16 (using bundled LuaJIT 2.1.0-beta2)

Initializing worker threads...

Creating table 'sbtest4'...
Creating table 'sbtest8'...
Creating table 'sbtest6'...
Creating table 'sbtest2'...
Creating table 'sbtest10'...
Creating table 'sbtest7'...
Creating table 'sbtest1'...
Creating table 'sbtest5'...
Creating table 'sbtest9'...
Creating table 'sbtest3'...
Inserting 1000 records into 'sbtest10'
Inserting 1000 records into 'sbtest7'
Inserting 1000 records into 'sbtest1'
Inserting 1000 records into 'sbtest2'
Inserting 1000 records into 'sbtest4'
Inserting 1000 records into 'sbtest6'
Inserting 1000 records into 'sbtest3'
Inserting 1000 records into 'sbtest9'
Inserting 1000 records into 'sbtest8'
Inserting 1000 records into 'sbtest5'
Creating a secondary index on 'sbtest10'...
Creating a secondary index on 'sbtest6'...
Creating a secondary index on 'sbtest1'...
Creating a secondary index on 'sbtest5'...
Creating a secondary index on 'sbtest7'...
Creating a secondary index on 'sbtest3'...
Creating a secondary index on 'sbtest2'...
Creating a secondary index on 'sbtest9'...
Creating a secondary index on 'sbtest8'...
Creating a secondary index on 'sbtest4'...

查看表数据

mysql> show tables;
+------------------+
| Tables_in_sbtest |
+------------------+
| sbtest1         |
| sbtest10         |
| sbtest2         |
| sbtest3         |
| sbtest4         |
| sbtest5         |
| sbtest6         |
| sbtest7         |
| sbtest8         |
| sbtest9         |
+------------------+
10 rows in set (0.00 sec)
mysql> select * from sbtest9 limit 1\G
*************************** 1. row ***************************
id: 1
k: 505
c: 92419600644-86829681637-42100581414-80298414140-81768158898-74430369956-50895721992-62087272403-75473465539-28369755814
pad: 65092491791-76928308446-68130154933-07155890946-00453047346
1 row in set (0.00 sec)

 

手动破坏表文件

对表sbtest9,模拟页面损坏,首先查看数据文件位置

mysql> show global variables like '%datadir%';
+---------------+----------------------------------------------+
| Variable_name | Value                                       |
+---------------+----------------------------------------------+
| datadir       | /home/mingjie.gmj/databases/data/mydata5405/ |
+---------------+----------------------------------------------+
1 row in set (0.01 sec)

vim直接删除前十个字符!

vi sbtest9.ibd

删除后查询还可以成功(表文件删了也能查)因为现在查的还是缓存。

 

重启数据库触发问题

正常重启

/home/mingjie.gmj/databases/mysql5405/bin/mysqladmin -S /home/mingjie.gmj/databases/data/mydata5405/mysql5405.sock -uroot -p shutdown

/bin/sh /home/mingjie.gmj/databases/mysql5405/bin/mysqld_safe --defaults-file=/home/mingjie.gmj/databases/data/mydata5405/my.cnf &

切换数据库时,数据库crash!

mysql> use sbtest
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> 190725 14:32:01 mysqld_safe Number of processes running now: 0
190725 14:32:01 mysqld_safe mysqld restarted
190725 14:32:02 mysqld_safe mysqld from pid file /home/mingjie.gmj/databases/data/mydata5405/iZbp1d4tisi44j6vxze02fZ.pid ended

mysql> use sbtest
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/home/mingjie.gmj/databases/data/mydata5405/mysql5405.sock' (2)
ERROR:
Can't connect to the server

先看下重启启动时的日志,已经有一些问题了

提示pageheader错误(因为我前面删的是文件头部的数据)

2019-07-25 14:32:02 6933 [Note] InnoDB: Database was not shutdown normally!
2019-07-25 14:32:02 6933 [Note] InnoDB: Starting crash recovery.
2019-07-25 14:32:02 6933 [Note] InnoDB: Reading tablespace information from the .ibd files...
2019-07-25 14:32:02 6933 [ERROR] InnoDB: Space id in fsp header 1441792,but in the page header 0
2019-07-25 14:32:02 6933 [ERROR] InnoDB: innodb-page-size mismatch in tablespace ./sbtest/sbtest9.ibd (table sbtest/sbtest9)
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size:1024 Pages to analyze:64
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size: 1024, Possible space_id count:0
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size:2048 Pages to analyze:64
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size: 2048, Possible space_id count:0
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size:4096 Pages to analyze:64
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size: 4096, Possible space_id count:0
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size:8192 Pages to analyze:43
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size: 8192, Possible space_id count:0
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size:16384 Pages to analyze:21
2019-07-25 14:32:02 6933 [Note] InnoDB: Page size: 16384, Possible space_id count:0
2019-07-25 14:32:02 7f4e65962740 InnoDB: Operating system error number 2 in a file operation.

在看一下crash的报错

InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
InnoDB: Error: could not open single-table tablespace file ./sbtest/sbtest9.ibd
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
190725 14:32:02 mysqld_safe mysqld from pid file /home/mingjie.gmj/databases/data/mydata5405/iZbp1d4tisi44j6vxze02fZ.pid ended

注意看下3),已经建议怎么操作了!

InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.

 

开始修复

尝试再次重启数据库报一样的错误

现在数据库已经不可服务了!需要尽快恢复!

增加参数force拉起数据库innodb_force_recovery = 1

2019-07-25 14:42:33 8563 [Note] InnoDB: innodb_force_recovery was set to 1. Continuing crash recovery even though we cannot access the .ibd file of this table.
2019-07-25 14:42:33 8563 [Note] InnoDB: Restoring possible half-written data pages
2019-07-25 14:42:33 8563 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Last MySQL binlog file position 0 1914387, file name mysql-bin.000013
2019-07-25 14:42:33 8563 [ERROR] InnoDB: Table sbtest/sbtest9 in the InnoDB data dictionary has tablespace id 43, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
2019-07-25 14:42:33 8563 [Note] InnoDB: 128 rollback segment(s) are active.
2019-07-25 14:42:33 8563 [Note] InnoDB: Waiting for purge to start
2019-07-25 14:42:33 8563 [Note] InnoDB: 5.6.44 started; log sequence number 4247920472
2019-07-25 14:42:33 8563 [Note] InnoDB: !!! innodb_force_recovery is set to 1 !!!
2019-07-25 14:42:33 8563 [Note] Recovering after a crash using /home/mingjie.gmj/databases/data/mydata5405/mysql-bin
2019-07-25 14:42:33 8563 [Note] Starting crash recovery...
2019-07-25 14:42:33 8563 [Note] Crash recovery finished.
2019-07-25 14:42:33 8563 [Note] Server hostname (bind-address): '*'; port: 5405
2019-07-25 14:42:33 8563 [Note] IPv6 is available.
2019-07-25 14:42:33 8563 [Note]   - '::' resolves to '::';
2019-07-25 14:42:33 8563 [Note] Server socket created on IP: '::'.
InnoDB: innodb_force_recovery is on: we do not allow
InnoDB: database modifications by the user. Shut down
InnoDB: mysqld and edit my.cnf so thatInnoDB: innodb_force_... is removed.
2019-07-25 14:42:33 8563 [ERROR] Error writing master configuration.
2019-07-25 14:42:33 8563 [ERROR] Error reading master configuration.
InnoDB: innodb_force_recovery is on: we do not allow
InnoDB: database modifications by the user. Shut down
InnoDB: mysqld and edit my.cnf so thatInnoDB: innodb_force_... is removed.
2019-07-25 14:42:33 8563 [ERROR] Error writing relay log configuration.
2019-07-25 14:42:33 8563 [ERROR] Error reading relay log configuration.
2019-07-25 14:42:33 8563 [ERROR] Failed to initialize the master info structure
2019-07-25 14:42:33 8563 [Note] Check error log for additional messages. You will not be able to start replication until the issue is resolved and the server restarted.
2019-07-25 14:42:33 8563 [Note] Event Scheduler: Loaded 0 events
2019-07-25 14:42:33 8563 [Note] /home/mingjie.gmj/databases/mysql5405/bin/mysqld: ready for connections.
Version: '5.6.44-log' socket: '/home/mingjie.gmj/databases/data/mydata5405/mysql5405.sock' port: 5405 Source distribution

连上去尝试修复数据

mysql> create table sbtest9_recovery like sbtest9;
ERROR 1146 (42S02): Table 'sbtest.sbtest9' doesn't exist

文件头被损坏了,数据库已经读不到这个表了!只能干掉这个表了

mysql> drop table sbtest9;
Query OK, 0 rows affected (0.01 sec)

注释参数重启后恢复正常,但是数据已经无法恢复了!

 

启动起来之后,需要尝试查询有问题的表,如果只是某个页面损坏了应该大部分数据是可以查出来的,可以用这样的SQL来试

insert ignore into test_recovery select * from test limit 10;
insert ignore into test_recovery select * from test limit 20;
insert ignore into test_recovery select * from test limit 30;

 

页面crash可以出现各种个样的情况,我在测试的过程中出现了很多种情况,有的情况改参数也拉不起来数据库,还有COREDUMP等等,具体问题需要具体分析!

 

附:参数说明

innodb_force_recovery

1 (SRV_FORCE_IGNORE_CORRUPT): 忽略检查到的 corrupt 页。尽管检测到了损坏的 page 仍强制服务运行。一般设置为该值即可,然后 dump 出库表进行重建。

2 (SRV_FORCE_NO_BACKGROUND): 阻止主线程的运行,如主线程需要执行 full purge 操作,会导致 crash。 阻止 master thread 和任何 purge thread 运行。若 crash 发生在 purge 环节则使用该值。

3 (SRV_FORCE_NO_TRX_UNDO): 不执行事务回滚操作。

4 (SRV_FORCE_NO_IBUF_MERGE): 不执行插入缓冲的合并操作。如果可能导致崩溃则不要做这些操作。不要进行统计操作。该值可能永久损坏数据文件。若使用了该值,则将来要删除和重建辅助索引。

5 (SRV_FORCE_NO_UNDO_LOG_SCAN): 不查看重做日志,InnoDB 存储引擎会将未提交的事务视为已提交。此时 InnoDB 甚至把未完成的事务按照提交处理。该值可能永久性的损坏数据文件。

6 (SRV_FORCE_NO_LOG_REDO): 不执行前滚的操作。恢复时不做 redo log roll-forward。使数据库页处于废止状态,继而可能引起 B 树或者其他数据库结构更多的损坏。

官方文档:Forcing InnoDB Recovery

  • 1 (SRV_FORCE_IGNORE_CORRUPT)

    Lets the server run even if it detects a corrupt page. Tries to make SELECT * FROM *tbl_name* jump over corrupt index records and pages, which helps in dumping tables.

  • 2 (SRV_FORCE_NO_BACKGROUND)

    Prevents the master thread and any purge threads from running. If a crash would occur during the purge operation, this recovery value prevents it.

  • 3 (SRV_FORCE_NO_TRX_UNDO)

    Does not run transaction rollbacks after crash recovery.

  • 4 (SRV_FORCE_NO_IBUF_MERGE)

    Prevents insert buffer merge operations. If they would cause a crash, does not do them. Does not calculate tablestatistics. This value can permanently corrupt data files. After using this value, be prepared to drop and recreate all secondary indexes. Sets InnoDB to read-only.

  • 5 (SRV_FORCE_NO_UNDO_LOG_SCAN)

    Does not look at undo logs when starting the database: InnoDB treats even incomplete transactions as committed. This value can permanently corrupt data files. Sets InnoDB to read-only.

  • 6 (SRV_FORCE_NO_LOG_REDO)

    Does not do the redo log roll-forward in connection with recovery. This value can permanently corrupt data files. Leaves database pages in an obsolete state, which in turn may introduce more corruption into B-trees and other database structures. Sets InnoDB to read-only.

相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
25天前
|
关系型数据库 MySQL 索引
MySQL的全文索引查询方法
【8月更文挑战第26天】MySQL的全文索引查询方法
37 0
|
1月前
|
存储 关系型数据库 MySQL
mysql数据库查询时用到的分页方法有哪些
【8月更文挑战第16天】在MySQL中,实现分页的主要方法包括:1)使用`LIMIT`子句,简单直接但随页数增加性能下降;2)通过子查询优化`LIMIT`分页,提高大页码时的查询效率;3)利用存储过程封装分页逻辑,便于复用但需额外维护;4)借助MySQL变量实现,可能提供更好的性能但实现较复杂。这些方法各有优缺点,可根据实际需求选择适用方案。
128 2
|
4天前
|
存储 关系型数据库 MySQL
技术解析:MySQL中取最新一条重复数据的方法
以上提供的两种方法都可以有效地从MySQL数据库中提取每个类别最新的重复数据。选择哪种方法取决于具体的使用场景和MySQL版本。子查询加分组的方法兼容性更好,适用于所有版本的MySQL;而窗口函数方法代码更简洁,执行效率可能更高,但需要MySQL 8.0及以上版本。在实际应用中,应根据数据量大小、查询性能需求以及MySQL版本等因素综合考虑,选择最合适的实现方案。
26 6
|
3天前
|
存储 缓存 关系型数据库
MySQL 查询优化方法
在数据库应用中,高效的查询性能至关重要。本文探讨了常用的 MySQL 查询优化方法,包括索引优化(选择合适的索引字段、复合索引、定期维护索引)、查询语句优化(避免全表扫描、限制返回行数、避免使用不必要的函数)、表结构优化(选择合适的数据类型、分区表、定期清理无用数据)及数据库配置优化(调整缓存大小、优化存储引擎参数)。通过这些方法,可以显著提高 MySQL 的查询性能,为应用程序提供更好的用户体验。
|
16天前
|
SQL 关系型数据库 MySQL
学习MySQL操作的有效方法
学习MySQL操作的有效方法
29 3
|
4月前
|
SQL 关系型数据库 MySQL
MYSQL根据查询结果删除sql 去除重复id 新增对比前一条与后一条数据 去重3种方法​ 窗口函数
MYSQL根据查询结果删除sql 去除重复id 新增对比前一条与后一条数据 去重3种方法​ 窗口函数
|
1月前
|
安全 关系型数据库 MySQL
分享一个 MySQL 简单快速进行自动备份和还原的脚本和方法
分享一个 MySQL 简单快速进行自动备份和还原的脚本和方法
|
3月前
|
SQL 关系型数据库 MySQL
MYSQL————DDL方法使用(包含在数据库,以及具体数据库表格的一些操纵)
MYSQL————DDL方法使用(包含在数据库,以及具体数据库表格的一些操纵)
|
3月前
|
存储 算法 关系型数据库
mysql存储地理信息的方法
MySQL 支持 `GEOMETRY` 及其子类型(如 `POINT`, `LINESTRING`, `POLYGON`)存储地理信息,并提供 `SPATIAL` 索引来加速查询。创建带有 `SPATIAL INDEX` 的表,使用 `GeomFromText` 或 `PointFromText` 插入数据,通过 `MBRContains`, `Distance_Sphere`, `ST_Distance_Sphere` 等函数查询。例如,查找矩形区域内位置、一定距离内的点,以及判断点是否在多边形内并计算距离。
63 1
|
3月前
|
SQL 关系型数据库 MySQL
MySQL大数据量分页查询方法及其优化
MySQL大数据量分页查询方法及其优化

热门文章

最新文章