[ERROR] WSREP no such a transition REPLICATING

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 高可用系列,价值2615元额度,1个月
简介:

满心欢喜的测试高大上的PXC,还没折腾几天就碰到了Bug,主要的错误提示为[ERROR] WSREP: FSM: no such a transition REPLICATING -> REPLICATING,后面的描述是碰到了Bug。本文是具体描述及其解决方案。

一、故障现象

以下为mysql error log日志捕获到的信息

2018-01-26T15:00:00.736954+08:00 2109 [Warning] WSREP: Percona-XtraDB-Cluster doesn't recommend using SERIALIZABLE isolation with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.737164+08:00 2116 [Warning] WSREP: Percona-XtraDB-Cluster doesn't recommend using SERIALIZABLE isolation with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.738619+08:00 2113 [Warning] WSREP: Percona-XtraDB-Cluster doesn't recommend using SERIALIZABLE isolation with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.738717+08:00 2112 [Warning] WSREP: Percona-XtraDB-Cluster doesn't recommend use of DML command on a table (S70.T7048) without an explicit primary key with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.738809+08:00 2112 [Warning] Event Scheduler: [root@localhost][S70.EVT_T7048] Percona-XtraDB-Cluster doesn't recommend use of DML command on a table (S70.T7048) without an explicit primary key with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.739163+08:00 2111 [Warning] WSREP: Percona-XtraDB-Cluster doesn't recommend using SERIALIZABLE isolation with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.739465+08:00 2114 [Warning] WSREP: Percona-XtraDB-Cluster doesn't recommend using SERIALIZABLE isolation with pxc_strict_mode = PERMISSIVE
2018-01-26T15:00:00.741091+08:00 2110 [ERROR] WSREP: FSM: no such a transition REPLICATING -> REPLICATING
07:00:00 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=67108864
read_buffer_size=2097152
max_used_connections=0
max_threads=1001
thread_count=14
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 18512159 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f6f7800e110
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f6fe0059abf thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0xf3de9b]
/usr/sbin/mysqld(handle_fatal_signal+0x471)[0x7adfc1]
/lib64/libpthread.so.0(+0xf370)[0x7f71318fb370]
/lib64/libc.so.6(gsignal+0x37)[0x7f712fa051d7] 
/lib64/libc.so.6(abort+0x148)[0x7f712fa068c8]   
/usr/lib64/galera3/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2_+0x1c1)[0x7f711d7b80b1]
/usr/lib64/galera3/libgalera_smm.so(_ZN6galera13ReplicatorSMM9replicateEPNS_9TrxHandleEP14wsrep_trx_meta+0x1c8)[0x7f711d7abdf8]
/usr/lib64/galera3/libgalera_smm.so(galera_replicate+0xdf)[0x7f711d7c143f]
/usr/sbin/mysqld(_Z15wsrep_replicateP3THD+0x897)[0xde0ee7]
/usr/sbin/mysqld(_Z14ha_prepare_lowP3THDb+0x76)[0x829206]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDbb+0x19f)[0x828a1f]
/usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x38)[0xdb37b8]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x825)[0xcfd155]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc7bb40]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x374)[0xc7d774]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc7e15b]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x330)[0xc7f510]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x53b)[0xc7737b]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x7a7)[0xc7b017]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x7b51)[0xd04481]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc7bb40]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x374)[0xc7d774]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc7e15b]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x330)[0xc7f510]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x53b)[0xc7737b]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x7a7)[0xc7b017]
/usr/sbin/mysqld(_ZN14Event_job_data7executeEP3THDb+0xabb)[0xdd9ccb]
/usr/sbin/mysqld(_ZN19Event_worker_thread3runEP3THDP28Event_queue_element_for_exec+0x188)[0xe95cd8]
/usr/sbin/mysqld(event_worker_thread+0x57)[0xe95da7]
/usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0xfc2394]
/lib64/libpthread.so.0(+0x7dc5)[0x7f71318f3dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f712fac776d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f6f64106460): is an invalid pointer
Connection ID (thread ID): 2110
Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.

二、故障分析

按上面的描述了看,当前触发了mysql bug,更令人杯具的是,改小了参数之后竟然起不来了,我晕。
这个不好玩啊,Oracle数据库参数出错了,改一下还是可以起来,这个就更NB了,直接起不来了。

# systemctl start mysql@bootstrap
Job for mysql@bootstrap.service failed because the control process exited with error code. See "systemctl status mysql@bootstrap.service" and "journalctl -xe" for details.
# systemctl status mysql@bootstrap.service
● mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap
  Loaded: loaded (/usr/lib/systemd/system/mysql@.service; disabled; vendor preset: disabled)
  Active: failed (Result: exit-code) since Tue 2018-01-25 13:35:27 CST; 7s ago
  Process: 19792 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
  Process: 19761 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
  Process: 18679 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
  Process: 18678 ExecStart=/usr/bin/mysqld_safe --basedir=/usr ${EXTRA_ARGS} (code=exited, status=0/SUCCESS)
  Process: 18635 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 18678 (code=exited, status=0/SUCCESS)

Jan 26 15:05:27 db-50 mysql-systemd[18679]: ERROR! mysqld_safe with PID 18678 has already exited: FAILURE
Jan 26 15:05:27 db-50 systemd[1]: mysql@bootstrap.service: control process exited, code=exited status=1
Jan 26 15:05:27 db-50 mysql-systemd[19761]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Jan 26 15:05:27 db-50 mysql-systemd[19761]: ERROR! mysql already dead
Jan 26 15:05:27 db-50 systemd[1]: mysql@bootstrap.service: control process exited, code=exited status=2
Jan 26 15:05:27 db-50 mysql-systemd[19792]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Jan 26 15:05:27 db-50 mysql-systemd[19792]: WARNING: mysql may be already dead
Jan 26 15:05:27 db-50 systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap.
Jan 26 15:05:27 db-50 systemd[1]: Unit mysql@bootstrap.service entered failed state.
Jan 26 15:05:27 db-50 systemd[1]: mysql@bootstrap.service failed.
# journalctl -xe
Jan 26 13:35:48 db-50 su[18607]: (to root) robinson on pts/0
Jan 26 13:35:48 db-50 systemd[1]: Starting Cleanup of Temporary Directories...
-- Subject: Unit systemd-tmpfiles-clean.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit systemd-tmpfiles-clean.service has begun starting up.
Jan 26 15:05:12 db-50 su[18607]: pam_unix(su-l:session): session opened for user root by robin(uid=0)
Jan 26 15:05:12 db-50 systemd[1]: Started Cleanup of Temporary Directories.
-- Subject: Unit systemd-tmpfiles-clean.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit systemd-tmpfiles-clean.service has finished starting up.
-- 
-- The start-up result is done.
Jan 26 13:36:03 db-50 polkitd[533]: Registered Authentication Agent for unix-process:18629:1141138560 (system bus name :1.51491 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object
Jan 26 13:36:03 db-50 systemd[1]: Starting Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap...
-- Subject: Unit mysql@bootstrap.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mysql@bootstrap.service has begun starting up.
Jan 26 15:05:12 db-50 mysql-systemd[18679]: State transfer in progress, setting sleep higher
Jan 26 15:05:12 db-50 mysqld_safe[18678]: 2018-02-27T05:35:18.095889Z mysqld_safe Logging to '/var/log/mysqld.log'.
Jan 26 15:05:12 db-50 mysqld_safe[18678]: 2018-02-27T05:35:18.098340Z mysqld_safe Logging to '/var/log/mysqld.log'.
Jan 26 15:05:12 db-50 mysqld_safe[18678]: 2018-02-27T05:35:18.122684Z mysqld_safe Starting mysqld daemon with databases from /u02/pxcdata
Jan 26 15:05:12 db-50 mysqld_safe[18678]: 2018-02-27T05:35:18.135690Z mysqld_safe WSREP: Running position recovery with --log_error='/u02/pxcdata/wsrep_recovery.vZdxS7' --pid-file='/
Jan 26 15:05:27 db-50 mysql-systemd[18679]: /usr/bin/mysql-systemd: line 140: kill: (18678) - No such process
Jan 26 15:05:27 db-50 mysql-systemd[18679]: ERROR! mysqld_safe with PID 18678 has already exited: FAILURE
Jan 26 15:05:27 db-50 systemd[1]: mysql@bootstrap.service: control process exited, code=exited status=1
Jan 26 15:05:27 db-50 mysql-systemd[19761]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Jan 26 15:05:27 db-50 mysql-systemd[19761]: ERROR! mysql already dead
Jan 26 15:05:27 db-50 systemd[1]: mysql@bootstrap.service: control process exited, code=exited status=2
Jan 26 15:05:27 db-50 mysql-systemd[19792]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Jan 26 15:05:27 db-50 mysql-systemd[19792]: WARNING: mysql may be already dead
Jan 26 15:05:27 db-50 systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap.
-- Subject: Unit mysql@bootstrap.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Author : Leshami
-- Unit mysql@bootstrap.service has failed.
-- Blog    : http://blog.csdn.net/leshami
-- The result is failed.
Jan 26 15:05:27 db-50 systemd[1]: Unit mysql@bootstrap.service entered failed state.
Jan 26 15:05:27 db-50 systemd[1]: mysql@bootstrap.service failed.
Jan 26 15:05:27 db-50 polkitd[533]: Unregistered Authentication Agent for unix-process:18629:1141138560 (system bus name :1.51491, object path /org/freedesktop/PolicyKit1/Authenticat

三、故障解决

Google了一下,官方的解决方案是升级到5.7.20-29.24。
由于先前的方式采用了yum安装,因此可以先卸载或者直接yum update 升级(升级先备份配置文件)
升级完毕后,将配置文件copy回原路径,再次重启OK。

详细参考:https://jira.percona.com/browse/PXC-2020

相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
2月前
|
前端开发
transition第一次不生效
解决CSS transition第一次不生效的问题,确保在动画属性变化前已经设置了初始值,如示例中需给`left`属性一个初始值以实现平滑过渡效果。
37 2
|
28天前
|
机器学习/深度学习
transition-timing-function属性
【10月更文挑战第6天】transition-timing-function属性。
18 4
|
28天前
|
前端开发
transition-property 属性和transition-duration属性
【10月更文挑战第5天】transition-property 属性和transition-duration属性。
15 2
|
6月前
|
iOS开发
this code must be changed as there‘s no longer a status bar or status bar window.
this code must be changed as there‘s no longer a status bar or status bar window.
37 0
|
6月前
|
JavaScript
Can‘t get DOM width or height. Please check dom.clientWidth and dom.clientHeight. They should not be
Can‘t get DOM width or height. Please check dom.clientWidth and dom.clientHeight. They should not be
422 0
|
12月前
|
内存技术
Egret的TimerEvent.TIMER和Event.ENTER_FRAME的区别
Egret的TimerEvent.TIMER和Event.ENTER_FRAME的区别
72 0
|
Web App开发 JavaScript 测试技术