大家好,请教一个问题,前段时间我遇到一个问题,我们的数据库是基于pg9.3版本的,操作系统是Centos6.4操作系统,不知道什么原因,导致数据库报错 worker took too long to start; canceled 从而导致服务器的CPU的使用率达到了96%,进而导致数据库和服务器宕机,请问这大概是什么原因,怎么引起的呢?明白原因后我好继续排查问题,希望大家不吝赐教,谢谢大家。
附日志部分报错信息:
2016-03-31 05:04:53.993 CST - zyml - glptuser - fb09 : LOG: process 108742 acquired ExclusiveLock on extension of relation 80385 of database 24576 after 2803790.559 ms
2016-03-31 05:04:53.993 CST - zyml - glptuser - fb09 : STATEMENT: insert into T_ZY_VALUE_TEMP (C_ID,N_ZY_ID,N_FY,D_TJSJ,N_VALUE,N_ZL,N_JB) select replace(''||uuid_generate_v4(), '-', '') as C_ID,N_ZY_ID,N_JBFY as N_FY,now() as D_TJSJ,N_VALUE,N_ZL, 1 from ( SELECT COUNT (1) as N_VALUE,MSES.N_JBFY as N_JBFY,437 as N_ZY_ID,1 as N_ZL from DB_MSES.T_MSESDSR DSR INNER JOIN DB_MSES.T_MSESYASTML STML ON DSR.N_AJBS = STML.N_AJBS AND DSR.N_DSR=STML.N_XH INNER JOIN DB_MSES.T_MSES MSES ON DSR.N_AJBS = MSES.N_AJBS WHERE (STML.N_LX = 1 OR STML.N_LX = 4) AND STML.N_SF IS NULL GROUP BY MSES.N_JBFY) as temp
2016-03-31 05:05:54.672 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 05:07:05.375 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 05:18:38.168 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 05:52:36.340 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 07:16:01.209 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 07:31:17.312 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 07:40:58.997 CST - [unknown] - appuser - postgres : LOG: could not receive data from client: Connection reset by peer
2016-03-31 07:59:57.668 CST - - - : WARNING: autovacuum worker started without a worker entry
2016-03-31 08:04:53.137 CST - - - : WARNING: autovacuum worker started without a worker entry
2016-03-31 08:29:09.090 CST - [unknown] - appuser - postgres : LOG: could not receive data from client: Connection reset by peer
2016-03-31 09:32:50.259 CST - - - : WARNING: worker took too long to start; canceled
2016-03-31 09:35:41.776 CST - [unknown] - [unknown] - [unknown] : WARNING: pg_getnameinfo_all() failed: 域名解析暂时失败
2016-03-31 09:37:21.383 CST - - - : WARNING: worker took too long to start; canceled
从报错上来看,像是因为autovacuum launcher想要创建新的worker,但是却发现上次创建计划仍然没有成功,但是不确定,如果再次遇到这样的问题,请把对应的调用栈保存下来。
有方法能重现吗?
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。