PostgreSQL 12 preview - 可靠性提升 - data_sync_retry 消除os层write back failed status不可靠的问题-阿里云开发者社区

PostgreSQL 12 preview - 可靠性提升 - data_sync_retry 消除os层write back failed status不可靠的问题

2020-02-25 572

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云原生数据库 PolarDB MySQL 版，通用型 2核4GB 50GB

云原生数据库 PolarDB PostgreSQL 版，标准版 2核4GB 50GB

简介： 标签 PostgreSQL , data_sync_retry , write back , retry , failed status 背景有些OS系统，对fsync的二次调用不敏感，因为OS层可能有自己的CACHE，如果使用了buffer write，并且出现write back failed的情况，有些OS可能在下次fsync时并不能正确的反馈fsync的可靠性与否。（因为这个B

背景

有些OS系统，对fsync的二次调用不敏感，因为OS层可能有自己的CACHE，如果使用了buffer write，并且出现write back failed的情况，有些OS可能在下次fsync时并不能正确的反馈fsync的可靠性与否。（因为这个BLOCK上一次write back可能已失败，并且状态未被正确的维护，所以后面发起的fsync实际上正确与否不得而知）

PG 的数据文件，WAL文件,CLOG文件等重要文件相关的进程：bgwriter, wal writer, backend process都有用到buffer write，如果OS层失守（即fsync retry不可靠）那么曾经的write back failed，在checkpoint时使用fsync返回可能成功，使得数据文件中可能存在损坏的BLOCK，需要使用wal修复，然而数据库收到的OS fsync返回是正确的，所以会认为checkpoint是成功的，不会使用wal去修复它。

PG 12修正了这个问题，并且对所有版本做了back patch。

PANIC on fsync() failure.  
  
On some operating systems, it doesn't make sense to retry fsync(),  
because dirty data cached by the kernel may have been dropped on  
write-back failure.  In that case the only remaining copy of the  
data is in the WAL.  A subsequent fsync() could appear to succeed,  
but not have flushed the data.  That means that a future checkpoint  
could apparently complete successfully but have lost data.  
  
Therefore, violently prevent any future checkpoint attempts by  
panicking on the first fsync() failure.  Note that we already  
did the same for WAL data; this change extends that behavior to  
non-temporary data files.  
  
Provide a GUC data_sync_retry to control this new behavior, for  
users of operating systems that don't eject dirty data, and possibly  
forensic/testing uses.  If it is set to on and the write-back error  
was transient, a later checkpoint might genuinely succeed (on a  
system that does not throw away buffers on failure); if the error is  
permanent, later checkpoints will continue to fail.  The GUC defaults  
to off, meaning that we panic.  
  
Back-patch to all supported releases.  
  
There is still a narrow window for error-loss on some operating  
systems: if the file is closed and later reopened and a write-back  
error occurs in the intervening time, but the inode has the bad  
luck to be evicted due to memory pressure before we reopen, we could  
miss the error.  A later patch will address that with a scheme  
for keeping files with dirty data open at all times, but we judge  
that to be too complicated to back-patch.  
  
Author: Craig Ringer, with some adjustments by Thomas Munro  
Reported-by: Craig Ringer  
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund  
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de

用户可设置参数

data_sync_retry (boolean)

When set to off, which is the default, PostgreSQL will raise a PANIC-level error on failure to flush modified data files to the filesystem. This causes the database server to crash. This parameter can only be set at server start.

On some operating systems, the status of data in the kernel's page cache is unknown after a write-back failure. In some cases it might have been entirely forgotten, making it unsafe to retry; the second attempt may be reported as successful, when in fact the data has been lost. In these circumstances, the only way to avoid data loss is to recover from the WAL after any failure is reported, preferably after investigating the root cause of the failure and replacing any faulty hardware.

If set to on, PostgreSQL will instead report an error but continue to run so that the data flushing operation can be retried in a later checkpoint. Only set it to on after investigating the operating system's treatment of buffered data in case of write-back failure.

默认值是安全的。

如果你要设置为ON，务必确保OS层的fsync是可以retry并且可靠的。

引申

1、当前数据库做法是data_sync_retry直接disable，即报错。实际上可以尝试从WAL中提取对应failed block 的FPW以及后面的变化量进行修复，避免直接crash对使用者的体感不好。

参考

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9ccdd7f66e3324d2b6d3dec282cfa9ff084083f1

这个patch对所有版本都已fix，所以在PG 11上也有这个patch。

PostgreSQL 12 preview - 可靠性提升 - data_sync_retry 消除os层write back failed status不可靠的问题

标签

背景

引申

参考

免费领取阿里云RDS PostgreSQL实例、ECS虚拟机

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

PostgreSQL 12 preview - 可靠性提升 - data_sync_retry 消除os层write back failed status不可靠的问题

标签

背景

引申

参考

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像