btrfs vs ext4 fsync

简介:

在虚拟机中,单个块设备下的fdatasync性能约为ext4的1/3。

PostgreSQL有很多地方会用到fsync,例如flush xlog,检查点,创建数据库,alter database move tablespace ,重写表,pg_clog等等。

参考:

http://blog.163.com/digoal@126/blog/static/1638770402015840480734/

fsync的性能直接影响数据库的性能。

以下是在CentOS 7 x64中的对比,btrfs 使用4.3.1的版本源码编译。

http://blog.163.com/digoal@126/blog/static/16387704020151025102118544/


ext4:

[root@digoal ~]# mkfs.ext4 /dev/sdb1
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
2621440 inodes, 10485504 blocks
524275 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2157969408
320 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000, 7962624

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done   

[root@digoal ~]# mount /dev/sdb1 /data01 -o defaults,noatime,nodiratime,discard,data=ordered
[root@digoal ~]# cd /data01/
[root@digoal data01]# /opt/pgsql9.5/bin/pg_test_fsync 
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      5496.006 ops/sec     182 usecs/op
        fdatasync                          5357.773 ops/sec     187 usecs/op
        fsync                              2872.555 ops/sec     348 usecs/op
        fsync_writethrough                            n/a
        open_sync                          3059.961 ops/sec     327 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      2997.891 ops/sec     334 usecs/op
        fdatasync                          4980.309 ops/sec     201 usecs/op
        fsync                              2934.537 ops/sec     341 usecs/op
        fsync_writethrough                            n/a
        open_sync                          1608.287 ops/sec     622 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
         1 * 16kB open_sync write          2909.899 ops/sec     344 usecs/op
         2 *  8kB open_sync writes         1565.073 ops/sec     639 usecs/op
         4 *  4kB open_sync writes          830.664 ops/sec    1204 usecs/op
         8 *  2kB open_sync writes          459.544 ops/sec    2176 usecs/op
        16 *  1kB open_sync writes          227.552 ops/sec    4395 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
        write, fsync, close                3082.501 ops/sec     324 usecs/op
        write, close, fsync                2798.324 ops/sec     357 usecs/op

Non-sync'ed 8kB writes:
        write                            300198.383 ops/sec       3 usecs/op

btrfs默认性能:

[root@digoal ~]# mkfs.btrfs /dev/sdb1 -f
btrfs-progs v4.3.1
See http://btrfs.wiki.kernel.org for more information.

Label:              (null)
UUID:               26f9fd42-0933-4382-8124-437091e1cddf
Node size:          16384
Sector size:        4096
Filesystem size:    40.00GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         DUP               1.01GiB
  System:           DUP              12.00MiB
SSD detected:       no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1    40.00GiB  /dev/sdb1

[root@digoal ~]# mount /dev/sdb1 /data01
[root@digoal ~]# cd /data01/
[root@digoal data01]# /opt/pgsql9.5/bin/pg_test_fsync 
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                       672.325 ops/sec    1487 usecs/op
        fdatasync                           460.352 ops/sec    2172 usecs/op
        fsync                               385.227 ops/sec    2596 usecs/op
        fsync_writethrough                            n/a
        open_sync                           392.941 ops/sec    2545 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                       179.161 ops/sec    5582 usecs/op
        fdatasync                           358.958 ops/sec    2786 usecs/op
        fsync                               518.578 ops/sec    1928 usecs/op
        fsync_writethrough                            n/a
        open_sync                           273.567 ops/sec    3655 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
         1 * 16kB open_sync write           566.545 ops/sec    1765 usecs/op
         2 *  8kB open_sync writes          268.357 ops/sec    3726 usecs/op
         4 *  4kB open_sync writes          144.014 ops/sec    6944 usecs/op
         8 *  2kB open_sync writes           79.028 ops/sec   12654 usecs/op
        16 *  1kB open_sync writes           31.814 ops/sec   31433 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
        write, fsync, close                 570.831 ops/sec    1752 usecs/op
        write, close, fsync                 562.849 ops/sec    1777 usecs/op

Non-sync'ed 8kB writes:
        write                            225085.038 ops/sec       4 usecs/op

btrfs优化后:
(data只存一份,使用4K的node size减少写锁冲突, 关闭压缩,使用space cache,关闭data cow。)

[root@digoal ~]# mkfs.btrfs /dev/sdb1 -m single -n 4096 -f
btrfs-progs v4.3.1
See http://btrfs.wiki.kernel.org for more information.

Label:              (null)
UUID:               1e859a5c-570b-4426-83ac-b73a473d1936
Node size:          4096
Sector size:        4096
Filesystem size:    40.00GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         single            8.00MiB
  System:           single            4.00MiB
SSD detected:       no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1    40.00GiB  /dev/sdb1

[root@digoal ~]# mount /dev/sdb1 /data01 -o ssd,discard,nodatacow,noatime,nodiratime,compress=no,space_cache
[root@digoal ~]# cd /data01/
[root@digoal data01]# /opt/pgsql9.5/bin/pg_test_fsync 
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      1424.383 ops/sec     702 usecs/op
        fdatasync                          1870.474 ops/sec     535 usecs/op
        fsync                              1816.084 ops/sec     551 usecs/op
        fsync_writethrough                            n/a
        open_sync                          1458.938 ops/sec     685 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                       750.109 ops/sec    1333 usecs/op
        fdatasync                          1747.257 ops/sec     572 usecs/op
        fsync                              1729.970 ops/sec     578 usecs/op
        fsync_writethrough                            n/a
        open_sync                           723.056 ops/sec    1383 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
         1 * 16kB open_sync write          1413.624 ops/sec     707 usecs/op
         2 *  8kB open_sync writes          720.379 ops/sec    1388 usecs/op
         4 *  4kB open_sync writes          352.704 ops/sec    2835 usecs/op
         8 *  2kB open_sync writes          157.877 ops/sec    6334 usecs/op
        16 *  1kB open_sync writes           73.355 ops/sec   13632 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
        write, fsync, close                1827.975 ops/sec     547 usecs/op
        write, close, fsync                1664.630 ops/sec     601 usecs/op

Non-sync'ed 8kB writes:
        write                            243183.732 ops/sec       4 usecs/op

btrfs使用条带性能可以进一步提升。

btrfs mount参数

https://btrfs.wiki.kernel.org/index.php/Mount_options

目录
相关文章
|
数据中心 Anolis
性能优化特性之:LSE指令集编译优化
本文介绍了倚天实例上的编译优化特性:LSE,并从优化原理、使用方法进行了详细阐述。
|
Linux Anolis
性能优化特性之:EXT4 Fast Commit
本文介绍了倚天实例上进行IO优化的特性:Fast Commit,并对其优化原理、使用方法进行了详细阐述
|
5月前
|
SQL 关系型数据库 MySQL
MySQL group by 底层原理详解。group by 执行 慢 原因深度分析。(图解+秒懂+史上最全)
MySQL group by 底层原理详解。group by 执行 慢 原因深度分析。(图解+秒懂+史上最全)
MySQL group by 底层原理详解。group by 执行 慢 原因深度分析。(图解+秒懂+史上最全)
|
存储 监控 固态存储
在高并发环境下,如何优化 WAL 的写入性能?
在高并发环境下,如何优化 WAL 的写入性能?
251 2
|
机器学习/深度学习 人工智能 搜索推荐
人工智能在医疗领域的伦理困境与未来展望
【8月更文挑战第10天】本文深入探讨了人工智能技术在医疗领域应用中所面临的伦理挑战,并展望了其未来的发展趋势。通过分析AI在提高诊断准确性、个性化治疗和药物研发等方面的积极作用,同时指出了数据隐私、责任归属和机器偏见等伦理问题,文章旨在促进对AI技术在医疗领域应用的全面理解和审慎态度。
|
缓存 开发框架 NoSQL
【Azure Redis 缓存】Azure Redis 异常 - 因线程池Busy而产生的Timeout异常问题
【Azure Redis 缓存】Azure Redis 异常 - 因线程池Busy而产生的Timeout异常问题
247 0
|
Linux 数据安全/隐私保护
【转】阿里云服务器入门使用流程 新手学习教程
一、阿里云根据个人需要选合适的云服务器,选好cpu、内存、带宽,地域,这四个是主要的。其他可以默认选择。
5801 1
【转】阿里云服务器入门使用流程 新手学习教程
|
缓存 关系型数据库 MySQL
【MySQL】read_rnd_buffer_size=4M,是干什么的?底层原理是什么?
【MySQL】read_rnd_buffer_size=4M,是干什么的?底层原理是什么?
1471 0
|
传感器 算法
控制系统稳定性常见策略
控制系统稳定性常见策略
654 0
|
SQL 算法 关系型数据库
MySQL Online DDL原理解读
MySQL Online DDL原理解读