ORACLE归档日志比联机重做日志小很多的情况总结

本文涉及的产品
日志服务 SLS,月写入数据量 50GB 1个月
简介: ORACLE归档日志比联机重做日志小很多的情况   前几天一网友在群里反馈他遇到归档日志比联机重做日志(redo log)小很多的情况,个人第一次遇到这种情况,非常感兴趣,于是在一番交流沟通后,终于弄清楚了情况,顺便将这方面的资料整理于此!也是一个总结、学习过程!     归档日志比联机重做日志小很多,出现这种情况的原因有很多,我们可以从下面这几方面着手检查,一一排除确认。

ORACLE归档日志比联机重做日志小很多的情况

 

前几天一网友在群里反馈他遇到归档日志比联机重做日志(redo log)小很多的情况,个人第一次遇到这种情况,非常感兴趣,于是在一番交流沟通后,终于弄清楚了情况,顺便将这方面的资料整理于此!也是一个总结、学习过程!

 

 

归档日志比联机重做日志小很多,出现这种情况的原因有很多,我们可以从下面这几方面着手检查,一一排除确认。

 

 

1:检查参数ARCHIVE_LAG_TARGET

 

 

ARCHIVE_LAG_TARGET参数可以设置一个时间,通过时间限制,指定数据库强制进行Log Switch进行归档。 如果这个参数设置过小,有可能导致联机重做日志还没有写满就切换了,这样就有可能导致归档日志远小于重做日志。

 

 

SQL> show parameter archive_lag_target;
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
archive_lag_target                   integer     0
SQL> 

 

该网友反馈参数archive_lag_target0,那么也可以排除这方面的因素。

 

 

 

2:检查是否存在人为切换redo log的可能性。

 

 

  一些命令可以引起重做日志的切换,具体请见下面

 

SQL> alter system archive log current; #归档命令也会造成日志切换

 

SQL> alter system switch logfile;        #直接切换日志组

 

RMAN> backup archivelog all;

 

RMAN> backup database plus archivelog;

 

 

SELECT TO_CHAR(FIRST_TIME, 'YYYY-MM-DD HH24:MI:SS'), 
       BLOCKS * BLOCK_SIZE / 1024 / 1024, 
       COMPRESSED 
FROM   V$ARCHIVED_LOG; 

 

 

该网友QQ提供的截图如下所示,从截图看归档日志的大小在31M左右徘徊。另外,可以看到没有启用归档日志压缩选项(其实ORACLE不支持归档日志压缩,这个后面说明)。从归档日志大小的规律可以看出,这个不是某个重做日志切换命令引起的。

 

image

 

 

 

3:一些Bug引起的,如下metalink文档所示:

 

 

o   BUG 9272059 - REDO LOG SWITCH AT 1/8 OF SIZE DUE TO CMT CPU'S

o   BUG 10354739 - REDOLOGSIZE NOT COMPLETLY USED

o   BUG 12317474 - FREQUENT REDO LOG SWITCHES GENERATING SMALL SIZED ARCHIVELOGS

o   BUG 5450861 - ARCHIVE LOGS ARE GENERATED WITH A SMALLER SIZE THAN THE REDO LOG FILES

o   BUG 7016254 - DECREASE CONTROL FILE ENQUEUE WAIT AT LOG SWITCH

 

 

 

4: 跟CPU个数CPU_COUNT以及log_bufferredo log size有关。

 

归档日志的大小是真实的在线日志文件的使用量,也就是在线日志文件切换前其中写入的内容的大小。为了更好的并行减少冲突,提高并发,减少redo allocation latch的等待,ORACLE会将redo buffer分成若干小的buffer,每份小的bufferstrand。按每16CPU分一股(strand),每一股独立从redo buffer以及redo log中分配一块空间,当这一块redo buffer用完,会写入redo log并且继续从redo log中分配相同大小的空间,如果无法分配空闲空间就会进行日志切换,而不管其他strand是否写完。

 

clip_image002

 

 

如上所示CPU_COUNT112,那么 112/16=7 ,那么redo buffer redo log 都可以分成7部分

 

 

 

SQLselect 112.0/16 from dual;
 
  112.0/16
----------
         7
 
SQLselect 341655552/1024/1024/7 from dual;   --log buffer
 
341655552/1024/1024/7
---------------------
            46.546875
 
SQL> select 200/7 from dual;                    --redo log size
 
     200/7
----------
28.5714286
 
SQL> 

 

log buffer的大小是325.828125M(341655552),分成7股(strand)的话,每个strand还是325.828125M/7=46.546875M。而redo log的大小是200M的时候,redo log中的空间会按strand的个数平均分配,也就是每块200M/7=28.5714286M

 

这样,当每个strand中的内容写到28M多左右的时候,就会日志切换,而不是46M。相当于log buffer中的一部分空间被浪费了。所以你看到的归档日志基本是30M左右大小(其中一股28.6再加上其它各股也有部分内容,所以归档日志的大小就是一个波动的范围

 

 

 

其它各个特殊场景分析,可以参考归档日志的大小比在线日志的大小小很多这篇文章的介绍。如下所示,当然这篇文章分析过程还忽略了其它各股其实也是有部分数据的。这个需要特别注意。

 

 

 

 

 

 

5: 检查是否开启归档日志压缩

 

  

此功能的目的是在归档传输到远程或者归档存储到磁盘之前进行压缩,以便减少归档传输的时间和占用的磁盘空间。可以使用下面脚本检查。

 

SELECT NAME,
    ARCHIVELOG_COMPRESSION 
FROM V$DATABASE;
 
 
SELECT TO_CHAR(FIRST_TIME, 'YYYY-MM-DD HH24:MI:SS'), 
       BLOCKS * BLOCK_SIZE / 1024 / 1024, 
       COMPRESSED 
FROM   V$ARCHIVED_LOG; 
 
 
 
SQL> SELECT NAME,
  2         ARCHIVELOG_COMPRESSION
  3  FROM V$DATABASE;
 
NAME      ARCHIVEL
--------- --------
GSPP      DISABLED

 

起初,估计很多人都会被这个所迷惑,估计也认同这种说法,其实ORACLE 10g 11g都是不支持归档日志压缩的,也没有明确的官方文档说明,其实归档日志压缩本来是ORACLE 10g计划引入的新特性,不幸的是这个计划放弃了,而且ORACLE 11g也不支持。所以其实这种说法是不靠谱的,所以对应这种原因,你完全可以忽略这个,之所以也写到这里,是因为网上很多文章都介绍了这种可能性,而且很多人都没有认真分辨、查证!

 

 

Archive compression was a planned new feature for 10G, but unfortunately it was withdrawn and it is still

not available in 11g .This feature is expected in future releases

 

 

最后大家可以去metalink上看看Archived redolog is (significant) smaller than the redologfile. (文档 ID 1356604.1)这篇文章,官方文档不愧是官方文档,最全面的阐述了归档日志比重做日志小的原因。下面截取部分内容展示如下:

 

 

Archived redolog is (significant) smaller than the redologfile. (文档 ID 1356604.1)

 

 

There are 2 possible causes for this :

 

1. Documented and designed behaviour due to explicit forcing an archive creation before the redolog file is full

o   SQL> alter system switch logfile;

o   SQL> alter system archive log current;

o   RMAN> backup archivelog all;

o   RMAN> backup database plus archivelog;

o   ARCHIVE_LAG_TARGET : limits the amount of data that can be lost and effectively increases the availability of the standby database by forcing a log switch after the specified amount of time elapses. you can see this aswell in RAC with an idle/low-load instance.

2. Undocumented, but designed behaviour :

o   BUG 9272059 - REDO LOG SWITCH AT 1/8 OF SIZE DUE TO CMT CPU'S

o   BUG 10354739 - REDOLOGSIZE NOT COMPLETLY USED

o   BUG 12317474 - FREQUENT REDO LOG SWITCHES GENERATING SMALL SIZED ARCHIVELOGS

o   BUG 5450861 - ARCHIVE LOGS ARE GENERATED WITH A SMALLER SIZE THAN THE REDO LOG FILES

o   BUG 7016254 - DECREASE CONTROL FILE ENQUEUE WAIT AT LOG SWITCH

 

Explanation :

As per Bug: 5450861 (closed as 'Not a Bug'):

* The archive logs do not have to be even in size. This was decided a very long time ago,

when blank padding the archive logs was stopped, for a very good reason - in order to save disk space.

* The log switch does not occur when a redo log file is 100% full. There is an internal algorithm

that determines the log switch moment. This also has a very good reason - doing the log switch

at the last moment could incur performance problems (for various reasons, out of the scope of this note).

As a result, after the log switch occurs, the archivers are copying only the actual information from the

redo log files. Since the redo logs are not 100% full after the log switch and the archive logs are

not blank padded after the copy operation has finished, this results in uneven, smaller files than

the original redo log files.

There are a number of factors which combine to determine the log

switch frequency. These are the most relevant factors in this case:

 

a) RDBMS parameter LOG_BUFFER_SIZE

        If this is not explicitly set by the DBA then we use a default;

        at instance startup the RDBMS  calculates the number of shared redo

        strands as ncpus/16, and the size of each strand is 128Kb * ncpus

        (where ncpus is the number of CPUs in the system). The log buffer

        size is the number of stands multiplied by the strand size.

        The calculated or specified size is rounded up to a multiple of the granule size

        of a memory segment in the SGA. For 11.2 if

        SGA size >= 128GB then granule size is 512MB

        64GB <= SGA size < 128GB then granule size is 256MB

        32GB <= SGA size < 64GB then granule size is 128MB

        16GB <= SGA size < 32GB then granule size is 64MB

        8GB <= SGA size < 16GB then granule size is 32MB

        1GB <= SGA size < 8GB then granule size is 16MB

        SGA size < 1GB then granule size is 4MB

        There are some minimums and maximums enforced.

 

b) System load

        Initially only one redo strand is used, ie the number of "active"

        redo strands is 1, and all the processes copy their redo into

        that one strand. When/if there is contention for that strand then

        the number of active redo strands is raised to 2. As contention

        for the active strands increases, the number of active strands

        increases. The maxmum possible number of active redo strands is

        the number of strands initially allocated in the log buffer.

        (This feature is called "dynamic strands", and there is a hidden

        parameter to disable it which then allows processes to use all

        the strands from the outset).

 

 

c) Log file size

        This is the logfile size decided by the DBA when the logfiles are created.

 

d) The logfile space reservation algorithm

        When the RDBMS switches into a new online redo logfile, all the

        log buffer redo strand memory is "mapped" to the logfile space.

        If the logfile is larger than the log buffer then each strand

        will map/reserve its strand size worth of logfile space, and the

        remaining logfile space (the "log residue") is still available.

        If the logfile is smaller than the log buffer, then the whole

        logfile space is divided/mapped/reserved equally among all the

        strands, and there is no unreserved space (ie no log residue).

        When any process fills a strand such that all the reserved

        underlying logfile space for that strand is used, AND there is

        no log residue, then a log switch is scheduled.

 

        Example : 128 CPU's so the RDBMS allocates a

        log_buffer of size 128Mb containing 8 shared strands of size 16Mb.

        It may be a bit larger than 128Mb as it rounds up to an SGA granule boundary.

        The logfiles are 100Mb, so when the RDBMS switches into a

        new online redo logfile each strand reserves 100Mb/8 = 25600 blocks

        and there is no log residue. If there is low system load, only one

        of the redo strands will be active/used and when 25600 blocks of

        that strand are filled then a log switch will be scheduled - the created

        archive logs have a size around 25600 blocks.

 

        With everything else staying the same (128 cpu's and low load),

        using a larger logfile would not really reduce the amount of

        unfilled space when the log switches are requested, but it would

        make that unfilled space less significant as a percentage of the

        total logfile space, eg

 

        - with a 100Mb logfile, the log switch happens with 7 x 16Mb

        logfile space unfilled (ie the logfile is 10% full when the

        log switch is requested)

 

        - with a 1Gb logfile, the log switch would happen with 7 x 16Mb

        logfile space unfilled (ie the logfile is 90% full when the

        log switch is requested)

        With a high CPU_COUNT, a low load and a redo log file size smaller than

        the redolog buffer, you may see small archived log files because of log switches

        at about 1/8 of the size of the define log file size.

        This is because CPU_COUNT defines the number of redo strands (ncpus/16).

        With a low load only a single strand may be used. With redo log file size smaller

        than the redolog buffer, the log file space is divided over the available strands.

        When for instance only a single active strand is used, a log switch can already occur

        when that strand is filled.

 

 

 

 

 

参考资料:

         

         

        https://oracleblog.org/study-note/archive-size-is-much-smaller-than-redo-size/

   

        http://www.ctonote.com/oracle/3236/

    

         

相关实践学习
日志服务之使用Nginx模式采集日志
本文介绍如何通过日志服务控制台创建Nginx模式的Logtail配置快速采集Nginx日志并进行多维度分析。
相关文章
|
2月前
|
SQL Oracle 关系型数据库
【赵渝强老师】Oracle的控制文件与归档日志文件
本文介绍了Oracle数据库中的控制文件和归档日志文件。控制文件记录了数据库的物理结构信息,如数据库名、数据文件和联机日志文件的位置等。为了保护数据库,通常会进行控制文件的多路复用。归档日志文件是联机重做日志文件的副本,用于记录数据库的变更历史。文章还提供了相关SQL语句,帮助查看和设置数据库的日志模式。
【赵渝强老师】Oracle的控制文件与归档日志文件
|
2月前
|
Oracle 关系型数据库 数据库
【赵渝强老师】Oracle的参数文件与告警日志文件
本文介绍了Oracle数据库的参数文件和告警日志文件。参数文件分为初始化参数文件(PFile)和服务器端参数文件(SPFile),在数据库启动时读取并分配资源。告警日志文件记录了数据库的重要活动、错误和警告信息,帮助诊断问题。文中还提供了相关视频讲解和示例代码。
|
2月前
|
SQL Oracle 关系型数据库
【赵渝强老师】Oracle的联机重做日志文件与数据写入过程
在Oracle数据库中,联机重做日志文件记录了数据库的变化,用于实例恢复。每个数据库有多组联机重做日志,每组建议至少有两个成员。通过SQL语句可查看日志文件信息。视频讲解和示意图进一步解释了这一过程。
|
5月前
|
SQL Oracle NoSQL
实时计算 Flink版操作报错合集之报错“找不到对应的归档日志文件”,怎么处理
在使用实时计算Flink版过程中,可能会遇到各种错误,了解这些错误的原因及解决方法对于高效排错至关重要。针对具体问题,查看Flink的日志是关键,它们通常会提供更详细的错误信息和堆栈跟踪,有助于定位问题。此外,Flink社区文档和官方论坛也是寻求帮助的好去处。以下是一些常见的操作报错及其可能的原因与解决策略。
|
5月前
|
SQL Oracle 关系型数据库
"揭秘!一键解锁Oracle日志清理魔法,让海量归档日志无处遁形,守护数据库健康,告别磁盘空间告急噩梦!"
【8月更文挑战第9天】随着Oracle数据库在企业应用中的普及,归档日志管理对保持数据库健康至关重要。归档日志记录所有更改,对数据恢复极为重要,但也可能迅速占用大量磁盘空间影响性能。利用Oracle提供的RMAN工具,可通过编写Shell脚本来自动清理归档日志。脚本包括设置环境变量、连接数据库、检查和删除指定时间前的日志,并记录执行情况。通过Cron作业定时运行脚本,可有效管理日志文件,确保数据库稳定运行。
145 7
|
5月前
|
SQL 监控 Oracle
Oracle数据误删不用怕,跟我来学日志挖掘
Oracle数据误删不用怕,跟我来学日志挖掘
111 0
|
3月前
|
存储 Oracle 关系型数据库
Oracle数据库的应用场景有哪些?
【10月更文挑战第15天】Oracle数据库的应用场景有哪些?
249 64
|
1月前
|
存储 Oracle 关系型数据库
数据库数据恢复—ORACLE常见故障的数据恢复方案
Oracle数据库常见故障表现: 1、ORACLE数据库无法启动或无法正常工作。 2、ORACLE ASM存储破坏。 3、ORACLE数据文件丢失。 4、ORACLE数据文件部分损坏。 5、ORACLE DUMP文件损坏。
116 11
|
2月前
|
Oracle 关系型数据库 数据库
Oracle数据恢复—Oracle数据库文件有坏快损坏的数据恢复案例
一台Oracle数据库打开报错,报错信息: “system01.dbf需要更多的恢复来保持一致性,数据库无法打开”。管理员联系我们数据恢复中心寻求帮助,并提供了Oracle_Home目录的所有文件。用户方要求恢复zxfg用户下的数据。 由于数据库没有备份,无法通过备份去恢复数据库。

推荐镜像

更多