《ORACLE 数据分析和动态采样》
引言:oracle 数据库性能调优最需要重视的也最常遇到的就是SQL执行效率,而反映SQL效率最直观的工具就是CBO生成的执行计划,那么如何让CBO生成最精准的效率最高的执行计划成为我们当前需要研究的课题。同一条语句,好的执行计划能带来飞一样的速度,坏的执行计划让我们痛苦不堪,下面我们从原理到实践来把如何产生高效计划的方法教给大家。
一 CBO介绍
CBO全称叫Cost Based optimization基于代价优化器,它是一个数学模型,同一个SQL语句在不同的oracle版本中计算出来的代价结果也是不一样的,因为每个版本CBO优化器的设计结构有很大不同,现在还不是很完善很智能很通人性,因此我们不能完全依赖它,只能辅助我们。
如何生成精确的执行计划:公式数据+CBO=执行计划,传入CBO的数据越精确得到结果越精确,我们能做的保证输入数据更准确,通过精确数据计算出精确执行计划
二演示一个表分析后执行计划比动态采样更准确的例子
动态采样:顾名思义就是oracle自动为你进行的初步数据分析,由于是随机在表上取一些数据,因此并不能保证得出的执行计划很准确,只能作为一种辅助分析手段,在不得已的情况下来分析数据,有一定的局限性。
场景:当表没有分析信息时,oracle会使用动态采样技术,而且动态采样是在SQL硬解析的时候发生的,传入->CBO参数->生成执行计划。
级别:oracle 10g oracle 11g 默认动态采样级别是2,它有level1-10,设置的级别越高采集的数据块越多,结果越精确,运行时间越长,level10对所有数据进行采样分析。
实验
LEO1@LEO1> drop table leo1 purge;清理环境
Table dropped.
LEO1@LEO1> drop table leo2 purge;
Table dropped.
LEO1@LEO1> create table leo1 as select * from dba_objects;创建leo1表
Table created.
LEO1@LEO1> create table leo2 as select * from leo1;创建leo2表,采用leo1一样数据和结构
Table created.
LEO1@LEO1> col segment_name for a10
LEO1@LEO1> select segment_name,extents,blocks from dba_segments where segment_name in ('LEO1','LEO2');
SEGMENT_NA EXTENTS BLOCKS
---------- ---------- -------------- -------------- ----
LEO1 24 1152
LEO2 24 1152
查询leo1和leo2表这两个段对象存储参数,都是占用24个区,1152个块,2个表一模一样嘛
LEO1@LEO1> col table_name for a10
LEO1@LEO1> select table_name,num_rows,blocks,status from dba_tables wheretable_name in ('LEO1','LEO2');
TABLE_NAME NUM_ROWS BLOCKS STATUS
---------- ---------- ---------- ------------ ---------- ------------
LEO1 VALID
LEO2 VALID
在这个数据字典里只显示表名和当前状态(有效),没有行信息和块信息,这是为神马呢,嗯从上面的操作可以看出,我们只是建立了表,但没有分析表统计信息,现在我们分析一下后看看效果
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO1');对LEO1表进行统计分析
PL/SQL procedure successfully completed.
LEO1@LEO1> select table_name,num_rows,blocks,status from dba_tables wheretable_name in ('LEO1','LEO2');
TABLE_NAME NUM_ROWS BLOCKS STATUS
---------- ---------- ---------- ------------ ---------- ------------
LEO1 71968 1051 VALID
LEO2 VALID
现在leo1表已经有行信息和块信息了,leo2由于没有进行表分析现在还什么都没有
LEO1@LEO1> set autotrace trace exp
LEO1@LEO1> select * from leo1 where object_id=10000;
Execution Plan
----------------------------------------------------------
Plan hash value: 2716644435
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 97 | 287 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| LEO1 | 1 | 97 | 287 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=10000)
Leo1表进行了统计分析,执行计划显示Rows=1,嗯由于object_id字段是不重复的,我们都知道object_id=10000只有一条记录,这和执行计划的判断结果是一致,所以表分析可以让oracle收集表数据的信息,让CBO了解表数据分布情况,有多少条,占用多少空间等。
总之“分析”就是让oracle理解我们的数据是怎么样一个情况,以便更好的去处理它
LEO1@LEO1> select * from leo2 where object_id=10000;
Execution Plan
----------------------------------------------------------
Plan hash value: 2258638698
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12 | 2484 | 287 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| LEO2 | 12 | 2484 | 287 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=10000)
Note
-----
- dynamic sampling used for this statement (level=2)
Leo2表没有进行统计分析,oracle采用了动态采样技术(红色字显示),执行计划显示Rows=12,由于动态采样只是随机在表上取一些数据来评估,没有进行全表完整分析,固然得出的执行计划没有leo1表更精确更好。所以说“动态采样”只是一个补救措施。
三演示在缺乏直方图(柱状图)时,CBO计算出错误的执行计划的例子,并给出最后正确的执行计划
直方图(柱状图):oracle对列上的数据分布情况进行统计,得出的图示。
场合:常用于查看数据倾斜情况(一个值在整个字段中所占比例),了解某个键值所拥有的记录数
如果是数字1这个键值在整个字段中占用的比例
如果是字符leonarding这个键值在整个字段中占用的比例
如果是日期 2012-12-21 这个日期在整个字段中占用的比例
直方图类型:
(1)Frequency-频率直方图:用于基数少,记录多,重复率高的字段
(2)Height Balanced-高度平衡直方图:用于基数多,唯一值多,重复率低的字段,这个就是等分相同数据量的桶(bucket),值可能不同,来显示桶中记录数的比例
实验
LEO1@LEO1> drop table leo3 purge;清除环境
Table dropped.
LEO1@LEO1> drop table leo4 purge;
Table dropped.
LEO1@LEO1> create table leo3 as select * from dba_objects;创建leo3表
Table created.
LEO1@LEO1> create table leo4 as select * from dba_objects;创建leo4表
Table created.
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO3',method_opt=>'for all columns size 1');
PL/SQL procedure successfully completed.
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO4',method_opt=>'for all columns size 254');
PL/SQL procedure successfully completed.
对leo3表做信息收集但没有做直方图分析(for all columns size 1)1 代表不做直方图
对leo4表做信息收集但又做了直方图分析(for all columns size 254)254 代表最精确直方图对所有列都做分析,在分析表时把更多数据告诉CBO,得出最精确的执行计划。
注:删除表上分析信息语句
LEO1@LEO1> execute dbms_stats.delete_table_stats('LEO1','LEO3');
PL/SQL procedure successfully completed.
LEO1@LEO1> select table_name,column_name,num_distinct,num_buckets,histogram from dba_tab_col_statistics where table_name in ('LEO3','LEO4');
TABLE_NAME COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
---------- ------------------------------ ------------ ----------- ---------------
LEO3 EDITION_NAME 0 0 NONE
LEO3 NAMESPACE 20 1 NONE
LEO3 SECONDARY 2 1 NONE
LEO3 GENERATED 2 1 NONE
LEO3 TEMPORARY 2 1 NONE
LEO3 STATUS 2 1 NONE
LEO3 TIMESTAMP 1227 1 NONE
LEO3 LAST_DDL_TIME 1170 1 NONE
LEO3 CREATED 1097 1 NONE
LEO3 OBJECT_TYPE 43 1 NONE
LEO3 DATA_OBJECT_ID 7262 1 NONE
LEO3 OBJECT_ID 71966 1 NONE
LEO3 SUBOBJECT_NAME 58 1 NONE
LEO3 OBJECT_NAME 43424 1 NONE
LEO3 OWNER 24 1 NONE
LEO4 EDITION_NAME 0 0 NONE
LEO4 NAMESPACE 20 14 FREQUENCY
LEO4 SECONDARY 2 2 FREQUENCY
LEO4 GENERATED 2 2 FREQUENCY
LEO4 TEMPORARY 2 2 FREQUENCY
LEO4 STATUS 2 1 FREQUENCY
LEO4 TIMESTAMP 1228 254 HEIGHT BALANCED
LEO4 LAST_DDL_TIME 1171 254 HEIGHT BALANCED
LEO4 CREATED 1098 254 HEIGHT BALANCED
LEO4 OBJECT_TYPE 43 30 FREQUENCY
LEO4 DATA_OBJECT_ID 7263 254 HEIGHT BALANCED
LEO4 OBJECT_ID 71967 254 HEIGHT BALANCED
LEO4 SUBOBJECT_NAME 58 58 FREQUENCY
LEO4 OBJECT_NAME 43424 254 HEIGHT BALANCED
LEO4 OWNER 24 19 FREQUENCY
30 rows selected.
参数说明:
TABLE_NAME:表名 leo3我们没有做直方图分析显示NONE 而leo4做了
COLUMN_NAME:列名
NUM_DISTINCT:列中不同键值数(就是我们所说基数)
NUM_BUCKETS:直方图的桶数,用几个桶来装数据
HISTOGRAM:直方图类型,FREQUENCY and HEIGHT BALANCED
例对于status列由于基数少重复率很高则使用frequency直方图
对于object_id由于都是唯一值没有重复率则使用height balance直方图
与我们上面写的情况一致
LEO1@LEO1> select count(*) from leo3 where object_type='TABLE';标准记录数2816
COUNT(*)
-----------------
2816
LEO1@LEO1> select count(*) from leo3 where object_type='TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 3505406240
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 287 (1)| 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| LEO3 | 1674 | 15066 | 287 (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_TYPE"='TABLE')
由于leo3我们没有做直方图分析,CBO就不知道object_type='TABLE'在整个字段中所占比例,导致Rows=1674与标准2816偏差太大,这就是没有足够分析数据的结果
LEO1@LEO1> select count(*) from leo4 where object_type='TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 3210696650
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 287 (1)| 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| LEO4 | 2697 | 24273 | 287 (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_TYPE"='TABLE')
由于leo4做了全表整体直方图分析,生成254个桶来装数据,oracle必然了解object_type='TABLE'在整个字段中所占比例,Rows=2697 和标准2816 已经非常接近了,相对比较精确,这就是有足够分析数据的结果。
四演示在分区表上,全局信息和分区信息是如何影响执行计划的,给出演示过程
全局统计信息:如果跨分区CBO使用全局统计信息
分区统计信息:如果不跨分区CBO使用分区统计信息
全局信息和分区信息是2个独立信息源,某些情况下会有联系
实验
LEO1@LEO1> drop table leo5 purge;清除环境
Table dropped.
LEO1@LEO1> drop table leo6 purge;
Table dropped.
LEO1@LEO1> create table leo5建立一个分区表
partition by range (object_id)
(
partition p1 values less than(10000),
partition p2 values less than(20000),
partition p3 values less than(30000),
partition p4 values less than(40000),
partition p5 values less than(50000),
partition p6 values less than(60000),
partition pm values less than(maxvalue)
) as select * from dba_objects; 2 3 4 5 6 7 8 9 10 11
Table created.
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO5');默认对全局和分区都做分析
PL/SQL procedure successfully completed.
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO5');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ----- ---------- ---------- ------------ ------
LEO5 71979 1111 YES
从全局上看,Leo5表有71979行记录占用1111个块
LEO1@LEO1> select table_name,partition_name,num_rows,blocks,global_stats from dba_tab_partitionswhere table_name='LEO5';
TABLE_NAME PARTITION_NAME NUM_ROWS BLOCKS GLO
---------- ------------------------------ ---------- ---------- ---
LEO5 P1 9708 137 YES
LEO5 P2 9806 152 YES
LEO5 P3 10000 158 YES
LEO5 P4 10000 159 YES
LEO5 P5 10000 159 YES
LEO5 P6 9606 150 YES
LEO5 PM 12859 196 YES
从分区上看,每个分区有多少行记录占用了多少个数据块,总和应该与全局信息一致
NUM_ROWS:是一个估算值
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO5',granularity=>'partition');
PL/SQL procedure successfully completed.
Granularity参数:针对分区表来说的,在哪个粒度上做分析
Global:全局级别分析
Partition:分区级别分析
Subpartition:子分区级别分析
我们这条命令单独对分区做分析,更新分区统计信息,不更新全局统计信息
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO5');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ------------ ---------- ---------------
LEO5 71979 1111 YES
全局统计信息没有变化,与实际数据信息不相符了
LEO1@LEO1> select table_name,partition_name,num_rows,blocks,global_stats from dba_tab_partitions where table_name='LEO5';
TABLE_NAME PARTITION_NAME NUM_ROWS BLOCKS GLO
---------- ------------------------------ ---------- ---------- ---
LEO5 P1 19416 382 YES
LEO5 P2 19612 382 YES
LEO5 P3 20000 382 YES
LEO5 P4 20000 382 YES
LEO5 P5 20000 382 YES
LEO5 P6 19212 382 YES
LEO5 PM 25718 382 YES
分区统计信息已经被更新了
现在看一下统计信息是如何被使用的
LEO1@LEO1> select count(*) from leo5 where object_id<8000;符合条件有15668行
COUNT(*)
------------------
15668
LEO1@LEO1> select count(*) from leo5 where object_id<8000;
Execution Plan
----------------------------------------------------------
Plan hash value: 546738359
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 |105 (0)| 00:00:02 | | |
| 1 | SORT AGGREGATE | | 1| 4 | | | | |
| 2 | PARTITION RANGE SINGLE| | 15534 | 62136 | 105 (0)| 00:00:02 | 1 | 1 |
|* 3 | TABLE ACCESS FULL | LEO5 | 15534 | 62136 | 105 (0)| 00:00:02 | 1 | 1 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("OBJECT_ID"<8000)
由于我们在分区级别重新进行了分析,分区统计信息是最新版,CBO使用了最新分区统计信息后Rows=15534 与实际15668 已经相差无疑,基本上得出的是正确结果。
原理:我们说过如果不跨分区CBO使用分区统计信息,针对某个分区oracle获取信息
Pstart| Pstop起始分区和结束分区都是1,说明结果集都落在了第一个分区只需扫描第一个分区即可
LEO1@LEO1> select count(*) from leo5 where object_id<30001;符合条件有59030行
COUNT(*)
----------------
59030
LEO1@LEO1> select count(*) from leo5 where object_id<30001;
Execution Plan
----------------------------------------------------------
Plan hash value: 3650668575
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 416 (1)| 00:00:05 | | |
| 1 | SORT AGGREGATE | | 1 | 5 | | | | |
| 2 | PARTITION RANGE ITERATOR| | 29209 | 142K| 416 (1)| 00:00:05 | 1 | 4 |
|* 3 | TABLE ACCESS FULL | LEO5 | 29209 | 142K| 416 (1)| 00:00:05 | 1 | 4 |
--------------------------------------------------------------------------------------------------
跨4个分区扫描
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("OBJECT_ID"<30001)
LEO1@LEO1> select count(*) from leo5 where object_id<30000;
Execution Plan
----------------------------------------------------------
Plan hash value: 3650668575
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 313 (1)| 00:00:04 | | |
| 1 | SORT AGGREGATE | | 1 | 5 | | | | |
| 2 | PARTITION RANGE ITERATOR| | 29208 | 142K| 313 (1)| 00:00:04 | 1 | 3 |
| 3 | TABLE ACCESS FULL | LEO5 | 29208 | 142K| 313 (1)| 00:00:04 | 1 | 3 |
--------------------------------------------------------------------------------------------------
跨3个分区扫描
小结:只要跨分区CBO就使用全局统计信息,由于全局信息过旧,不管扫描3个分区还是4个分区Rows=29209 这与实际值59030都相差甚远,这就是为什么分区里数据是准的,而全局里数据是不准的。
我们再一次对全局级别分析一次看看全局统计信息是不是更新了,执行计划是不是精准了
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO5',granularity=>'global');
PL/SQL procedure successfully completed.
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO5');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ------ ---------- ---------- --------
LEO5 143958 2674 YES
这次全局统计信息更新了
LEO1@LEO1> select count(*) from leo5 where object_id<30001;
Execution Plan
----------------------------------------------------------
Plan hash value: 3650668575
--------------------------------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 416 (1)| 00:00:05 | | |
| 1 | SORT AGGREGATE | | 1 | 5 | | | | |
| 2 | PARTITION RANGE ITERATOR| | 58418 | 285K| 416 (1)| 00:00:05 | 1 | 4 |
|* 3 | TABLE ACCESS FULL | LEO5 | 58418 | 285K| 416 (1)| 00:00:05 | 1 | 4 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("OBJECT_ID"<30001)
小结:我们使用更新后的全局统计信息得出Rows=58418 与实际值59030差不多了,说明单独对分区分析,不会更新全局统计信息。
五演示用extended statistics 解决列相关性的例子,给出演示过程
列的相关性:就是where子句中同时存在2个列以上条件时,CBO不做列与列之间相关性分析,这会导致CBO计算执行计划出现一种偏差
实验
由于oracle11g之前 CBO不做列与列之间相关性分析,我的实验环境是11g的没有10g环境,我们就用其他方式模拟一下吧。
LEO1@LEO1> select count(*) from leo6 where object_type='TABLE' and owner='LEO1';
COUNT(*)
------------------
20符合条件有20行
LEO1@LEO1> select count(*) from leo6 where object_type='TABLE' and owner='LEO1';
Execution Plan
----------------------------------------------------------
Plan hash value: 3995977415
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 28 | 287 (1)| 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | 28 | | |
|* 2 | TABLE ACCESS FULL| LEO6 | 8 | 224 | 287 (1)| 00:00:04 |
---------------------------------------------------------------------------
CBO不做列与列之间相关性分析前提下,Rows=8显然CBO计算错误,应该有20行的
LEO1@LEO1>
executedbms_stats.gather_table_stats('LEO1','LEO6',method_opt=>'for columns (object_type,owner) size skewonly');
PL/SQL procedure successfully completed.
我们对leo6表做列相关性分析,看看CBO计算是否准确
LEO1@LEO1> select count(*) from leo6 where object_type='TABLE' and owner='SYS';
COUNT(*)
----------------
969符合条件有969行
LEO1@LEO1> select count(*) from leo6 where object_type='TABLE' and owner='SYS';
Execution Plan
----------------------------------------------------------
Plan hash value: 3995977415
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 28 | 287 (1)| 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | 28 | | |
|* 2 | TABLE ACCESS FULL| LEO6 | 819 | 22932 | 287 (1)| 00:00:04 |
---------------------------------------------------------------------------
小结:这次CBO的Rows=819与实际值969非常接近了,可以看出作完列与列相关性分析后,CBO知道2个列的数据分布情况是一个什么样子了,从而在计算执行计划上有更多数据参考得出结果更准确。
六对一张表进行导出/导入,看它的统计信息是否发生改变,给出演示过程。
LEO1@LEO1> drop table leo7 purge;清除环境
Table dropped.
LEO1@LEO1> create table leo7 as select * from dba_objects;创建leo7表
Table created.
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO7');进行表分析
PL/SQL procedure successfully completed.
LEO1@LEO1> select count(*) from leo7;现在我们表里有71979行记录
COUNT(*)
-----------------
71979
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO7');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ---- ---------- ----------- -----
LEO7 71979 1051 YES
做完表分析后,统计信息已经有了,有71979行,1051个块,只包括数据段
LEO1@LEO1> select segment_name,extents,blocks from dba_segmentswhere segment_name in ('LEO7');
SEGMENT_NA EXTENTS BLOCKS
---------- ---------- ----------- ---------- -----------
LEO7 24 1152
这是leo7段对象的存储参数,占用24个区1152个数据块,段对象里面的块数包括数据段+表头段的信息
[oracle@leonarding1 ~]$ expuserid=leo1/leo1@leo1 file=/home/oracle/leo7_dump_file1.dmp tables=leo7 rows=y
Export: Release 11.2.0.1.0 - Production on Sun Jan 20 14:16:45 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses ZHS16GBK character set (possible charset conversion)
About to export specified tables via Conventional Path ...
. . exporting table LEO7 71979 rows exported
EXP-00091: Exporting questionable statistics.
Export terminated successfully with warnings.
导出leo7表进行导出
[oracle@leonarding1 ~]$ ll
total 8208
drwxrwxrwx 8 oracle oinstall 4096 Aug 21 2009 database
drwxr-xr-x 2 oracle oinstall 4096 Nov 17 14:48 Desktop
-rw-r--r-- 1 oracle oinstall 7 Nov 17 15:43 em
-rw-r--r-- 1 oracle oinstall 8314880 Jan 20 14:16 leo7_dump_file1.dmp
-rw-r--r-- 1 oracle oinstall 42201 Dec 16 21:25 orahdfs-1.0.0.0.0.zip
-rwxrwxrwx 1 oracle oinstall 795 Dec 16 23:53 test1.txt
-rwxrwxrwx 1 oracle oinstall 13325 Dec 27 23:13 test.txt
看一下统计信息,现在没有变化,也就是说导出不会更新表统计信息
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO7');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ---
LEO7 71979 1051 YES
我们把leo7_dump_file1.dmp在导入到leo7表中
[oracle@leonarding1 ~]$ imp userid=leo1/leo1@leo1 file=/home/oracle/leo7_dump_file1.dmp full=y ignore=y
Import: Release 11.2.0.1.0 - Production on Sun Jan 20 14:31:12 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export file created by EXPORT:V11.02.00 via conventional path
import done in US7ASCII character set and AL16UTF16 NCHAR character set
import server uses ZHS16GBK character set (possible charset conversion)
. importing LEO1's objects into LEO1
. importing LEO1's objects into LEO1
. . importing table "LEO7" 71979 rows imported
Import terminated successfully without warnings.
LEO1@LEO1> select count(*) from leo7;
COUNT(*)
----------------
143958
默认追加方式导入数据,原来71979行,导入后143958行
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO7');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ---------- ---------- ----------
LEO7 71979 1051 YES
我们对leo7表进行了追加导入,但是它的统计信息还是原来收集的统计信息,也就是说导入也不会更新表统计信息
LEO1@LEO1> execute dbms_stats.gather_table_stats('LEO1','LEO7');
PL/SQL procedure successfully completed.
LEO1@LEO1> select table_name,num_rows,blocks,global_stats from dba_tables where table_name in ('LEO7');
TABLE_NAME NUM_ROWS BLOCKS GLO
---------- ---------- ---------- ---
LEO7 143958 2160 YES
小结:只有重新进行表分析才能更新表统计信息
动态采样 直方图 CBO gather granularity extendedstatisticsexp imp 列相关性
本文转自 leonarding151CTO博客,原文链接:http://blog.51cto.com/leonarding/1122802,如需转载请自行联系原作者