现在的位置: 首页 > 综合 > 正文

创建分区索引时统计信息的准确性问题

2013年09月05日 ⁄ 综合 ⁄ 共 3893字 ⁄ 字号 评论关闭

本实验基于Oracle10g

 

先建立一个组合分区表

create table t_sub(id number, data varchar2(20))
partition by range(id)subpartition by hash(data)
(partition p1 values less than (100)
 (subpartition p1_s_1,
  subpartition p1_s_2
 ),
 partition p2 values less than (maxvalue)
 (subpartition p2_s_1,
  subpartition p2_s_2
 )
 );
 

插入数据

 BEGIN
 FOR i in 1..200 loop
 insert into t_sub values(i,i||'A');
 end loop;
 END;

 

SQL> create index i_id on t_sub(id)  local nologging;

Index created

SQL> select a.index_name,
  2  decode(partitioned,'YES',b.partition_name,'NO')
  3  partition,b.subpartition_name subpartition_name,b.num_rows,
  4  b.distinct_keys,b.num_rows/b.distinct_keys
  5  avg_row_per_key,b.distinct_keys/b.num_rows SELECTIVITY,
  6  b.clustering_factor from dba_indexes
  7  a,dba_ind_statistics b where a.owner=b.owner
  8  and a.index_name=b.index_name and a.owner='&owner' and
  9  a.index_name='&index_name';

INDEX_NAME PARTITION  SUBPARTITI   NUM_ROWS DISTINCT_KEYS AVG_ROW_PER_KEY SELECTIVITY CLUSTERING_FACTOR
---------- ---------- ---------- ---------- ------------- --------------- ----------- -----------------
I_ID                    P2          P2_S_2                   58            58               1           1                 1
I_ID                    P2          P2_S_1                   43            43               1           1                 1
I_ID                    P1          P1_S_2                   44            44               1           1                 1
I_ID                    P1          P1_S_1                   55            55               1           1                 1
I_ID                    P2                                         101          58 1.7413793103448 0.574257425                 2
I_ID                    P1                                         99            55             1.8 0.555555555                 2
I_ID                                                                 200          58 3.4482758620689        0.29                 4

7 rows selected

 

我们都知道,创建索引的时候会自动收集统计信息,以前创建索引如果要收集统计信息还需要加上compute statistics,不过现在不需要了,已经默认会收集。从查询看出,子分区的统计信息 是准确的, 不过分区,以及全局的统计信息 是不准确 的。

注意看 distinct_keys。我插入的数据都没有重复的,不过Oracle却统计出有重复数据。

 

好了现在我收集统计信息

SQL> BEGIN
  2     DBMS_STATS.GATHER_TABLE_STATS(ownname => 'ROBINSON',
  3                                   tabname => 'T_SUB',
  4                                   estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
  5                                   method_opt => 'for all columns size repeat',
  6                                   degree => DBMS_STATS.AUTO_DEGREE,
  7                                   granularity => 'ALL',
  8                                   cascade=>TRUE
  9                                   );
 10  END;
 11  /

PL/SQL procedure successfully completed

再次查询 统计信息

SQL> select a.index_name,
  2  decode(partitioned,'YES',b.partition_name,'NO')
  3  partition,b.subpartition_name subpartition_name,b.num_rows,
  4  b.distinct_keys,b.num_rows/b.distinct_keys
  5  avg_row_per_key,b.distinct_keys/b.num_rows SELECTIVITY,
  6  b.clustering_factor from dba_indexes
  7  a,dba_ind_statistics b where a.owner=b.owner
  8  and a.index_name=b.index_name and a.owner='&owner' and
  9  a.index_name='&index_name';

INDEX_NAME PARTITION  SUBPARTITI   NUM_ROWS DISTINCT_KEYS AVG_ROW_PER_KEY SELECTIVITY CLUSTERING_FACTOR
---------- ---------- ---------- ---------- ------------- --------------- ----------- -----------------
I_ID       P2         P2_S_2             58            58               1           1                 1
I_ID       P2         P2_S_1             43            43               1           1                 1
I_ID       P1         P1_S_2             44            44               1           1                 1
I_ID       P1         P1_S_1             55            55               1           1                 1
I_ID       P2                           101           101               1           1                 2
I_ID       P1                            99            99               1           1                 2
I_ID                                    200           200               1           1                 4

7 rows selected

现在准确了。

 

从这个实验,可以看出如果表是组合分区表, 那么创建索引之后,最好收集索引的统计信息。

【上篇】
【下篇】

抱歉!评论已关闭.