现在的位置: 首页 > 综合 > 正文

NULL 值与索引(一)

2018年04月12日 ⁄ 综合 ⁄ 共 14678字 ⁄ 字号 评论关闭
分类: SQL/PLSQL 优化 4185人阅读 评论(0) 收藏 举报
    NULL值是关系数据库系统布尔型(true,false,unknown)中比较特殊类型的一种值,通常称为UNKNOWN或空值,即是未知的,不确定的。由于
NULL存在着无数的可能,因此NULL值也不等于NULL值,所以与NULL值相关的操作同样都为NULL值。正是基于这样一个特性,对于NULL值列上的B
树索引导致了is null/is not null不走索引的情形,下面描述了NULL值与索引以及索引NULL列上的执行计划,如何使得NULL值走索引的情形。
注:本文仅仅讨论的是B树索引上的NULL值,位图索引不在此范围之内。

一、null值与索引的关系

scott@ORCL> create table t1(id number,val varchar2(1));

-->为表t1创建<a target=_blank name="baidusnap0" style="color: rgb(255, 153, 0);"></a><span style="background-color: rgb(255, 255, 102);">唯一索引</span>
scott@ORCL> create unique index i_t1_id on t1(id);

scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'Y' from dual;

scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'N' from dual;

-->从上面的操作可知,尽管列id上存在<span style="background-color: rgb(255, 255, 102);">唯一索引</span>,但由于<span style="background-color: rgb(153, 255, 153);">null</span>值不等于任一<span style="background-color: rgb(153, 255, 153);">null</span>值,因此能够成功插入
scott@ORCL> commit;

-->再次为表添加唯一复合索引,即基于id列与val列
scott@ORCL> create unique index i_t1_id_val on t1(id,val);

Index created.

-->插入<span style="background-color: rgb(153, 255, 153);">null</span>,'N'的记录时失败,提示违反唯一性约束
scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'N' from dual;
insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'N' from dual
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.I_T1_ID_VAL) violated

-->插入<span style="background-color: rgb(153, 255, 153);">null</span>,'Y'的记录时同样失败,提示违反唯一性约束
scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'Y' from dual;
insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'Y' from dual
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.I_T1_ID_VAL) violated

-->插入两个<span style="background-color: rgb(153, 255, 153);">null</span>值成功
scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,<span style="background-color: rgb(153, 255, 153);">null</span> from dual;

1 row created.

scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,<span style="background-color: rgb(153, 255, 153);">null</span> from dual;

1 row created.

scott@ORCL> insert into t1 select <span style="background-color: rgb(153, 255, 153);">null</span>,'A' from dual;

1 row created.

scott@ORCL> commit;

Commit complete.

scott@ORCL> set <span style="background-color: rgb(153, 255, 153);">null</span> unknown;
scott@ORCL> select * from t1;

        ID VAL
---------- ------------------------------
unknown    Y
unknown    N
unknown    unknown
unknown    unknown
unknown    A

scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T1',cascade=>true);
		 
scott@ORCL> select index_name,index_type,blevel,leaf_blocks,num_rows,status,distinct_keys
  2  from user_indexes  where table_name='T1';

INDEX_NAME      INDEX_TYPE     BLEVEL LEAF_BLOCKS   NUM_ROWS STATUS   DISTINCT_KEYS
--------------- ---------- ---------- ----------- ---------- -------- -------------
I_T1_ID         NORMAL              0           0          0 VALID                0
I_T1_ID_VAL     NORMAL              0           1          3 VALID                3

-->从上面的情形可知,
-->基于单列的<span style="background-color: rgb(255, 255, 102);">唯一索引</span>,可以多次插入<span style="background-color: rgb(153, 255, 153);">null</span>值,但其索引上并不存储<span style="background-color: rgb(153, 255, 153);">null</span>值。
-->基于多列的复合索引,尽管全为<span style="background-color: rgb(153, 255, 153);">null</span>值的行可以多次插入,但不全为<span style="background-color: rgb(153, 255, 153);">null</span>的重复行则不能被插入(注,非唯一复合索引不存在此限制,此处不演示)。
-->基于多列的复合索引,对于全为<span style="background-color: rgb(153, 255, 153);">null</span>值的索引值也不会被存储。如上面的情形,尽管插入了5条记录,复合索引中只存储了3条。
-->注:对于唯一性约束,<span style="background-color: rgb(153, 255, 153);">null</span>值不等于<span style="background-color: rgb(153, 255, 153);">null</span>值,同样(<span style="background-color: rgb(153, 255, 153);">null</span>,<span style="background-color: rgb(153, 255, 153);">null</span>)也不等同于(<span style="background-color: rgb(153, 255, 153);">null</span>,<span style="background-color: rgb(153, 255, 153);">null</span>),所以上面的两次<span style="background-color: rgb(153, 255, 153);">null</span>能够被插入。

二、null值与执行计划

scott@ORCL> set autot trace exp;
scott@ORCL> select * from t1 where id is <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     5 |     5 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     5 |     5 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID" IS <span style="background-color: rgb(153, 255, 153);">NULL</span>)

-->从上面的测试可知,由于<span style="background-color: rgb(153, 255, 153);">null</span>值是不被存储的,因此当使用id is <span style="background-color: rgb(153, 255, 153);">null</span>作为谓词时,走了全表扫描
   
scott@ORCL> select * from t1 where id is not <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 796913935

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     1 |     1 |     0   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1      |     1 |     1 |     0   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | I_T1_ID |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ID" IS NOT <span style="background-color: rgb(153, 255, 153);">NULL</span>)

-->从上面的测试可知,尽管当前表上id列上的所有值都为<span style="background-color: rgb(153, 255, 153);">null</span>,但不排除后续记录插入的id不为<span style="background-color: rgb(153, 255, 153);">null</span>的列。
-->故当使用id is not <span style="background-color: rgb(153, 255, 153);">null</span>作为谓词时,此时执行计划中走了索引全扫描。   

-->下面来看看复合索引的情形   
scott@ORCL> select * from t1 where val is <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     2 |     2 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     2 |     2 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("VAL" IS <span style="background-color: rgb(153, 255, 153);">NULL</span>)

scott@ORCL> select * from t1 where val is not <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 1931510411

--------------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |     3 |     3 |     1   (0)| 00:00:01 |
|*  1 |  INDEX FULL SCAN | I_T1_ID_VAL |     3 |     3 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("VAL" IS NOT <span style="background-color: rgb(153, 255, 153);">NULL</span>)

-->对于复合<span style="background-color: rgb(255, 255, 102);">唯一索引</span>的情形,当使用单列且非前导列谓词时,使用is <span style="background-color: rgb(153, 255, 153);">null</span>与 is not <span style="background-color: rgb(153, 255, 153);">null</span>等同于单列<span style="background-color: rgb(255, 255, 102);">唯一索引</span>的情形。
-->即原理也是一样的,val is <span style="background-color: rgb(153, 255, 153);">null</span>走全表扫描而val is not <span style="background-color: rgb(153, 255, 153);">null</span>走索引。因为<span style="background-color: rgb(153, 255, 153);">null</span>值不会被存储。

-->下面看看两个列都作为谓词的情形   
scott@ORCL> select * from t1 where id is <span style="background-color: rgb(153, 255, 153);">null</span> and val is not <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 1040510552

--------------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |     3 |     3 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| I_T1_ID_VAL |     3 |     3 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID" IS <span style="background-color: rgb(153, 255, 153);">NULL</span>)
       filter("VAL" IS NOT <span style="background-color: rgb(153, 255, 153);">NULL</span>)

-->从上面的测试可知,尽管两个谓词列上都存在索引,一个为单列<span style="background-color: rgb(255, 255, 102);">唯一索引</span>,一个为复合<span style="background-color: rgb(255, 255, 102);">唯一索引</span>。Oracle 选择了复合索引I_T1_ID_VAL。	   

scott@ORCL> select * from t1 where id is not <span style="background-color: rgb(153, 255, 153);">null</span> and val is <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 796913935

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     1 |     1 |     0   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1      |     1 |     1 |     0   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | I_T1_ID |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("VAL" IS <span style="background-color: rgb(153, 255, 153);">NULL</span>)
   2 - filter("ID" IS NOT <span style="background-color: rgb(153, 255, 153);">NULL</span>)	   

-->同样的情形,谓词的顺序与复合索引定义的顺序一样,只不过第一个谓词为id is not <span style="background-color: rgb(153, 255, 153);">null</span>,而第二个谓词为val is <span style="background-color: rgb(153, 255, 153);">null</span>。
-->此时Oracle 选择了单列<span style="background-color: rgb(255, 255, 102);">唯一索引</span>I_T1_ID
-->看到此,不知道大家是否已明白,即哪个列为is not <span style="background-color: rgb(153, 255, 153);">null</span>,则会使用该列上的索引,原因还是那句话,索引不存储<span style="background-color: rgb(153, 255, 153);">null</span>值。
-->对于颠倒id列与val列以及id,val列为<span style="background-color: rgb(153, 255, 153);">null</span>或not <span style="background-color: rgb(153, 255, 153);">null</span>的其他不同组合情形不再演示,其执行计划类似。

三、使用is null走索引的情形

scott@ORCL> set autot off;
--删除原有表上的<span style="background-color: rgb(153, 255, 153);">null</span>值记录
scott@ORCL> delete from t1 where val not in('Y','N') or val is <span style="background-color: rgb(153, 255, 153);">null</span>;

3 rows deleted.

scott@ORCL> update t1 set id=1 where val='Y';

1 row updated.

scott@ORCL> update t1 set id=2 where val='N';

1 row updated.

scott@ORCL> commit;

Commit complete.

-->对原有记录更新后的情形
scott@ORCL> select * from t1;

        ID VAL
---------- ------------------------------
         1 Y
         2 N

scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T1',cascade=>true);

PL/SQL procedure successfully completed.

-->修改表列id使之具有not <span style="background-color: rgb(153, 255, 153);">null</span>约束的特性
scott@ORCL> alter table t1 modify(id not <span style="background-color: rgb(153, 255, 153);">null</span>);

Table altered.

scott@ORCL> set autot trace exp;
scott@ORCL> select * from t1 where id is <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 3160894736

--------------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |     1 |     5 |     0   (0)|          |
|*  1 |  FILTER          |             |       |       |            |          |
|   2 |   INDEX FULL SCAN| I_T1_ID_VAL |     2 |    10 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(<span style="background-color: rgb(153, 255, 153);">NULL</span> IS NOT <span style="background-color: rgb(153, 255, 153);">NULL</span>)

-->从上面的执行计划中可知,当表t1列id上具有not <span style="background-color: rgb(153, 255, 153);">null</span> 约束时,此时使用id is <span style="background-color: rgb(153, 255, 153);">null</span>选择了索引范围扫描

-->下面来看看列val is <span style="background-color: rgb(153, 255, 153);">null</span> 的情形   
scott@ORCL> select * from t1 where val is <span style="background-color: rgb(153, 255, 153);">null</span>;

Execution Plan
----------------------------------------------------------
Plan hash value: 48744011

------------------------------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |     5 |     2   (0)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| I_T1_ID_VAL |     1 |     5 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("VAL" IS <span style="background-color: rgb(153, 255, 153);">NULL</span>)

-->尽管val列上允许<span style="background-color: rgb(153, 255, 153);">null</span>值存在,但由于列id上具有not <span style="background-color: rgb(153, 255, 153);">null</span> 约束,且id列与val列存在复合<span style="background-color: rgb(255, 255, 102);">唯一索引</span>,因此此时选择了索引快速全扫描
-->其余不同组合情形大致相同,不再演示

-->为表t1新增一条val为<span style="background-color: rgb(153, 255, 153);">null</span>的记录
scott@ORCL> insert into t1 select 3,<span style="background-color: rgb(153, 255, 153);">null</span> from dual;

1 row created.

scott@ORCL> commit;

Commit complete.

scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T1',cascade=>true);

PL/SQL procedure successfully completed.

-->下面的查询中可以看出尽管只有列id有not <span style="background-color: rgb(153, 255, 153);">null</span>约束,当所有的索引值都被存储
scott@ORCL> select index_name,index_type,blevel,leaf_blocks,num_rows,status,distinct_keys
  2  from user_indexes  where table_name='T1';

INDEX_NAME      INDEX_TYPE     BLEVEL LEAF_BLOCKS   NUM_ROWS STATUS   DISTINCT_KEYS
--------------- ---------- ---------- ----------- ---------- -------- -------------
I_T1_ID         NORMAL              0           1          3 VALID                3
I_T1_ID_VAL     NORMAL              0           1          3 VALID                3

-->Author : Robinson Cheng
-->Blog :   http://blog.csdn.net/robinson_0612

四、总结
    无论是单列唯一索引或复合唯一索引,对于可以为null的列或复合null值,Oracle不会为其存储索引值。
    故在基于单列创建B树唯一索引或多列创建B树复合唯一索引的情形下,
    当列上允许为null值时
        where子句使用了基于is null的情形,其执行计划走全表扫描。
        where子句使用了基于is not null的情形,其执行计划走索引扫描(索引范围扫描或索引全扫描)。
    当列上不允许为null值时,存在非null约束
        where子句使用了基于is null的情行,其执行计划走索引扫描。
        where子句使用了基于is not null的情形,其执行计划也是走索引扫描。
    注:此在Oracle 10g R2(linux)下的情形,不同的优化器版本可能会有偏差。

五、更多参考

NULL 值与索引(二)

SQL tuning 步骤

高效SQL语句必杀技

父游标、子游标及共享游标

绑定变量及其优缺点

dbms_xplan之display_cursor函数的使用

dbms_xplan之display函数的使用

执行计划中各字段各模块描述

使用 EXPLAIN PLAN 获取SQL语句执行计划

启用
AUTOTRACE 功能

函数使得索引列失效

Oracle 绑定变量窥探

Oracle 自适应共享游标

   

抱歉!评论已关闭.