现在的位置: 首页 > 综合 > 正文

oracle in和exists、not in和not exists原理和性能探究

2014年08月01日 ⁄ 综合 ⁄ 共 8631字 ⁄ 字号 评论关闭

      对于in和exists、not in和not exists还是有很多的人有疑惑,更有甚者禁用not in,所有的地方都要用not exists,它真的高效吗?通过下面的使用我们来证明。

先制造一些数据

SQL> drop table test1 purge;
SQL> drop table test2 purge;
SQL> create table test1 as select * from dba_objects where rownum <=1000;
SQL> create table test2 as select * from dba_objects;
SQL> exec dbms_stats.gather_table_stats(user,'test1');
SQL> exec dbms_stats.gather_table_stats(user,'test2');
SQL> set autotrace traceonly

in和exists原理及性能实验:

SQL> select * from test1 t1 where t1.object_id in (select t2.object_id from test2 t2);
已选择1000行。
执行计划
----------------------------------------------------------
Plan hash value: 3819917785
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |   997 | 84745 |   168   (3)| 00:00:03 |
|*  1 |  HASH JOIN SEMI    |       |   997 | 84745 |   168   (3)| 00:00:03 |
|   2 |   TABLE ACCESS FULL| TEST1 |  1000 | 80000 |     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| TEST2 | 50687 |   247K|   162   (2)| 00:00:02 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         95  consistent gets
          0  physical reads
          0  redo size
      45820  bytes sent via SQL*Net to client
       1111  bytes received via SQL*Net from client
         68  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1000  rows processed

SQL> select *  from test1 t1
  2   where exists (select 1 from test2 t2 where t1.object_id = t2.object_id);
已选择1000行。
执行计划
----------------------------------------------------------
Plan hash value: 3819917785
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |   997 | 84745 |   168   (3)| 00:00:03 |
|*  1 |  HASH JOIN SEMI    |       |   997 | 84745 |   168   (3)| 00:00:03 |
|   2 |   TABLE ACCESS FULL| TEST1 |  1000 | 80000 |     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| TEST2 | 50687 |   247K|   162   (2)| 00:00:02 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         95  consistent gets
          0  physical reads
          0  redo size
      45820  bytes sent via SQL*Net to client
       1111  bytes received via SQL*Net from client
         68  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1000  rows processed
        结论:在oracle 10g中,in 和 exists其实是一样的,原理就是两张表做HASH JOIN SEMI。也可以通过10053事件看到两条sql语句最终转换成同一条sql。

not in和not exists原理及性能实验:
not exists 比 not in效率高的例子

SQL> select count(*) from test1 where object_id not in(select object_id from test2);
执行计划
----------------------------------------------------------
Plan hash value: 3641219899
-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |     4 | 81076   (2)| 00:16:13 |
|   1 |  SORT AGGREGATE     |       |     1 |     4 |            |          |
|*  2 |   FILTER            |       |       |       |            |          |
|   3 |    TABLE ACCESS FULL| TEST1 |  1000 |  4000 |     5   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| TEST2 |     1 |     5 |   162   (2)| 00:00:02 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
              LNNVL("OBJECT_ID"<>:B1)))
   4 - filter(LNNVL("OBJECT_ID"<>:B1))
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       9410  consistent gets
          0  physical reads
          0  redo size
        407  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(*) from test1 t1 where not exists
    (select 1 from test2 t2 where t1.object_id=t2.object_id);
执行计划
----------------------------------------------------------
Plan hash value: 240185659
-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |     9 |   168   (3)| 00:00:03 |
|   1 |  SORT AGGREGATE     |       |     1 |     9 |            |          |
|*  2 |   HASH JOIN ANTI    |       |     3 |    27 |   168   (3)| 00:00:03 |
|   3 |    TABLE ACCESS FULL| TEST1 |  1000 |  4000 |     5   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| TEST2 | 50687 |   247K|   162   (2)| 00:00:02 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        717  consistent gets
          0  physical reads
          0  redo size
        407  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
          
not in比not exists 效率高的例子
SQL> Set autotrace off
SQL> drop table test1 purge;
表已删除。
SQL> drop table test2 purge;
表已删除。
SQL> create table test1 as select * from dba_objects where rownum <=5;
表已创建。
SQL> create table test2 as select * from dba_objects;
表已创建。
SQL> Insert into test2 select * from dba_objects;
已创建50687行。
SQL> Insert into test2 select * from test2;
已创建101374行。
SQL> Insert into test2 select * from test2;
已创建202748行。
SQL> Commit;
提交完成。
SQL> exec dbms_stats.gather_table_stats(user,'test1');
PL/SQL 过程已成功完成。
SQL> exec dbms_stats.gather_table_stats(user,'test2');
PL/SQL 过程已成功完成。
SQL> Set autotrace traceonly
SQL> select count(*) from test1 where object_id not in(select object_id from test2);
执行计划
----------------------------------------------------------
Plan hash value: 3641219899
-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |     3 |  3143   (2)| 00:00:38 |
|   1 |  SORT AGGREGATE     |       |     1 |     3 |            |          |
|*  2 |   FILTER            |       |       |       |            |          |
|   3 |    TABLE ACCESS FULL| TEST1 |     5 |    15 |     3   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| TEST2 |     8 |    40 |  1256   (2)| 00:00:16 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
              LNNVL("OBJECT_ID"<>:B1)))
   4 - filter(LNNVL("OBJECT_ID"<>:B1))
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         23  consistent gets
          0  physical reads
          0  redo size
        407  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(*) from test1 t1 where not exists
    (select 1 from test2 t2 where t1.object_id=t2.object_id);
执行计划
----------------------------------------------------------
Plan hash value: 240185659
-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |     8 |  1263   (3)| 00:00:16 |
|   1 |  SORT AGGREGATE     |       |     1 |     8 |            |          |
|*  2 |   HASH JOIN ANTI    |       |     1 |     8 |  1263   (3)| 00:00:16 |
|   3 |    TABLE ACCESS FULL| TEST1 |     5 |    15 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| TEST2 |   405K|  1981K|  1253   (2)| 00:00:16 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       5609  consistent gets
          0  physical reads
          0  redo size
        407  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
        结论:not in 和not exists原理是nestedloops 与HASH JOIN的区别,not in中的filter算法类似于nestedloops。如果比较两者的性能,就是比较nestedloops 与HASH JOIN的性能差异。在本例子中:
    not in 性能 大于not exists  test1的数据量5条,test2数量40多万条。
    not exists 性能 大于not in  test1的数据量1000条,test2数量50687条。

not in和not exists还有一个重要区别,就是查询条件后面的语句连接字段中有null值时,not in查询的结果不正确。
http://blog.csdn.net/stevendbaguo/article/details/8270572

抱歉!评论已关闭.