现在的位置: 首页 > 综合 > 正文

oracle hash join和nested loop下的驱动表相关测试

2018年02月06日 ⁄ 综合 ⁄ 共 13081字 ⁄ 字号 评论关闭

Oracle 驱动表

Oracle驱动表也叫做外部表,也叫外层表,是在多表关联查询中首先遍历的表,驱动表的每一行都要到另一个表中寻找相应的记录,然后计算返回最终数据。

驱动表的概念只在nested loopshash join时存在。

原则:

1.        驱动表一般是小表,但不绝对,看下边

2.        驱动表一般是通过where条件筛选后剩余行数较少的表。

3.        如果表的一条记录很长,占用几个数据块也适合做驱动表

4.        CBORBO中,对于驱动表的选择是不同的,CBO中通过对统计信息的参考进行计算来选择驱动表,而RBO中按照既定原则选择驱动表。

5.        RBO中,from后边最右边的表为驱动表(from后边表从右向左遍历,where条件从下向上遍历)

6.        涉及驱动表的查询,连接条件的索引很重要,驱动表连接字段可以没有索引,但是被驱动表需要被扫描驱动表经过筛选后剩余条数的遍数,所以被驱动表的连接字段上有一条索引是非常重要的。

分析:

假设a10行记录,b1000行记录,两个表都有id列,查询时使用id列进行关联

Select * from a,b where a.id=b.id anda.id=100;

A表作为驱动表比较合适,假设a.id=100只有1行,即使全表扫描a表也就几个块,假设a表占用10个块。

B表的id假如非唯一,如果b表的id列有索引,b表占用100个块,每个块10行记录,id列索引占用10个块,并且id1002条记录,在两个块中

那么这条语句的成本(以块计算,下同):

A表(10个块)*b表索引(10个块)+bid1002个块=102个块

如果b表没有索引,成本为:

A表(10个块)*b表(100个块)=1000个块

如果ab表都没有索引,可以看出不管哪个表作为驱动表,语句的执行成本都是一样的。

如果abid列都有索引,aid列索引占2个块,成本为:

Aid列索引(2个块)*bid列索引(10个块)+
b
id1002个块=22个块

 

如果B表的记录很长,可以作为驱动表的情况比较复杂,大家可以自己想象适合的场景。

 

可以看出,在连接中,如果连接列有索引是多么的重要。

实验支撑

SQL> create table a(id,name) as selectobject_id,object_name from all_objects where rownum < 200;

 

Table created.

 

SQL>      

SQL> create table b as select * fromall_objects ;             

 

Table created.

 

SQL> select count(*) from a;

 

 COUNT(*)

----------

      199

 

SQL> select count(*) from b

SQL>

 

 COUNT(*)

----------

    89083

 

SQL>

SQL> execdbms_stats.gather_table_stats('TEST','A');

 

PL/SQL procedure successfully completed.

 

SQL>

SQL> execdbms_stats.gather_table_stats('TEST','B');

 

PL/SQL procedure successfully completed.

两个表都没有索引

Select count(*) from a,b wherea.id=b.object_id

And a.id=53

执行计划:(B表驱动)

SQL> Select count(*) from a,b wherea.id=b.object_id

 2  And a.id=53

 3  /

 

 COUNT(*)

----------

        1

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 319234518

 

----------------------------------------------------------------------------

| Id | Operation           | Name |Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0| SELECT STATEMENT    |      |    1 |     9 |   420  (1)| 00:00:01 |

|   1| 
SORT AGGREGATE    
|     |     1 |     9 |            |          |

|*  2|   HASH JOIN         |     |     1 |    9 |   420   (1)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| B    |    1 |     5 |   417  (1)| 00:00:01 |

|*  4|    TABLE ACCESS FULL| A    |    1 |     4 |     3  (0)| 00:00:01 |

----------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   2- access("A"."ID"="B"."OBJECT_ID")

   3- filter("B"."OBJECT_ID"=53)

   4- filter("A"."ID"=53)

 

 

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

      1506  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

 

SQL>

A表作为驱动表

SQL> Select /*+
ordered use_nl(a) 
*/count(*) from a,b where a.id=b.object_id

  2  Anda.id=53;

 

 COUNT(*)

----------

        1

 

1 row selected.

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 1397777030

 

----------------------------------------------------------------------------

| Id | Operation           | Name |Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0| SELECT STATEMENT    |      |    1 |     9 |   420  (1)| 00:00:01 |

|   1|  SORT AGGREGATE     |      |    1 |     9 |            |          |

|*  2|  
HASH JOIN
         |     |     1 |     9 |  420   (1)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A    |    1 |     4 |     3  (0)| 00:00:01 |

|*  4|    TABLE ACCESS FULL| B    |    1 |     5 |   417  (1)| 00:00:01 |

----------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   2 -access("A"."ID"="B"."OBJECT_ID")

   3- filter("A"."ID"=53)

   4- filter("B"."OBJECT_ID"=53)

 

 

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

      1506  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

          1 rows processed

 

SQL>

 

发现上面两个语句的代价是一样的

/*+ Ordered use_nl(table_name) */   --使用hint强制表作为驱动表,另外,这里使用的use_nl,但是走的是hash join,说明在没有索引的情况下,oracle优化器更倾向hash
join

执行计划中,hash join下第一个表为驱动表,此处为A表

 

B  object_id列有索引的情况

SQL> create index id_b_object_id onb(object_id);

 

Index created.

 

SQL> execdbms_stats.gather_table_stats(ownname => 'TEST',TABNAME => 'B',CASCADE=> TRUE);

 

PL/SQL procedure successfully completed.

 

SQL>

 

执行计划:

SQL> Select count(*) from a,b wherea.id=b.object_id

 2  And a.id=53;

 

 COUNT(*)

----------

        1

 

1 row selected.

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 3168189658

 

----------------------------------------------------------------------------------------

| Id | Operation             |Name           | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------------------

|   0| SELECT STATEMENT      |                |     1 |    9 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE       |                |     1 |    9 |            |          |

|   2|  
MERGE JOINCARTESIAN|
                |     1 |    9 |     4  (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL  | A              |     1 |    4 |     3   (0)| 00:00:01 |

|   4|    BUFFER SORT        |                |     1 |    5 |     1   (0)| 00:00:01 |

|*  5|     INDEX RANGE SCAN  | ID_B_OBJECT_ID |     1 |    5 |     1   (0)| 00:00:01 |

----------------------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   3- filter("A"."ID"=53)

   5 -access("B"."OBJECT_ID"=53)

 

 

Statistics

----------------------------------------------------------

        92  recursive calls

         0  db block gets

       134  consistent gets

        23  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

        12  sorts (memory)

         0  sorts (disk)

         1  rows processed

 

SQL>

发现执行计划并没有使用nested loophash join,不过走索引后,执行代价明显减少。Merge join发生了排序,如果内存够用还好,不够用就比较耗时了。

 

强制hash

A表驱动

SQL> Select /*+ use_hash(a,b)  */count(*) from a,b where a.id=b.object_id

 2  And a.id=53;

 

 COUNT(*)

----------

        1

 

1 row selected.

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 895278611

 

--------------------------------------------------------------------------------------

| Id | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0| SELECT STATEMENT    |                |     1 |    9 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE     |                |     1 |    9 |            |          |

|*  2|   HASH JOIN         |                |     1 |    9 |     4   (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A              |     1 |    4 |     3   (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN | ID_B_OBJECT_ID|     1 |     5 |    1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   2- access("A"."ID"="B"."OBJECT_ID")

   3- filter("A"."ID"=53)

   4- access("B"."OBJECT_ID"=53)

 

 

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         5  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

          0 sorts (disk)

         1  rows processed

 

SQL>

--强制使用hash joina表默认变为了驱动表,执行代价很低,符合要求

B表驱动

SQL> Select /*+ ordered use_hash(b)  */count(*) from a,b where a.id=b.object_id

 2  And a.id=53;

 

 COUNT(*)

----------

        1

 

1 row selected.

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 895278611

 

--------------------------------------------------------------------------------------

| Id | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0| SELECT STATEMENT    |                |     1 |    9 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE     |                |     1 |    9 |            |          |

|*  2|   HASH JOIN         |                |     1 |    9 |     4   (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A              |     1 |    4 |     3   (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN | ID_B_OBJECT_ID|     1 |     5|     1  (0)| 00:00:01 |

--------------------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   2- access("A"."ID"="B"."OBJECT_ID")

   3- filter("A"."ID"=53)

   4- access("B"."OBJECT_ID"=53)

 

 

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         5  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

 

SQL>

发现有索引,并且有统计信息的情况下,无法强制B表作为驱动表,oraclehint进行了忽略。

删除统计信息试试:

SQL> EXEC dbms_stats.delete_table_stats(user,'B',cascade_parts =>TRUE);

 

PL/SQL procedure successfully completed

 

SQL> EXEC dbms_stats.delete_table_stats(user,'A',cascade_parts =>TRUE);

 

PL/SQL procedure successfully completed

 

SQL>

--测试发现仍然不能将B表作为驱动表,修改optimizer_moderule

alter session set optimizer_mode=rule;

SQL> Select /*+ ordered use_nl(b)  */count(*) from a,b where a.id=b.object_id

 2  And object_id=53;

--发现仍然不能将B表作为驱动表

 

强制nested loop

SQL> Select /*+ ordered use_nl(b)  */count(*) from a,b where a.id=b.object_id

 2  And object_id=53;

 

 COUNT(*)

----------

        1

 

1 row selected.

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 1183094437

 

--------------------------------------------------------------------------------------

| Id | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0| SELECT STATEMENT    |                |     1 |   26 |     4   (0)| 00:00:01 |

|   1|  SORT AGGREGATE     |                |     1 |   26 |            |          |

|   2|   NESTED LOOPS      |                |     1 |   26 |     4   (0)| 00:00:01 |

|*  3|    TABLE ACCESS FULL| A              |     1 |   13 |     3   (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN | ID_B_OBJECT_ID|     1 |    13 |    1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   3- filter("A"."ID"=53)

   4- access("OBJECT_ID"=53)

 

Note

-----

   -dynamic statistics used: dynamic sampling (level=2)

 

 

Statistics

----------------------------------------------------------

        10  recursive calls

         0  db block gets

        73  consistent gets

         1  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

       543  bytes received via SQL*Netfrom client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

 

SQL>

--代价和hash join差不多,另外,即使强制B表作为驱动表,仍然不能将B表作为驱动表。

两个都有索引的情况

SQL> create index id_a_id on a(id);

 

Index created.

 

SQL> execdbms_stats.gather_table_stats(user,'A',CASCADE=>TRUE);

 

PL/SQL procedure successfully completed.

 

SQL> execdbms_stats.gather_table_stats(user,'B',cascade => true);

 

PL/SQL procedure successfully completed.

 

SQL>

 

SQL> Select /*+ ordered use_nl(b)  */count(*) from a,b where a.id=b.object_id

 2  And object_id=53;

 

 COUNT(*)

----------

        1

 

1 row selected.

 

Elapsed: 00:00:00.01

 

Execution Plan

----------------------------------------------------------

Plan hash value: 2751652919

 

-------------------------------------------------------------------------------------

| Id | Operation          | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------

|   0| SELECT STATEMENT   |                |     1 |    9 |     2   (0)| 00:00:01 |

|   1|  SORT AGGREGATE    |                |     1 |    9 |            |          |

|   2|   NESTED LOOPS     |                |     1 |    9 |     2   (0)| 00:00:01 |

|*  3|    INDEX RANGE SCAN| ID_A_ID        |    1 |     4 |     1  (0)| 00:00:01 |

|*  4|    INDEX RANGE SCAN| ID_B_OBJECT_ID|     1 |     5 |    1   (0)| 00:00:01 |

-------------------------------------------------------------------------------------

 

Predicate Information (identified byoperation id):

---------------------------------------------------

 

   3- access("A"."ID"=53)

   4- access("OBJECT_ID"=53)

 

 

Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         3  consistent gets

         0  physical reads

         0  redo size

       542  bytes sent via SQL*Net toclient

        543 bytes received via SQL*Net from client

         2  SQL*Net roundtrips to/fromclient

         0  sorts (memory)

         0  sorts (disk)

         1  rows processed

 

SQL>

--hint强制不能将B表作为驱动表

代价明显变小,又减少一倍(索引是多么重要)

  

 我这里使用的是12c的库,发现12c对于执行计划的准确性确实有提高,hint作为辅助手段越来越显得必要性很小,这是dba要失业的劲头还是帮助dba减轻负担,~~

 

 

 

 

 

 

抱歉!评论已关闭.