文章来自:http://space.itpub.net/15415488/viewspace-663969
常常有人把这三个hint搞混,主要是因为对三种重写原理不清楚。特总结如下。(实验环境为10204)
1. no_unnest, unnest
unnest我们称为对子查询展开,顾名思义,就是别让子查询孤单地嵌套(nest)在里面。
所以un_unnest双重否定代表肯定,即让子查询不展开,让它嵌套(nest)在里面。
现做一个简单的实验:
create table hao1 as select * from dba_objects;
create table hao2 as select * from dba_objects;
analyze table hao1 compute statistics;
analyze table hao2 compute statistics;
SQL> select hao1.object_id from hao1 where exists
2 (select 1 from hao2 where hao1.object_id=hao2.object_id*10);
1038 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2662903432
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 80 (3)| 00:00:01 |
|* 1 | HASH JOIN SEMI | | 1 | 8 | 80 (3)| 00:00:01 |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 42648 | 40 (3)| 00:00:01 |
| 3 | TABLE ACCESS FULL| HAO2 | 10663 | 42652 | 40 (3)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID"*10)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
352 consistent gets
0 physical reads
0 redo size
18715 bytes sent via SQL*Net to client
1251 bytes received via SQL*Net from client
71 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1038 rows processed
这里子查询自动展开(unnest),即HAO2和HAO1 hash join在一起。
接下来如果我们不希望HAO2展开,想先让它单独的执行完,然后再来和外部查询进行一种叫做FILTER的操作。
那么我们加入hint no_unnest:
SQL> select hao1.object_id from hao1 where exists
2 (select /*+no_unnest*/ 1 from hao2 where hao1.object_id=hao2.object_id*10);
1038 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2565749733
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 10750 (1)| 00:01:48 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 42648 | 40 (3)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| HAO2 | 1 | 4 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "HAO2" "HAO2"
WHERE "HAO2"."OBJECT_ID"*10=:B1))
3 - filter("HAO2"."OBJECT_ID"*10=:B1)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1369157 consistent gets
0 physical reads
0 redo size
18715 bytes sent via SQL*Net to client
1251 bytes received via SQL*Net from client
71 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1038 rows processed
这里HAO1和HAO2进行了一种FILTER操作,这个操作在《Cost Based Oracle Fundamental》此书第九章有介绍。他其实很像我们熟悉的neested loop,但它的独特之处在于会维护一个hash table。
举例,如果HAO1里取出object_id=1,那么对于HAO2来说即select 1 from hao2 where hao2.object_id*10=1,如果条件满足,那么对于子查询,输入输出对,即为(1(HAO1.object_id),1(常量))。
他存储在hash table里,并且由于条件满足,HAO1.object_id=1被放入结果集。
然后接着从HAO1取出object_id=2,如果子查询依旧条件满足,那么子查询产生另一个输入和输出,即(2,1),被放入hash table里;并且HAO1.object_id=2被放入结果集。
接着假设HAO1里有重复的object_id,例如我们第三次从HAO1取出的object_id=2,那么由于我们对于子查询来说,已经有输入输出对(2,1)在hash table里了,所以就不用去再次全表扫描HAO2了,ORACLE非常聪明地知道object_id=2是结果集。这里,filter和neested loop相比,省去了一次全表扫描HAO2。
这个hash table是有大小限制的,当被占满的时候,后续新的HAO1.object_id的FILTER就类似neested loop了。
由此可见,从buffer gets层面上来看,FILTER是应该优于neested loop的,尤其当外部查询需要传递给子查询的输入(此例中为HAO1.object_id)的distinct value非常小时,FILTER就会显得更优。
即使在我这个例子中,HAO1.object_id的distinct value上万,我对比了一下neested loop,FILTER仍然略优:
SQL> select /*+use_nl(hao1 hao2)*/ hao1.object_id from hao1,hao2 where hao1.object_id=hao2.object_id*10;
1038 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 251947914
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10663 | 85304 | 404K (2)| 01:07:23 |
| 1 | NESTED LOOPS | | 10663 | 85304 | 404K (2)| 01:07:23 |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 42648 | 40 (3)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| HAO2 | 1 | 4 | 38 (3)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID"*10)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1503621 consistent gets
0 physical reads
0 redo size
18715 bytes sent via SQL*Net to client
1251 bytes received via SQL*Net from client
71 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1038 rows processed
FILTER的consistent gets是1369157,neested loop的consistent gets是1503621。
如果我们希望验证我前面的结论,我们可以用distinct value较小的object_type来做个类似的对比试验。
SQL> select hao1.object_id from hao1 where exists
2 (select /*+no_unnest*/ 1 from hao2 where hao1.object_type=hao2.object_type);
10662 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2565749733
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 288 | 3168 | 114 (1)| 00:00:02 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 114K| 40 (3)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| HAO2 | 2 | 14 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "HAO2" "HAO2"
WHERE "HAO2"."OBJECT_TYPE"=:B1))
3 - filter("HAO2"."OBJECT_TYPE"=:B1)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
17012 consistent gets
0 physical reads
0 redo size
187491 bytes sent via SQL*Net to client
8302 bytes received via SQL*Net from client
712 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10662 rows processed
可见,同样是HAO1和HAO2的全表扫描后的FILTER操作,却因为传给子查询的输入的distinct value的差别,两者相差的consistent gets却如此巨大,这跟neested loop是完全不一样的。
当然,对于如此的两个全表扫描的结果集,使用hash join是最佳方法。
SQL> select hao1.object_id from hao1 where exists
2 (select 1 from hao2 where hao1.object_type=hao2.object_type);
10662 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3371915275
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10662 | 187K| 81 (4)| 00:00:01 |
|* 1 | HASH JOIN RIGHT SEMI| | 10662 | 187K| 81 (4)| 00:00:01 |
| 2 | TABLE ACCESS FULL | HAO2 | 10663 | 74641 | 40 (3)| 00:00:01 |
| 3 | TABLE ACCESS FULL | HAO1 | 10662 | 114K| 40 (3)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO1"."OBJECT_TYPE"="HAO2"."OBJECT_TYPE")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
985 consistent gets
0 physical reads
0 redo size
187491 bytes sent via SQL*Net to client
8302 bytes received via SQL*Net from client
712 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10662 rows processed
所以,什么时候该用no_unnest使得子查询能够独立的执行完毕之后再跟外围的查询做FILTER?
首先,子查询的返回结果集应该较小,然后外围查询的输入的distinct value也应该较小(例如object_type)。
2.push_subq
如果说no_unnest是为了让子查询不展开,独立的完成,那么push_subq就是为了让子查询最先进行join。
所以,这个hint其实是控制的join的顺序。
例如某次在生产库中遇到的一个SQL,简化一下然后模拟一下:
create table hao1 as select * from dba_objects;
create table hao2 as select * from dba_objects;
create table hao3 as select * from dba_objects;
create table hao4 as select * from dba_objects;
create index hao3idx on hao3(object_id);
(analyze all tables。)
select hao1.object_name from
hao1,hao2,hao4
where hao1.object_name like '%a%'
and hao1.object_id+hao2.object_id>50
and hao4.object_type=hao1.object_type
and 11 in
(SELECT hao3.object_id FROM hao3 WHERE hao1.object_id = hao3.object_id);
对于如上的SQL,其中hao3和hao1在子查询中join,
很明显,如果先让hao1和hao3通过join,结果集估计只有一行,或者没有。
但是,此时CBO做出的执行计划为:
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 89077 | 3131K| 2070M (1)|999:59:59 |
|* 1 | FILTER | | | | | |
|* 2 | HASH JOIN | | 3234M| 108G| 289K (24)| 00:48:17 |
| 3 | TABLE ACCESS FULL | HAO4 | 36309 | 212K| 126 (3)| 00:00:02 |
| 4 | NESTED LOOPS | | 3296K| 94M| 224K (2)| 00:37:28 |
|* 5 | TABLE ACCESS FULL| HAO1 | 1816 | 47216 | 126 (3)| 00:00:02 |
|* 6 | TABLE ACCESS FULL| HAO2 | 1815 | 7260 | 124 (2)| 00:00:02 |
|* 7 | FILTER | | | | | |
|* 8 | INDEX RANGE SCAN | HAO3IDX | 1 | 4 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ */ 0 FROM "HAO3" "HAO3" WHERE 11=:B1
AND "HAO3"."OBJECT_ID"=11))
2 - access("HAO4"."OBJECT_TYPE"="HAO1"."OBJECT_TYPE")
5 - filter("HAO1"."OBJECT_NAME" LIKE '%a%')
6 - filter("HAO1"."OBJECT_ID"+"HAO2"."OBJECT_ID">50)
7 - filter(11=:B1)
8 - access("HAO3"."OBJECT_ID"=11)
由上可见,hao1和hao2,hao4先进行无穷无尽的join之后,最后才跟hao3 join,这是非常坏的plan。
于是,我们希望hao1和hao3所在子查询先join,可以采用push_subq:
select /*+push_subq(@tmp)*/ hao1.object_name from
hao1,hao2,hao4
where hao1.object_name like '%a%'
and hao1.object_id+hao2.object_id>50
and hao4.object_type=hao1.object_type
and 11 in
(SELECT /*+QB_Name(tmp)*/ hao3.object_id FROM hao3 WHERE hao1.object_id = hao3.object_id);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 161M| 5552M| 14749 (24)| 00:02:28 |
|* 1 | HASH JOIN | | 161M| 5552M| 14748 (24)| 00:02:28 |
| 2 | TABLE ACCESS FULL | HAO4 | 36309 | 212K| 126 (3)| 00:00:02 |
| 3 | NESTED LOOPS | | 164K| 4828K| 11386 (2)| 00:01:54 |
|* 4 | TABLE ACCESS FULL | HAO1 | 91 | 2366 | 126 (3)| 00:00:02 |
|* 5 | FILTER | | | | | |
|* 6 | INDEX RANGE SCAN| HAO3IDX | 1 | 4 | 1 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | HAO2 | 1815 | 7260 | 124 (2)| 00:00:02 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO4"."OBJECT_TYPE"="HAO1"."OBJECT_TYPE")
4 - filter("HAO1"."OBJECT_NAME" LIKE '%a%' AND EXISTS (SELECT /*+
PUSH_SUBQ QB_NAME ("TMP") */ 0 FROM "HAO3" "HAO3" WHERE 11=:B1 AND
"HAO3"."OBJECT_ID"=11))
5 - filter(11=:B1)
6 - access("HAO3"."OBJECT_ID"=11)
7 - filter("HAO1"."OBJECT_ID"+"HAO2"."OBJECT_ID">50)
加上hint后,SQL会在1秒以内完成。
3.push_pred
在谈到push_pred这个hint时,首先要搞清楚mergeable view和unmergeable view的区别。
这个在concept上有明确解释:
Mergeable and Unmergeable ViewsThe optimizer can merge a view into a referencing query block when the view has one or more base tables, provided the view does not contain:
- set operators (UNION, UNION ALL, INTERSECT, MINUS)
a CONNECT BY clause
a ROWNUM pseudocolumn
- aggregate functions (AVG, COUNT, MAX, MIN, SUM) in the select list
- a GROUP BY clause
- a DISTINCT operator in the select list
这里在最后,我们发现一个unmergeable view的一种情况就是view在outer join的右侧。
对于这种情况,我们熟知的merge hint也无效。
例如:
create or replace view haoview as
select hao1.* from hao1,hao2
where hao1.object_id=hao2.object_id;
那么对于这样一个简单的查询,可见谓词hao3.object_name=haoview.object_name被merge到了view中:
select hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name
and hao3.object_id=999;
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 44 | 129 (3)| 00:00:02 |
| 1 | NESTED LOOPS | | 1 | 44 | 129 (3)| 00:00:02 |
|* 2 | HASH JOIN | | 1 | 40 | 128 (3)| 00:00:02 |
| 3 | TABLE ACCESS BY INDEX ROWID| HAO3 | 1 | 20 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | HAO3IDX | 1 | | 1 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | HAO1 | 36311 | 709K| 125 (2)| 00:00:02 |
|* 6 | INDEX RANGE SCAN | HAO2IDX | 1 | 4 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("HAO3"."OBJECT_NAME"="HAO1"."OBJECT_NAME")
4 - access("HAO3"."OBJECT_ID"=999)
6 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
接着,我把haoview放到outer join的右侧,这是haoview就属于unmergeable view了,优化器默认无法将谓词merge进这个haoview中,于是就看到了haoview单独先执行:
select hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name(+)
and hao3.object_id=999;
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 86 | 153 (5)| 00:00:02 |
|* 1 | HASH JOIN OUTER | | 1 | 86 | 153 (5)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| HAO3 | 1 | 20 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | HAO3IDX | 1 | | 1 (0)| 00:00:01 |
| 4 | VIEW | HAOVIEW | 36309 | 2340K| 150 (4)| 00:00:02 |
|* 5 | HASH JOIN | | 36309 | 850K| 150 (4)| 00:00:02 |
| 6 | INDEX FAST FULL SCAN | HAO2IDX | 36309 | 141K| 22 (5)| 00:00:01 |
| 7 | TABLE ACCESS FULL | HAO1 | 36311 | 709K| 125 (2)| 00:00:02 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO3"."OBJECT_NAME"="HAOVIEW"."OBJECT_NAME"(+))
3 - access("HAO3"."OBJECT_ID"=999)
5 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
接着,我们来使用这里的hint push_pred强制优化器将谓词merge进view中,可见到“VIEW PUSHED PREDICATE”:
select /*+push_pred(haoview)*/ hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name(+)
and hao3.object_id=999;
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 40 | 128 (2)| 00:00:02 |
| 1 | NESTED LOOPS OUTER | | 1 | 40 | 128 (2)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| HAO3 | 1 | 36 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | HAO3IDX | 1 | | 1 (0)| 00:00:01 |
| 4 | VIEW PUSHED PREDICATE | HAOVIEW | 1 | 4 | 126 (2)| 00:00:02 |
| 5 | NESTED LOOPS | | 1 | 24 | 126 (2)| 00:00:02 |
|* 6 | TABLE ACCESS FULL | HAO1 | 1 | 20 | 125 (2)| 00:00:02 |
|* 7 | INDEX RANGE SCAN | HAO2IDX | 1 | 4 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("HAO3"."OBJECT_ID"=999)
6 - filter("HAO1"."OBJECT_NAME"="HAO3"."OBJECT_NAME")
7 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
于是,会有同学问,那么merge hint能否有同样的效果呢?答案是,对于这种unmergeable view来说,merge hint无效。
select /*+merge(haoview)*/ hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name(+)
and hao3.object_id=999;
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 86 | 153 (5)| 00:00:02 |
|* 1 | HASH JOIN OUTER | | 1 | 86 | 153 (5)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| HAO3 | 1 | 20 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | HAO3IDX | 1 | | 1 (0)| 00:00:01 |
| 4 | VIEW | HAOVIEW | 36309 | 2340K| 150 (4)| 00:00:02 |
|* 5 | HASH JOIN | | 36309 | 850K| 150 (4)| 00:00:02 |
| 6 | INDEX FAST FULL SCAN | HAO2IDX | 36309 | 141K| 22 (5)| 00:00:01 |
| 7 | TABLE ACCESS FULL | HAO1 | 36311 | 709K| 125 (2)| 00:00:02 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO3"."OBJECT_NAME"="HAOVIEW"."OBJECT_NAME"(+))
3 - access("HAO3"."OBJECT_ID"=999)
5 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
可见,对于此种身处outger join右侧的view来说,merge hint已经无能为力了。
综上,对于大家比较容易混淆的三个hint:
no_unnest/unnest是针对子查询是否展开的,push_subq是针对子查询的连接顺序的,push_pred则是针对unmergeable view使用外部查询谓词。