FIRST_VALUE 功能描述:返回组中数据窗口的第一个值。有些类似于SQL Server中的first(),但用法完全不一样,而功能也强大一些。
SQL> create table EMP(
id number,
name varchar2(20),
age number,
POS VARCHAR2(20)
);
SQL> alter table EMP
add constraint emp_PK_id primary key (ID);
...
SQL> SELECT * FROM EMP;
ID | NAME | AGE | POS | |
1 | A | 25 | PM | |
2 | B | 20 | PM | |
3 | C | 30 | PL | |
4 | D | 35 | PL | |
5 | E | 36 | PL |
需求:求出每种职位的平均年龄,同时列出每种职位的年龄最大和最小者。
SQL> SELECT DISTINCT
FIRST_VALUE(NAME) OVER
(PARTITION BY POS ORDER BY AGE DESC)
AS MAXAGE_NAME
,FIRST_VALUE(NAME) OVER
(PARTITION BY POS ORDER BY AGE ASC)
AS MINAGE_NAME
,AVG(AGE) OVER
(PARTITION BY POS)
AS AVG_AGE
,POS
FROM EMP
ORDER BY POS
MAXAGE_NAME | MINAGE_NAME | AVG_AGE | POS | |
E | C | 33.66666667 | PL | |
A | B | 22.5 | PM |
楼上例子很经典,但补充两点:
1. 使用DISTINCT与不使用DISTINCT,对FIRST_VALUE函数是有区别的,使用DISTINCT的FIRST_VALUE函数效果等类似于SQL Server中的first(),取得所有分组中的第一条数据。
如果不使用DISTINCT,FIRST_VALUE函数的执行结果,按照楼上的例子,按照POS分组,对于每个NAME,都返回本组的第一个值(first_value)
SELECT DISTINCT
FIRST_VALUE(NAME) OVER
(PARTITION BY POS ORDER BY AGE DESC)
AS MAXAGE_NAME
,FIRST_VALUE(NAME) OVER
(PARTITION BY POS ORDER BY AGE ASC)
AS MINAGE_NAME
,AVG(AGE) OVER
(PARTITION BY POS)
AS AVG_AGE
,POS
FROM EMP
ORDER BY POS
运行结果
MAXAGE_NAME MINAGE_NAME AVG_AGE POS
1 E C33.6666666666667 PL
2 E C33.6666666666667 PL
3 E C33.6666666666667 PL
4 A B 22.5 PM
5 A B 22.5 PM
2.性能,
通常使用DISTINCT都会造成SQL在运行上性能的损失,但具体损失多少,一般大家没有直观的感受,
根据使用工具Toad(Tools of Oracle Application Developers)的分析结果,相同的SQL使用DISTINCT与使用ROWNUM = 1两种写法, DISTINCT在性能上的损失高达30%以上,证据会在下面给出。
DISTINCT的写法
SELECT DISTINCT
FIRST_VALUE(NAME) OVER
(PARTITION BY ID ORDER BY AGE DESC)
AS MAXAGE_NAME
FROM EMP
ORDER BY POS
ROWNUM = 1的写法
SELECT MAXAGE_NAME
(SELECT
FIRST_VALUE(NAME) OVER
(PARTITION BY ID ORDER BY AGE DESC)
AS MAXAGE_NAME
FROM EMP
ORDER BY POS) T
WHERE ROWNUM = 1
楼上的SQL有误,正确的实例下面给出:
DISTINCT的写法
SELECT DISTINCT
FIRST_VALUE(T.NAME) OVER
(PARTITION BY T.POS ORDER BY T.ID DESC)
AS MAXAGE_NAME
FROM EMP T
WHERE T.POS = 'PM'
Total Cost 18
SELECT STATEMENT, GOAL = ALL_ROWS Cost=5 Cardinality=3 Bytes=24
HASH UNIQUE Cost=5 Cardinality=3 Bytes=24
WINDOW SORT Cost=5 Cardinality=3 Bytes=24
TABLE ACCESS FULL Object owner=NB_S Object name=EMP Cost=3 Cardinality=3 Bytes=24
rownum =1的写法
SELECT MAXAGE_NAME
FROM
(SELECT
FIRST_VALUE(T.NAME) OVER
(PARTITION BY T.POS ORDER BY T.ID DESC)
AS MAXAGE_NAME
FROM EMP T
WHERE T.POS = 'PM') T1
WHERE ROWNUM = 1
Total Cost 15
SELECT STATEMENT, GOAL = ALL_ROWS Cost=4 Cardinality=3 Bytes=75
VIEW Object owner=NB_S Cost=4 Cardinality=3 Bytes=75
WINDOW SORT PUSHED RANK Cost=4 Cardinality=3 Bytes=24
TABLE ACCESS FULL Object owner=NB_S Object name=EMP Cost=3 Cardinality=3 Bytes=24
由此可以看出,使用全表扫描的Distinct会给SQL运行带来更大的成本。
3在分析中发现,使用分析函数FIRST_VALUE()并不一定就能降低SQL的成本,一条sql语句的成本似乎跟本身的逻辑复杂程度,查询的数据表的数据量都有直接的关系,下面给一个反例:
同样是楼上的SQL
改为不用分析函数FIRST_VALUE()的形式:
SELECT MAXAGE_NAME
FROM (SELECT T.NAME AS MAXAGE_NAME
FROM EMP T
WHERE T.POS = 'PM'
ORDER BY T.ID DESC) T1
WHERE ROWNUM = 1
Total Cost: 7
SELECT STATEMENT, GOAL = ALL_ROWS Cost=2 Cardinality=1 Bytes=12
COUNT STOPKEY
VIEW Object owner=NB_S Cost=2 Cardinality=1 Bytes=12
TABLE ACCESS BY INDEX ROWID Object owner=NB_S Object name=EMP Cost=2 Cardinality=3 Bytes=24
INDEX FULL SCAN DESCENDING Object owner=NB_S Object name=EMP_PK_ID Cost=1 Cardinality=2