
A
 while ago I wrote several similar articles on how to use the different 
Analytic functions in Oracle. I have now chosen to merge these articles 
into one useful article/document that you can print out or save to your 
computer for your own reference.
In this tutorial I will try to 
explain to you how to use them, through practical examples. You may have
 some problems following this tutorial if you do not have knowledge 
about Oracle’s aggregated functions.
An
 analytic function differs from a regular aggregated function (like i.e.
 SUM, AVG, MAX, MIN, etc) in the way that they return multiple rows for 
each group. The group of rows is called a 
window. In a 
way you can think about it as if you have a resultset from your query, 
and for each record there is a little window to set of other data.
All
 the samples in this tutorial are using the tables in the default SCOTT 
schema that follows a standard Oracle database installation. This schema
 is not always available in your database, depending on the 
administrative database setup. And, it might not be available in newer 
versions of Oracle. You can i.e. find the scripts here: Oracle SCOTT demo tables.
  
The common format of the use of an analytic funtion is as follows:
analytic_function(arguments)
 OVER (PARTITION BY  ORDER BY 
) []
 
The syntax can vary a little bit based on which of the analytic functions it is.
 UPDATE 6/27/2012: I added a post on how to use the RANK() and DENSE_RANK() analytic functions:  How To Rank Records in Oracle SQL
COUNT aggregate function sample
Consider this use of the COUNT 
aggregate function
| 2 | 2          ,count(e.empno) emp_count | 
 
 
 
As you can see, for each of the departments in the resultset you will find one number.
COUNT analytic function sample
Now, let us use the COUNT 
analytic function and se what happens:
| 3 | 3           ,count(e.empno) over (partition bye.deptno) emp_count | 
 
| 5 | 5  orderbye.deptno, e.empno | 
 
 
 
I added the 
empno
 field to the query so that you easier can see what happens in the 
resultset. And, as you can see the result set shows the count of the 
number of employees in each department…for EACH of the employees in the 
department.
The equivalent query using the COUNT aggregate function would be:
| 6 | 6               wheree2.deptno = e.deptno | 
 
| 9 | 9  orderbye.deptno, e.empno | 
 
 
 
Ok, so how can we practically use the COUNT analytic funtion?
COUNT analytic function, practical example
Let
 us pretend that the HR department wants to know how many employees 
there are in each department in our company, and also how many percent 
the count of department employees is out of the total employees. This is
 one way you can do this using the COUNT analytic function:
| 1 | SQL> selectdistincte.deptno | 
 
| 2 | 2          ,count(e.empno) over (partition bye.deptno) dept_emp_count | 
 
| 3 | 3          ,count(e.empno) over (partition by1) overall_emp_count | 
 
| 5 | 5              (count(e.empno) over (partition bye.deptno) * 100) / count(e.empno) over (partition by1) | 
 
| 9 | DEPTNO DEPT_EMP_COUNT OVERALL_EMP_COUNT DEPTNO_EMP_PERC | 
 
 
 
In the query above you will see a “trick” that I use many times in my queries containing analytic function: 
partition by 1. This simply means every record in the resultset..
SUM aggregate function sample
Consider this use of the SUM 
aggregate function:
| 2 | 2          ,sum(e.sal) emp_salary_sum | 
 
 
 
As you can see, for each of the departments in the resultset you will find one number.
SUM analytic function sample
Now, let us use the SUM 
analytic function and se what happens:
| 4 | 4           ,sum(e.sal) over (partition bye.deptno) emp_salary_sum | 
 
| 6 | 6  orderbye.deptno, e.empno | 
 
| 8 | DEPTNO EMPNO       SAL EMP_SALARY_SUM | 
 
 
 
I added the 
empno
 field to the query so that you easier can see what happens in the 
resultset. And, as you can see the result set shows the sum of the 
amount of salary for the employees in each department…for EACH of the 
employees in the department.
The equivalent query using the SUM aggregate function would be:
| 7 | 7               wheree2.deptno = e.deptno | 
 
| 10 | 10  orderbye.deptno, e.empno | 
 
| 12 | DEPTNO EMPNO       SAL SUM_DEPT_SALARY | 
 
 
 
Ok, so how can we practically use the SUM analytic funtion?
SUM analytic function, practical example
Let
 us pretend that the HR department wants to know how much money each of 
departments in our company make, and also how many percent this is out 
of the total infome for the whole comapny. This is one way you can do 
this using the SUM analytic function:
| 1 | SQL> selectdistincte.deptno | 
 
| 2 | 2          ,sum(e.sal) over (partition bye.deptno) dept_emp_salary | 
 
| 3 | 3          ,sum(e.sal) over (partition by1) overall_emp_salary | 
 
| 5 | 5              (sum(e.sal) over (partition bye.deptno) * 100) / sum(e.sal) over (partition by1) | 
 
| 9 | DEPTNO DEPT_EMP_SALARY OVERALL_EMP_SALARY DEPTNO_EMP_PERC | 
 
 
 
In the query above you will see a “trick” that I use many times in my queries containing analytic function: 
partition by 1. This simply means every record in the resultset..
AVG aggregate function sample
Consider this use of the AVG 
aggregate function:
| 2 | 2          ,round(avg(e.sal), 2) avg_salary | 
 
 
 
As you can see, for each of the departments in the resultset you will find one number.
AVG analytic function sample
Now, let us use the AVG 
analytic function and se what happens:
| 4 | 4          ,round(avg(e.sal) over (partition bye.deptno), 2) emp_salary_avg | 
 
| 6 | 6  orderbye.deptno, e.empno | 
 
| 8 | DEPTNO EMPNO       SAL EMP_SALARY_AVG | 
 
| 10 | 10      7782   2450.00        2916.67 | 
 
| 11 | 10      7839   5000.00        2916.67 | 
 
| 12 | 10      7934   1300.00        2916.67 | 
 
| 18 | 30      7499   1600.00        1566.67 | 
 
| 19 | 30      7521   1250.00        1566.67 | 
 
| 20 | 30      7654   1250.00        1566.67 | 
 
| 21 | 30      7698   2850.00        1566.67 | 
 
| 22 | 30      7844   1500.00        1566.67 | 
 
 
 
I added the 
empno
 field to the query so that you easier can see what happens in the 
resultset. And, as you can see the result set shows the avg of the 
amount of salary for the employees in each department…for EACH of the 
employees in the department.
The equivalent query using the AVG aggregate function would be:
| 5 | 5              SELECTROUND(AVG(e2.sal), 2) | 
 
| 7 | 7              WHEREe2.deptno = e.deptno | 
 
| 10 | 10  ORDERBYe.deptno, e.empno | 
 
| 12 | DEPTNO EMPNO       SAL SUM_DEPT_SALARY | 
 
| 14 | 10      7782   2450.00         2916.67 | 
 
| 15 | 10      7839   5000.00         2916.67 | 
 
| 16 | 10      7934   1300.00         2916.67 | 
 
| 22 | 30      7499   1600.00         1566.67 | 
 
| 23 | 30      7521   1250.00         1566.67 | 
 
| 24 | 30      7654   1250.00         1566.67 | 
 
| 25 | 30      7698   2850.00         1566.67 | 
 
| 26 | 30      7844   1500.00         1566.67 | 
 
 
 
Ok, so how can we practically use the AVG analytic funtion?
AVG analytic function, practical example
Let
 us pretend that the HR department wants to know how much money, in 
average, each of departments in our company make, and also how many 
percent this is out of the total income for the whole company. This is 
one way you can do this using the AVG analytic function:
| 1 | SQL> selectdistincte.deptno | 
 
| 2 | 2          ,round(avg(e.sal) over (partition bye.deptno), 2) avg_dept_salary | 
 
| 3 | 3          ,round(avg(e.sal) over (partition by1), 2) avg_overall_salary | 
 
| 4 | 4          ,round(((avg(e.sal) over (partition bye.deptno) * 100) / avg(e.sal) over (partition by1)), 2) avg_perc_of_avg_tot | 
 
| 7 | DEPTNO AVG_DEPT_SALARY AVG_OVERALL_SALARY AVG_PERC_OF_AVG_TOT | 
 
| 9 | 10             2916.67            2073.21              140.68 | 
 
| 10 | 30             1566.67            2073.21               75.57 | 
 
 
 
In the query above you will see a “trick” that I use many times in my queries containing analytic function: 
partition by 1. This simply means every record in the resultset..
MIN aggregate function sample
Consider this use of the MIN 
aggregate function:
| 2 | 2          ,ROUND(MIN(e.sal), 2) min_salary | 
 
 
 
As you can see, for each of the departments in the resultset you will find one number.
MIN analytic function sample
Now, let us use the MIN 
analytic function and se what happens:
| 4 | 4          ,MIN(e.sal) OVER (PARTITION BYe.deptno) emp_salary_min | 
 
| 6 | 6  ORDERBYe.deptno, e.empno | 
 
| 8 | DEPTNO EMPNO       SAL EMP_SALARY_MIN | 
 
 
 
I added the 
empno
 field to the query so that you easier can see what happens in the 
resultset. And, as you can see the result set shows the min salary for 
the employees in each department…for EACH of the employees in the 
department.
The equivalent query using the MIN aggregate function would be:
| 5 | 5              SELECTROUND(MIN(e2.sal), 2) | 
 
| 7 | 7              WHEREe2.deptno = e.deptno | 
 
| 10 | 10  ORDERBYe.deptno, e.empno | 
 
| 12 | DEPTNO EMPNO       SAL SUM_DEPT_SALARY | 
 
 
 
Ok, so how can we practically use the MIN analytic funtion?
MIN analytic function, practical example
Let
 us pretend that the HR department wants to know how many employees in 
each department that make more than the minimum salary for the 
department. This is one way you can do this using the MIN analytic 
function (together with an outer SUM aggregate function):
| 2 | 2          ,x.dept_emp_salary_min | 
 
| 4 | 4              WHENx.sal > dept_emp_salary_min THEN1 | 
 
| 6 | 6          END) dept_emp_sal_above_min | 
 
| 10 | 10                      ,MIN(e.sal) OVER (PARTITION BYe.deptno) dept_emp_salary_min | 
 
| 14 | 14          ,x.dept_emp_salary_min | 
 
| 16 | DEPTNO DEPT_EMP_SALARY_MIN DEPT_EMP_SAL_ABOVE_MIN | 
 
 
 
MAX aggregate function sample
Consider this use of the MAX 
aggregate function:
| 2 | 2          ,ROUND(MAX(e.sal), 2) max_salary | 
 
 
 
As you can see, for each of the departments in the resultset you will find one number.
MAX analytic function sample
Now, let us use the MAX 
analytic function and se what happens:
| 4 | 4          ,MAX(e.sal) OVER (PARTITION BYe.deptno) emp_salary_max | 
 
| 6 | 6  ORDERBYe.deptno, e.empno | 
 
| 8 | DEPTNO EMPNO       SAL EMP_SALARY_MAX | 
 
 
 
I added the 
empno
 field to the query so that you easier can see what happens in the 
resultset. And, as you can see the result set shows the max salary for 
the employees in each department…for EACH of the employees in the 
department.
The equivalent query using the MAX aggregate function would be:
| 5 | 5              SELECTROUND(MAX(e2.sal), 2) | 
 
| 7 | 7              WHEREe2.deptno = e.deptno | 
 
| 10 | 10  ORDERBYe.deptno, e.empno | 
 
| 12 | DEPTNO EMPNO       SAL MAX_DEPT_SALARY | 
 
 
 
Ok, so how can we practically use the MAX analytic funtion?
MAX analytic function, practical example
Let
 us pretend that the HR department wants to know how many employees in 
each department that make less than the maximum salary for the 
department. This is one way you can do this using the MAX analytic 
function (together with an outer SUM aggregate function):
| 2 | 2          ,x.dept_emp_salary_max | 
 
| 4 | 4              WHENx.sal < dept_emp_salary_max THEN1 | 
 
| 6 | 6          END) dept_emp_sal_less_max | 
 
| 10 | 10                      ,MAX(e.sal) OVER (PARTITION BYe.deptno) dept_emp_salary_max | 
 
| 14 | 14          ,x.dept_emp_salary_max | 
 
| 16 | DEPTNO DEPT_EMP_SALARY_MAX DEPT_EMP_SAL_LESS_MAX | 
 
 
 
The ROWNUM pseudocolumn
Before
 we start using the ROW_NUMBER analytic function, let’s look a little 
bit on how ROWNUM works in a query. ROWNUM is a so-called pseudocolumn. 
It can be added to any select statement as a field.
Here is a sample of the use of ROWNUM:
| 9 | DEPTNO    HIREDATE     EMPNO     ENAME ROWNUM | 
 
| 10 | 20     12/17/1980     7369     SMITH 1 | 
 
| 11 | 30     2/20/1981     7499     ALLEN 2 | 
 
| 12 | 30     2/22/1981     7521     WARD 3 | 
 
| 13 | 20     4/2/1981     7566     JONES 4 | 
 
| 14 | 30     9/28/1981     7654     MARTIN 5 | 
 
| 15 | 30     5/1/1981     7698     BLAKE 6 | 
 
| 16 | 10     6/9/1981     7782     CLARK 7 | 
 
| 17 | 20     4/19/1987     7788     SCOTT 8 | 
 
| 18 | 10     11/17/1981     7839     KING 9 | 
 
| 19 | 30     9/8/1981     7844     TURNER 10 | 
 
| 20 | 20     5/23/1987     7876     ADAMS 11 | 
 
| 21 | 30     12/3/1981     7900     JAMES 12 | 
 
| 22 | 20     12/3/1981     7902     FORD 13 | 
 
| 23 | 10     1/23/1982     7934     MILLER 14 | 
 
 
 
The
 ROWNUM field just shows the row number in the resultset that is 
returned from the query. If you i.e. wanted to only see the first five 
records in the resultset you could do like this:
 Note
 that if you want to do an ORDER BY you will need to do an ORDER BY in 
the query above, and then use ROWNUM in a query that you wrap around the
 inner query…otherwise the ORDER BY will not work correctly.
The ROW_NUMBER analytic function usage and sample
The
 ROW_NUMBER analytic function is a bit different than the ROWNUM 
pseudocolumn. When you use ROWNUM you get the row number based on the 
WHOLE resultset. When you use the ROW_NUMBER analytical function you 
will get a row number based on what is in the 
PARTITION BY and the 
ORDER BY segments of the expression. Let’s look at a sample.
  
| 5 |     ,ROW_NUMBER() OVER (PARTITION BYe.deptno ORDERBYe.hiredate) rownumber | 
 
| 9 | DEPTNO     HIREDATE     EMPNO     ENAME     ROWNUMBER | 
 
| 10 | 10     6/9/1981     7782     CLARK     1 | 
 
| 11 | 10     11/17/1981     7839     KING     2 | 
 
| 12 | 10     1/23/1982     7934     MILLER     3 | 
 
| 13 | 20     12/17/1980     7369     SMITH     1 | 
 
| 14 | 20     4/2/1981     7566     JONES     2 | 
 
| 15 | 20     12/3/1981     7902     FORD     3 | 
 
| 16 | 20     4/19/1987     7788     SCOTT     4 | 
 
| 17 | 20     5/23/1987     7876     ADAMS     5 | 
 
| 18 | 30     2/20/1981     7499     ALLEN     1 | 
 
| 19 | 30     2/22/1981     7521     WARD     2 | 
 
| 20 | 30     5/1/1981     7698     BLAKE     3 | 
 
| 21 | 30     9/8/1981     7844     TURNER     4 | 
 
| 22 | 30     9/28/1981     7654     MARTIN     5 | 
 
| 23 | 30     12/3/1981     7900     JAMES     6 | 
 
 
 
In
 the above sample the employees are ranged within their department when 
they were hired. The first hire in the department is number 1, etc.
So, if we only want to see i.e. the first employee hired in each department, we can do like this:
| 11 |         ,ROW_NUMBER() OVER (PARTITION BYe.deptno ORDERBYe.hiredate) rownumber | 
 
 
 
Please note that it might be tempting to do this:
| 5 |     ,ROW_NUMBER() OVER (PARTITION BYe.deptno ORDERBYe.hiredate) rownumber | 
 
| 7 | WHEREROW_NUMBER() OVER (PARTITION BYe.deptno ORDERBYe.hiredate) = 1 | 
 
 
 
But, this will cause the following error: “ORA-30483: window functions are not allowed here”.
 
No comments :
Post a Comment