A
while ago I wrote several similar articles on how to use the different
Analytic functions in Oracle. I have now chosen to merge these articles
into one useful article/document that you can print out or save to your
computer for your own reference.
In this tutorial I will try to
explain to you how to use them, through practical examples. You may have
some problems following this tutorial if you do not have knowledge
about Oracle’s aggregated functions.
An
analytic function differs from a regular aggregated function (like i.e.
SUM, AVG, MAX, MIN, etc) in the way that they return multiple rows for
each group. The group of rows is called a
window. In a
way you can think about it as if you have a resultset from your query,
and for each record there is a little window to set of other data.
All
the samples in this tutorial are using the tables in the default SCOTT
schema that follows a standard Oracle database installation. This schema
is not always available in your database, depending on the
administrative database setup. And, it might not be available in newer
versions of Oracle. You can i.e. find the scripts here: Oracle SCOTT demo tables.
The common format of the use of an analytic funtion is as follows:
analytic_function(arguments)
OVER (PARTITION BY ORDER BY
) []
The syntax can vary a little bit based on which of the analytic functions it is.
UPDATE 6/27/2012: I added a post on how to use the RANK() and DENSE_RANK() analytic functions: How To Rank Records in Oracle SQL
COUNT aggregate function sample
Consider this use of the COUNT
aggregate function
2 | 2 , count (e.empno) emp_count |
As you can see, for each of the departments in the resultset you will find one number.
COUNT analytic function sample
Now, let us use the COUNT
analytic function and se what happens:
3 | 3 , count (e.empno) over (partition by e.deptno) emp_count |
5 | 5 order by e.deptno, e.empno |
I added the
empno
field to the query so that you easier can see what happens in the
resultset. And, as you can see the result set shows the count of the
number of employees in each department…for EACH of the employees in the
department.
The equivalent query using the COUNT aggregate function would be:
6 | 6 where e2.deptno = e.deptno |
9 | 9 order by e.deptno, e.empno |
Ok, so how can we practically use the COUNT analytic funtion?
COUNT analytic function, practical example
Let
us pretend that the HR department wants to know how many employees
there are in each department in our company, and also how many percent
the count of department employees is out of the total employees. This is
one way you can do this using the COUNT analytic function:
1 | SQL> select distinct e.deptno |
2 | 2 , count (e.empno) over (partition by e.deptno) dept_emp_count |
3 | 3 , count (e.empno) over (partition by 1) overall_emp_count |
5 | 5 ( count (e.empno) over (partition by e.deptno) * 100) / count (e.empno) over (partition by 1) |
9 | DEPTNO DEPT_EMP_COUNT OVERALL_EMP_COUNT DEPTNO_EMP_PERC |
In the query above you will see a “trick” that I use many times in my queries containing analytic function:
partition by 1. This simply means every record in the resultset..
SUM aggregate function sample
Consider this use of the SUM
aggregate function:
2 | 2 , sum (e.sal) emp_salary_sum |
As you can see, for each of the departments in the resultset you will find one number.
SUM analytic function sample
Now, let us use the SUM
analytic function and se what happens:
4 | 4 , sum (e.sal) over (partition by e.deptno) emp_salary_sum |
6 | 6 order by e.deptno, e.empno |
8 | DEPTNO EMPNO SAL EMP_SALARY_SUM |
I added the
empno
field to the query so that you easier can see what happens in the
resultset. And, as you can see the result set shows the sum of the
amount of salary for the employees in each department…for EACH of the
employees in the department.
The equivalent query using the SUM aggregate function would be:
7 | 7 where e2.deptno = e.deptno |
10 | 10 order by e.deptno, e.empno |
12 | DEPTNO EMPNO SAL SUM_DEPT_SALARY |
Ok, so how can we practically use the SUM analytic funtion?
SUM analytic function, practical example
Let
us pretend that the HR department wants to know how much money each of
departments in our company make, and also how many percent this is out
of the total infome for the whole comapny. This is one way you can do
this using the SUM analytic function:
1 | SQL> select distinct e.deptno |
2 | 2 , sum (e.sal) over (partition by e.deptno) dept_emp_salary |
3 | 3 , sum (e.sal) over (partition by 1) overall_emp_salary |
5 | 5 ( sum (e.sal) over (partition by e.deptno) * 100) / sum (e.sal) over (partition by 1) |
9 | DEPTNO DEPT_EMP_SALARY OVERALL_EMP_SALARY DEPTNO_EMP_PERC |
In the query above you will see a “trick” that I use many times in my queries containing analytic function:
partition by 1. This simply means every record in the resultset..
AVG aggregate function sample
Consider this use of the AVG
aggregate function:
2 | 2 ,round( avg (e.sal), 2) avg_salary |
As you can see, for each of the departments in the resultset you will find one number.
AVG analytic function sample
Now, let us use the AVG
analytic function and se what happens:
4 | 4 ,round( avg (e.sal) over (partition by e.deptno), 2) emp_salary_avg |
6 | 6 order by e.deptno, e.empno |
8 | DEPTNO EMPNO SAL EMP_SALARY_AVG |
10 | 10 7782 2450.00 2916.67 |
11 | 10 7839 5000.00 2916.67 |
12 | 10 7934 1300.00 2916.67 |
18 | 30 7499 1600.00 1566.67 |
19 | 30 7521 1250.00 1566.67 |
20 | 30 7654 1250.00 1566.67 |
21 | 30 7698 2850.00 1566.67 |
22 | 30 7844 1500.00 1566.67 |
I added the
empno
field to the query so that you easier can see what happens in the
resultset. And, as you can see the result set shows the avg of the
amount of salary for the employees in each department…for EACH of the
employees in the department.
The equivalent query using the AVG aggregate function would be:
5 | 5 SELECT ROUND( AVG (e2.sal), 2) |
7 | 7 WHERE e2.deptno = e.deptno |
10 | 10 ORDER BY e.deptno, e.empno |
12 | DEPTNO EMPNO SAL SUM_DEPT_SALARY |
14 | 10 7782 2450.00 2916.67 |
15 | 10 7839 5000.00 2916.67 |
16 | 10 7934 1300.00 2916.67 |
22 | 30 7499 1600.00 1566.67 |
23 | 30 7521 1250.00 1566.67 |
24 | 30 7654 1250.00 1566.67 |
25 | 30 7698 2850.00 1566.67 |
26 | 30 7844 1500.00 1566.67 |
Ok, so how can we practically use the AVG analytic funtion?
AVG analytic function, practical example
Let
us pretend that the HR department wants to know how much money, in
average, each of departments in our company make, and also how many
percent this is out of the total income for the whole company. This is
one way you can do this using the AVG analytic function:
1 | SQL> select distinct e.deptno |
2 | 2 ,round( avg (e.sal) over (partition by e.deptno), 2) avg_dept_salary |
3 | 3 ,round( avg (e.sal) over (partition by 1), 2) avg_overall_salary |
4 | 4 ,round((( avg (e.sal) over (partition by e.deptno) * 100) / avg (e.sal) over (partition by 1)), 2) avg_perc_of_avg_tot |
7 | DEPTNO AVG_DEPT_SALARY AVG_OVERALL_SALARY AVG_PERC_OF_AVG_TOT |
9 | 10 2916.67 2073.21 140.68 |
10 | 30 1566.67 2073.21 75.57 |
In the query above you will see a “trick” that I use many times in my queries containing analytic function:
partition by 1. This simply means every record in the resultset..
MIN aggregate function sample
Consider this use of the MIN
aggregate function:
2 | 2 ,ROUND( MIN (e.sal), 2) min_salary |
As you can see, for each of the departments in the resultset you will find one number.
MIN analytic function sample
Now, let us use the MIN
analytic function and se what happens:
4 | 4 , MIN (e.sal) OVER (PARTITION BY e.deptno) emp_salary_min |
6 | 6 ORDER BY e.deptno, e.empno |
8 | DEPTNO EMPNO SAL EMP_SALARY_MIN |
I added the
empno
field to the query so that you easier can see what happens in the
resultset. And, as you can see the result set shows the min salary for
the employees in each department…for EACH of the employees in the
department.
The equivalent query using the MIN aggregate function would be:
5 | 5 SELECT ROUND( MIN (e2.sal), 2) |
7 | 7 WHERE e2.deptno = e.deptno |
10 | 10 ORDER BY e.deptno, e.empno |
12 | DEPTNO EMPNO SAL SUM_DEPT_SALARY |
Ok, so how can we practically use the MIN analytic funtion?
MIN analytic function, practical example
Let
us pretend that the HR department wants to know how many employees in
each department that make more than the minimum salary for the
department. This is one way you can do this using the MIN analytic
function (together with an outer SUM aggregate function):
2 | 2 ,x.dept_emp_salary_min |
4 | 4 WHEN x.sal > dept_emp_salary_min THEN 1 |
6 | 6 END ) dept_emp_sal_above_min |
10 | 10 , MIN (e.sal) OVER (PARTITION BY e.deptno) dept_emp_salary_min |
14 | 14 ,x.dept_emp_salary_min |
16 | DEPTNO DEPT_EMP_SALARY_MIN DEPT_EMP_SAL_ABOVE_MIN |
MAX aggregate function sample
Consider this use of the MAX
aggregate function:
2 | 2 ,ROUND( MAX (e.sal), 2) max_salary |
As you can see, for each of the departments in the resultset you will find one number.
MAX analytic function sample
Now, let us use the MAX
analytic function and se what happens:
4 | 4 , MAX (e.sal) OVER (PARTITION BY e.deptno) emp_salary_max |
6 | 6 ORDER BY e.deptno, e.empno |
8 | DEPTNO EMPNO SAL EMP_SALARY_MAX |
I added the
empno
field to the query so that you easier can see what happens in the
resultset. And, as you can see the result set shows the max salary for
the employees in each department…for EACH of the employees in the
department.
The equivalent query using the MAX aggregate function would be:
5 | 5 SELECT ROUND( MAX (e2.sal), 2) |
7 | 7 WHERE e2.deptno = e.deptno |
10 | 10 ORDER BY e.deptno, e.empno |
12 | DEPTNO EMPNO SAL MAX_DEPT_SALARY |
Ok, so how can we practically use the MAX analytic funtion?
MAX analytic function, practical example
Let
us pretend that the HR department wants to know how many employees in
each department that make less than the maximum salary for the
department. This is one way you can do this using the MAX analytic
function (together with an outer SUM aggregate function):
2 | 2 ,x.dept_emp_salary_max |
4 | 4 WHEN x.sal < dept_emp_salary_max THEN 1 |
6 | 6 END ) dept_emp_sal_less_max |
10 | 10 , MAX (e.sal) OVER (PARTITION BY e.deptno) dept_emp_salary_max |
14 | 14 ,x.dept_emp_salary_max |
16 | DEPTNO DEPT_EMP_SALARY_MAX DEPT_EMP_SAL_LESS_MAX |
The ROWNUM pseudocolumn
Before
we start using the ROW_NUMBER analytic function, let’s look a little
bit on how ROWNUM works in a query. ROWNUM is a so-called pseudocolumn.
It can be added to any select statement as a field.
Here is a sample of the use of ROWNUM:
9 | DEPTNO HIREDATE EMPNO ENAME ROWNUM |
10 | 20 12/17/1980 7369 SMITH 1 |
11 | 30 2/20/1981 7499 ALLEN 2 |
12 | 30 2/22/1981 7521 WARD 3 |
13 | 20 4/2/1981 7566 JONES 4 |
14 | 30 9/28/1981 7654 MARTIN 5 |
15 | 30 5/1/1981 7698 BLAKE 6 |
16 | 10 6/9/1981 7782 CLARK 7 |
17 | 20 4/19/1987 7788 SCOTT 8 |
18 | 10 11/17/1981 7839 KING 9 |
19 | 30 9/8/1981 7844 TURNER 10 |
20 | 20 5/23/1987 7876 ADAMS 11 |
21 | 30 12/3/1981 7900 JAMES 12 |
22 | 20 12/3/1981 7902 FORD 13 |
23 | 10 1/23/1982 7934 MILLER 14 |
The
ROWNUM field just shows the row number in the resultset that is
returned from the query. If you i.e. wanted to only see the first five
records in the resultset you could do like this:
Note
that if you want to do an ORDER BY you will need to do an ORDER BY in
the query above, and then use ROWNUM in a query that you wrap around the
inner query…otherwise the ORDER BY will not work correctly.
The ROW_NUMBER analytic function usage and sample
The
ROW_NUMBER analytic function is a bit different than the ROWNUM
pseudocolumn. When you use ROWNUM you get the row number based on the
WHOLE resultset. When you use the ROW_NUMBER analytical function you
will get a row number based on what is in the
PARTITION BY and the
ORDER BY segments of the expression. Let’s look at a sample.
5 | ,ROW_NUMBER() OVER (PARTITION BY e.deptno ORDER BY e.hiredate) rownumber |
9 | DEPTNO HIREDATE EMPNO ENAME ROWNUMBER |
10 | 10 6/9/1981 7782 CLARK 1 |
11 | 10 11/17/1981 7839 KING 2 |
12 | 10 1/23/1982 7934 MILLER 3 |
13 | 20 12/17/1980 7369 SMITH 1 |
14 | 20 4/2/1981 7566 JONES 2 |
15 | 20 12/3/1981 7902 FORD 3 |
16 | 20 4/19/1987 7788 SCOTT 4 |
17 | 20 5/23/1987 7876 ADAMS 5 |
18 | 30 2/20/1981 7499 ALLEN 1 |
19 | 30 2/22/1981 7521 WARD 2 |
20 | 30 5/1/1981 7698 BLAKE 3 |
21 | 30 9/8/1981 7844 TURNER 4 |
22 | 30 9/28/1981 7654 MARTIN 5 |
23 | 30 12/3/1981 7900 JAMES 6 |
In
the above sample the employees are ranged within their department when
they were hired. The first hire in the department is number 1, etc.
So, if we only want to see i.e. the first employee hired in each department, we can do like this:
11 | ,ROW_NUMBER() OVER (PARTITION BY e.deptno ORDER BY e.hiredate) rownumber |
Please note that it might be tempting to do this:
5 | ,ROW_NUMBER() OVER (PARTITION BY e.deptno ORDER BY e.hiredate) rownumber |
7 | WHERE ROW_NUMBER() OVER (PARTITION BY e.deptno ORDER BY e.hiredate) = 1 |
But, this will cause the following error: “ORA-30483: window functions are not allowed here”.