Introduction
In the examples in this little tutorial I will be using the tables found in the “OE” (Order Entry) sample schema that comes with most Oracle Installations. You can i.e. read more about it here: Oracle® Database Sample SchemasOk, so let us pretend that we have gotten the task to look at the data for a given year, and then rank the customers with the highest order total, and we want to do this for each month.
Example # 1: Using the RANK() analytic function
If you want to present your data with people having the same order total show as ties, RANK() is the analytic function you should use.The syntax for using the RANK() analytic function is as follows:
RANK() OVER ( [ query_partition_clause] ORDER BY clause )
1 | SQL> set pagesize 500 |
2 | SQL> |
3 | SQL> SELECT x.customer_id |
4 | 2 ,x.mnth |
5 | 3 ,x.order_total |
6 | 4 ,RANK() OVER (PARTITION BY x.mnth ORDER BY x.order_total DESC ) month_rank |
7 | 5 FROM ( |
8 | 6 SELECT o.customer_id |
9 | 7 ,EXTRACT ( MONTH FROM o.order_date) mnth |
10 | 8 ,ROUND( SUM (o.order_total), -3) order_total |
11 | 9 FROM oe.orders o |
12 | 10 WHERE EXTRACT ( YEAR FROM o.order_date) = 2007 |
13 | 11 GROUP BY o.customer_id, EXTRACT ( MONTH FROM o.order_date) |
14 | 12 ) x |
15 | 13 ORDER BY x.mnth, 4 ASC |
16 | 14 ; |
17 |
18 | CUSTOMER_ID MNTH ORDER_TOTAL MONTH_RANK |
19 | ----------- ---------- ----------- ---------- |
20 | 122 2 104000 1 |
21 | 119 2 16000 2 |
22 | 109 3 73000 1 |
23 | 149 3 53000 2 |
24 | 107 3 32000 3 |
25 | 108 3 30000 4 |
26 | 102 3 11000 5 |
27 | 105 3 2000 6 |
28 | 104 3 0 7 |
29 | 147 4 2000 1 |
30 | 118 5 79000 1 |
31 | 143 5 27000 2 |
32 | 142 5 26000 3 |
33 | 146 5 18000 4 |
34 | 169 5 16000 5 |
35 | 150 6 283000 1 |
36 | 141 6 38000 2 |
37 | 123 6 11000 3 |
38 | 116 6 11000 3 |
39 | 165 6 3000 5 |
40 | 148 6 3000 5 |
41 | 145 6 1000 7 |
42 | 167 6 0 8 |
43 | 146 6 0 8 |
44 | 166 6 0 8 |
45 | 117 7 104000 1 |
46 | 109 7 78000 2 |
47 | 159 7 69000 3 |
48 | 154 7 27000 4 |
49 | 163 7 1000 5 |
50 | 161 7 1000 5 |
51 | 160 7 1000 5 |
52 | 164 7 1000 5 |
53 | 162 7 0 9 |
54 | 101 8 78000 1 |
55 | 107 8 71000 2 |
56 | 147 8 37000 3 |
57 | 146 8 27000 4 |
58 | 105 8 22000 5 |
59 | 145 8 10000 6 |
60 | 149 9 269000 1 |
61 | 144 9 62000 2 |
62 | 145 9 14000 3 |
63 | 102 9 11000 4 |
64 | 116 9 6000 5 |
65 | 104 9 5000 6 |
66 | 103 9 0 7 |
67 | 101 10 30000 1 |
68 | 118 10 22000 2 |
69 | 149 10 13000 3 |
70 | 144 10 10000 4 |
71 | 103 10 7000 5 |
72 | 116 10 0 6 |
73 | 108 11 120000 1 |
74 | 109 11 93000 2 |
75 | 117 11 50000 3 |
76 | 102 11 42000 4 |
77 | 158 11 25000 5 |
78 | 116 11 15000 6 |
79 | 146 11 14000 7 |
80 | 145 11 10000 8 |
81 | 149 11 9000 9 |
82 | 152 11 8000 10 |
83 | 157 11 7000 11 |
84 | 107 11 1000 12 |
85 | 147 12 296000 1 |
86 | 148 12 32000 2 |
87 | 144 12 11000 3 |
88 |
89 | 68 rows selected |
As you can see, we have some ties. Take a look at i.e. month # 6, and month # 7.
Also, if you look at month # 7, you will see that the rank numbers are consecutive:
1 | SQL> SELECT x.customer_id |
2 | 2 ,x.mnth |
3 | 3 ,x.order_total |
4 | 4 ,RANK() OVER (PARTITION BY x.mnth ORDER BY x.order_total DESC ) month_rank |
5 | 5 FROM ( |
6 | 6 SELECT o.customer_id |
7 | 7 ,EXTRACT ( MONTH FROM o.order_date) mnth |
8 | 8 ,ROUND( SUM (o.order_total), -3) order_total |
9 | 9 FROM oe.orders o |
10 | 10 WHERE EXTRACT ( YEAR FROM o.order_date) = 2007 |
11 | 11 GROUP BY o.customer_id, EXTRACT ( MONTH FROM o.order_date) |
12 | 12 ) x |
13 | 13 WHERE x.mnth = 7 |
14 | 14 ORDER BY x.mnth, 4 ASC |
15 | 15 ; |
16 |
17 | CUSTOMER_ID MNTH ORDER_TOTAL MONTH_RANK |
18 | ----------- ---------- ----------- ---------- |
19 | 117 7 104000 1 |
20 | 109 7 78000 2 |
21 | 159 7 69000 3 |
22 | 154 7 27000 4 |
23 | 163 7 1000 5 |
24 | 161 7 1000 5 |
25 | 160 7 1000 5 |
26 | 164 7 1000 5 |
27 | 162 7 0 9 |
28 |
29 | 9 rows selected |
Example # 2: Using the DENSE_RANK() analytic function
Some times you do not want to show the rank numbers as consecutive. If this is the case for you, you should use the DENSE_RANK() analytic function.The syntax for using the DENSE_RANK() analytic function is as follows:
DENSE_RANK() OVER ( [ query_partition_clause] ORDER BY clause )
1 | SQL> SELECT x.customer_id |
2 | 2 ,x.mnth |
3 | 3 ,x.order_total |
4 | 4 ,DENSE_RANK() OVER (PARTITION BY x.mnth ORDER BY x.order_total DESC ) month_rank |
5 | 5 FROM ( |
6 | 6 SELECT o.customer_id |
7 | 7 ,EXTRACT ( MONTH FROM o.order_date) mnth |
8 | 8 ,ROUND( SUM (o.order_total), -3) order_total |
9 | 9 FROM oe.orders o |
10 | 10 WHERE EXTRACT ( YEAR FROM o.order_date) = 2007 |
11 | 11 GROUP BY o.customer_id, EXTRACT ( MONTH FROM o.order_date) |
12 | 12 ) x |
13 | 13 WHERE x.mnth = 7 |
14 | 14 ORDER BY x.mnth, 4 ASC |
15 | 15 ; |
16 |
17 | CUSTOMER_ID MNTH ORDER_TOTAL MONTH_RANK |
18 | ----------- ---------- ----------- ---------- |
19 | 117 7 104000 1 |
20 | 109 7 78000 2 |
21 | 159 7 69000 3 |
22 | 154 7 27000 4 |
23 | 163 7 1000 5 |
24 | 161 7 1000 5 |
25 | 160 7 1000 5 |
26 | 164 7 1000 5 |
27 | 162 7 0 6 |
28 |
29 | 9 rows selected |
Example # 3: What if I do not want to show any ties?
In some situations you will not want to show any ties at all in your list. How do we do this? One answer is to actually use the ROW_NUMBER() analytic function like this:1 | SQL> SELECT x.customer_id |
2 | 2 ,x.mnth |
3 | 3 ,x.order_total |
4 | 4 ,ROW_NUMBER() OVER (PARTITION BY x.mnth ORDER BY x.order_total DESC ) month_rank |
5 | 5 FROM ( |
6 | 6 SELECT o.customer_id |
7 | 7 ,EXTRACT ( MONTH FROM o.order_date) mnth |
8 | 8 ,ROUND( SUM (o.order_total), -3) order_total |
9 | 9 FROM oe.orders o |
10 | 10 WHERE EXTRACT ( YEAR FROM o.order_date) = 2007 |
11 | 11 GROUP BY o.customer_id, EXTRACT ( MONTH FROM o.order_date) |
12 | 12 ) x |
13 | 13 WHERE x.mnth = 7 |
14 | 14 ORDER BY x.mnth, 4 ASC |
15 | 15 ; |
16 |
17 | CUSTOMER_ID MNTH ORDER_TOTAL MONTH_RANK |
18 | ----------- ---------- ----------- ---------- |
19 | 117 7 104000 1 |
20 | 109 7 78000 2 |
21 | 159 7 69000 3 |
22 | 154 7 27000 4 |
23 | 163 7 1000 5 |
24 | 161 7 1000 6 |
25 | 160 7 1000 7 |
26 | 164 7 1000 8 |
27 | 162 7 0 9 |
28 |
29 | 9 rows selected |
A bit of cautiousness
I just wanted to add a note for example # 1 and 2. If you use the sql as shown in these examples, you might run into a situation where you get a different result each time you run it. The reason for this is that it is not described in the examples how to sort the people that ties. This might be ok for what you are doing, but if you are i.e. showing results for a contest for hundreds of distributors in a Multi Level Marketing (MLM) company, and the results keep on changing…I guarantee you: The phones will be chiming.So, to solve this you can i.e. change the ORDER BY line to something like this:
ORDER BY x.mnth, 4 ASC, x.customer_id
This way you will always get the same result.
Conclusion
So, in conclusion we have the following rules:- If you want to show ties, and non-consecutive numbers: Use the RANK() analytic function
- If you want to show ties, and consecutive numbers: Use the DENSE_RANK() analytic function
- If you do not want to show ties at all, but consecutive numbers: Use the ROW_NUMBER analytic function
- Make sure you always show the same result
No comments :
Post a Comment