Monday, October 8, 2012

How To Work With Downline Hierarchies In Oracle

Oracle SQL Downline HierarchyHieararchies or trees in Oracle using CONNECT BY is actually pretty fun to work with. Lately I have been a real “tre-hugger”, since I have worked a lot with different tree structures, and downlines. I thought I wanted to share some of my latest experiences.

Introduction

In this post I want to cover some of the things that have a bit interesting for me lately. Things like:
  • How to visually draw a hierarchy?
  • What is the difference between table count and downline count?
  • How to sort a CONNECT BY query best possible?
  • How to get bottom left side or bottom right side node for a given node?
  • How to get the upline ids for a given node?
Before we go on I would just want to mention that for this tutorial I threw together some data in two different tables: CUST and DOWNLINE_TEST.

1SQL> desc cust;
2Name      Type         Nullable Default Comments
3--------- ------------ -------- ------- --------
4CUST_ID   NUMBER(9)
5CUST_NAME VARCHAR2(50)                          
6
7SQL>
8SQL> SELECT  *
9  FROM   cust
10  3  ;
11
12   CUST_ID CUST_NAME
13---------- --------------------------------------------------
14         1 BILL
15         2 BOB
16         3 BARRY
17         4 JALENE
18         5 LISA
19         6 ROBERT
20         7 CHRIS
21         8 JASMINE
22         9 JOHN
23        10 CARL
24        11 BRANDON
25        12 STAN
26        13 STEVE
27        14 MARSHALL
28        15 JUSTIN
29        18 MATT
30        19 MARC
31        30 JOE
32        31 CARLA
33        62 ANDY
34        63 STUART
35
3621 rows selected
37
38SQL> desc downline_test;
39Name      Type        Nullable Default Comments
40--------- ----------- -------- ------- --------
41NODE_ID   NUMBER(9)
42PARENT_ID NUMBER(9)
43CUST_ID   NUMBER(9)
44LEFT_ID   NUMBER(9)   Y
45RIGHT_ID  NUMBER(9)   Y
46SIDE      VARCHAR2(1)                          
47
48SQL>
49SQL> SELECT  *
50  FROM   downline_test
51  3  ;
52
53   NODE_ID  PARENT_ID    CUST_ID    LEFT_ID   RIGHT_ID SIDE
54---------- ---------- ---------- ---------- ---------- ----
55      1001       1000          1       1002       1003 R
56      1002       1001          2       1004       1005 L
57      1003       1001          3       1006       1007 R
58      1004       1002          4       1008       1009 L
59      1005       1002          5       1010       1011 R
60      1006       1003          6       1012       1013 L
61      1007       1003          7       1014       1015 R
62      1008       1004          8                       L
63      1009       1004          9       1018       1019 R
64      1010       1005         10                       L
65      1011       1005         11                       R
66      1012       1006         12                       L
67      1013       1006         13                       R
68      1014       1007         14                       L
69      1015       1007         15       1030       1031 L
70      1018       1009         18                       L
71      1019       1009         19                       R
72      1030       1015         30                       R
73      1031       1015         31       1062       1062 L
74      1062       1031         62                       R
75      1063       1031         63                       R
76
7721 rows selected

This image shows the downline visually:
Oracle SQL Hierarchy Sample
Alrighty then, let’s start answering some of the initial questions.

How to visually draw a hierarchy?

In a project I have worked on a bit the last little while, my task has been to generate a binary tree based on data in a different type of tree. There have been many factors involved, like which people should be placed before others in the tree, types of customers, etc., etc.
While you are working your way through the code and see them records actually be generated, it sometimes is nice to visually make a “right answer” to compare with the data you have generated. I was looking around on the Internet a while ago for different tools to use, and found two ways to make a visual tree electronically – in addition to the most obvious method to use.

Drawing hierarchy by hand

This is actually the method I have used most, and it works pretty well, but sometimes it can be pretty tight between the nodes if you do not plan well, and if you mess up…well…toss the paper and start over.

Drawing a hierarchy using Kap Lab’s Kalileo

In the beginning I used this online tool a LOT, since I had to visually show trees that had up to 250-300 nodes in it. The tool is actually a demo for an object you can use with Adobe Flashbuilder (Flex). You can find the demo site for Kalileo here: http://lab.kapit.fr/media/demo/kalileo/prod/index.php
One problem I found using this tool though was that it seems to use some sort of logic when it redraws the hierarchy you just used an hour to create…and what was on the left side is now on the right side, etc. That did not work well for me, and one time I used the tool for almost three hours and was hoping and praying that the power did not go out…since when you save it and reopen it everything looks different.

Drawing a hierarchy using Excel’s “SmartArt”

This tool is what I have been using most lately. You can use “SmartArt” in Word as well, but Excel is best because of the “space”. You can find this in the newer versions of Excel under “Insert” and then “SmartArt”. Here you can choose between several types of hierarchies, etc. I really like this method, and it is very simple to use.The picture of the downline above is made using Excel and “SmartArt”.
The only thing I could not figure out was how to get rid of the 3D look completely, but oh well…it works pretty well.

What is the difference between table count and downline count?

Ok, I guess this might be pretty obvious, but just a reminder. Let us say that you have generated 21 records in your node /hierarchy/downline table, like in the drawing above. If you would want to make sure that you got everything, you would want to count the number of records in the downline_test table:
1SQL> SELECT  COUNT(*)
2  FROM   downline_test
3  3  ;
4
5  COUNT(*)
6----------
7        21
…and then compare it to the count of records your CONNECT BY sql would give you:
1SQL> SELECT  COUNT(*)
2  FROM   downline_test dl
3  3  START WITH dl.node_id = 1001
4  CONNECT BY PRIOR dl.node_id = dl.parent_id
5  5  ;
6
7  COUNT(*)
8----------
9        21
This is of course just an example, and if you have many downlines in your tree, and you want to test counts for only a part of the tree, you would of course have to filter your records accordingly.
But, these numbers need to match. If they do not, something is wrong in your tree.

How to sort a CONNECT BY query best possible?

You can sort your CONNECT BY query in whatever way you want, but if you (like me) work for a company where the left side should be presented before the right side, etc….it is sometimes nice to sort your query like this:
1SQL> SELECT  LEVEL lvl
2  2         ,dl.node_id
3  3          ,dl.cust_id
4  4          ,c.cust_name
5  5          ,dl.parent_id
6  6          ,dl.left_id
7  7          ,dl.right_id
8  8          ,dl.side
9  FROM   cust c
10 10         ,downline_test dl
11 11  WHERE  c.cust_id = dl.cust_id
12 12  START WITH dl.node_id = 1001
13 13  CONNECT BY PRIOR dl.node_id = dl.parent_id
14 14  ORDER SIBLINGS BY dl.side
15 15  ;
16
17       LVL    NODE_ID    CUST_ID CUST_NAME  PARENT_ID    LEFT_ID   RIGHT_ID SIDE
18---------- ---------- ---------- --------- ---------- ---------- ---------- ----
19         1       1001          1 BILL            1000       1002       1003 R
20         2       1002          2 BOB             1001       1004       1005 L
21         3       1004          4 JALENE          1002       1008       1009 L
22         4       1008          8 JASMINE         1004                       L
23         4       1009          9 JOHN            1004       1018       1019 R
24         5       1018         18 MATT            1009                       L
25         5       1019         19 MARC            1009                       R
26         3       1005          5 LISA            1002       1010       1011 R
27         4       1010         10 CARL            1005                       L
28         4       1011         11 BRANDON         1005                       R
29         2       1003          3 BARRY           1001       1006       1007 R
30         3       1006          6 ROBERT          1003       1012       1013 L
31         4       1012         12 STAN            1006                       L
32         4       1013         13 STEVE           1006                       R
33         3       1007          7 CHRIS           1003       1014       1015 R
34         4       1014         14 MARSHALL        1007                       L
35         4       1015         15 JUSTIN          1007       1030       1031 R
36         5       1030         30 JOE             1015                       L
37         5       1031         31 CARLA           1015       1062       1062 R
38         6       1062         62 ANDY            1031                       L
39         6       1063         63 STUART          1031                       R
40
4121 rows selected
In a hierarchy like shown under the “Introduction” part in this tutorial, each person has two legs. Left leg, and right leg. So, for Mr. Top Guy – “BILL” , his left leg consists of “BOB”, “JALENE”, and “JASMINE”. BILL’s right leg consist of “BARRY”, “CHRIS”, “JUSTIN”, “CARLA”, and “STUART”.
So, in the query I just showed you above, the data will be presented left side all the way down, and then the right side. This is done using “ORDER SIBLINGS BY dl.side”, and “L” comes before “R” in the alphabet. Pretty handy. I had never used this until this year, even if I have worked with hierarchies many times during the years as an Oracle developer.

How to get to bottom left side or bottom right side for a given node?

Let us say that our Mr Cool Guy – “BILL” is about to sign up a new person, and he wants to sign this person up on his left side. Below is a query you can use for this:
1SQL> SELECT  dl.node_id
2  FROM   downline_test dl
3  3  START WITH dl.node_id = 1001
4  CONNECT BY PRIOR dl.node_id = dl.parent_id AND dl.side = 'L'
5  5  ;
6
7   NODE_ID
8----------
9      1001
10      1002
11      1004
12      1008
Since we actually want to place our new person on the left side of “JASMINE”, we are not totally done yet, fact being that more than one record is returned. This is a sample on how you can get the bottom left node under “BILL”:
1SQL> SELECT  x.node_id
2  FROM   (
3  3              SELECT LEVEL lvl
4  4                      ,dl.node_id
5  5                      ,MAX(LEVEL) OVER (PARTITION BY 1) max_lvl
6  6              FROM   downline_test dl
7  7              START WITH dl.node_id = 1001
8  8              CONNECT BY PRIOR dl.node_id = dl.parent_id AND dl.side = 'L'
9  9         ) x
10 10  WHERE  x.lvl = x.max_lvl
11 11  ;
12
13   NODE_ID
14----------
15      1008
There you have it. And, of course, if you want the bottom right side, just switch out the ‘L’ with an ‘R’ in the query above.
Also, to find the MAX level I am using an analytic function. You can read more about analytic functions here: How To Use Oracle Analytic Functions in Oracle SQL
And then, last..but least:

How to get the upline ids for a given node?

Ok, so you have been generating records like crazy in your downline table, and suddenly there is an error. You see that it appears under node “X”, but the problem is actually caused due to some placements under one of the upline customers of the customer under node “X”. So, a very nice thing to know then is: How do we get the upline list for “X”?
It is actually pretty simple:
1SQL> SELECT  LEVEL lvl
2  2         ,dl.node_id
3  3          ,dl.cust_id
4  4          ,c.cust_name
5  5          ,dl.parent_id
6  6          ,dl.left_id
7  7          ,dl.right_id
8  8          ,dl.side
9  FROM   cust c
10 10         ,downline_test dl
11 11  WHERE  c.cust_id = dl.cust_id
12 12  START WITH dl.node_id = 1015
13 13  CONNECT BY dl.node_id = PRIOR dl.parent_id
14 14  ;
15
16       LVL    NODE_ID    CUST_ID CUST_NAME  PARENT_ID    LEFT_ID   RIGHT_ID SIDE
17---------- ---------- ---------- --------- ---------- ---------- ---------- ----
18         1       1015         15 JUSTIN          1007       1030       1031 R
19         2       1007          7 CHRIS           1003       1014       1015 R
20         3       1003          3 BARRY           1001       1006       1007 R
21         4       1001          1 BILL            1000       1002       1003 R
The difference between showing downline, and upline is the keyword “PRIOR“. So, if we move “PRIOR” to the left side, we will get the downline:
1SQL> SELECT  LEVEL lvl
2  2         ,dl.node_id
3  3          ,dl.cust_id
4  4          ,c.cust_name
5  5          ,dl.parent_id
6  6          ,dl.left_id
7  7          ,dl.right_id
8  8          ,dl.side
9  FROM   cust c
10 10         ,downline_test dl
11 11  WHERE  c.cust_id = dl.cust_id
12 12  START WITH dl.node_id = 1015
13 13  CONNECT BY PRIOR dl.node_id = dl.parent_id
14 14  ;
15
16       LVL    NODE_ID    CUST_ID CUST_NAME  PARENT_ID    LEFT_ID   RIGHT_ID SIDE
17---------- ---------- ---------- --------- ---------- ---------- ---------- ----
18         1       1015         15 JUSTIN          1007       1030       1031 R
19         2       1030         30 JOE             1015                       L
20         2       1031         31 CARLA           1015       1062       1062 R
21         3       1062         62 ANDY            1031                       L
22         3       1063         63 STUART          1031                       R

Summary

In summary: Make sure you have no holes or missing nodes in your tree by comparing the raw table count agains the count of records of your CONNECT BY query. Also, by moving the PRIOR keyword between your left and right side of the “=” in your CONNECT BY line, you will be able to switch between showing a node’s downline, or upline.

No comments :

Post a Comment