Tuesday, October 7, 2008

Oracle Lag Analytic Function

It is rare that you may use the lag function, however, if you do need it, it can be a life saver. I've only had to use it once.... ever. The reason for it was so convoluted and asinine that I'll spare you those painful details. The lag function allows you to access data from previous rows. Let's take a look at a simple example:

  1. select product, order_date,   
  2. lag(order_date,1) over (order by order_date) as prev_order_date  
  3. from orders;  


And the result:


PRODUCT ORDER_DATE PREV_ORDER_DATE
------------------------- ------------------------- -------------------------
Basketball 07-OCT-07
Baseball 27-OCT-07 07-OCT-07
Football 16-DEC-07 27-OCT-07
Basketball 11-MAR-08 16-DEC-07
Football 04-AUG-08 11-MAR-08
Basketball 07-OCT-08 04-AUG-08

6 rows selected



So basically allowed me to query my orders table and get the previous order date, without some complicated inline query to try to figure that out. Well that isn't very interesting. So I got the previous order date on the current line, big deal. The cool part of this is what it now allows me to do. I can now answer questions like, "how many days is it between orders?". Let's take a look.

  1. select product, order_date,   
  2. nvl(order_date - lag(order_date,1) over (order by order_date),0) as days_since_last_order  
  3. from orders;  



And the resutls:


PRODUCT ORDER_DATE DAYS_SINCE_LAST_ORDER
------------------------- ------------------------- ----------------------
Basketball 07-OCT-07 0
Baseball 27-OCT-07 20
Football 16-DEC-07 50
Basketball 11-MAR-08 86
Football 04-AUG-08 146
Basketball 07-OCT-08 64

6 rows selected



Its look as if Nick's Basketball, Baseball and Football Emporium will be going out of business. Especially if there are 146 days between orders. Okay so all this looks cool on a report, but I really need to know the average time between orders. Ask and ye shall receive. Sorry to go all biblely on you.

  1. select avg(days_since_last_order)  
  2. from  
  3. (select   
  4. order_date - lag(order_date,1) over (order by order_date) as days_since_last_order  
  5. from orders);  


And the answer:



AVG(DAYS_SINCE_LAST_ORDER)
--------------------------
73.2

1 rows selected




(Note: Oracle doesn't include nulls in the average, which is precisely what we want. If we wanted the first record to show zero, I would have left the nvl function in there.)
So the average days between orders is 73.2. So apparently selling basketballs, baseballs and footballs is not my calling. One more thing, the number that appears inside the parenthesis tells Oracle how far back to go. So if you say 1 it looks at the previous row, 2 the row before the previous row and so on. Let's take a look.

  1. select product, order_date,  
  2. lag(order_date,2) over (order by order_date) as prev_prev_order_date  
  3. from orders  


And the results:


PRODUCT ORDER_DATE PREV_PREV_ORDER_DATE
------------------------- ------------------------- -------------------------
Basketball 07-OCT-07
Baseball 27-OCT-07
Football 16-DEC-07 07-OCT-07
Basketball 11-MAR-08 27-OCT-07
Football 04-AUG-08 16-DEC-07
Basketball 07-OCT-08 11-MAR-08

6 rows selected


Notice now it gives me the order date for the record two rows back. Pretty cool stuff. Again, you may never use it, but its nice to have in your arsenal.

No comments: