Say Hello To Partitioned Outer Joins
Recently, I was helping someone to improve a procedure which fills a table with zero where there is no value for some days.
Instead of cursoring and inserting, I wrote a single query using a partitioned outer join to achieve this. Later in the conversation, I was asked to explain how this was working and I posted about. You can find the original topic here (in french).
A simple example is very efficient to explain the concept of the partitioned outer join. I'm using an example very similar to the one provided by Oracle in their documentation, but I'm going step by step which may help to understand the concept.
First, I'm emulating two tables. A seven days calendar one and a sales one.
I'm doing this with CTE, and to keep the post clear I won't repeat it each time.
(
SELECT date '2009-05-01' + level-1 AS day
FROM dual connect BY level <= 7
),
sales AS
(
SELECT date '2009-05-01' AS day, 'Car' AS product, 100 AS amount FROM dual union ALL
SELECT date '2009-05-03', 'Car' , 75 FROM dual union ALL
SELECT date '2009-05-07', 'Car' , 80 FROM dual union ALL
SELECT date '2009-05-01', 'Bike', 8 FROM dual union ALL
SELECT date '2009-05-02', 'Bike', 5 FROM dual union ALL
SELECT date '2009-05-06', 'Bike', 15 FROM dual union ALL
SELECT date '2009-05-07', 'Bike', 3 FROM dual
)
SELECT * FROM calendar
-- SELECT * FROM sales
Now I want to construct a product daily sales report.
First, I'm trying the classic inner join :
to_char(c.day, 'yyyy-mm-dd') AS day,
s.product,
coalesce(s.amount, 0) AS amount
FROM
calendar c
INNER JOIN sales s
ON s.day = c.day
ORDER BY
s.product ASC,
c.day ASC
2009-05-02 Bike 5
2009-05-06 Bike 15
2009-05-07 Bike 3
2009-05-01 Car 100
2009-05-03 Car 75
2009-05-07 Car 80
There is no gain with this query against a regular select on the sales table.
So what about an outer join ? Let's check it :
to_char(c.day, 'yyyy-mm-dd') AS day,
s.product,
coalesce(s.amount, 0) AS amount
FROM
calendar c
LEFT OUTER JOIN sales s
ON s.day = c.day
ORDER BY
s.product ASC,
c.day ASC
2009-05-02 Bike 5
2009-05-06 Bike 15
2009-05-07 Bike 3
2009-05-01 Car 100
2009-05-03 Car 75
2009-05-07 Car 80
2009-05-04 {null} 0
2009-05-05 {null} 0
Well, it's quite disappointing. I had no sales on 4th and 5th of may so I just obtained two more lines with the outer join.
Fortunately, Oracle thought about this introducing partition outer join in 10g.
The semantic is very clear and easy to use :
to_char(c.day, 'yyyy-mm-dd') AS day,
s.product,
coalesce(s.amount, 0) AS amount
FROM
calendar c
LEFT OUTER JOIN sales s
PARTITION BY (s.product)
ON s.day = c.day
ORDER BY
s.product ASC,
c.day ASC
2009-05-02 Bike 5
2009-05-03 Bike 0
2009-05-04 Bike 0
2009-05-05 Bike 0
2009-05-06 Bike 15
2009-05-07 Bike 3
2009-05-01 Car 100
2009-05-02 Car 0
2009-05-03 Car 75
2009-05-04 Car 0
2009-05-05 Car 0
2009-05-06 Car 0
2009-05-07 Car 80
Bingo ! I can now make some nice sheets !
But this need have more or less always existed.
How one can achieve this if you don't have a partitioned outer join compliant RDBMS ?
By cross joining the calendar with all your products :
to_char(c.day, 'yyyy-mm-dd') AS day,
d.product,
coalesce(s.amount, 0) AS amount
FROM
calendar c
CROSS JOIN (SELECT DISTINCT product FROM sales) d
LEFT OUTER JOIN sales s
ON s.day = c.day
AND s.product = d.product
ORDER BY
d.product ASC,
c.day ASC
2009-05-02 Bike 5
2009-05-03 Bike 0
2009-05-04 Bike 0
2009-05-05 Bike 0
2009-05-06 Bike 15
2009-05-07 Bike 3
2009-05-01 Car 100
2009-05-02 Car 0
2009-05-03 Car 75
2009-05-04 Car 0
2009-05-05 Car 0
2009-05-06 Car 0
2009-05-07 Car 80
Partitioned outer joins are part of the ANSI standard, but i couldn't find in which revision.
Some links about Partitioned outer joins :
Oracle doc
OTN
DB Nemec
And Google as always.
This entry is filed under Datawarehousing, SQLing. And tagged with calendar, cte, Oracle, outer join, SQL. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.