You Should Like Analytics
I'm not speaking of Google Analytics - even if it's quite awesome - I'm speaking of SQL analytic functions, or window functions.
I first met them back in 2002, when the firm I was working in upgraded their database from Oracle 7.3.4 to Oracle 8i. At first, I just didn't understand the syntax. It felt so different from the regular SQL I was knowing about. At this time I wasn't running after the cool SQL tricks, I was still a beginner and most of my job was to adjust some sheets within Business Objects.
I'm not going to write a tutorial or even explain the rules of those function. I'll translate an easy problem in both classic-SQL and analytic SQL, that I did answer too on the french-speaking Développez.com forum.The analytic functions I'm going to use are only implemented in Oracle as far as I know. But they are into the SQL:2008 standard so I'm fine with that as other DBMS will adapt to a similar syntax.
The problem is the following : you want to know for each course who are the youngest and the oldest student ?
(
SELECT 1 AS stud_id, 'JAMES' AS name, date '1989-07-14' AS birthdate FROM dual union ALL
SELECT 2, 'JOHN' , date '1991-02-07' FROM dual union ALL
SELECT 3, 'ROBERT' , date '1986-08-23' FROM dual union ALL
SELECT 4, 'MICHAEL', date '1983-12-02' FROM dual union ALL
SELECT 5, 'WILLIAM', date '1988-03-02' FROM dual union ALL
SELECT 6, 'DAVID' , date '1984-05-12' FROM dual union ALL
SELECT 7, 'RICHARD', date '1984-06-06' FROM dual union ALL
SELECT 8, 'CHARLES', date '1984-09-07' FROM dual
),
course AS
(
SELECT 1 AS cour_id, 1 AS stud_id FROM dual union ALL
SELECT 2, 2 FROM dual union ALL
SELECT 1, 3 FROM dual union ALL
SELECT 3, 4 FROM dual union ALL
SELECT 2, 5 FROM dual union ALL
SELECT 2, 6 FROM dual union ALL
SELECT 1, 7 FROM dual union ALL
SELECT 1, 8 FROM dual
)
SELECT *
FROM
students st
INNER JOIN course co
ON co.stud_id = st.stud_id
Here are just some datas to work with. I won't repeat the CTEs in the following queries, but they are necessary.
The classic SQL makes us first find what are the min and max birthdate by course - via a subquery - and then finding the students born at those dates :
sq.cour_id,
sy.name AS youngest,
so.name AS oldest
FROM
(
SELECT
co.cour_id,
min(st.birthdate) AS bmin,
max(st.birthdate) AS bmax
FROM
students st
INNER JOIN course co
ON co.stud_id = st.stud_id
GROUP BY
co.cour_id
) sq
INNER JOIN students so
ON so.birthdate = sq.bmin
INNER JOIN students sy
ON sy.birthdate = sq.bmax
ORDER BY
sq.cour_id ASC
As you can see, the DMBS have to scan the student table three times.
Now with analytic SQL it's just one scan and a shorter query :
co.cour_id,
max(st.name) keep (dense_rank last ORDER BY st.birthdate ASC) AS youngest,
max(st.name) keep (dense_rank first ORDER BY st.birthdate ASC) AS oldest
FROM
students st
INNER JOIN course co
ON co.stud_id = st.stud_id
GROUP BY
co.cour_id
ORDER BY
co.cour_id ASC
Is'nt it very hot ?
This entry is filed under SQLing. And tagged with analytic functions, business objects, google, hot, Oracle, SQL. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.
[...] For those who don't have a clue about this function, I wrote a small post on this subject. [...]