My TSQL Challenge 10 Pivot-UnPivot Solution
This is the query I sent to the TC team. I published it when I sent it, but they prefer to keep solution secret as long as the closure date isn't in the past, which I can understand.
I had fun using PIVOT and UNPIVOT operators to reduce the query size.
PIVOT / UNPIVOT syntax isn't part of ANSI SQL but is supported in both SQL Server 2005+ and Oracle 11g.
First, I've got to assign an unique identifier to the rows of the table.
The easy way is to add an INT IDENTITY column to the table.
If not allowed or supported, one can use rownum pseudocolumn in Oracle or a row_number() trick on SQL Server :
DECLARE @T1 TABLE
(ID TINYINT IDENTITY, C1 CHAR(1), C2 CHAR(1), C3 CHAR(1));
INSERT INTO @T2 (C1, C2, C3) VALUES ('2','1','3');
INSERT INTO @T2 (C1, C2, C3) VALUES ('3','2','1');
INSERT INTO @T2 (C1, C2, C3) VALUES ('Z','X','Y');
INSERT INTO @T2 (C1, C2, C3) VALUES ('B','C','D');
INSERT INTO @T2 (C1, C2, C3) VALUES ('Y','Z','X');
INSERT INTO @T2 (C1, C2, C3) VALUES ('B','C','A');
INSERT INTO @T1 (C1, C2, C3) SELECT * FROM @T2;
SELECT * FROM @T1 ORDER BY ID ASC;
SELECT ROW_NUMBER() OVER(ORDER BY GETDATE() ASC) AS ID, C1, C2, C3 FROM @T2 ORDER BY 1 ASC;
ID C1 C2 C3
---- ---- ---- ----
1 2 1 3
2 3 2 1
3 Z X Y
4 B C D
5 Y Z X
6 B C A
As this is not a real problem I'm presuming to have an identity column to earn a step in the query.
This challenge is on a three column sized table, but in my opinion the query should be able to handle an m-columns table.
On T-SQL Challenge #10 official forum there is a "brute force" solution, but good luck using this method even with a five columns table.
I would rather use the sorting abilities of RDBMS to do this for me !
So i'm going to unpivot the data. For those who aren't familiar with this operator, unpivot is similar to a multi "union all" query.
The two following queries give similar results (@T is similar to previous @T1) :
SELECT ID, C2 , 'C2' FROM @T UNION ALL
SELECT ID, C3 , 'C3' FROM @T
ORDER BY 1 ASC, 3 ASC;
SELECT * FROM @T
UNPIVOT (VAL FOR COL IN (C1, C2, C3)) AS UPV
ORDER BY 1 ASC, 3 ASC;
ID VAL COL
---- ---- ----
1 2 C1
1 1 C2
1 3 C3
2 3 C1
2 2 C2
2 1 C3
3 Z C1
3 X C2
3 Y C3
4 B C1
4 C C2
4 D C3
5 Y C1
5 Z C2
5 X C3
6 B C1
6 C C2
6 A C3
Two things about this dataset.
Firstly, I don't really need the "COL" column. Being provided by default by the unpivot operator, I just have to not select it.
Secondly, I don't need to really sort it. I just need to know what are the rank of each value along the same id.
To compute this, I'm of course using an analytical function.
So, injecting a CTE for further use, my query is becoming :
(
SELECT
ID,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY VAL ASC),
VAL
FROM @T
UNPIVOT (VAL FOR COL IN (C1, C2, C3)) AS UPV
)
SELECT * FROM UNPVT
ORDER BY 1 ASC, 2 ASC;
ID RN VAL
---- ---- ----
1 1 1
1 2 2
1 3 3
2 1 1
2 2 2
2 3 3
3 1 X
3 2 Y
3 3 Z
4 1 B
4 2 C
4 3 D
5 1 X
5 2 Y
5 3 Z
6 1 A
6 2 B
6 3 C
We can see the duplicates between the couples (1,2) and (3, 5), but I'll filter them later.
From this dataset, I just need to re-pivot the data.
For those who aren't familiar with this operator, pivot is similar to a select with hard coded max(case) against a group by / pivot axis.
The two following queries give similar results :
ID,
MAX(CASE RN WHEN 1 THEN VAL END) AS [1],
MAX(CASE RN WHEN 2 THEN VAL END) AS [2],
MAX(CASE RN WHEN 3 THEN VAL END) AS [3]
FROM UNPVT
GROUP BY ID;
SELECT * FROM UNPVT
PIVOT (MAX(VAL) FOR RN IN ([1], [2], [3])) AS PVT;
ID 1 2 3
---- ---- ---- ----
1 1 2 3
2 1 2 3
3 X Y Z
4 B C D
5 X Y Z
6 A B C
And now, it's just a regular distinct, which can be applied to the same query :
[1] AS C1,
[2] AS C2,
[3] AS C3
FROM UNPVT
PIVOT (MAX(VAL) FOR RN IN ([1], [2], [3])) AS PVT
ORDER BY 1 ASC, 2 ASC, 3 ASC;
C1 C2 C3
---- ---- ----
1 2 3
A B C
B C D
X Y Z
Hurray, challenge solved !
Here is the complete query using PIVOT / UNPIVOT operators (not so verbose in fine) :
(
SELECT
ID,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY VAL ASC),
VAL
FROM @T
UNPIVOT (VAL FOR COL IN (C1, C2, C3)) AS UPV
)
SELECT DISTINCT
[1] AS C1,
[2] AS C2,
[3] AS C3
FROM UNPVT
PIVOT (MAX(VAL) FOR RN IN ([1], [2], [3])) AS PVT
ORDER BY 1 ASC, 2 ASC, 3 ASC;
Here is the complete query not using PIVOT / UNPIVOT operators :
MAX(CASE RN WHEN 1 THEN VAL END) AS C1,
MAX(CASE RN WHEN 2 THEN VAL END) AS C2,
MAX(CASE RN WHEN 3 THEN VAL END) AS C3
FROM
(
SELECT
ID,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY VAL ASC) AS RN,
VAL
FROM
(
SELECT ID, C1 AS VAL FROM @T UNION ALL
SELECT ID, C2 FROM @T UNION ALL
SELECT ID, C3 FROM @T
) AS UPVT
) AS PVT
GROUP BY ID
ORDER BY 1 ASC, 2 ASC, 3 ASC;
This entry is filed under SQLing. And tagged with analytic functions, Challenges, Pivot, TSQL, Unpivot. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.
SQL> select banner from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Product
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
For me it works on 10 xe.
But I think that relying on Set to eliminate duplicated Nested Tables components is risky.
Works fine in 11g Enterprise Edition, but I had error executing it on 10g Express, I suppose you use some recent features !
It seems I have some issue with my syntax highlighter into comments, I'll check that tonight.
Edit : I had to remove line numbers in code to fix the problem.
Thanks mnitu, I love how well you handle Oracle objects !
It’s not T SQL it’s not exactly what was demanded but I had fun writing it.
/
CREATE OR REPLACE type nt_nt_v2 AS TABLE of nt_v2
/
SELECT Cursor(SELECT column_value
FROM TABLE(b.column_value)
ORDER BY 1) nt
FROM (
SELECT SET(nc) snt
FROM (
SELECT Cast(Collect(nt) AS nt_nt_v2) nc
FROM (
SELECT nt_v2(c1,c2,c3) AS nt, least(c1,c2,c3)
FROM t
ORDER BY 2
)
)
), TABLE(snt) b