Fake Cross Join : Is Oracle Really Four Times Better Than MS SQL Server ?

2 Comments

During my researches about SQL Challenge, I had an idea which I think is a fairly good one.

I had to cross join a lot of time the same value, cross join being very expensive. So I thought about quite of "faking" it, using inner join with a not so hard joining predicate. In the problematic, I had to generate the distribution of faces, using a cross join and then filtering it, join by join.

In the yesterday's SQL, with k being the number of faces and N the number of throws, cross joins gives N^k rows, where i needed "only" (N+k-1)!/(N!*(k-1)!), which is way lower (for N=20 and k=6, it is 64.000.000 versus 53.130 rows).

To fake it, I began by identifying that my first column of data is the master. Every other one will have to be inferior or equal to it, with the sum of them inferior or equal to N. But other joins are quite cross joined among them but with a lower cardinality (because they are inferior to first column).

I can illustrate with my three sided-"dicentity", and with three throws. First cross join version :

WITH
-- Compute factorials
f AS
(
SELECT 0 AS l,
cast(1 AS float(53)) AS f
UNION ALL
SELECT l+1,
cast((l+1)*f AS float(53))
FROM f
WHERE l+1 <= 3
),
-- Compute multinomial coefficients
m AS
(
SELECT
f1.l AS l1,
f2.l AS l2,
f3.l AS l3
FROM
f f1
CROSS JOIN f f2
CROSS JOIN f f3
WHERE f1.l + f2.l <= 3
AND f1.l + f2.l + f3.l = 3
)
SELECT row_number() over(ORDER BY l1 ASC, l2 ASC, l3 ASC) rk, m.*
FROM m
**************
rk l1 l2 l3
1 0 0 3
2 0 1 2
3 0 2 1
4 0 3 0
5 1 0 2
6 1 1 1
7 1 2 0
8 2 0 1
9 2 1 0
10 3 0 0

My point is about rows (1, 4, 10) ; (2, 7, 8) ; (3, 5, 9) are very similar looking. I've then modified the compute multinomial coefficient part by (rest of query remains the same) :

SELECT
f1.l AS l1,
f2.l AS l2,
f3.l AS l3
FROM
f f1
INNER JOIN f f2 ON f2.l <= f1.l AND f1.l + f2.l <= 3
INNER JOIN f f3 ON f3.l <= f1.l AND f1.l + f2.l + f3.l = 3
**************
rk l1 l2 l3
1 1 1 1
2 2 0 1
3 2 1 0
4 3 0 0

And my idea stands where previous (1, 4, 10) being row #4 ; (2, 7, 8) being row #2 ; (3, 5, 9) being row #3.

But it still need the other lines. What about a nice shifting ?

-- Shifting strategy
s AS
(
SELECT l1, l2, l3 FROM m UNION
SELECT l2, l3, l1 FROM m UNION
SELECT l3, l1, l2 FROM m
)
SELECT row_number() over(ORDER BY l1 ASC, l2 ASC, l3 ASC) rk, s.*
FROM s
**************
rk l1 l2 l3
1 0 0 3
2 0 1 2
3 0 2 1
4 0 3 0
5 1 0 2
6 1 1 1
7 1 2 0
8 2 0 1
9 2 1 0
10 3 0 0

I've got the same result than with cross join but without any cross join !

But one last question remains... Is it efficient ? And my answer is, well, it depends. Of your RDMBS.

I did ran some test on both Oracle 11g and SQL Server :

  • on Oracle 11g, it's almost twice faster : nice improvement !
  • on SQL Server 2k5+, it's almost twice slower : bad improvement slowdow optimization <insert your own word here>

Here are the numbers i crushed :

Oracle 11g SQL Server 2005
Number of throws Left shifting Cross joining Ratio CJ / LS Left shifting Cross joining Ratio CJ / LS
20 1 2 2 3 1 0,33
24 2 5 2,5 10 4 0,4
28 4 11 2,75 23 12 0,52
32 9 22 2,44 49 25 0,51
36 18 39 2,17 86 49 0,57
40 34 66 1,94 155 84 0,54
44 54 105 1,94 250 136 0,54
48 83 161 1,94 393 214 0,54
52 124 245 1,98
56 179 363 2,03
60 254 525 2,07 1342 770 0,57
64 353 737 2,09
68 477 1022 2,14
72 646 1418 2,2 3374 2025 0,6

Cross join between the two RDBMS are rather in line  - Oracle computer may be a tad stronger than SQL Server's one - but the left-shifting strategy is definetly not !

Does any Oracle / SQL Server DBA out there can explain such discrepancies with similar queries ?

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google Bookmarks
  • Facebook
  • TwitThis
  • blogmarks
  • email
  • Furl
  • LinkedIn

This entry is filed under SQLing. And tagged with , , , , . You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

2 Responses to “Fake Cross Join : Is Oracle Really Four Times Better Than MS SQL Server ?”


  1. Waldar

    I should edit my post. Both Oracle 11g and SQL Server 2005 run on Windows, but they don't have the same hardware.

    Thanks for the encouragement !

  2. laurentschneider

    Oracle is not supposed to be faster on Windows, but if you compare Oracle on a Linux Server with SQL Server on Windows running the same hardware, I would suppose Oracle is faster.

    The way each db save the segments and do the maths differs a lot. For instance you may get a 100x improvement by using binary_double instead of numbers...

    Welcome to the authoring community (do not worry about your French accent :) )