Archive for the 'Datawarehousing' Category

Track Source At Column Level Inside Your Datawarehouse

Add a comment

Every one pretty much knows that code have to be commented.
In C, C++, Java, whatever your favorite language is.

In most ETL dataflow I’ve encountered, I’ve always had to reverse everything to know where the data came from.

Sometimes I can find comments in the code, but often I can’t.

When working on feeding a datawarehouse (not speaking about datamarts here), the data are matched and cleansed, but except those crucial steps, most of the columns are going down straightforward.

Nevertheless, the same question remains : WHERE my data came from, WHAT is feeding them ?

So where do you comment your code ?
You can mostly comment inside :

  • documentation
  • the ETL process
  • the database

Continue reading ‘Track Source At Column Level Inside Your Datawarehouse’

Using Analytics To Handle Breaks

Add a comment

When trying to solve last T-SQL Challenge I did find a query which fits well in both SQL Server and Oracle (with some syntax tweaking) and seems quite effective to me even on a ten millions rows table.

I said I’ll try to use the more advanced analytics function of Oracle to solve the problem, and it was a bit harder that I first thought.
With the help of Tomas “MyMaster” Kytes I could find ideas and piece of code which are a more generic treatment of the issue, so it deserves it’s own post.
Continue reading ‘Using Analytics To Handle Breaks’

Say Hello To Partitioned Outer Joins

Add a comment

Recently, I was helping someone to improve a procedure which fills a table with zero where there is no value for some days.

Instead of cursoring and inserting, I wrote a single query using a partitioned outer join to achieve this. Later in the conversation, I was asked to explain how this was working and I posted about. You can find the original topic here (in french).

A simple example is very efficient to explain the concept of the partitioned outer join. I’m using an example very similar to the one provided by Oracle in their documentation, but I’m going step by step which may help to understand the concept.

First, I’m emulating two tables. A seven days calendar one and a sales one. Continue reading ‘Say Hello To Partitioned Outer Joins’

How A Monday Became A Sunday

Add a comment

Some weeks ago, one client of mine told me something I found really weird.
He told me that he updated the (weekly) day number of a couple of days to have a good count of working days in the month.
In fact, it was a quick and dirty way to take care of something that went in production with a flaw : the need for working days has been detected in a last minute reporting, and it was a faster solution that writing an additional interface. Continue reading ‘How A Monday Became A Sunday’