Sunday, 12 February 2012

Data Densification in an OBIEE 10/11g


In last blog I have tried to explain various modeling options by which we can implement the Outer Join can be implemented for OBIEE 11G. The Dashboard option is only applicable to single dimension while the option of adding measure to preserve dimension values, is suitable for one or more dimension.

The steps involved were

·   Add separate logical table source for each preservation measure to trigger fact-based partitioning outer joins; each LTS maps just that preservation column

·    Set Level-based at the All level except for dimensions to be preserved

·    Apply the hidden column format as the system-wide default for these preservation measures

This modeling method of preserving each dimension becomes more complicated if number of preservation columns is high and it is also cumbersome to generate reports by using individual measures for each preserve column ( year) or combination of preserve columns ( e.g. year- office).
Is there any way by which we can achieve Data Densification for OBIEE for all dimensions and dimension members? The solution/model should also offer an ease in generating report with or without data densification within the same subject area.

This is an attempt to discuss the data densification in oracle database in general and OBIEE 10/11g in particular.


Data in a relational data warehouse is usually sparse. Densities of 0.1-2% are very common. No data exists in the fact table if there is no value exists for a given combination of dimension values. However, the analyst may want to view the data in a dense form, which rows for all combination of dimension values displayed even when no fact data exists for them. For example, if a product did not sell during a particular time period, users may still want to see the product for that time period with zero sales value next to it.

Data densification is the process of converting spare data into dense form. For Oracle database the portioned join operator provides an easy, elegant and efficient option e.g.

SELECT V2.pid, V2.tid, SUM(SALES) OVER
(PARTITION BY V2.PID, V2.YEAR ORDER BY V2.TID) YTD sales
FROM SALES PARTITION BY(PID) V1 RIGHT OUTER JOIN
TIMES V2 ON (V1.TID=V2.TID)

Data Densification - OBIEE 11g

The approach for data densification which works across multiple dimensions for OBIEE 11g is as follows. For demonstration purpose, I have used following reporting schema.

Fact –Billed Revenue

Dimensions –Time Day Grain, Time Month Grain, Time Quarter Grain, Customers, Offices, Product Customer

The idea is to add a dummy fact table with single column, and then create joins between the dummy fact table and the all dimension tables using 1=1 as the join criteria. This should force the BI Server to do an outer stitch join between original query (from Billed Revenue) and the Cartesian query from dummy fact table.

Modeling Physical Layer
Create a new physical table in the physical layer called ‘DUMMY’, with a single column called ‘DUMMY’ with a SQL statement source ‘SELECT 1 AS DUMMY FROM DUAL’;


Create joins between ‘DUMMY’ table and the all the dimension tables using 1=1 as the join criteria. The Office Dimension is joined with the condition 1=1 with the fact table as shown below.

The physical diagram after setting up all joins between the dimensions and ‘DUMMY’ fact table should be as same as shown below.




Modeling Logical Layer
In the logical layer drag the dummy column from the dummy table as a new fact measure, called ‘Show All Rows’.

Modeling Presentation Layer
In the Presentation Layer drag the ‘Show All Rows’ column, it is just drag and drop.




Note – After completing the development, don’t forget to check consistency of the repository.
OBIEE Analysis (Answer/Report)
Login, load repository metadata and choose the subject area for new analyses. The first step is to apply hidden column format as the system-wide default for ‘Show All Rows’ column.

The reports with ‘Show All Rows’ column and standard report without it are generated as below.


The  Logical SQL generated by OBIEE Server for combined report is as below.
SET VARIABLE QUERY_SRC_CD='Report',PREFERRED_CURRENCY='USD';SELECT
   0 s_0,
   "A - Sample Sales"."Time"."T05 Per Name Year" s_1,
   "A - Sample Sales"."Base Facts"."1- Revenue" s_2,
   "A - Sample Sales"."Base Facts"."Show All Rows" s_3
FROM "A - Sample Sales"
ORDER BY 1, 2 ASC NULLS LAST

The physical SQL generated by OBIEE Server  - BI Server does an outer stitch join between original query (from Billed Revenue) and the Cartesian query from dummy fact table.


By adding/removing ‘Show All Rows’ column in report Analyst can easily switch report to and from standard report to all rows (outer join) report.  
The outer join for time and office dimension is illustrated below.


2 comments:

  1. Great Work,,I tried this in my system, but giving performance problem.

    ReplyDelete
  2. This is exactly what I need. I've followed these steps as best I can and the sql that is formed looks just like yours except for when it's time to join the two subqueries, it does a LEFT OUTER JOIN as opposed to a FULL OUTER JOIN as pictured above. I've checked the Database Features in the Physical pane of the RPD and FULL OUTER JOIN is supported.

    I can't seem to find a way to alter this behavior. Please reply if you know why this is happening.

    ReplyDelete