Thursday, 31 January 2013

Oracle BI Apps 7.9.6.3 Tips and Best Practice Guidelines for Customization (Informatica) – Part C

In previous blog I have tried to explain the OBIA customization types and its impacts on various components of the solution, I have also covered database objects database objects modification standard practices and guidelines such as tables, columns and indexes in my second blog of the series.

You can find out the previous blogs at: link to blogs

Oracle Business Intelligence Application 7.9.6.3 Customization Types, Tips and Best Practice Guidelines – Part A
  
Oracle BI Apps 7.9.6.3 Tips and Best Practice Guidelines for Customization ( Database Objects)– Part B
  
In this blog my focus would be more on Informatica modification standard practices and guidelines such as mappings, map-lets and sessions. I would include ODI may be in future blog, the focus area of this blog is Informatica only. 

Why Standards and Guidelines? –The standard and guidelines for Informatica folders, mappings, mapplets and sessions is for better manageability and simplify the future upgrades. In long term it would help to reduce the application day to day cost and successive upgrades.

Informatica Folder Structures
The Informatica repository is designed into mapping folders.  There is a different folder for each source adapter which is prefixed with ‘SDE’.  These represent the Source Dependant Extracts from the ERP system of your source system to the Business Analytics Warehouse staging tables.
There are 2 folders, ‘SIL_Vert’ and ‘SILOS’, that represent the Source Independent Loads ETL mappings from the staging to the Data Warehouse schema.
There is another folder ‘PLP’, which represents the Post Load Process ETL mappings which do aggregations, snapshots, cross subject area relationship updates and update flags etc after completion of Source Independent Loads ETL mappings.

OOTB folders related to E-Business Suite source
The following folders are used for the typical E-Business implementation project:
SDE_ORAR11510_Adaptor - Source Dependant Extracts from the E-Business Source System 11.5.1.0

SILOS - Source Independent Loads from staging to Data Warehouse schema.

PLP - Post Load Process for aggregations, snapshots etc

No change should be allowed to be done in the OOTB Folders. Treat this as a MUST requirement, any change should be carefully recorded (Best Practice ETL1)

Why? To maintain OOTB intact is very important which would help with better ETL content manageability and future upgrades. Folder may be overwritten in future upgrades. Careful recording of customization is useful for re-application if required.


Custom folders related to E-Business Suite source
So the best option is to create custom folders. The following custom folders are to be created for the typical E-Business implementation project:
Custom_SDE_ORAR11510_Adaptor - Source Dependant Extracts from the E-Business Source System 11.5.1.0 for custom process (modified from the existing one or newly designed)

Custom_SILOS - Source Independent Loads from staging to Data Warehouse schema for custom processes (modified from the existing one or newly designed)

Custom_PLP - Post Load Process for custom aggregations, snapshots etc (modified from the existing one or newly designed)



The naming conventions for folders defined in BI Data Warehouse should be followed if custom folders are created in the warehouse. (Best Practice ETL 2)


Informatica Mapping, Maplet, Session and Workflow

Customization of OOTB Components

The setting up session parameters such as commit interval etc is not in the scope of the customization best practices for abivous reasons (Best Practice ETL 3)


Mapping

Prefix the mapping with a custom identifier (e.g. customer identifier is ‘XYZ’), then folder type (“SDE_” or “SIL_”), and include a descriptive name.

For example, an ETL mapping to extract Invoice distribution fact from E-Business ERP, a name may be XYZ_SDE_ORA_APInvoiceDistributionFact

While copy and paste mapping from OOTB folder to Custom folder, there are two options ‘Paste’ and ‘Paste Shortcut’, I would prefer to use paste shortcut only when a mapping reference is required for workflow and there is no need to change the mapping at all.

Don’t make any change in the mapping in OOTB folder, take a copy of the mapping and copy it to equivalent custom folder, and follow the naming convention as stated above (Best Practice ETL4)

Why? To protect mapping changes from future upgrade of BI Apps. Adding a custom identifier in the mapping name in the custom folder provides better manageability for migration of informatica metadata and upgrade.    


Maplet

Prefix the maplet with a custom identifier (“XYZ”), the object type (“mplt_” = maplet), and include a descriptive name. 

For example, a re-usable part of a mapping to load Party Organisation may be: XYZ_mplt_PartyOrgDimension

The maplet are copied when mapping is been copied to custom folder, no need to rename the existing maplet, but any change or customization in maplet should be done by following the naming convention suggested. (Best Practice ETL5)

Why? Any customization in maplet would be easily identified, as I believe most of the data load logic is within the maplet and there is a danger that any change in existing maplet would affect the functioning of other mapping as maplet is most reusable component.   

Sessions

Prefix the session with a custom identifier (“XYZ”), the folder type (“SDE_” or “SIL_”) and include a descriptive name. 

For example, running a session to extract Invoice distribution fact from E-Business ERP may be: XYZ_SDE_ORA_APInvoiceDistributionFact

Create session based on the modified mapping in Workflow Manager. I would prefer to create multiple sessions from the same mapping if required e.g. incremental load and full load (Best Practice ETL6)

Why? Session allows overriding the SQL Override if I would like to change the data load extraction logic from incremental load to full load or vice versa

Workflows

Prefix the Workflow with a custom identifier (“XYZ”), the folder type (“SDE_” or “SIL_”) and include a descriptive name. 

For example, running a workflow to extract Invoice distribution fact from E-Business ERP may be: XYZ_SDE_ORA_APInvoiceDistributionFact

Create workflow which uses the session within based on the modified mapping in Workflow Manager. I would prefer to create multiple workflows from the same mapping if required e.g. incremental load and full load (Best Practice ETL7)

Why? Workflow is a fantastic option to manage dependency of data load, can access various operations such as sql, shell script and obvious Informatica sessions Design once and reuse often is the Mantra.   


New Custom Components

Note: Based on my implementation experiences I believe the most of the customer really don’t follow the customization practices for OOTB mappings/sessions. Copying the OOTB mapping into the Custom Folder and modifying it within the custom folder is very essential and I would go easy if someone is not using the naming convention suggested for OOTB mapping customization.

The naming convention for new custom components is a must requirement. The best practices mentioned for OOTB custom components are also applicable for new custom components.

Mapping

Prefix the mapping with a custom identifier (e.g. customer identifier is ‘XYZ’), then folder type (“SDE_” or “SIL_”), and include a descriptive name.

For example, an ETL mapping to create a post load process to create invoice on hold snapshot be:  XYZ_PLP_InvoiceOnHoldSnapshot

Maplet

Prefix the maplet with a custom identifier (e.g. customer identifier is ‘XYZ’), the object type (“mplt_” = maplet), and include a descriptive name. 

For example, a re-usable part of a mapping to load Meeting List may be:
XYZ_mplt_MeetingListDimension

Sessions

Prefix the Workflow with a custom identifier (e.g. customer identifier is ‘XYZ’), the folder type (“SDE_” or “SIL_”) and include a descriptive name. 

For example, an workflow to create a post load process to create invoice on hold snapshot be:  XYZ_PLP_InvoiceOnHoldSnapshot

Workflows

Prefix the Workflow with a custom identifier (e.g. customer identifier is ‘XYZ’), the folder type (“SDE_” or “SIL_”) and include a descriptive name. 

For example, a workflow to create a post load process to create invoice on hold snapshot be:  XYZ_PLP_InvoiceOnHoldSnapshot

Transformation

Use Change Capture functionality for incremental updates it is different for different applications, the rule is to load delta rather than full load everytime.(Best Practice ETL8)

Why? Incremental data load make a perfect sense, I would recommend implementing incremental load for PLP/Aggregate data load also. I know almost 5/6 BI Apps implementations where full load is done every night, it is faster than incremental load. In reality the partial full load for certain areas e.g. Assignment/Positions etc is a reality at many places.   

After creating a dimension in Informatica, try to create a reusable expression to handel surrogate key resolution (Best Practice ETL9)

Why? Reusability is the basic principal, treat Transformation, Maplet etc as a function in PL/SQL terms. Design once and reuse whenever possible.   

Create a reusable lookup transformation LKP_W_dimension_D. It should be defined as the following:

Moreover, the mandatory conditions are:
o   INTEGRATION_ID=IN_INTEGRATION_ID
o   DATASOURCE_NUM_ID=IN_ DATASOURCE_NUM_ID
And the following conditions are only added if the dimension is specified as a SCD Type 2, 3 or 6:
o   EFFECTIVE_FROM_DT<=IN_DATE
o   EFFECTIVE_TO_DT>IN_DATE

Lookup SQL Override of Lookup should be commented to avoid order by clause in query and make sure it is cacheable if referenced multiple times within the different mappings (Best Practice ETL10)

Why? Removal of Order by Clause will improve the performance of lookup query and making it cacheable will improve performance of overall load process.    


Setting up the cache for lookup queries is a double edge sword, make sure appropriate analysis of storage space allocated to ETL Server is must before setting up the caching in place. Also make sure the purging of caching is in place to avoid data refresh issues.

Use lookup transformation based on ETL_PROC_WID is essential for restar ability (Best Practice ETL11)

Why? To make sure that data will not be reloaded after session failure

Ensure that all INNER/OUTER joins performed in Source Qualifiers are correctly defined; resolve one-to-many relationship by using lookup instead of using SQL override (Best Practice ETL12)

Why? The inner joins always limit the dataset loaded into the DW database while outer join may end up with unnecessary data in DW, it is essential that the required and right data is loaded, use RIGHT join

E.g. For multiple assignment records use primary assignment condition within lookup rather than using it within SQL Override.

Note- I believe BI Apps Informatica data extraction logic is mostly defined within SQL Override followed by few transformation, lookup and update strategy, the OOTB is not really using any of the functionality of Informatica as Integration Platform can offer. I believe it is under utilization of ETL tool.


Non Indexes Join Paths should be replaced with Informatica lookups and convert non-index-able filters to index-able ones. Review all joins in SQL Override (Best Practice ETL13)

Why? A flag column filter is better than using name not like ‘%SP%’ where clause, data extraction (SDE) time should be minimum to avoid any adverse impacts on source system.

Miscellaneous Informatica for BI Apps Best Practices

As a best practice, you should comment custom code you have introduced. Comments should include at least the developer's name and the date the code was added.(Best Practice ETL14)

Why? It is useful for traceability and accountability. Developer name, code and reason (for example requirement/bug) would be useful for continuous support to reporting solution.     

Any modification to a component must be checked-in and commented once the fix is validated (Best Practice ETL15).

Why? Be a good developer.    

Add documentation to identify an integration_id column (MD070, Database comments…(Best Practice ETL16)

Why? The documentation to support development is useful.    

The flag 'Fail parent if this task fails' should be checked for all the sessions within a workflow(Best Practice ETL17).

The flag 'Fail parent if this task does not run' should be checked for all the sessions within a workflow(Best Practice ETL18).

Why? It is a standard Informatica development practice.    

The Stop on Errors parameter should be set to 1 for all sessions within a workflow. This parameter is located in the Error handling area of the Config Object tab in Informatica PowerCenter Designer(Best Practice ETL19).

Why? It is helpful to stop data processing on occurrence of first error.    

For all entities, such as dimension and fact tables, create two workflows, one to be used for a full load and the other to be used for an incremental load. Both workflows are based on the same mapping. The same mapping gets executed during both full and incremental loads. This provides an opportunity to tune each of these load scenarios(Best Practice ETL20).

Why? A separate handler for tuning each of the load scenarios is a absolutely must.    

Custom change capture img suffix : The image suffix should be three characters long. The recommended naming convention is to start with C. For example, use C01, C02 and so on (Best Practice ETL21).

Input fields of a mapplet are prefixed by ‘INP_’. Output fields of a mapplet are prefixed by ‘EXT_’ (Best Practice ETL22)

Why? This is for Siebel Application based BI Apps ( BI CRM).    

The surrogate keys related to day date calendar (non-fiscal) are specified as NUMBER. The mask applied is YYYYMMDD where YYYY = Year, MM = Month #, DD = Day #. The following example for a meeting start date MEETING_START_DT_WID would be the result of the transformation TO_NUMBER(TO_CHAR(MEETING_START_DT,’YYYYMMDD’)) (Best Practice ETL23).

Why? A special treatment for date to date WID conversion as Date Dimension is referenced by WID from Fact tables.

I hope this blog Oracle BI Apps 7.9.6.3 Tips and Best Practice Guidelines for Customization (Informatica) is good enough to answer few of your questions/queries. I would appreciate your feeback/comments if any.  

Wednesday, 23 January 2013

Oracle Business Intelligence 11g – Usage Tracking – Part A (Setup and Administer)

The Oracle BI Server supports the collection of usage tracking data. It is becoming very important with OBIEE 11g as Oracle BI Summary Advisor feature works in conjunction with the usage tracking feature. There is a significant value addition from previous version (10g) to the latest version of Usage Tracking with OBIEE 11g. It has been changed for better, but I still believe there are ample opportunities to get value out of this solution apart from simple log of logical and physical queries.

This blog series on usage tracking is an attempt to explain how usage tracking can be enhanced to provide a value to business users along with the technical and backend support resources. Business Intelligent solution should provide business intelligent inputs. In Part A I would be focusing on setup and admin of usage tracking, next blog will cover more than how to integrate usage tracking log data with human resource data within organization.   
 

Why Usage Tracking

There are many business challenges which might come across when we deploy OBIEE/OBIA reporting solution.

The Oracle BI may not be optimized for querying that actually occurs

  • End Users queries may not match with what is expected, so cache is not seeded with appropriate queries

  • Additional aggregate table may need be created to speed up query performance or may need to remove aggregate tables which are not been used etc
  
The DBA or Support may like to investigate the physical query for business logic as well as for database tuning activities.

The company may like to track reporting and database usage at user or department level, this might be to charge the user/department based on report/database usage or there may be legal /regulatory requirement around usage tracking and data protection compliances.


Solution – Oracle Usage Tracking

To fulfill all business challenges mentioned above, Oracle Usage Tracking provides business solution. Oracle Usage Tracking mechanism tracks and stores Oracle BI Server usage at detail query(logical and physical) level.

The accumulation of usage tracking statistics that can be used in a variety of ways such as database performance optimization, aggregation strategies (summary advisor) and users and department level based billing for the resources ( reports or databases) consumption.

OBIEE provides ability to analyze usage results  

Sample Usage Tracking Implementation

A sample usage tracking implementation is provided with the Oracle Business Intelligence installation at:

ORACLE_INSTANCE\bifoundation\OracleBIServerComponent\coreapplication_obisn\sample\usagetracking

NOTE – The repository and web cat content provided in this folder works with earlier version OBIEE 10g. Require upgrade utility to upgrade repository and web cat. I ran upgrade utility as suggested, once migration is finished what I have found is that usage tracking for logical queries does not reflect the new additions done for OBIEE 11g and repository does not have physical query table exposed for reporting purpose.


Setting up Direct Insertion

Direct insertion is the recommended method for setting up usage tracking, although option to write in file is also available.

Usage Tracking Statistics Database

Oracle Business Intelligence database created during the installation of OBIEE 11g which has RCU-created tables can be used for usage tracking storage.

NOTE – Make sure that XXX_BIPLATFORM should have S_NQ_ACCT (used to store logical query details) and S_NQ_DB_ACCT (used to store physical query details)

Setting Direct Insertion Parameters

For setting up direct insertion for new installations we have to use MBean Browser in Fusion Middleware Control (Enterprise Manager), it is different than what we used to do in OBIEE 10g.

  1. Open Enterprise Manager – put in username/password to login  

  1. Choose bifoundation_domain which is a part of Web Logic Domain, right click and choose  System Mbean Browser  




  1. Expand Application Defined MBeans > Expand oracle.biee.admin > Expand Domain:bifoundation_domain > Expand BIDomain and Select the BIDomain MBean where group=Service


  1. Lock the domain

    1. Expand BIDomain and select the BIDomain MBean where Group=Service
    2. Display the Operations tab
    3. Click the lock link , ‘invoke’ and return



  1. Expand BIDomain.BIInstance.ServerConfiguration, and then select the BIDomain.BIInstance.ServerConfiguration MBean, Ensure that UsageTrackingCentrallyManaged attribute is set to true.
NOTE – The usage tracking attributes can be managed using NQSConfig.INI file rather than System MBean Browser. One need to set UsageTrackingCentrallyManaged attribute to false. I would recommend to use centrally managed option for ease and simplicity.


  1. Set Usage Tracking Attributes as below   

NOTE – The UsageTrackingPhysicalTableName attribute to the name of the fully-qualified database table for collecting query statistic information, as it appears in the Physical layer of the Oracle BI repository. For example: "My_DB"."DEV_BIPLATFORM"."S_NQ_ACCT"

The UsageTrackingConnectionPool attribute to the name of the fully-qualified connection pool for the query statistics database, as it appears in the Physical layer of the Oracle BI repository. For example: "My_DB"."Usage Connection Pool"
  
  1. Release Lock for domain and Commit 
    1. Return to the BIDomain MBean where Group=Service
    2. Display the Operations tab
    3. Click on the commit operation



  
  1. Go to Overview Page and click Restart to start Oracle Business Intelligence Services 

Setting Optional Parameters

The optional parameters in the Usage Tracking section of the NQSConfig.INI located at
$BI_INSTANCE/config/OracleBIServerComponent/coreapplication_obis1.

BUFFER_SIZE = 250 MB;
-       Allocated memory for buffering the insert statement , improves insert throughput and no wait for normal analyses
BUFFER_TIME_LIMIT_SECONDS = 5; 
-       Maximum amount of time that an insert statement remains in the buffer , ensure quick insert statements
NUM_INSERT_THREADS = 5;
-         Number of inserts treads should be equal to the maximum connections setting in the connection pool
MAX_INSERTS_PER_TRANSACTION = 1;
-       Maximum number of inserts per transaction, larger number will increase throughput but there is potential failure of statement due to deadlocks.

Repository Metadata Design

Physical Layer

Import the usage tracking tables into the Physical layer of the Oracle BI repository and then expose it to logical layer and presentation layer.


The physical data model and physical joins are as follows.


Logical Layer

Logical Dimension tables, logical tables and Logical Facts tables are build as below by dragging physical tables to Business Layer.



Business Model diagram for usage tracking – Two Facts ( Logical and Physical Queries Fact ) and 4 degenerative dimension – Logical Query, Physical Query, Time and User





I believe logical dimensions with level based or value based hierarchy is one of the most important feature we should use for analysis. The logical dimensions with hierarchies for logical queries dimension, user dimension and time are very useful for reporting and analysis.




The important measures included for analyses of logical and physical query analysis are shown below. Most of them are very straight forward and directly mapped from the physical layer to logical layer.



Presentation Layer

Drag and drop from logical layer to presentation.



Analysis (Report) Build 

The following are some of the sample reports which one can build based on usage tracking logical and physical statistical data.







I know there is nothing new in this blog, but still I hope you find this helpful at least as a starting point to set up usage tracking in OBIEE 11g.