BI Direct: Oracle BI Apps 7.9.6.3 Tips and Best Practice Guidelines for Customization ( Database Objects)

In previous blog I have tried to explain the OBIA customization types and its impacts on various components of the solution.

http://bidirect.blogspot.co.uk/2012/05/oracle-business-intelligence.html

In successive blogs of this series I would like to cover the naming conventions and development guidelines that should be followed when developing new attributes, tables, dimensions and facts in the Business Analytics Warehouse (BAW) and new/changed repository objects in the OBIEE repository.

In this blog my focus would be more on database objects modification standard practices and guidelines such as tables, columns and indexes.

Why Best Practices? – The standard and guidelines for database schema, tables, columns and indexes is for better manageability and simplify the future upgrades. In long term it would help to reduce the application day to day cost and successive upgrades.

Database Schema and Objects

The Business Analytics Warehouse (BAW) is the pre-built Data Warehouse as part of BI-Apps. The database schema will be decided by the installation team. A default is “BAW”. It contains both the staging tables and DW target tables. All customized tables, columns, indexes and other database objects MUST be added to BAW schema. (Best Practice DB 1)

Why to add in BAW Schema? – For better manageability and avoid any cross schema joins at OBIEE or Informatica Level as I understand cross schema joins within repository and Informatica would more complicated than keeping it simple. The mantra is keep it simple.

The database objects changes which is been done within the database schema and objects should use best practices for creating and managing versions and comments option of underline databases. The database objects customization should use versions and comments features of database for better traceability. (Best Practice DB 2)

Why Versions and Comments? Putting comments on tables and columns based on phase of projects, user created and date would be useful and provide enough traceability of customization done.

In addition make sure that what so ever customization done to database objects should be imported into the DAC container. The database objects customization should be imported into the DAC container. (Best Practice DB 3)

Why Import? DAC should be your central repository which would be used to migrate the database objects customization from development environment to test and then production. We should avoid using word/text file as a storage option for database objects.

Tables

Naming of Tables

The naming conventions used for tables in the Oracle BI Data Warehouse are as follows. The naming conventions for tables defined in BI Data Warehouse should be followed if custom tables are created in the warehouse. (Best Practice DB 4)

Why Standard Naming Conentions? It would help to identify meaning of each of the table and its significance and usability in overall solution.

The names of the Oracle Data Warehouse tables consist of three parts:

Prefix

Name

Suffix

The three components are capitalized and separated by an underscore (_), e.g. W_GL_OTHER_F. Each component has a meaning, as described in the table below:

Component	Value	Meaning
Prefix	W_	Out-of-the-box warehouse table
Prefix	WC_	Custom warehouse table
Suffix	_F	Fact table (base)
Suffix	_FS	Staging Fact table (base)
Suffix	_A	Aggregate fact table (base)
Suffix	_D	Dimension table (base)
Suffix	_DS	Staging Dimension table (base)
Suffix	_DH	Dimension hierarchy table (base)
Suffix	_SCD	Slowly changing dimension table (base)
Suffix	_MD	Mini-dimension table
Suffix	_H	Helper table
Suffix	_HS	Helper Staging Table
Suffix	_TMP	Pre Staging Temporary table
Suffix	_G, _GS, _S	Internal Table
Prefix	_UD	Unbounded dimension
Prefix	_DHLS	Staging for Dimension Helper
Prefix	_DHL	Dimension Helper Table
Prefix	_M	Map Dimension

Table Types

The types of tables used by the Oracle Business Analytics Warehouse are as follows. The developer should use following types of tables; I believe the list of the tables used has covered almost all types of tables possible for DW solution. Any new addition would be a treated as a new entrant in the family and should be welcomed. (Best Practice DB 5)

Why different types? The type of the table and its description would be helpful to understand its significance in overall solution

Table Type	Description
Aggregate tables (_A)	Contain summed (aggregated) data
Dimension tables (_D)	Star analysis dimensions
Staging tables for Dimension (_DS)	Tables used to hold dimension information that have not been through the final ETL transformations.
Staging tables for Usage Accelerator (WS_)	Tables containing the necessary columns for the ETL transformations
Dimension Hierarchy tables (_DH)	Tables that store the dimension's hierarchical structure
Dimension Helper tables (_DHL)	Tables that store M:M relationships between two joining dimension tables
Staging tables for Dimension Helper (_DHLS)	Staging tables for storing M:M relationships between two joining dimension tables
Fact tables (_F)	Contain the metrics being analyzed by dimensions.
Fact Staging tables (_FS)	Staging tables used to hold the metrics being analyzed by dimensions that have not been through the final ETL transformations
Internal tables (_G, _GS, _S)	Internal tables are used primarily by ETL mappings for data transformation and controlling ETL runs
Helper tables (_H)	Helper tables are inserted between the fact and dimension tables to support a many-to-many (M:M) relationship between fact and dimension records
Map dimension tables (_M)	Tables that store cross-referencing mappings between the external data ID and the internal ID
Mini dimension tables (_MD)	Include combinations of the most queried attributes of their parent dimensions. The database joins these small tables to the fact tables
Primary extract tables (_PE)	Tables that are used to support the soft delete feature. The table includes all the primary key columns (integration ID column) from the source system. When a delete event happens, the full extract from the source compares the data previously extracted in the primary extract table to determine if a physical deletion was done in the transactional database. The soft delete feature is disabled by default. Therefore, the primary extract tables are not populated until you enable the soft delete feature
Unbounded dimension tables(_UD)	Tables containing information that is not bounded in transactional database data but should be treated as bounded data in the Oracle Business Analytics Warehouse

Columns

Naming of Columns

This section describes the naming conventions used for columns in the Oracle BI Data Warehouse. The naming conventions for columns defined in BI Data Warehouse should be followed if custom columns are created in the warehouse. (Best Practice DB 6)

Why Naming Convention for Columns? It would help to identify meaning of each of the column and its significance and usability in overall table usage.

The columns of the Oracle Data Warehouse tables consist of three parts:

Prefix

Name

Suffix

The three components are capitalized and separated by an underscore (_), e.g. X_BUDGET_AMT. Each component has a meaning, as described in the table below:

Component	Value	Meaning
Prefix	X_	Custom warehouse column
Suffix	_AMT	This column contains a currency amount
Suffix	_CD	The column value is based on the contents of the List of Values (LOV)
Suffix	_DT	This column contains a date or date time value.
Suffix	_FLG	This column contains a Boolean value where Y indicates Yes or True; N indicates No or False
Suffix	_I	Language Independent Code
Suffix	_ID	Columns are used in staging tables, they corresponds to the source identifier
Suffix	_WID	Identifier generated by Oracle BI linking dimension and fact tables, except for ROW_WID
Suffix	_NAME	Name corresponding to the code column (columns ending with _CODE)
Suffix	_DESC	Long Description corresponding to the code column (columns ending with _CODE)
Suffix	_NUM	This column contains a number or an identifying alphanumeric value

All new tables, whether Fact or Dimension MUST contain the following important fields :( Best Practice DB 7)

Why System Fields? It would provide metadata information about the data and DML operations. Metadata of data such as source identifier, source type, created by, creation date etc is useful in many ways such as setting up changed data capture, setting up DML operations such as insert, update and up-sert.

Column Name	Column Type	Required in Area	Description
INTEGRATION_ID	VARCHAR2(80 CHAR)	Staging / DW	Stores the primary key or unique identifier of a record in the source table.
DATASOURCE_NUM	NUMBER(10)	Staging / DW	Stores the data source from which the data is extracted – this is setup in DAC. E-Business Suite = 4. New external sources should have their own unique number.
ETL_PROC_WID	NUMBER(10)	DW	Stores the ID of the ETL process information; details of ETL process are stored in the W_ETL_RUN in the warehouse
ROW_WID	NUMBER	DW	Is the sequence number generated during the ETL process; unique identifier (sequence number) for the tables.
CREATED_ON_DT	DATE	Staging / DW	Stores the creation date of a record in the primary source table
CHANGED_ON_DT AUX1_CHANGED_ON_DT AUX2_CHANGED_ON_DT AUX3_CHANGED_ON_DT AUX4_CHANGED_ON_DT	DATE	Staging / DW	Stores the last update date of a record in the primary source table Additional column prefixed by AUX# can be added to store the last update date related to auxiliary source tables
CREATED_BY_ID	VARCHAR2(80)	Staging	Stores the user id who created a record in the primary source table
CREATED_BY_WID	NUMBER(10,0)	DW	Surrogate key of the user id who created a record in the primary source table
CHANGED_BY_ID	VARCHAR2(80)	Staging	Stores the user id who last updated a record in the primary source table
CHANGED_BY_WID	NUMBER(10,0)	DW	Surrogate key of the user id who last updated a record in the primary source table
TENANT_ID	VARCHAR2(80)	Staging / DW	This permits to distinguish distinct source instances having a common data source num identifier Default value is ‘DEFAULT’

Miscellaneous Technical columns for new tables

All new tables, MAY contain the following important fields:( Best Practice DB 8)

Why Flg Columns ? It would provide metadata information about the data and DML operations. Metadata of data such as source identifier, source type, created by, creation date etc is useful in many ways such as setting up changed data capture, setting up DML operations such as insert, update and upsert.

Column Name	Column type	Required in Area	Description
CURRENT_FLG	CHAR(1)	DW	For Slowly changing dimension purpose
DELETE_FLG	CHAR(1)	Staging / DW	For Soft/Hard delete purposes
W_INSERT_DT	DATE	DW	optional insertion date for ETL audit purpose
W_UPDATE_DT	DATE	DW	optional update date for ETL audit purpose
SRC_EFF_FROM_DATE	DATE	Staging / DW	For Type 2,3 or 6 slowly changing dimension purpose
SRC_EFF_TO_DATE	DATE	Staging / DW	For Type 2,3 or 6 slowly changing dimension purpose
EFFECTIVE_FROM_DATE	DATE	Staging / DW	For Type 2,3 or 6 slowly changing dimension purpose
EFFECTIVE_TO_DATE	DATE	Staging / DW	For Type 2,3 or 6 slowly changing dimension purpose

Miscellaneous Technical columns for new tables

The currency codes for related system columns are as follows:( Best Practice DB 9)

Why Currency Codes? To support reporting in multiple currencies is one of the prime features of BI Apps implementations.

System Column	Description
DOC_CURR_CODE	Code for the currency in which the document was created in the source system.
LOC_CURR_CODE	Usually the reporting currency code for the financial company in which the document was created.
GRP_CURR_CODE	The primary group reporting currency code for the group of companies or organizations in which the document was created.
LOC_EXCHANGE_RATE	Currency conversion rate from the document currency code to the local currency code.
GLOBAL1_EXCHANGE_RATE	Currency conversion rate from the document currency code to the primary group currency code.
GLOBAL2_EXCHANGE_RATE	Currency conversion rate from the document currency code to the GLOBAL1 currency code.
GLOBAL3_EXCHANGE_RATE	Currency conversion rate from document currency code to the GLOBAL2 currency code.

Indexes

Naming of Indexes

The naming conventions used for Indexes in the Oracle BI Data Warehouse are as follows. The naming conventions for indexes defined in BI Data Warehouse should be followed if custom indexes are created in the warehouse. (Best Practice DB 10)

Why Naming Conventions for Indexes ? It would help to identify custom indexes and its significance and usability in overall solution.

The index name of the Oracle Data Warehouse tables consists of two parts

Prefix

Name

Suffix

The three components are capitalized and separated by an underscore (_), e.g. W_GL_BALANCE_F_U1, each component has a meaning as described in the table below.

Component	Value	Description
Prefix	W_	Standard Data Warehouse Table
Prefix	WC_	Custom Table
Name	XXX	Table Name (e.g. W_GL_BALANCE_F)
Suffix	_U1	Unique Index ( e.g. composite index on INTEGRATION_ID and DATASOURCE_NUM_ID
Suffix	_F1	Non Unique Dimensional Reference Key Index ( e.g. bitmap index such as BALANCE_DT_WID )
Suffix	_M1	Non Unique Index ( e.g. bitmap or normal index or composite normal index )

Best Practices for Indexes

The best practices for the indexes are as below.

· The custom indexes in Source System ( e.g. Oracle EBS) for incremental Loads performance can be categorized into following groups and action need to taken based on facts

o Tables that do not have indexes on LAST_UPDATE_DATE in the latest EBS releases, but there are no performance implications reported with indexes on LAST_UPDATE_DATE column.

o Tables that have indexes on LAST_UPDATE_DATE columns, introduced in Oracle EBS Release 12.

o Tables that cannot have indexes on LAST_UPDATE_DATE because of serious performance degradations in the source EBS environments.

The indexes on source system should have prefix OBIEE_ and DO NOT follow to the standard source system index naming conventions (e.g. OBIEE_OE_ORDER_HEADERS_ALL)(Best Practice DB 11)

Why? This is to make sure that these custom indexes are managed separately from the rest of the standard indexes.

· Staging tables typically do not require indices, not a strict rule

Don’t hesitate to include indexes on staging table if necessary(Best Practice DB12).

Why? Let common sense prevails, I hate strict rules; if you prove the index would help you in data load performance don’t worry go ahead and just do it.

· Create indices on all columns that the ETL uses for dimensions and facts. For example:

o ROW_WIDs of dimensions and facts

o INTEGRATION_ID

o DATASOURCE_NUM_ID

o Flags

o Dimensional Reference Keys of fact tables

Create indices on all system columns such as ROW_WID, INTEGRATION_ID etc used within ETL processes.This is absolutely MUST for ETL performance. (Best Practice DB13).

Why? ETL Performance, without these indices I can tell you that your ETL won’t run efficiently.

· Create Bitmap Indexes for Dimensional reference keys and flags, treat this a MUST requirement. (Best Practice DB14).

Why? With Bitmap Indexes for Dimensional reference keys and flags allows us to use STAR TRANSFORMATION.I believe it is absolutely essential to achieve best query performances.

Make sure that the database parameter for STAR TRANSFORMATION & Repository Database Parameters setting has been in place.

· Carefully consider on which columns to put filter conditions (Best Practice DB15).

Why? Setting up index on column which we used to put condition in report would improve performance for obvious reason.

· Define indices to improve query performance(Best Practice DB16)

o Inspect standard repository objects for guidance

Why? The index creation processes should be driven by business requirements rather than physical structures and conventional wisdom. I believe the reporting system is a mainly a business support system and should be driven by business requirements.

· Register new indices in the DAC, treat this as a MUST requirement ( Best Practice DB17)

Options within DAC are

o Local or Partitioned Index

o Query Index or ETL Index

o The DAC server drops and re-creates indices during a full load and /or incremental load

Why? As I said earlier DAC should be your central repository for all database objects which includes indexes. This is helpful during migration of code from development instance to test and finally to production. DAC would be useful to maintain drop and re-recreate indices during full load and/or incremental load.

· Review/Revisit Index Strategy on regular basis ( Best Practice DB18)

Why? Review and Revisit the Indices should be done on regular basis, particularly with every new release which might end up with different usage pattern and query performances.

· The best practices for handling BI Applications indexes in Exadata Warehouse (Best Practice DB19):

Why? Exadata, Exalytics are specially designed engineering system and requires different approach. Based on my experience in this area the success mantra is don’t rush to any conclusion, make sure that proper comprehensive benchmarking is done before dropping or disabling any indices.

o Turn on Index usage monitoring to identify any unused indexes and drop / disable them in your environment.

o Consider pinning the critical target tables in smart flash cache

o Consider building custom aggregates to pre-aggregate more data and simplify queries performance.

o Drop selected query indexes and disable them in DAC to use Exadata Storage Indexes / Full Table Scans only after running comprehensive benchmarks and ensuring no impact on any other queries performance.

I hope you find the second part of this series informative and useful. My plan is to write next couple of blogs on guideline on Informatica, DAC and Repository customisation.

2 comments:

Anonymous16 January 2013 at 01:47
Excellent post! Very helpful. Looking forward to the next few posts on Informatica, DAC and Repository customisation.
Lafay Tech Plaza13 August 2021 at 13:37
With the advent of cloud computing, businesses of all sizes are looking to improve their data warehouse architectures.Azure Synapse is a tool that helps to integrate on-premises data sources with a cloud-based data warehouse. It transparently ingests data from on-premises data sources, and then transfers the data to Azure, where it is stored in Azure data warehouse. Azure Synapse also pushes data between Azure and on-premises data warehouses.

Pages

Sunday, 13 January 2013

Oracle BI Apps 7.9.6.3 Tips and Best Practice Guidelines for Customization ( Database Objects)– Part B

2 comments: