DW and ETL Challenges
In a conventional Business Intelligence Application ETL scenario, data is loaded from source (OLTP) schemas, which in many cases support 24 X 7 transactional systems with constant ongoing updates. In addition in case of ETL extract failures due to system/network or database issues, it is close to impossible to re-start the extraction as it may over run the ETL window allocated.
One additional thing which I have noticed over the years is that more and more customers are using Micro-ETL for intraday updates for more frequent data refresh. Frequent Intraday updates may cause OLTP system load on the source system resulting in poor application performance during the business hours.
There are number of challenges which needs to be addressed, few of them are listed below
- Contention can arise during complex extracts source systems, particularly in cases where significant OLTP data changes have occurred which must be processed and loaded by ETL processes
- Complex extracts may cause OLTP system load on the source system
- The source system has 24 X 7 usage a few small windows
- Different maintenance cycles for source system (OLTP) and DW is quite normal
- Most of organizations have multiple source (OLTP) systems
- The source application spans across multiple regions and time zones separated by a WAN may hamper extraction performance
To relieve OLTP contention and minimize impact on source system, in my view setting up a source dependent schemas which replicate OLTP schemas in the same database as the Oracle Business Analytics Warehouse schema is the most obvious solution. I have been part of number of implementation where we have tried different replication technologies (e.g. EMC’s BCV) to achieve un-interrupted access to OLTP source system for data extraction. The deployment of the solution resulted in a big customization efforts.
Is Oracle Offering of Source Dependent Schema (SDS) Architecture is a right solution? Is it different than customization efforts or is it a simple configuration? This blog is an attempt to discuss the SDS architecture and its usability to address the conventional ETL shortcomings.
Oracle Source Dependent Schema Architecture
Oracle Source Dependent Schema is a schema on the BI Apps DW database that is a replication of the source OLTP system tables, deletes and has additional optimization for incremental ETL.
Watch the space ..