DW and ETL Challenges
In a conventional Business
Intelligence Application ETL scenario, data is loaded from source (OLTP)
schemas, which in many cases support 24 X 7 transactional systems with constant
ongoing updates. In addition in case of
ETL extract failures due to system/network or database issues, it is close to
impossible to re-start the extraction as it may over run the ETL window
allocated.
One additional thing which I have
noticed over the years is that more and more customers are using Micro-ETL for
intraday updates for more frequent data refresh. Frequent Intraday updates may
cause OLTP system load on the source system resulting in poor application
performance during the business hours.
There are number of challenges
which needs to be addressed, few of them are listed below
- Contention can arise during complex extracts source systems, particularly in cases where significant OLTP data changes have occurred which must be processed and loaded by ETL processes
- Complex extracts may cause OLTP system load on the source system
- The source system has 24 X 7 usage a few small windows
- Different maintenance cycles for source system (OLTP) and DW is quite normal
- Most of organizations have multiple source (OLTP) systems
- The source application spans across multiple regions and time zones separated by a WAN may hamper extraction performance
To relieve OLTP contention and
minimize impact on source system, in my view setting up a source dependent
schemas which replicate OLTP schemas in the same database as the Oracle Business
Analytics Warehouse schema is the most obvious solution. I have been part of
number of implementation where we have tried different replication technologies
(e.g. EMC’s BCV) to achieve un-interrupted access to OLTP source system for
data extraction. The deployment of the
solution resulted in a big customization efforts.
Is Oracle Offering of Source
Dependent Schema (SDS) Architecture is a right solution? Is it different than customization efforts or
is it a simple configuration? This blog is an attempt to discuss the SDS
architecture and its usability to address the conventional ETL shortcomings.
Oracle Source Dependent Schema Architecture
Oracle Source Dependent Schema is
a schema on the BI Apps DW database that is a replication of the source OLTP system
tables, deletes and has additional optimization for incremental ETL.
Watch the space ..
No comments:
Post a Comment