The need to retain the existing Data
Warehouse data-set, for BI Apps after platform or content upgrade does present
a number of challenges. These challenges are not entirely technical in nature.
The main challenge for the project is to de-risk this activity alongside the
considerable work that is already known to be required, to upgrade the ETL and
reporting layer aspects of the overall application.
Recently I have read a customer document
which outlines the possible upgrade options for application and data warehouse
data. I believe it is one of the well thought and well written documents around
OBIA Application and warehouse data. This blog is an attempt to share the
content of document to wider audience. I am a messenger here, credit goes to
someone else.
Upgrade Options for Warehouse Data
The upgrade option choice for warehouse data
depends on which application upgrade option chosen.
There are three main upgrade options for BI
Apps upgrade
- Upgrade and Re-implement (Oracle Recommended)
- Rediscover and Re-Implement
- Re-implement and Retain
I have
decided not go into details of these application upgrade options as of now, and
talk about upgrade option for upgrade of warehouse data. I will definitely
cover these options in some other day.
Which application upgrade option is taken
does impact which approach should be taken here, as some go in tandem and some
effect the options positively or negatively.
There are three upgrade options discussed in
this blog for upgrade of warehouse data. For all options, a baseline activity
at cutover would be to have a full copy of the current Production data
warehouse reporting schema and a target data warehouse schema containing the
new data model structure only. The data movement and transformation activity
needs to be undertaken with PL/SQL scripts – either based on Oracle supplied
scripts for vanilla, or custom built for the complete data model.
Option -1: Upgrade Warehouse Data
This
option is all about migration of the existing data into the new model. This
would entail the creation of a set of upgrade scripts for migration of data
into new model. Following this activity, the ETL would be configured to run as
an Incremental Load, to extract the new transactional data ongoing, as at
present.
The
new OBI Apps vanilla ETL extracts additional data from source system to that
which exists in the current warehouse. Following migration of the existing data
warehouse data; it would therefore be necessary to refresh that data with the
latest attributes from the source system database. This would need to be achieved via an
additional set of PL/SQL scripts or via the Incremental Load ETL code,
undertaking a full refresh of data.
Some
of the important observations for this option are as follows
- This approach adds technical complexity in the project delivery, with the associated impact that entails.
- The option suggest need to rebuild the ETL application code means that a set of Full and Incremental SDE objects would need to be re-customised and tested.
- A proper testing strategy is required to accommodate the Full Load + Incremental Load scenarios, in addition to the Upgrade Warehouse Data + Incremental Load approach.
- This approach forces a dependency on the Full and Incremental activities as the two streams of activities This is likely to elongate the build / unit test phase as well as the main testing phase, which would need to test both Load scenarios.
Option 2: Full Load ETL and Migrate
Aged Data
This option proposes to populate the new
blank data warehouse, from the data from source system database. In first step
extract data that is extant in the source system, a further activity would be
required to migrate the archived data from the existing data warehouse into the
new data warehouse.
This would entail the creation of a set of
upgrade scripts to effect the migration of the existing data into the new
model. It is better to go with modular
data migration approach to make sure that migration can be scheduled indecently
for identified entities. This would help to schedule data migration activities
performed over one or more slots.
This approach would help in decoupling the
build / unit test activity of the application upgrade from the data migration,
allowing them to be built with some degree of independence, although they would
both need to deliver to the same milestone for testing of the overall
deliverable.
.
Option 3: Full Load ETL and Archive
the existing data
This option is a variation of Option-2. The Full Load capability is still used to
load the new data warehouse and there would then be a migration of any
Historically Significant Data, but no migration of the remaining archived data
warehouse data would be performed. This
data set (essentially, the current data warehouse), would be cloned to a new
reporting schema in the new database and made available for reporting using the
current reporting model, but would not be added to.
Each data set could be reported on
independently, which accomplishes the requirement to have the data
available. If it is necessary to report
across the combined data sets then it would be possible to union the two
subject areas together, but this does lead to some reporting complexity and
performance considerations for federating two data sets.For simplicity of
reporting, if it is required that this archived data area only represent the
data that is *not* existing in the active data set, then it would be necessary
to trim the data in the archive area after loading.
This option lowers the complexity of the
overall delivery, by removing the need to build, test and deploy the data
migration code. To counter this, the
need to have federated reporting across the two data sets does increase the
reporting difficulty and introduces a new requirement to trim the archived data
set.
I
hope this copy/pasted information would be useful for planning data warehouse
data upgrade activities if any.
No comments:
Post a Comment