The need to retain the existing Data Warehouse data-set, for BI Apps after platform or content upgrade does present a number of challenges. These challenges are not entirely technical in nature. The main challenge for the project is to de-risk this activity alongside the considerable work that is already known to be required, to upgrade the ETL and reporting layer aspects of the overall application.
Recently I have read a customer document which outlines the possible upgrade options for application and data warehouse data. I believe it is one of the well thought and well written documents around OBIA Application and warehouse data. This blog is an attempt to share the content of document to wider audience. I am a messenger here, credit goes to someone else.
Upgrade Options for Warehouse Data
The upgrade option choice for warehouse data depends on which application upgrade option chosen.
There are three main upgrade options for BI Apps upgrade
- Upgrade and Re-implement (Oracle Recommended)
- Rediscover and Re-Implement
- Re-implement and Retain
I have decided not go into details of these application upgrade options as of now, and talk about upgrade option for upgrade of warehouse data. I will definitely cover these options in some other day.
Which application upgrade option is taken does impact which approach should be taken here, as some go in tandem and some effect the options positively or negatively.
There are three upgrade options discussed in this blog for upgrade of warehouse data. For all options, a baseline activity at cutover would be to have a full copy of the current Production data warehouse reporting schema and a target data warehouse schema containing the new data model structure only. The data movement and transformation activity needs to be undertaken with PL/SQL scripts – either based on Oracle supplied scripts for vanilla, or custom built for the complete data model.
Option -1: Upgrade Warehouse Data
This option is all about migration of the existing data into the new model. This would entail the creation of a set of upgrade scripts for migration of data into new model. Following this activity, the ETL would be configured to run as an Incremental Load, to extract the new transactional data ongoing, as at present.
The new OBI Apps vanilla ETL extracts additional data from source system to that which exists in the current warehouse. Following migration of the existing data warehouse data; it would therefore be necessary to refresh that data with the latest attributes from the source system database. This would need to be achieved via an additional set of PL/SQL scripts or via the Incremental Load ETL code, undertaking a full refresh of data.
Some of the important observations for this option are as follows
- This approach adds technical complexity in the project delivery, with the associated impact that entails.
- The option suggest need to rebuild the ETL application code means that a set of Full and Incremental SDE objects would need to be re-customised and tested.
- A proper testing strategy is required to accommodate the Full Load + Incremental Load scenarios, in addition to the Upgrade Warehouse Data + Incremental Load approach.
- This approach forces a dependency on the Full and Incremental activities as the two streams of activities This is likely to elongate the build / unit test phase as well as the main testing phase, which would need to test both Load scenarios.
Option 2: Full Load ETL and Migrate Aged Data
This option proposes to populate the new blank data warehouse, from the data from source system database. In first step extract data that is extant in the source system, a further activity would be required to migrate the archived data from the existing data warehouse into the new data warehouse.
This would entail the creation of a set of upgrade scripts to effect the migration of the existing data into the new model. It is better to go with modular data migration approach to make sure that migration can be scheduled indecently for identified entities. This would help to schedule data migration activities performed over one or more slots.
This approach would help in decoupling the build / unit test activity of the application upgrade from the data migration, allowing them to be built with some degree of independence, although they would both need to deliver to the same milestone for testing of the overall deliverable.
Option 3: Full Load ETL and Archive the existing data
This option is a variation of Option-2. The Full Load capability is still used to load the new data warehouse and there would then be a migration of any Historically Significant Data, but no migration of the remaining archived data warehouse data would be performed. This data set (essentially, the current data warehouse), would be cloned to a new reporting schema in the new database and made available for reporting using the current reporting model, but would not be added to.
Each data set could be reported on independently, which accomplishes the requirement to have the data available. If it is necessary to report across the combined data sets then it would be possible to union the two subject areas together, but this does lead to some reporting complexity and performance considerations for federating two data sets.For simplicity of reporting, if it is required that this archived data area only represent the data that is *not* existing in the active data set, then it would be necessary to trim the data in the archive area after loading.
This option lowers the complexity of the overall delivery, by removing the need to build, test and deploy the data migration code. To counter this, the need to have federated reporting across the two data sets does increase the reporting difficulty and introduces a new requirement to trim the archived data set.
I hope this copy/pasted information would be useful for planning data warehouse data upgrade activities if any.