There is a Segmentation Engine hidden inside OBIEE 188.8.131.52 This is being used extensively with the Siebel Marketing application which is a Oracle’s application offering for campaign management since days of Siebel Analytics.
There is no harm in using Segmentation Engine in standalone mode. One can extend any repository to support segmentation. There is nothing extra need to install to use segmentation engine and it is very useful for SET Analysis.
I have already written a blog entry around how we can use the segmentation, the blog talks about earlier version of OBIEE 11g. I don’t see any reason why it is not applicable to the latest version of OBIEE 184.108.40.206.1.
Recently I have seen some specific questions around Marketing Cache and Sampling Factors. The questions are as follows
- How BI Server itself controls deleting of marketing cache records (in M_C_* tables)?
- After restart of the OBIEE we need to clear marketing cache, because OBIEE doesn’t know anything about old GUIDs. Is this right? Correct me if it’s wrong.
- Can we restrict our users to use only Sample Tables (10%) while creating segments?
- Is there a way to maintain “sampled” not only dimension (with target level), but also a fact table?
Marketing Cache Tables
Some segmentation criteria blocks are often reused. The Analytics server provides a way of storing the results of calculated segmentation criteria so that they can be reused in future requests without having to compute them again (within a time frame). This mechanism is called Marketing Cache (note that it is different from Analytics Query Caching). It requires physical tables to be created where the cached results are stored.
A caching table will be configured for each qualified list item , in sample application for a qualified list item Customers a caching table is defined SAMP_M_C_CUSTOMER as shown below. The column should exist in physical database schema.
Deleting Marketing Cache Tables
The cached results are based on criteria and might change once new data get loaded into the system, the old record should be deleted and new result should be loaded into to cache. This can be achieved by inserting physical SQL for deletion of the cached data based on GUID. In short with SQL configuration in place BI Server itself controls deleting of marketing cache records for chosen GUID. GUID in case of marketing cache is a global unique id for the cache entry; this ID is written to the cache header table for each entry. I believe this is nothing to do with the GUID related to User Access in OBIEE 11g
The following settings for the segmentation functions of the Oracle BI Presentation Services are contained in a configuration file (instanceconfig.xml). These configuration will control overall caching entries deletion and cleanups.
Marketing Sample Tables
OBIEE Marketing Server provides the functionality of using sampling tables for segmentation requests. This is useful for testing complex segmentation criteria that may take a long time to execute against the full data set. By using the sampling tables marketing users can fine-tune their criteria and only run requests against the full volume once they are satisfied with the segmentation criteria.
The sampling tables will be populated with a random selection of data from the base tables using the native database random selection algorithm. This will be done on a scheduled basis (the same refresh rate as the base tables).
A sampling table M_10C_PRODUCT_D could be defined, and is populated from PRODUCT dimension with 10% random chosen records. The sampling configuration is done at Target Level as shown below.
Sampling works on the principle that if the segmentation criteria are applied to the sampled subset of customers, and then subsequently to each of the stars accessed by the segment criteria, the final count is a good approximation of the actual (100 percent) counts and executes more quickly.
When you enable sampling, the Marketing Server continues to generate the same logical SQL. However, the BI Server generates physical SQL that queries against the configured sample tables using dynamic table names.
For the dimension table (such as the Contact dimension table) that contains all target-level IDs, a dynamic table name is associated with the target-level table. If the value of the dynamic table name session variable is set to the sampled table, then all queries executed in that session that include the customer dimension table are executed against the sampled table. The session variable is automatically set to the appropriate sampling table, depending on the sampling factor chosen in the user interface for all counting tasks.
Sample Factor Question / Answer
Can we restrict our users to use only Sample Tables (10%) while creating segments?
Answer – I don’t think we can restrict users not to use 100% and forced them to use 10%. I believe it is more like a training issue rather than access issue.
Is there a way to maintain “sampled” not only dimension (with target level), but also a fact table?
Answer - I believe sampling should be done on dimension and as dimension is linked with fact table, it will get only data which is linked with the dimension from sampled table. I don’t think there is a need to sample the fact tables.
I hope you find this discussion useful.