There is a Segmentation Engine hidden
inside OBIEE 11.1.7.1 This is being used extensively with the Siebel Marketing
application which is a Oracle’s application offering for campaign management
since days of Siebel Analytics.
There is no harm in using Segmentation
Engine in standalone mode. One can extend any repository to support
segmentation. There is nothing extra need to install to use segmentation engine
and it is very useful for SET Analysis.
I have already written a blog entry around
how we can use the segmentation, the blog talks about earlier version of OBIEE
11g. I don’t see any reason why it is not applicable to the latest version of
OBIEE 11.1.1.7.1.
Customer Questions
Recently I have seen some specific questions around
Marketing Cache and Sampling Factors. The questions are as follows
Marketing Cache:
- How BI Server itself controls deleting of marketing cache records (in M_C_* tables)?
- After restart of the OBIEE we need to clear marketing cache, because OBIEE doesn’t know anything about old GUIDs. Is this right? Correct me if it’s wrong.
Sampling factors:
- Can we restrict our users to use only Sample Tables (10%) while creating segments?
- Is there a way to maintain “sampled” not only dimension (with target level), but also a fact table?
Marketing Cache Tables
Some segmentation criteria blocks are often
reused. The Analytics server provides a way of storing the results of
calculated segmentation criteria so that they can be reused in future requests
without having to compute them again (within a time frame). This mechanism is
called Marketing Cache (note that it is different from Analytics Query
Caching). It requires physical tables to be created where the cached results
are stored.
A caching table will be configured for each
qualified list item , in sample application for a qualified list item
Customers a caching table is defined
SAMP_M_C_CUSTOMER as shown below. The column should exist in physical database
schema.
Deleting Marketing Cache Tables
The cached results
are based on criteria and might change once new data get loaded into the
system, the old record should be deleted and new result should be loaded into
to cache. This can be achieved by inserting physical SQL for deletion of the
cached data based on GUID. In short with SQL configuration in place BI Server
itself controls deleting of marketing cache records for chosen GUID. GUID in
case of marketing cache is a global unique id for the cache entry; this ID is
written to the cache header table for each entry. I believe this is nothing to
do with the GUID related to User Access in OBIEE 11g
The following settings for the segmentation functions of the Oracle BI
Presentation Services are contained in a configuration file
(instanceconfig.xml). These configuration will control overall caching entries
deletion and cleanups.
Marketing Sample Tables
OBIEE Marketing Server provides the functionality of using
sampling tables for segmentation requests. This is useful for testing complex
segmentation criteria that may take a long time to execute against the full
data set. By using the sampling tables marketing users can fine-tune their
criteria and only run requests against the full volume once they are satisfied
with the segmentation criteria.
The sampling tables will be populated with a random
selection of data from the base tables using the native database random
selection algorithm. This will be done on a scheduled basis (the same refresh
rate as the base tables).
A sampling table M_10C_PRODUCT_D could be defined, and is
populated from PRODUCT dimension with 10% random chosen records. The sampling
configuration is done at Target Level as shown below.
Sampling Factor
Sampling works on the principle that if the
segmentation criteria are applied to the sampled subset of customers, and then
subsequently to each of the stars accessed by the segment criteria, the final
count is a good approximation of the actual (100 percent) counts and executes
more quickly.
When
you enable sampling, the Marketing Server continues to generate the same
logical SQL. However, the BI Server generates physical SQL that queries against
the configured sample tables using dynamic table names.
For
the dimension table (such as the Contact dimension table) that contains all
target-level IDs, a dynamic table name is associated with the target-level
table. If the value of the dynamic table name session variable is set to the sampled
table, then all queries executed in that session that include the customer
dimension table are executed against the sampled table. The session variable is
automatically set to the appropriate sampling table, depending on the sampling
factor chosen in the user interface for all counting tasks.
Sample Factor Question / Answer
Can we restrict our users to use only Sample Tables
(10%) while creating segments?
Answer – I don’t
think we can restrict users not to use 100% and forced them to use 10%. I
believe it is more like a training issue rather than access issue.
Is there a way to maintain “sampled” not only
dimension (with target level), but also a fact table?
Answer - I believe
sampling should be done on dimension and as dimension is linked with fact
table, it will get only data which is linked with the dimension from sampled
table. I don’t think there is a need to sample the fact tables.
I hope you find this discussion useful.
No comments:
Post a Comment