Pages

Wednesday, 29 January 2014

OBIEE 11.1.1.7.1 - Improve Performance of Hadoop Queries


Oracle offers dedicated BI Server DB GB to support HiveQL query generation against Hadoop Hive interface. Specific DB Features are tuned to the capabilities of HiveQL. An example of using Hadoop sources in BI Model is shown as below.  I have taken this from the Oracle published document.


Hive Queries – High Latency

One of the biggest challenges with Hive Queries is that these queries will have high query latency; hence these queries are not suitable for ad hoc analysis. This blog is an attempt to explain number of measures to make sure that ad hoc analysis should generated Hive Queries which will perform. 

Tools to Improve Performance of Hive Queries

Summary Advisor and Aggregate Persistence can be used to vastly improve overall query performance.  Some federated queries are not well suited for Hadoop sources, particularly those requiring large volumes of data to be “stitched” by the BI Server, instead aggregate source can vastly improve performance of federated query.

Query latency can be reduced from minutes to sub-seconds when leveraging Exalytics in-memory cache. By loading data into Times Ten can improve the performance of the queries. E.g. a report directly against Hadoop and federating relational table takes multiple minutes, but same report against in-memory cache has sub-second response.

I believe for Hive Queries Exalytics might be a better option.

OBIEE 11g Infrastructure Performance Tuning Guide (Doc ID 1333049.1)

Performance tuning of Oracle Business Intelligence Enterprise Edition is very complex process. It involves monitoring, analyzing and tuning of all the Fusion Middleware/BI components along with database and operating system parameters. It is iterative process and requires a significant contribution from performance architect.

Recently Oracle released a latest version of guide that describes the tools which would help to monitor performance and techniques for optimizing the performance of BI and Database tier components. Intension is to provide guidance to quick start for performance tuning activities. 

This blog is an attempt to provide information on released document, location to download the document and other logistic stuffs. I have read the document in totality and my recommendation is that it is indeed a document which would provide a better insight to kick start performance tuning activities. I would like to thank you author of this document for his contribution.


Document Details
Best Practices Guide for Infrastructure Tuning
Oracle Business Intelligence Enterprise Edition
11g Release 1 (11.1.1.6, 11.1.1.7)
An Oracle® White Paper
January 2014

Oracle Support Web URL

Components Covered
Operating Systems parameters
WebLogic Server parameters
64bit Java Virtual Machines (JVM).
32bit Java Virtual Machines (JVM).
HTTP Server parameters.
HTTP Server Compression / Caching.
Web Browser Settings.
Database Parameters.
IBM WebSphere Server (IBM WAS) Parameters.


It is an interesting read.

Monday, 27 January 2014

OBIEE 11.1.1.7.1 – Size of Repository – Is there any Restriction on Size? When it is too large to load via Enterprise Manager?

Is there a size limitation or any other limitations for the OBIEE 11g RPD?  Is it depends on operating system allowed file size or is there any limitation from Middleware?  How the sizes of the repository affect online access or start and stop processes?  Is it different in 32 bit than 64 bit? Does OBIEE 11g allow bigger size repository than OBIEE 10g?

As mentioned there are numbers of questions regarding maximum size for OBIEE 11g repository. This blog is an attempt to answer all these questions and share answers to larger audience.

Does OBIEE 11g Support Multiple Repositories? Can we use multiple repositories with smaller sizes?

OBIEE 11g does not support more than one repository per installation. The intent is that you cannot have multiple repositories deployed in a domain. This does not mean you cannot use a vertical or horizontal scale to have more than one BI Server loading the repository. So we have to use one repository per installation, hence there is no option to use multiple repositories with smaller sizes, it is shame.

Impact of Size on Online Mode Access

Opening repository in on-line mode loads repository metadata in RAM.  The slow opening of repository is a reflection not just the size of the RPD, but the power of the server BI is running on and the network connection between the client and the server. In short size of repository is a factor but not only factor.

Can Repository Size – X TiB or YPiB?

I haven’t come across any documentation which talks about the actual limit on the size of RPD for OBIEE 10g and/or OBIEE 11g. It is very normal that it can get bigger and bigger in case of multiple BI Apps implementations. I am assuming that one can design repository with XTiB or YPiB, there is no restriction. Good luck for those we would like to do so.

Is Operating System File Size Limitation Affect RPD Size?

The limit of size of file depends on the flavor of UNIX and also the file system and addressable memory size. The general perception about UNIX is that maximum size on UNIX supported is 2.0 GB, which is based on old hardware and kernel version.

The latest limitations for Solaris UNIX are as follows

UFS1: 4GiB to 256TiB
UFS2 : 512GiB to 32PiB
ZFS: 16Exbyte Max file size

In short it seems that there is no restriction on file size for repository based on which operating system on OBIEE 11g is running.

How Much Enterprise Manager Can Handle for Upload RPD ?  

For repository file is bigger than 40MB, Enterprise Manager throws an error while repository upload process, as 40MB is the maximum size that Enterprise Manager can handle.  This is an Enterprise Manager limit and is specified in web.xml file. So this limit is not a limit to Size of repository, it is a limitation to middleware.

One option to overcome this limit is to check if it is feasible to increase the maximum file size specified in web.xml. There is a work-around, which is to manually deploy the repository. OBIEE 11g RPD can be deployed by using MBean via Enterprise Manager and A WLST Script.  An extract from Oracle Support document which talks about RPD upload via Enterprise Manager and WLST Script are as follows.

Please refer tech- notes listed below for more information

  1. How to Deploy an OBIEE 11g RPD Using An MBean Via Enterprise Manager And A WLST Script (Doc ID 1319333.1)
  1. Using WLST to Manage the Deployment of Repository/Catalog at Business Intelligence Server (Doc ID 1319303.1)

How to Deploy RPD using MBean in Enterprise Manager

Steps to deploy RPD using MBean in Enterprise Manager are as follows, this is used for Repository size is bigger than 40MB, assuming that it is limit specified within web.xml

  • Navigate to and login to EM Fusion Middleware Control URL (http://host.domain:7001/em)
  • Expand the Web Logic Domain from the menu located at the left. Continue expanding until finding the Managed Server, usually called bi_server1 by default
  • In the right pane, you will see a new page. From the Web Logic Server menu at the top, choose System M Bean Browser
  • Expand 'Application Defined MBeans' > oracle.biee.admin > Domain: bifoundation_domain > BIDomain > and select the first BIDomain , it could be the second, depending upon your version, but it will have an 'Operations' tab
  • Navigate to the Operations Tab > Select the ’lock’ link > Press the 'Invoke' button.
  • Navigate to BIDomain.BIInstance.ServerConfiguration (which is in the same path) and select BIDomain.BIInstance.ServerConfiguration Navigate to the 'Operations' Tab Choose " uploadRepository
  • Enter a path for the RPD and its password. For example: C:\OBI11\instances\instance3\bifoundation\OracleBIServerComponent\coreapplication_obis1\repository\repositoryname.rpd
  • Press "Invoke"
  • Release the lock on the domain. Navigate back to BIDomain > and select the first BIDomain Navigate to the 'Operations' tab and select the first "Commit" option and then press "Invoke"
  • Re-start the BI Server (mandatory)
How to Deploy RPD Using WLST

This section includes deployment of the repository/catalog with WLST and script to change the default repository file with Jython script. To carry out these steps Web Logic Admin Server should be up and running, new repository should located in machine where BI server is running and user should have administrative rights
  • Save the content bellow as updateRepository.py
  • Script
     
    username = "weblogic"
    password = "weblogic1"
    newRPDlocation = "<local-path-to-RPD>"
    newRPDpassword = "<new-RPD-password>"
    connect(username, password, "t3://host:port");
    print 'Connecting to Domain ...'
    domainCustom()
    cd ('oracle.biee.admin')
    print 'Connecting to BIDomain MBean ...'
    cd ('oracle.biee.admin:type=BIDomain,group=Service')
    objs = jarray.array([], java.lang.Object)
    strs = jarray.array([], java.lang.String)
    invoke('lock', objs, strs)
    cd ('..')
    cd ('oracle.biee.admin:type=BIDomain.BIInstance.ServerConfiguration,biInstance=coreapplication,group=Service')
    objs=jarray.array([newRPDlocation,newRPDpassword],Object)
    strs=jarray.array(['java.lang.String', 'java.lang.String'],String)
    invoke('uploadRepository', objs, strs)
    cd ('..')
    cd ('oracle.biee.admin:type=BIDomain,group=Service')
    objs = jarray.array([], java.lang.Object)
    strs = jarray.array([], java.lang.String)
    invoke('commit', objs, strs)
    disconnect()
    exit()
  • Modify the parameters for the user, connection string to the Admin Server and new RPD information.
  • Set the domain environment using the script. $ WL_HOME/user_projects/domains/<DOMAIN>/. bin/setDomainEnv.sh 
  • Run the script as follow: $ java weblogic.WLST updateRepository.py,
  • Sample Script Output

    >>:~/WLS/10.3.3.0/user_projects/domains/3-2825937801/repo$ java weblogic.WLST repositoryCreator.py
    Initializing WebLogic Scripting Tool (WLST) ...
    Welcome to WebLogic Server Administration Scripting Shell
    Type help() for help on available commands
    Connecting to t3://XXXXXXX:ZZZZ with userid weblogic ...
    Successfully connected to Admin Server 'AdminServer' that belongs to domain 'bifoundation_domain'.
    Warning: An insecure protocol was used to connect to the
    server. To ensure on-the-wire security, the SSL port or
    Admin port should be used instead.
    Connecting to Domain ...
    Location changed to domain custom tree. This is a writable tree with No root.
    For more help, use help(domainCustom)
    Connecting to BIDomain MBean ...
    No stack trace available.
    Disconnected from weblogic server: AdminServer
  • Restart the BI Server
Inference
  • There is no limit set for Repository Size
  • Online Access is bound to be slower for bigger sized repositories
  • Smaller Sizes Repositories are easy to manage and develop
  • The Operating System Limit is too big to worry about
  • Enterprise Manager has work around to load bigger size repositories
  • Manual Load is recommended for bigger repository
  • It is a shame that Oracle does not offer any solution around multiple repositories within one BI Domain

Sunday, 26 January 2014

OBIEE 11.1.1.7.1 – What is difference between FIRST & FIRST_PERIOD Function


There are two repository functions available to select first returned value of the expression argument, FIRST AND FIRST_PERIOD. This is blog is an attempt to explain the difference between these functions and elaborate on when and where to use these functions.

FIRST

This function selects the first non-null returned value of the expression argument. It uses primary level key not based on the chronological key.

The FIRST function is limited to defining dimension-specific aggregation rules in a repository. It cannot be used in SQL statements. The FIRST function operates at the most detailed level specified in your explicitly defined dimension. For example, if you have a time dimension defined with hierarchy levels day, month, and year, the FIRST function returns the first day in each level.

One should not use the FIRST function as the first dimension-specific aggregate rule. It might cause queries to bring back large numbers of rows for processing in the Oracle BI Server, causing poor performance. In addition there is one more limitation that one cannot nest PERIODROLLING, FIRST, FIRST_PERIOD, LAST, and LAST_PERIOD functions.

Syntax
FIRST (expr):- expr is any expression that references at least one measure column

Example
A measure is aggregated as FIRST based on HO Time Dimension, Data is dense is not selected.




FIRST("01 - Sample App"."F4 Headcount Base Measures"."9- Full Time Empl  (Beg Period)"

FIRST_PERIOD

This function selects the first returned value of the expression argument. For example, the FIRST_PERIOD function can calculate the value of the first day of the year.

Use the FIRST_PERIOD function instead of the FIRST function whenever you want to compute the first value based on the chronological key rather than the primary level key. Also, FIRST returns the first non-null value, whereas FIRST_PERIOD returns the first value, regardless of whether it is null.

The FIRST_PERIOD function operates at the most detailed level specified in your explicitly defined dimension. For example, if you have a time dimension defined with hierarchy levels day, month, and year, the FIRST_PERIOD function returns the first day in each level.

One should not use the FIRST function as the first dimension-specific aggregate rule. It might cause queries to bring back large numbers of rows for processing in the Oracle BI Server, causing poor performance. In addition there is one more limitation that one cannot nest PERIODROLLING, FIRST, FIRST_PERIOD, LAST, and LAST_PERIOD functions.

Syntax
FIRST_PERIOD (expr):- expr is any expression that references at least one measure column

Example




FIRST_PERIOD("01 - Sample App"."F4 Headcount Base Measures"."9- Full Time Empl  (Beg Period)")


Inference

In short FIRST returns the first non-null value, whereas FIRST_PERIOD returns the first value. FIRST_PERIOD requires "data is dense" flag to be checked in the Logical Column Aggregation tab in Admin Tool. If that flag is not check, these function just become First.

 I hope you find this information useful.