Wednesday, 29 January 2014

OBIEE 11.1.1.7.1 - Improve Performance of Hadoop Queries


Oracle offers dedicated BI Server DB GB to support HiveQL query generation against Hadoop Hive interface. Specific DB Features are tuned to the capabilities of HiveQL. An example of using Hadoop sources in BI Model is shown as below.  I have taken this from the Oracle published document.


Hive Queries – High Latency

One of the biggest challenges with Hive Queries is that these queries will have high query latency; hence these queries are not suitable for ad hoc analysis. This blog is an attempt to explain number of measures to make sure that ad hoc analysis should generated Hive Queries which will perform. 

Tools to Improve Performance of Hive Queries

Summary Advisor and Aggregate Persistence can be used to vastly improve overall query performance.  Some federated queries are not well suited for Hadoop sources, particularly those requiring large volumes of data to be “stitched” by the BI Server, instead aggregate source can vastly improve performance of federated query.

Query latency can be reduced from minutes to sub-seconds when leveraging Exalytics in-memory cache. By loading data into Times Ten can improve the performance of the queries. E.g. a report directly against Hadoop and federating relational table takes multiple minutes, but same report against in-memory cache has sub-second response.

I believe for Hive Queries Exalytics might be a better option.

1 comment:

  1. There are lots of information about latest technology and how to get trained in them, like Big Data Training in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Big Data Training). By the way you are running a great blog. Thanks for sharing this.

    Hadoop Training in Chennai | Big Data Training in Chennai

    ReplyDelete