The term Big
Data refers to high variety, high volumes, high complexity, high velocity forms
of data. Big Data typically is unstructured. Big Data may come in the forms of
web logs, social media logs, email, equipment sensors, machine generated data
such GPS or smart meter readings, videos, photographs, etc.
One of the
biggest challenges is to integrate this unstructured data to enterprise wide structured
data stored in current operational and data warehouse IT systems. So idea is to
extract and load big data (non-structured/semi-structural data) into database alongside
rest of enterprise structural data. The next step is try to find a way to
analyze the data for making important business decisions to grow in new areas
to better understand the information from acquired data. There are number of
analytical applications available in the market place. Most of the analytical
applications require transforming data and moving it back and forth between
database and application.
In-database analytics
allows the data processing to be conducted with the database by building
analytic logic into the database itself. This blog is an attempt to list down
In-database analytical offerings by DBMS in general and Oracle in particular.
In-database
Analytics
In-database
analytics is a technology that allows data processing to be conducted within
the database by building analytic logic into the
database itself. Doing so eliminates the time and effort required to transform
data and move it back and forth between a database and a separate analytics
application.
An in-database analytics system consists
of an enterprise data warehouse and acquired big data built on an analytic database platform. Such platforms
provide parallel processing, partitioning, scalability and optimization features geared
toward analytic functionality.
In-database analytics allows analytical data marts to be consolidated in the enterprise
data warehouse. Data retrieval and analysis are much faster and corporate
information is more secure because it doesn’t leave the database.
This approach is useful for helping
companies make better predictions about future business risks and opportunities identify
trends, and spot anomalies to make informed decisions more efficiently and
affordably.
Companies use in-database analytics for applications requiring intensive
processing – for example, fraud detection, credit scoring, risk management, trend and pattern recognition,
and balanced scorecard analysis. In-database
analytics also facilitates ad hoc analysis, allowing business users to
create reports that do not already exist or drill deeper into a static report
to get details about accounts, transactions, or records.
Oracle In-database
Analytics
Once big data and enterprise data loaded Oracle Database, end users can
use number of easy-to-use tools for in-database, advanced analytics. Following
are some of the in-database advanced analytics supported by Oracle
Oracle R Enterprise - Oracle’s version of the widely used
Project R statistical environment enables statisticians to use R on very large
data sets without any modifications to the end user experience. Examples of R
usage include predicting airline delays at a particular airports and the
submission of clinical trial analysis and results.
In-Database Data Mining - the ability to
create complex models and deploy these on very large data volumes to drive
predictive analytics. End-users can leverage the results of these predictive
models in their BI tools without the need to know how to build the models. For
example, regression models can be used to predict customer age based on
purchasing behavior and demographic data.
In-Database Text Mining -The ability to mine
text from micro blogs, CRM system comment fields and review sites combining
Oracle Text and Oracle Data Mining. An example of text mining is sentiment
analysis based on comments. Sentiment analysis tries to show how customers feel
about certain companies, products or activities.
In-Database Graph Analysis – the ability to create graphs and connections between various data points
and data sets. Graph analysis creates, for example, networks of relationships
determining the value of a customer’s circle of friends. When looking at
customer churn customer value is based on the value of his network, rather than
on just the value of the customer.
In-Database Spatial – the ability to add a
spatial dimension to data and show data plotted on a map. This ability enables
end users to understand geospatial relationships and trends much more
efficiently. For example, spatial data can visualize a network of people and
their geographical proximity. Customers who are in close proximity can readily
influence each other’s purchasing behavior, an opportunity which can be easily
missed if spatial visualization is left out.
In-Database
MapReduce – the ability to write procedural logic and
seamlessly leverage Oracle Database parallel execution. In-database MapReduce
allows data scientists to create high-performance routines with complex logic.
In-database MapReduce can be exposed via SQL. Examples of leveraging
in-database MapReduce are sessionization of weblogs or organization of Call
Details Records (CDRs).
Conclusion
Every one of the analytical components in Oracle Database is valuable. Combining these components creates even more value to the business. Leveraging SQL or a BI Tool to expose the results of these analytics to end users gives an organization an edge over others who do not leverage the full potential of analytics in Oracle Database.
Every one of the analytical components in Oracle Database is valuable. Combining these components creates even more value to the business. Leveraging SQL or a BI Tool to expose the results of these analytics to end users gives an organization an edge over others who do not leverage the full potential of analytics in Oracle Database.