Understanding the key aspect of Data Discovery, DVCS includes multiple features to facilitate the data analysis process. One simple tool used for data discovery is the filters: the user can decide to filter Attributes based on values or Measures based on ranges.
Together with the filter, Reference Lines and Trend Lines are available in DVCS straight out-of-the-box. As well as these features, more Advanced Analytics tools are available in combination with R. For this reason, DVCS includes an Oracle R Distribution (version 3.1.1) installer executable after the installation of DVCS. When R and the required libraries are installed, we will be able to use Clustering, Outlier Detection and Forecasting, as well as custom R scripts.
In the example below we use Clusters to identify how the number of branches by region affects the sales. In addition, we have a Reference Line to analyze the average sales for different branches. Finally, using Trend Lines, we can see that the relationship between minimum number of branches and sales which has been increasing over the fiscal year month.
Oracle R Distribution is an Oracle-supported redistribution of open source R which comes pre-installed with Oracle applications. ORD is part of Oracle’s overall strategy for Data Science. ORD built on the fundamentals of R is a programming language and software environment is meant for statistical analysis, graphics representation and reporting. ORD facilitates enterprise acceptance of R, since the lack of a major corporate sponsor has made some companies concerned about fully adopting R.
Why Oracle R Distribution?
- Improve scalability and performance at R client and database with Oracle R Enterprise embedded R execution
- Dynamically load linear algebra performance libraries for Intel’s Math Kernel Library (MKL), AMD’s ACML, and Sun Performance Library for Solaris, which enables optimized, multi-threaded math routines to provide relevant R functions maximum performance on targeted hardware
- Oracle-provided enterprise support for customers of the Oracle Advanced Analytics option, Oracle Linux, and the Oracle Applications
Data mining with ORD: Automatically works large data volumes to find hidden patterns, discover insights and make predictions
- Identify most important factor (Attribute Importance)
- Predict customer behavior (Classification)
- Predict or estimate a value (Regression)
- Find profiles of targeted people or items (Decision Trees)
- Segment a population (Clustering)
- Find fraudulent or “rare events” (Anomaly Detection)
- Determine co-occurring items in a “baskets” (Associations)
How much bigger is data going to get with the Internet of Things (IoT), and how will this change the way businesses gather, store, compute, and consume data? There are a few ways companies can leverage the massive, unstructured data of the IoT with DVCS. We would talk about this in our next blog.