Book Review: Predictive Analytics Using Oracle Data Miner

Uli Bethke data mining, Oracle

My friend and colleague, ACE Director Brendan Tierney, has recently published the reference book Predictive Analytics Using Oracle Data Miner. It is the first comprehensive book on the subject matter. The book is primarily aimed at the Oracle Data Scientist/Data Miner. The other target audience are Oracle developers who implement the data mining models created by the Data Scientists in their applications, e.g. OBIEE. Some of the areas covered are also relevant for Oracle DBAs as they represent typical tasks a DBA would perform.

While I have a high level understanding of predictive analytics I was hoping that the book would give me some new insights. Here are my impressions.

In the first chapter, Brendan provides us with an overview on the various data mining options in Oracle. We learn that one of the key differentiators of Oracle Data Mining to other tools is that it runs inside the Oracle database. This has the advantage that the data resides in one place and does not have to be moved into other desktop style tools. Brendan also stresses the point that data mining has been around for decades and does not necessarily require Big Data size volumes to derive insights. In chapter two, Brendan introduces us to the data mining lifecycle and methodology. He discusses the CRISP-DM in detail. We learn that this is the most widely used data mining lifecycle. The key points in this chapter are on the one hand that the lifecylce is iterative and as such lends itself to an agile methodology. On the other hand it is important to have well defined business questions and objectives. Equally important are data profiling and data preparation. Chapters three and four then talk about how to install, set up and use the Oracle Data Miner client tools. It gets more interesting again in chapters five (Exploring your Data) and six (Data Preparation). We learn that both of these stages are extremely important for the success of a data mining project and take up significant time as data needs to be profiled, cleansed, put into a suitable format, relevant features (data mining speak for attributes) need to be extracted and new features need to be derived to feed the algorithms. Some of the nice features that ODM provides are Automated Feature Selection and Automatic Data Preparation (ADP). Automatic Feature Selection allows you to let ODM algorithmycally determine which features of your data set are relevant for your supervised machine learning algorithms.

In chapters seven to eleven Brendan introduces us to the various types of machine learning and algorithms that are supported through the Oracle Data Miner GUI. These include various types of supervised (a target variable exists) and unsupervised algorithms such as Association Rules, Classification, Clustering, Regression, and Anomaly Detection. For each of the types of analytics Brendan gives us some high level background information, the use cases, and how to build, evaluate, and deploy the model. While chapters four to eleven show us how to conduct data mining using the Oracle Data Miner GUI, chapters twelve to eighteen guide us on how we can achieve the same using SQL or PL/SQL. Similarly, the chapters cover data preparation, Association Rules, Classification, Clustering, Regression, and Anomaly Detection.

The last two chapters are dedicated on model deployment and how you can make use of the models in tools such as OBIEE dashboards.

Predictive Analytics Using Oracle Data Miner is an excellent introduction to the world of data mining in general and data mining on Oracle in particular. You don't need any prior knowledge to read this book. Brendan manages to explain all of the relevant points really well. Throughout the book he gives invaluable advice and tips and tricks that will allow you to quickly master data mining on Oracle. After reading the book you will have an in-depth understanding on how to apply machine learning to various use cases using Oracle Data Miner. The book also introduces us to the types of data mining and algorithms that are supported in the tool. If you want to learn more about the various machine learning algorithms mentioned I would recommend to read it side by side with a data mining reference book such as Data Mining Techniques by Linoff/Berry. Having read Predicitive Analytics Using Oracle Data Miner I feel confident to successfully implement a first data mining project on my own. I wish everyone happy hunting for signals and insights.