Archives

Oracle

What’s the Big Deal about Big Data? Hear me speak at OUG Ireland. 11 March 2014. Convention Centre Dublin.

What’s the Big Deal about Big Data? Hear me speak at OUG Ireland. 11 March 2014. Convention Centre Dublin.

So what’s the Big Deal about Big Data? Oil has fueled the Industrial Revolution. Data will fuel the Information Revolution.

Not convinced? Did you know that Amazon has recently patented a technology based on a Big Data algorithm that will start the shipping process before you have completed your order. That’s right. Amazon knows that you will buy some stuff from their website before you know it yourself. How amazing or scary is that?

In my upcoming presentation on 11 March in the Convention Centre in Dublin I will explore this topic further and I will talk about

- What is Big Data (and what it is not)?
- Some interesting use cases to show you what is possible.
- Why the traditional data warehouse framework and technology don’t work for Big Data.
- Big Data architecture for the Information revolution.
- The Oracle Big Data armoury

Registration for the event is now open.

Hope to see you there and talk about your Big Data project.

Exadata, Exalytics, SAP HANA and the multi billion dollar question.

Recently there has been a lot of noise around in memory databases and how Exadata, Exalytics and SAP HANA compare to each other. Here are my two cents on the debate.

Why you can’t compare Exalytics to SAP Hana

It is the vision of SAP Hana to be used for both OLTP and analytics. As the name already suggests Exalytics just caters for analytics. Exalytics needs to be loaded from the data warehouse or transactional systems. Not needed for Hana. Everything already sits in memory. While Exalytics is near near realtime. Hana is realtime. However, currently there are not too many OLTP applications running on Hana. The problem is that applications need to be adapted or rewritten to make full use of HANA’s new architecture.This seems about to change. SAP has recently released a service pack for HANA that will allow it to do just that. However, the claim of being able to run OLTP and Analytics in the same in memory database remains somewhat unproven.

Why you can’t compare Exadata to SAP Hana

Exadata V3 now ships with 4 TB of RAM. Contrary to the claims of Oracle this does not make it an in-memory database. It lacks two important features:

- Most of the data still sits on disk. Mostly SSD. Still disk
- It lacks the optimized algorithms and design features (in memory indexes etc.) that have specifically been designed for in memory access.

How does HANA compare to Exadata/Exalytics then?

Well, it doesn’t. Oracle (or anyone else for that matter) still has to come up with a product that can be compared to HANA.

Use cases of SAP HANA

The number 1 use case for SAP HANA is for realtime operational business intelligence. True realtime BI now seems a distinct possibility. In this use case HANA represents the Interactive Sector as defined by Inmon et al. in their DW 2.0 book.

The other use case is similar to what Exalytics is used for. A performance booster for the data warehouse. In this scenario we load or replicate data into the in-memory database either directly from our OLTP systems or the data warehouse. This is what we have predominantly seen HANA being used for so far. However, this falls well short of the vision and long term strategy, namely to change the way we do databases.

The third use case of course and the end game for HANA is to run OLTP, data warehouse, and analytics all on HANA. Not very realistic at the moment. DRAM is still too costly.

What happens next?

This is the billion dollar question. Curt Monash thinks:

“Putting all that together, the analytic case for SAP HANA seems decently substantiated, there are years of experience with the technology and its antecedents, and column stores (including in-memory) are well-established for analytics via multiple vendors. The OLTP case for HANA, however, remains largely unproven. It will be interesting to see how it plays out.”

It will indeed be interesting to see how this plays out. SAP are currently heavily promoting HANA. There is a developer license available. You can rent an instance on AWS. SAP are pushing it hard for start-ups. I am sure that it will replace Oracle in many SAP implementations as the underlying database. It remains to be seen, however, if it can eat into the wider database market (the ultimate SAP objective). The other interesting question is: when will Oracle come up with a product that can compete head on with HANA? I believe there is plenty of time. The game hasn’t really started yet.

In the meantime Microsoft have also announced an in memory database named Hekaton for release in 2014/15. It seems to be for OLTP only and from what I read a bit of a joke. Interesting times indeed though.

The wider picture and Google

At the moment we are seeing some tectonic shifts in the technology space that we haven’t seen in a generation. New technologies like Hadoop, Impala, IMDBs, NoSQL, Cloud etc are emerging that can handle ever bigger data at an ever faster speed. These innovations will have knock on effects on the way we architect information systems and on enterprise architecture in general. I even believe that they will ultimately change the way we do business. Unknown to many, a lot of these innovations are pioneered by Google. Papers on MapReduce  and the Google File System kicked off Hadoop. Google BigTable  inspired a lot of the NoSQL databases we have seen recently. You may have heard of Impala from Cloudera. Again a brainchild of Google. Based on Google Spanner and probably more so their in-house RDBMS F1.

I would expect more innovation to come from that corner. After all Google is at the epicentre of the Big Data problem. They already have their own offering named BigQuery, which recently left the Beta phase. Doesn’t look like much right now but I expect them to up their game.

Of course you can ignore those trends but you do so at your own peril.

If you want to find out more about IMDBs and SAP Hana I recommend to read In-Memory Data Management: Technology and Applications by Hasso Plattner, one of the co-founders of SAP.

Uses of the modulo operator: How Oracle mod can make your life easier.

You may ask, what the heck is modulo? Well, below is an easy to understand definition.

“Modulo basically means keep taking the second number away from the first number. When you can’t do it any more without going into negative numbers, whatever’s left is the answer”

SELECT MOD(6,3) FROM DUAL;
 6-3=3
 3-3=0 (remainder is 0)
SELECT MOD(3,6) FROM DUAL;
 3-6=-3 (result is negative so remainder is 3)
SELECT MOD(5,3) FROM DUAL;
 5-3=2
 2-3 (result is negative so remainder is 2)

Ok. That is great. But how can we actually benefit from this.

What are common use cases?

Use case 1: Wrap values such as in a clock, e.g. convert seconds to hours, minutes, seconds.

SELECT FLOOR(10000/3600) || ':' || FLOOR(MOD(10000/60,60)) || ':' || MOD(10000,60)  FROM DUAL;

 Use case 2: Finding even or odd numbers

SELECT 10,CASE WHEN mod(10,2) = 0 THEN 'even' else 'odd' END FROM DUAL
 UNION
 SELECT 9,CASE WHEN mod(9,2) = 0 THEN 'even' else 'odd' END FROM DUAL;

Use case 3: Expressing something in decimal form

SELECT FLOOR(7/5) || ' + ' || MOD(7,5) || '/5' FROM DUAL;

Use case 4: Distribute a dataset into buckets in a round robin fashion

An example would be to update only every second or third record in a table that contains a sequenced list of items. If you don’t have a sequence you can use rownum.

SELECT MOD(object_id,2), x.* FROM all_objects x  WHERE MOD(object_id,2) = 1 order by object_id;

Selecting 2/3 of records

SELECT MOD(object_id,3), x.* FROM all_objects x  WHERE MOD(object_id,3) in (1,2) ORDER BY object_id;

Use case 5: Get last M digits from a number

Get the last digit:

SELECT MOD(98,10) FROM DUAL;

Get the last two digits:

SELECT MOD(980,100) FROM DUAL;

and so on.

Let me know how you use the mod operator to make your life easier.

 

Tom Kyte in Dublin take aways

Today I had the pleasure to listen to Tom Kyte talk about:

- Big Data and the Oracle perspective and tool set on it
- Database stuff: Statistics
- More database stuff: compression & partitioning

The most interesting part of the speech was the Oracle view on big data. The strategy is coherent and makes a lot of sense. However, I am not convinced why you would ever pay licenses for a NoSQL database or Hadoop? But as I said the strategy itself is sound from my point of view.

There were also one or two things I took away from the database agenda:

- For the best compression results you want to have the columns that share certain values to be sitting beside each other. That way they are more likely to end up on the same database block and compression will be applied to more values. Really only something to take into account for huge tables where you have similar values across columns and you want to squeeze out the last bit of performance.
- CTAS is less expensive than Delete when high enough number of rows are deleted from a table.

The other good thing about the event was to meet up with some of the Dublin Oracle ACEs Marcin Przepiorowski and Brendan Tierney. And of course I also had a pint or two as it was a lovely day in Dublin.

LISTAGG with CLOB? String aggregation exceeding 4000 characters with XMLAGG.

We all know that the Oracle LISTAGG function does not support the CLOB datatype. In my opinion this is a severe limitation of LISTAGG (and BTW also of PIVOT and UNPIVOT etc.).

So what are your options? One option is to create your own user defined aggregate function. I found such an example on the Oracle forums.

However, if you want to use pure SQL to achieve the same, we can leverage the XMLAGG SQL/XML functionality built into the Oracle database.

As per documentation: “You use SQL/XML standard function XMLAgg to construct a forest of XML elements from a collection of XML elements”

Below is an example to demonstrate how this works. For demonstration purposes I don’t exceed the 4K character limit in the example below. However, I guarantee you that it works with >4K strings as well.

SELECT 
   table_row_id,
   DBMS_XMLGEN.CONVERT(EXTRACT(xmltype('<?xml version="1.0"?><document>'||XMLAGG(XMLTYPE('<V>'|| DBMS_XMLGEN.CONVERT(data_value)|| '</V>')).getclobval()||'</document>'), '/document/V/text()') .getclobval(),1) AS data_value
FROM (
   SELECT 1 table_row_id,'abcdefg>' data_value FROM dual
   UNION ALL
   SELECT 1 table_row_id,'hijklmn' data_value FROM dual)
GROUP BY
   table_row_id;

Let’s have a closer look at the nested stuff in this query.

In a first step we escape the data into its XML equivalent

DBMS_XMLGEN.CONVERT(data_value)

‘abcdefg>’ ‘becomes abcdef&lt;g’

Next we create an XMLELEMENT for each row and convert the result to XML data type

XMLTYPE(''|| DBMS_XMLGEN.CONVERT(data_value)|| '')

Then we do the XML aggregation

XMLAGG(XMLTYPE(''|| DBMS_XMLGEN.CONVERT(data_value)|| '')).getclobval()

In a last step we create a well formed XML document, extract the text value from the XML elements and unescape the text.

DBMS_XMLGEN.CONVERT(EXTRACT(xmltype(''||XMLAGG(XMLTYPE(''|| DBMS_XMLGEN.CONVERT(data_value)|| '')).getclobval()||''), '/document/V/text()') .getclobval(),1)

Are Oracle’s whitepapers really for the bin?

According to the MOS search they are. whitepaper = wastepaper ??

Exporting & Importing Oracle Data Miner (11gR2) Workflows

As with all development environments there will be need to move your code from one schema to another or from one database to another.

With Oracle Data Miner 11gR2, we have the same requirement. In our case it is not just individual procedures or packages, we have a workflow consisting of a number of nodes. With each node we may have a number of steps or functions that are applied to the data.

    Exporting an ODM (11gR2) Workflow

In the Data Miner navigator, right-click the name of the workflow that you want to export.

The Save dialog opens. Specify a location on you computer where the workflow is saved as an XML file.

The default name for the file is workflow_name.xml, where workflow_name is the name of the workflow. You can change the name and location of the file.

    Importing an ODM (11gR2) Workflow

Before you import your ODM workflow, you need to make sure that you have access the the same data that is specified in the workflow.

All tables/views are prefixed with the schema where the table/view resides.

You may want to import the data into the new schema or ensure that the new schema has the necessary grants.

Open the connection in ODM.

Select the project under with you want to import the workflow, or create a new project.

Right click the Project and select Import Workflow.

Search for the XML export file of the workflow.

Preserve the objects during the import.

When you have all the data and the ODM workflow imported, you will need to run the entire workflow to ensure that you have everything setup correctly.

It will also create the models in the new schema.

    Data encoding in Workflow

All of the tables and views used as data sources in the exported workflow must reside in the new account

The account from which the workflow was exported is encoded in the exported workflow e.g. the exported workflow was exported from the account DMUSER and contains the data source node with data MINING_DATA_BUILD. If you import the schema into a different account (that is, an account that is not DMUSER) and try to run the workflow, the data source node fails because the workflow is looking for USER.MINING_DATA_BUILD_V.

To solve this problem, right-click the data node (MINING_DATA_BUILD_V in this example) and select Define Data Wizard. A message appears indicating that DMUSER.MINING_DATA_BUILD_V does not exist in the available tables/views. Click OK and then select MINING_DATA_BUILD_V in the current account.

    Video

I have created a video of this blog. It illustrates how you can Export a workflow and Import the workflow into a new schema.

Exporting and Importing Oracle Data Miner (11gR2) Workflows

Make sure to check out my other Oracle Data Miner (11gR2) videos.

http://www.youtube.com/user/btierney70

Brendan Tierney

Exporting & Importing Oracle Data Miner (11gR2) Workflows

As with all development environments there will be need to move your code from one schema to another or from one database to another.

With Oracle Data Miner 11gR2, we have the same requirement. In our case it is not just individual procedures or packages, we have a workflow consisting of a number of nodes. With each node we may have a number of steps or functions that are applied to the data.

    Exporting an ODM (11gR2) Workflow

In the Data Miner navigator, right-click the name of the workflow that you want to export.

The Save dialog opens. Specify a location on you computer where the workflow is saved as an XML file.

The default name for the file is workflow_name.xml, where workflow_name is the name of the workflow. You can change the name and location of the file.

    Importing an ODM (11gR2) Workflow

Before you import your ODM workflow, you need to make sure that you have access the the same data that is specified in the workflow.

All tables/views are prefixed with the schema where the table/view resides.

You may want to import the data into the new schema or ensure that the new schema has the necessary grants.

Open the connection in ODM.

Select the project under with you want to import the workflow, or create a new project.

Right click the Project and select Import Workflow.

Search for the XML export file of the workflow.

Preserve the objects during the import.

When you have all the data and the ODM workflow imported, you will need to run the entire workflow to ensure that you have everything setup correctly.

It will also create the models in the new schema.

    Data encoding in Workflow

All of the tables and views used as data sources in the exported workflow must reside in the new account

The account from which the workflow was exported is encoded in the exported workflow e.g. the exported workflow was exported from the account DMUSER and contains the data source node with data MINING_DATA_BUILD. If you import the schema into a different account (that is, an account that is not DMUSER) and try to run the workflow, the data source node fails because the workflow is looking for USER.MINING_DATA_BUILD_V.

To solve this problem, right-click the data node (MINING_DATA_BUILD_V in this example) and select Define Data Wizard. A message appears indicating that DMUSER.MINING_DATA_BUILD_V does not exist in the available tables/views. Click OK and then select MINING_DATA_BUILD_V in the current account.

    Video

I have created a video of this blog. It illustrates how you can Export a workflow and Import the workflow into a new schema.

Exporting and Importing Oracle Data Miner (11gR2) Workflows

Make sure to check out my other Oracle Data Miner (11gR2) videos.

http://www.youtube.com/user/btierney70

Brendan Tierney

Creating ODM Schemas & Repository for ODM 11g R2

Before you can start using the Oracle Data Miner features that are now available in SQL Developer 3, there are a few steps you need to perform. This post will walk you through these steps and I have put together a video which goes into more detail. The video is available on my YouTube channel.

http://www.youtube.com/user/btierney70

I have also created a view of this blog post that contains more details of topics covered. So check out the video.

Creating ODM Schemas & Repository video – YouTube

I will be posting more How To type videos over the coming weeks and months. Each video will focus in one one particular feature within the new Oracle Data Mining tool.

So following steps are necessary before you can start using the ODM tool

Set up of Oracle Data Miner tabs

To get the ODM tabs to display in SQL Developer, you need to go to the View menu and select the following from the Data Miner submenu:

  • Data Miner Connections
  • Workflow Jobs
  • Property Inspector

Create an ODM Schema

There are two main ways to create a Schema. The first and simplest way is to use SQL Developer. To do this you need to create a connection to SYS. Right
click on the Other Users option and select Create User.

The second option is to use SQL*Plus to create the user. Using both methods you need to grant Connect & Resource privileges to the user.

Create the Repository

Before you can start using Oracle Data Mining, you need to create an Oracle Data Miner Repository in the database. Again there are two ways to do this. The
simplest is to use the inbuilt functionality in SQL Developer. In the Oracle Data Miner Connections tab, double click on the ODM schema you have just
created. SQL Developer will check the database to see if the ODM Repository exists in the database. If it will create the repository for you. But you will
need to provide the SYS password.

The other way to create the repository is to run the installodmr.sql script that in available in the ‘datamining’ directory.

@installodmr.sql <default tablespace> <temp tablespace>

example:   @installodmr.sql USER TEMP

 

Create another ODM Schema

It is typical that you would need to have more than one schema for your data mining work. After creating the default Oracle schema, the next step is to grant
the schema the privileges to use the Data Mining Repository. This script is called

usergrants.sql <DM Schema>

example:    @usergrants.sql DMUSER

Hint: The schema name needs to be in upper case.

IMPORTANT: The last grant statement in the script may give an error. If this occurs then it is due to an invalid hidden character on the line. If you do a
cut and paste of the grant statement and execute this statement, everything should run fine.

If you want to demo data to be created for this new ODM schema then you need to run

@instdemodata.sql <DM Schema>

example:    @instdemodata.sql DMUSER

All of these scripts can be found in SQL developer directories

\sqldeveloper\dataminer\scripts

 

Creating ODM Schemas & Repository for ODM 11g R2

Before you can start using the Oracle Data Miner features that are now available in SQL Developer 3, there are a few steps you need to perform. This post will walk you through these steps and I have put together a video which goes into more detail. The video is available on my YouTube channel.

http://www.youtube.com/user/btierney70

I have also created a view of this blog post that contains more details of topics covered. So check out the video.

Creating ODM Schemas & Repository video – YouTube

I will be posting more How To type videos over the coming weeks and months. Each video will focus in one one particular feature within the new Oracle Data Mining tool.

So following steps are necessary before you can start using the ODM tool

Set up of Oracle Data Miner tabs

To get the ODM tabs to display in SQL Developer, you need to go to the View menu and select the following from the Data Miner submenu:

  • Data Miner Connections
  • Workflow Jobs
  • Property Inspector

Create an ODM Schema

There are two main ways to create a Schema. The first and simplest way is to use SQL Developer. To do this you need to create a connection to SYS. Right
click on the Other Users option and select Create User.

The second option is to use SQL*Plus to create the user. Using both methods you need to grant Connect & Resource privileges to the user.

Create the Repository

Before you can start using Oracle Data Mining, you need to create an Oracle Data Miner Repository in the database. Again there are two ways to do this. The
simplest is to use the inbuilt functionality in SQL Developer. In the Oracle Data Miner Connections tab, double click on the ODM schema you have just
created. SQL Developer will check the database to see if the ODM Repository exists in the database. If it will create the repository for you. But you will
need to provide the SYS password.

The other way to create the repository is to run the installodmr.sql script that in available in the ‘datamining’ directory.

@installodmr.sql <default tablespace> <temp tablespace>

example:   @installodmr.sql USER TEMP

 

Create another ODM Schema

It is typical that you would need to have more than one schema for your data mining work. After creating the default Oracle schema, the next step is to grant
the schema the privileges to use the Data Mining Repository. This script is called

usergrants.sql <DM Schema>

example:    @usergrants.sql DMUSER

Hint: The schema name needs to be in upper case.

IMPORTANT: The last grant statement in the script may give an error. If this occurs then it is due to an invalid hidden character on the line. If you do a
cut and paste of the grant statement and execute this statement, everything should run fine.

If you want to demo data to be created for this new ODM schema then you need to run

@instdemodata.sql <DM Schema>

example:    @instdemodata.sql DMUSER

All of these scripts can be found in SQL developer directories

sqldeveloperdataminerscripts