Subselect, derived tables, and subqueries in ODI 11G

Uli Bethke Oracle Data Integrator (ODI)

I think one of the best new features in ODI 11G is the subselect/subquery feature. In ODI 10 this could only be achieved by a workaround as outlined in a previous post Using subqueries in Oracle Data Integrator.

What is the advantage of a subquery?

In previous versions of ODI we had to physically set down the data for each indiviudal query, e.g. if we wanted to rank a dataset and then select the top ten out of that dataset we had to create an interface for the ranking operation and an interface for the top ten operation. At each point we had to set down the data thereby increasing I/O and decreasing overall performance. In ODI 11 this has changed. While we still need to create two temp interfaces for the operation as a whole we can now tell ODI to treat the first temp interface as a derived table. ODI 11G will then use this to generate a subquery.

How does the subquery work in ODI 11?

The way this works is fairly simple. For each subquery/derived table in your query you create a temp interface. You embed your various subqueries by simply telling ODI that you want to use the temp interface as a subquery.

Teach me Big Data to Advance my Career

A step by step guide

We will look at an example from the SH schema. The task at hand is to load a table with the top ten customers based on sales amount.

To accomplish this task we need three temp interfaces.

The first temp interface (INT_SALES_CUST) will aggregate the sales amount from the sales table by cust_id

odi11_subquery1

The second temp interface (INT_SALES_RANK) will take the resultset from the interface in the previous step and dense rank the customers’ sales data.

odi11_subquery2

The third interface (INT_TOP_TEN) will then select the top ten customers, join to the customers table, and physically set down the data.

odi11_subquery3

So far so good. These are the same steps we took in ODI 10.

In a next step we need to subquery enable the interfaces. We open interface INT_SALES_RANK and click on the INT_SALES_CUST data store. In the Source Properties section you will find a checkbox Use Temporary Interface as Derived Table. Select this checkbox.

odi11_subquery4

We also need to perform the same step for our third interface INT_TOP_TEN.

Once this has been done we can execute interface INT_TOP_TEN in Simulation mode.

odi11_subquery5

This will generate the required query with the embedded subqueries from the temp interfaces. What a great feature.

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.