Converting XML documents from an Oracle CLOB to a relational format

Uli Bethke Oracle

In this post we will show you how easy it is to convert XML data from Oracle CLOB to a relational format. We will be using Enterprise Flexter which ships with a feature that enables us to use Oracle CLOB as source data and to convert it.

Flexter is available in three editions. An enterprise edition, which can be installed on-premise. A managed SaaS edition Flexter as a Service. You can try Flexter for free.

Oracle CLOB

CLOB stands for “Character Large Object”. CLOB stores large amounts of character data, e.g. in the form of XML documents. The CLOB data type is similar to a BLOB, but includes character encoding, which defines a character set and the way each character is represented.

Flexter can also convert XML documents that are stored in an Oracle XMLType to relational tables.

Inserting XML data to Oracle CLOB

First we populate a table with XML documents.

We create the table with two columns, an ID column and an XML_COL column where we will insert XML files.

 

We can now insert the XML documents to the CLOB column. We will be using our standard soundlib XML sample files files for the conversion.

Inserting XML can be done with this simple SQL Query.

Converting Oracle CLOB data with Enterprise Flexter

Flexter allows us to convert Oracle CLOB with just using two additional switches (options). To specify which table will be converted we will use the “-T” switch, and to specify the column we use the “-C” switch.

Converting XML/JSON data can be performed in a couple of simple steps

Step 1 - Define Source data, collect Statistics (Information such as data type and relationships) and Data Flow (Mapping data points in the source to the data points in target schema)

Step 2 - Define Source Schema (XSD)

Step 3 - Convert the XML documents

Step 1 - Define Source, collect statistics and create data flow

In this step we will read XML data, collect metadata, and create a data flow (a logical target schema and the mappings from source to target). The -T switch tells Flexter which table to read from. The -C switch provides the column name containing the XML documents.

Example of output

Step 2 - Define Source Schema (XSD)

In the next step we will define the source schema.

Example of output

Step 3 - Convert data

In the final step we will convert the data to the output folder which we created. We also define the output format, e.g. TSV, ORC, Parquet, Avro files, or a relational database such as Oracle. In our example we convert the XML documents to TSV files.

Example of output:

Er Diagram

Conclusion.

With Flexter it is very easy and simple to convert XML documents stored in an Oracle CLOB data type to a relational format. You can query your data in a matter of minutes with SQL.

Why did we create Flexter?

Flexter was born from our own frustration of converting XML and JSON documents into a data warehouse. We saw many projects fail or run over budget. Why waste weeks of converting XML/JSON and risking failure instead of focusing on delivering real value to your business users? We think that the time spent writing manual code to convert the data to It just takes too much time of converting the data

Our customers are delighted with Flexter. Hear what they say

“This will save us weeks”

“You did in one day what we could not achieve in 3 years”

And this is what Ralph Kimball, one of the fathers of data warehousing has to say about XML conversion:

“The construction of an XML parser is a project in itself - not to be attempted by the data warehouse team”. Ralph Kimball, ETL Toolkit

We provide Flexter in three editions:

You can try out the free version of Flexter for JSON or XML.

Our enterprise edition can be installed on a single node or for very large volumes of XML on a cluster of servers.

Our SaaS edition, Flexter as a Service (FaaS), is managed by Sonra. You can use the Flexter API to convert your data.

If you have any questions please refer to the Flexter FAQ section. You can also request a demo of Flexter or reach out to us directly.

Last but not least we have open sourced Paranoid, our solution to obfuscate data in XML and JSON documents

Who uses Flexter?

Companies in travel, finance, healthcare, automotive, finance, and insurance industry use Flexter.

 

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.