Converting TVAnytime XML to Impala and Parquet

Chinmay Sinha February 1, 2018

In this example we will use Flexter to convert an XML file to parquet. We then query and analyse the output with Impala (using Cloudera VM). Flexter can generate a target schema from an XML file or a combination of XML and XML schema (XSD) files. In our example we process ContentCS.xml file from the ...

Read More

Converting XML to Hive

Chinmay Sinha January 27, 2018

In this example we will use the Flexter XML converter to generate a Hive schema and parse an XML file into a Hive database. We will then use the spark-sql interface to query the generated tables. TVAnytime XML standard For the example we will use TVAnytime XML standard. You can download sample XML files and ...

Read More

Convert Oracle XMLTYPE to Oracle tables

Chinmay Sinha January 27, 2018

In this example we will read XML data from a table with an XMLTYPE column in the Oracle database and convert the XML to tables in the same Oracle database. We will use Flexter to extract the data to and from the same database in this example. We need to give Flexter table and column ...

Read More

Convert XML with Spark to Parquet

Chinmay Sinha January 25, 2018

It can be very easy to use Spark to convert XML to Parquet and then query and analyse the output data. As I have outlined in a previous post, XML processing can be painful especially when you need to convert large volumes of complex XML files. Apache Spark has various features that make it a ...

Read More