Converting XML to TSV on HDInsight

Uli Bethke April 5, 2018

In this post we will show you detailed steps on how to convert XML files on HDInsight to text (TSV/CSV). We will use Flexter, our ETL tool for XML and JSON to convert the XML files. HDInsight is the Hortonworks Hadoop distribution. Create HDInsight Cluster To create an HDInsight cluster we add a new resource ...

Read More

Converting FHIR JSON to CSV with Flexter

Uli Bethke March 16, 2018

In this post we will be converting FHIR JSON files to text (CSV). FHIR Fast Healthcare Interoperability Resources (FHIR, pronounced “fire”) is a draft standard describing data formats and elements (known as “resources”) and an application programming interface (API) for exchanging electronic health records. The standard was created by the Health Level Seven International (HL7) ...

Read More

Flexter 1.2.0 our ETL tool for JSON/XML has been released. 20x faster. Supports very large XML and JSON

Uli Bethke February 23, 2018

We have release Flexter 1.2.0 this week. We have added some significant new features and improvements. Flexter now supports conversion of JSON to a database, text (CSV/TSV), and Hadoop/Spark (ORC, Parquet, Avro). We now support the conversion of very large XML files of multi-GB sizes without any pre-processing. We have made performance improvements to the ...

Read More

Liberating data from spreadmarts and Excel (aka OOXML)

Uli Bethke February 23, 2018

In this blog post we liberate data and metadata from the shackles of Excel. We convert all of the content of an Excel file to a relational database and then query the output to determine data lineage, formulas used, formatting used, tables and pivot tables inside Excel, errors in formulas and their dependencies etc. Few ...

Read More
1 6 7 8 9 10 11