You might be surprised to hear, but Hadoop is a poor choice for a data lake

Uli Bethke May 3, 2018

As there are a lot of definitions on what constitutes a data lake let’s first define what we actually mean by it. A data lake is similar to the staging area of a data warehouse with a couple of core differences. Let’s look at the commonalities first. Just like the staging area of a data ...

Read More

Flexter 1.2.0 our ETL tool for JSON/XML has been released. 20x faster. Supports very large XML and JSON

Uli Bethke February 23, 2018

We have release Flexter 1.2.0 this week. We have added some significant new features and improvements. Flexter now supports conversion of JSON to a database, text (CSV/TSV), and Hadoop/Spark (ORC, Parquet, Avro). We now support the conversion of very large XML files of multi-GB sizes without any pre-processing. We have made performance improvements to the ...

Read More

Are Data Lakes Fake News?

Uli Bethke August 8, 2017

The problem with the data lake Are data lakes fake news? The quick answer is yes and in this post I will show you why. Before we get started make sure to download our checklist for a successful data lake implementation. The biggest problem I have with data lakes is that the term has been ...

Read More

[Video] Convert ACORD Insurance XML to Talend Data Preparation

Uli Bethke July 11, 2017

In this video we will show you how easy it is to convert complex XML files that are based on ACORD. ACORD is an industry data standard in the insurance sector. We first use Flexter’s powerful XML Converter to parse the files into text (CSV,TSV). We then load the data into Talend Data Preparation for ...

Read More
1 2 3 4 56