You might be surprised to hear, but Hadoop is a poor choice for a data lake
As there are a lot of definitions on what constitutes a data lake let’s first define what we actually mean by it. A data lake is similar to the staging area of a data warehouse with a couple of core differences. Let’s look at the commonalities first. Just like the staging area of a data ...
Read MoreFlexter 1.2.0 our ETL tool for JSON/XML has been released. 20x faster. Supports very large XML and JSON
We have release Flexter 1.2.0 this week. We have added some significant new features and improvements. Flexter now supports conversion of JSON to a database, text (CSV/TSV), and Hadoop/Spark (ORC, Parquet, Avro). We now support the conversion of very large XML files of multi-GB sizes without any pre-processing. We have made performance improvements to the ...
Read MoreAre Data Lakes Fake News?
The problem with the data lake Are data lakes fake news? The quick answer is yes and in this post I will show you why. Before we get started make sure to download our checklist for a successful data lake implementation. The biggest problem I have with data lakes is that the term has been ...
Read More[Video] Convert ACORD Insurance XML to Talend Data Preparation
In this video we will show you how easy it is to convert complex XML files that are based on ACORD. ACORD is an industry data standard in the insurance sector. We first use Flexter’s powerful XML Converter to parse the files into text (CSV,TSV). We then load the data into Talend Data Preparation for ...
Read More