Recent Articles

Converting ACORD XML to Avro row storage

In this example we will use Flexter to convert an XML file to the Apache Avro format. We then query and analyse the output in the Spark-Shell. Flexter can generate…

Kristijan Berta April 26, 2018

Using Apache Airflow to build reusable ETL on AWS Redshift

Building a data pipeline on Apache Airflow to populate AWS RedshiftIn this post we will introduce you to the most popular workflow management tool - Apache Airflow. Using Python as…

Dorian Beganovic January 1, 2018