Convert MISMO XML to Hive and Parquet

Anvesh Gali XML

In this walkthrough, we will convert the MISMO (The Mortgage Industry Standards Maintenance Organization) XML files to Parquet and query in Hive. The XML files are converted to Parquet using the enterprise version of Flexter. The enterprise version provides users with numerous additional features which aren’t available on the free version of Flexter (try for free). In the following steps, ...

Let's talk Kafka Streams

Anvesh Gali Uncategorized

Look Ma, no Code! Building Streaming Data Pipelines with Apache Kafka, Robin Moffatt, Technology Evangelist, Confluent Companies new and old are all recognising the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, which enables low latency analytics, event driven architectures and the ...

Comparing Window Function Features by Database Vendors

Jiří Mauritz Data Warehouse, Redshift, SQL for Analysis, Window Functions

We will round off the series on window functions with comparison of what database vendors offer. There are various mutations of window functions and every vendor supports a different subset or feature. Some also add extra window functions or features beyond standard ANSI SQL. One of the most powerful features is user-defined aggregate functions (UDAF), which some databases allow using ...

Convert IRS XML to PostgreSQL using Flexter Enterprise

Anvesh Gali XML

Loading IRS Data into PostgreSQL database using Flexter Enterprise In this walkthrough, we load the IRS XML files into PostgreSQL database using Flexter Enterprise edition. Flexter Enterprise provisions users with all features. The enterprise version provides users with numerous additional features which aren’t available on the free version of Flexter. Features Free Online Trial Enterprise Version Max daily data limit ...

Window Function ROWS and RANGE on Redshift and BigQuery

Jiří Mauritz Data Warehouse, Redshift, Window Functions

Frames in window functions allow us to operate on subsets of the partitions by breaking the partition into even smaller sequences of rows. SQL provides syntax to express very flexible definitions of a frame. We described the syntax in the first post on Window functions and demonstrated some basic use cases in the post on Data Exploration with Window Functions ...