Learn Window Functions on Snowflake. Become a cloud data warehouse superhero.

Dorian Beganovic Snowflake, Window Functions

In a recent post we compared Window Function Features by Database Vendors. In this post we will give you an overview on the support for various window function features on Snowflake. Window functions are essential for data warehousing Window functions are the base of data warehousing workloads for many reasons. First of all they are very similar to the GROUP ...

Converting XML to Hive

Chinmay Sinha XML

In this example we will use the Flexter XML converter to generate a Hive schema and parse an XML file into a Hive database. We will then use the spark-sql interface to query the generated tables. TVAnytime XML standard For the example we will use TVAnytime XML standard. You can download sample XML files and an XSD for this standard ...

Convert Oracle XMLTYPE to Oracle tables

Chinmay Sinha Oracle, XML

In this example we will read XML data from a table with an XMLTYPE column in the Oracle database and convert the XML to tables in the same Oracle database. We will use Flexter to extract the data to and from the same database in this example. We need to give Flexter table and column name that contains the XML ...

Convert XML with Spark to Parquet

Chinmay Sinha Spark, XML

It can be very easy to use Spark to convert XML to Parquet and then query and analyse the output data. As I have outlined in a previous post, XML processing can be painful especially when you need to convert large volumes of complex XML files. Apache Spark has various features that make it a perfect fit for processing XML ...

Why is concurrency overrated to measure performance of data warehouse platforms?

Uli Bethke Redshift

The difference between making a good and a bad decisions often comes down to the quality of the pre-defined metrics. If the metric is poor so will be the decision. When comparing performance between different technologies such as Google Big Query (based on a distributed file system - Colossus to be precise) and MPP technologies such as Redshift, people tend ...