Sampling in Snowflake. Approximate Query Processing for fast Data Visualisation

Martin Goebbels Snowflake

Introduction Making decisions can be defined as a process to achieve a desirable result by gathering and comparing all available information. The ideal situation would be to have all the necessary data (quantitatively and qualitatively), all the necessary time and all the necessary resources (processing power, including brain power) to take the best decision. In reality, however, we usually don't ...

Connecting a Zeppelin Notebook to Snowflake in under 5 minutes

Dorian Beganovic Snowflake

About Notebooks In this blog post we will show you how easy it is to connect Zeppelin notebooks to Snowflake Cloud Data Warehouse. We will also execute some queries and visualize the results using Zeppelin’s built-in tools. Notebooks are useful for exploratory data analysis, data discovery and data storytelling. As they are web based they can be shared between different ...

Exploding polygons in Snowflake. KaBooom! Visualising Dublin property data in Tableau.

Dorian Beganovic Snowflake, Tableau

Introduction In this blog post we will show the power of Snowflake UDFs to prepare data for visualisation in Tableau. We take the Irish Census Data from the Central Statistics Office of Ireland and use Snowflake to prepare the dataset for visualization in Tableau. As we have already loaded and pre-processed the Irish Census dataset for this blogpost we are ...

Loading data into Snowflake and performance of large joins

Dorian Beganovic Snowflake

Introduction In this blog post we will load a large dataset into Snowflake and then evaluate the performance of joins in Snowflake. Loading large data into Snowflake Dataset The dataset we will load is hosted on Kaggle and contains Checkouts of Seattle library from 2006 until 2017. You can also download the data and see some samples here. The dataset ...

Deep dive on caching in Snowflake

Dorian Beganovic Snowflake

In this post we will explain the clever caching strategies Snowflake uses for performance optimization. In the process we will also cover related internals of Snowflake. A lot of information is from the official research paper created by the Snowflake authors which explains the architecture of Snowflake in depth. Caching in virtual warehouses Snowflake strictly separates the storage layer from ...