Deep dive on caching in Snowflake

Dorian Beganovic March 5, 2018

In this post we will explain the clever caching strategies Snowflake uses for performance optimization. In the process we will also cover related internals of Snowflake. A lot of information is from the official research paper created by the Snowflake authors which explains the architecture of Snowflake in depth. Caching in virtual warehouses Snowflake strictly ...

Read More

The top 10+1 things we love about Snowflake

Dorian Beganovic February 14, 2018

Introduction I have been familiarising myself with Snowflake over the last couple of months and these are my impressions on the top 10+1 features that really make Snowflake stand out compared to other cloud based data warehouse solutions. 1. Results of queries are stored and can be viewed in query history The fact that Snowflake ...

Read More

SpaceX Performance for Snowflake with Clustering Keys

Dorian Beganovic February 8, 2018

Introduction Snowflake stores tables by dividing their rows across multiple micro-partitions (horizontal partitioning). Each micro-partition automatically gathers metadata about all rows stored in it such as the range of values (min/max etc.) for each of the columns. This is a standard feature of column store technologies. For example Apache ORC format (optimized row columnar) keeps ...

Read More

Create your own custom aggregate (UDAF) and window functions in Snowflake

Dorian Beganovic February 4, 2018

In this post we will show you how to create your own aggregate functions in Snowflake cloud data warehouse. This type of feature is known as a user defined aggregate function. Most big data frameworks such as Spark, Hive, Impala etc. let you create your own UDAFs. Also traditional databases such as Oracle or SQL ...

Read More
1 6 7 8 9