Optimal Usage of Snowflake Virtual Warehouses: Breaking Down Parallelism & Concurrency

September 29, 2023

If you aim to harness the full potential of Snowflake’s virtual warehouses, it’s crucial to grasp how Snowflake datalake architecture handles concurrency and parallelism. Here’s a simplified rundown:

🔍 Basics First:

Virtual warehouses (VWH) in Snowflake = Compute.
They range in size: XS, S, M … up to 6XL.
An individual VWH houses 8 virtual cores (vCore). An XL virtual warehouse boasts 128 vCores.
XS size has just 1 server (VM). Each successive size doubles the number of VMs (e.g., S has 2, M has 4).
Multiple VMs make up a cluster. Several clusters can run simultaneously (multi cluster).

🔧 Diving into Concurrency:

Default maximum concurrency level (degree of parallelism) for a warehouse = 8, matching the number of vCores. But the parameter max_concurrency_level can be adjusted.
This level remains consistent irrespective of the VM count in a cluster. So, XS and 6XL both can run 8 concurrent queries.
Additional queries? Unless they are simple with minimal resource requirements they’re queued.
On an XS warehouse, 1 query is given 1 vCore. But, on a 6XL one, a query has access to up to a whopping 4096 vCores. So, bigger clusters = more resources per query = generally faster execution. Fewer queries get queued, and those that do, get dequeued quicker.
A larger virtual warehouse feels speedier not because it raises the concurrency level, but because it reduces the chances of queries being queued. It also increases the speed at which queries come off the queue.

✨ Fine-tuning Concurrency?

You can modify the max level of concurrency. By doing so, you’ll allocate more resources to each query but at the cost of reduced concurrency, and vice versa.

My advice? Tweak max_concurrency_level with caution. For compute intensive queries, utilize Snowflake’s SQL query acceleration. To bolster concurrency, operate multiple concurrent clusters or virtual warehouses.

Final note

Smaller queries execute on a subset of available resources, and so will only count as a fraction towards the max concurrency level.