Top 5 performance best practices with Snowflake


By Mike Sargo

Chief Data and Analytics Officer and Co-Founder of Data Ideology  

The Snowflake Data Platform is designed for scale, efficiency, and ease of use.

It supports an unlimited number of Virtual Data Warehouse clusters that offer shared access for optimal performance. The Snowflake Data Platform requires almost zero management or tuning as there are no indexes and only a few options available to tune the database platform. Undoubtedly, Snowflake was designed for simplicity and offers almost no performance tuning options. Instead of describing technical tuning options, this article summarizes  the top five best practices to maximize query performance.

The data platform itself doesn't support many options for performance tuning. Nonetheless,  best practices make a difference in improving query performance.  

Here are the top 5 performance best practices for the Snowflake data platform: 

Optimize size 

One of the best ways to maximize performance during data loading is to optimize the files' size. Make sure to:  

  • Split the data into multiple small files to support optimal data loading in Snowflake.  
  • Use a separate data warehouse for large files.  

The number and capacity of the servers determine the number of data files.  

Segment Data 

Snowflake caches data in the virtual data warehouse, but it's still essential to segment data. Consider these best practices for data query performance:

  • Group users with common queries in the same virtual data warehouse to optimize data retrieval and use.  
  • The Snowflake Query Profile supports query analysis to help identify and address performance concerns. 

Snowflake draws from the same virtual data warehouse to support complex data science operations, business intelligence queries, and ELT data integration.  

Scale-Up  

Snowflake allows for a scale-up in the virtual data warehouse to better handle large workloads. When using scale-up to improve performance, make note of the following:

  • Snowflake supports fast and easy adjustments to the warehouse-size to handle the workload.  
  • It can also automatically suspend or resume the scale-up, with complete transparency for the user.  

Snowflake's scale-up functionality supports the continually changing requirements for processing. 

Scale-Out 

Snowflake supports the deployment of same-size clusters to support concurrency. Keep these points in mind for  how scale-out can help performance optimization: 

  • As users execute queries, the virtual data warehouse automatically adds clusters up to a fixed limit.  
  • It can scale-up in a more controlled way instead of deploying one or more clusters of larger machines like legacy data platforms.  

Snowflake automatically adjusts based on user queries, with automatic clustering during peak and off hours as needed.  

Database Design 

Without proper oversight and planning, database design and development features can become a nightmare. Here are the best practices for database design:  

  • Plan for any changes in advance, and meet with the team to map out your data model.  
  • Configure and test in the dev system to avoid untested deployment.  
  • Communicate with the team to make sure everyone is on the same page.  

With proper planning and communication, Snowflake should not experience any design issues.  

Implement Best Practices to Improve Performance with Snowflake

Snowflake is designed to be efficient and effective, so it's sometimes easy to overlook the simple yet best practices to improve performance. As you can see, these top best practices can improve performance with Snowflake. 

At Data Ideology, we will work with  you to determine the  best practices and how to implement them as part of your data engineering process with Snowflake's data platform. We help you work smarter, not harder. 

Written by Mike Sargo
Mike Sargo is Chief Data and Analytics Officer and Co-Founder of Data Ideology with over 18 years of experience leading, architecting, implementing, and delivering enterprise analytics, business intelligence, and enterprise data management solutions.

Snowflake vs Azure E-Book

Data Ideology has created a Free Comprehensive E-Book that highlights many of the key differences, advantages and disadvantages to consider when starting your Cloud Data Migration Journey.

Retail

A 360-Degree Customer View: Personalizing the Consumer Journey

Although achieving a 360-degree view of customers isn’t a new concept by any stretch, retail organizations are still struggling to achieve this critical feat from a data management perspective.
Banking & Financial Services

The Open Banking Movement is Open for Business

When it comes to financial services, people’s lives can be somewhat fragmented. Depending on the situation, they might be juggling two or three different financial institutions along with several diverse financial products.
Banking & Financial Services

Data Solutions that Improve Risk Management for Banks

An extremely vital aspect of the banking and finance sector is their ability to manage exposure to losses & risk and to protect the value of its assets.