Mastering Data Storage in Snowflake: Key Concepts and Best Practices

Snowflake offers a unique approach to data storage that provides exceptional query performance, near-instant recovery options, and a flexible cost model. However, to get the most out of Snowflake—and to avoid common pitfalls—it’s important to understand how data is physically stored, how updates are handled, and how Time Travel and Failsafe affect storage costs.
In this post, we’ll break down the core concepts and share practical best practices for optimizing data storage in your Snowflake environment.
How Snowflake Stores Data
Snowflake stores data in compressed columnar format inside immutable 16MB micro-partitions.
Each micro-partition typically holds 50–500 MB of raw data, compressed down to ~16MB.
Data is automatically organized in columns, allowing Snowflake to read only the columns needed for a given query.
This architecture delivers fast query performance and high compression rates, particularly for analytic workloads.
The diagram above illustrates how Snowflake stores data in micro-partitions in columnar format.
Why Micro-Partitions Matter
Micro-partitions are immutable—once written, they cannot be modified. This influences how Snowflake handles inserts and updates:
Inserts: New micro-partitions are appended to the table.
Updates/Deletes: Instead of modifying data in place, Snowflake creates new versions of affected micro-partitions.
The diagram above illustrates how Snowflake and new micro-partition versions handle UPDATES. This is critical to understand as it underpins many of the Snowflake features, including Snowflake Streams, Dynamic Tables, and Time Travel.
This architecture also enables powerful partition elimination during queries:
Snowflake stores the minimum and maximum values for each column within each micro-partition.
The query optimizer can skip irrelevant partitions, dramatically reducing query time.
The diagram above illustrates how this works. Assuming transactions are loaded at the end of each day, Snowflake stores the data in micro-partitions, but the data tends to be clustered together. For example, all the sales for January are in the first two micro-partitions, and February and March are in the third and fourth.
Let’s assume the user executes the following query:
select * from sales where sale_date = '14-Feb-2025'
Then Snowflake needs only query micro-partition 3 because the Cloud Services records the minimum and maximum dates. Notice, there is no data for 14-Feb-2025, but the fact we narrowed down the query to a single micro-partition produces incredible performance.
Metadata and Query Optimization
Snowflake automatically captures metadata about every micro-partition:
Column min/max values
Row counts
NULL counts
Pointers to physical data
This metadata is managed by the Cloud Services layer, not the virtual warehouse, and plays a critical role in query planning and execution.
The diagram above illustrates how when data is loaded Snowflake automatically captures the metadata including pointers to the physical storage. The diagram below illustrates when a query is executed.
The cloud services based metadata is distributed to the nodes (in this case a MEDIUM size warehouse), each warehouse fetches data in parallel and computes the results which are then returned to the user.
Time Travel & Failsafe — Benefits and Costs
Snowflake’s data storage model enables automatic data versioning:
Time Travel: Retains previous versions of data for 1 to 90 days (configurable). Enables fast recovery from mistakes.
Failsafe: After Time Travel expires, Snowflake retains data for an additional 7 days, accessible by Snowflake Support.
Updates and deletes increase storage usage, since old data versions remain in storage until retention expires. Long Time Travel periods can triple effective storage costs in heavily updated tables.
Choosing the Right Table Type
Given that default Snowflake tables track micro-partition versions (even if a single row is updated), and then an additional 7 days Fail Safe data, the storage costs can increase significantly. To reduce this Snowflake provides three table types, each with different retention and cost characteristics:
Table Type | Time Travel | Failsafe | Use Case |
Default Table | 1-90 Days | 7 Days | Standard Production Tables |
Transient Table | 1 Day | None | Staging intermediate results |
Temporary Table | Session Only | None | Temporary Results |
Best practice: Use Transient Tables for easily rebuildable data (e.g. staging, ETL intermediate results) to avoid unnecessary Failsafe storage costs.
Storage Costs — Small but Manageable
Snowflake’s storage costs are very affordable: ~$23 per terabyte per month.
However, poor management of Time Travel retention and table types can still lead to unnecessary costs.
One real-world example: a customer reduced storage costs by $200,000/year simply by tuning data retention.
Best Practices for Snowflake Data Storage
✅ Understand how micro-partitions and columnar storage impact performance
✅ Use partition elimination to accelerate queries
✅ Set Time Travel retention appropriately for each table or environment
✅ Use Transient and Temporary tables where appropriate to reduce storage costs
✅ Monitor storage growth and versioning on frequently updated tables
✅ Educate users on how Time Travel and Failsafe work—accidental long-term retention can be costly
Conclusion
Snowflake’s storage architecture provides incredible flexibility, performance, and resilience — but without careful management, it can also lead to unexpected costs. By understanding how Snowflake handles micro-partitions, updates, Time Travel, and Failsafe, you can ensure that your environment remains both fast and cost-effective. Applying these best practices will help you optimise storage usage, improve query performance, and avoid the common pitfalls that lead to storage bloat. With just a few proactive adjustments, you can unlock the full potential of Snowflake’s unique architecture.
Takeaways
Snowflake stores data in compressed columnar format within immutable micro-partitions, enabling fast analytic queries.
Partition elimination (pruning) helps Snowflake read only relevant data, delivering excellent query performance.
Updates and deletes create new micro-partition versions, increasing storage — understanding this is key for managing costs.
Time Travel enables recovery of previous data versions (1–90 days), but can significantly impact storage costs on frequently updated tables.
Failsafe provides an additional 7-day recovery buffer — but this can also add hidden costs.
Choose table types wisely:
Default tables for standard production
Transient tables for ETL and intermediate results
Temporary tables for session-based work
Setting appropriate Time Travel retention is critical — excessive retention on heavily updated tables can triple storage costs.
Regularly monitor storage usage and educate teams on how Snowflake’s storage model works to prevent accidental storage bloat.
Next Steps
This article is based on a training course developed by a former Snowflake Architect. Learn more here.
Further Reading
Subscribe to my newsletter
Read articles from John Ryan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

John Ryan
John Ryan
After 30 years of experience building multi-terabyte data warehouse systems, I spent five years at Snowflake as a Senior Solution Architect, helping customers across Europe and the Middle East deliver lightning-fast insights from their data. In 2023, he joined Altimate.AI, which uses generative artificial intelligence to provide Snowflake performance and cost optimization insights and maximize customer return on investment. Certifications include Snowflake Data Superhero, Snowflake Subject Matter Expert, SnowPro Core, and SnowPro Advanced Architect.