Рет қаралды 118
Related Video: • Snowflake Performance ...
When Snowflake warehouse cannot fit an operation in memory, it starts spilling (storing) data first to the local disk of a warehouse node, and then to remote storage.
In such a case, Snowflake first tries to temporarily store the data on the warehouse's local disk. As this means extra IO operations, any query that requires spilling will take longer than a similar query running on similar data that is capable to fit the operations in memory.
Also, if the local disk is not sufficient to fit the spilled data, Snowflake further tries to write to the remote cloud storage, which will be shown in the query profile as "Bytes spilled to remote storage".
This spilling can have a profound effect on query performance (especially if the remote disk is used for spilling)
The spilling can't always be avoided, especially for large batches of data, but it can be decreased by:
Reviewing the query for query optimization especially if it is a new query
Reducing the amount of data processed. For example, by trying to improve partition pruning, or projecting only the columns that are needed in the output.
Decreasing the number of parallel queries running in the warehouse.
Trying to split the processing into several steps (for example by replacing the CTEs with temporary tables). In other words, processing data in smaller batches.
Using a larger warehouse - this effectively means more memory and more local disk space
community.snow...