You are currently viewing Download Snowflake table to local WINDOWS | Linux | MAC

As Snowflake data warehouse is a cloud database, you can use data unloading SQL COPY INTO statement to unload/download/export the data from Snowflake table to flat file on the local file system, Amazon S3, Microsoft Azure, Google GCP and more. In this article, I will explain how to export to a local file system.

It doesn’t matter where you are downloading the flat file to either Windows, Linux or Mac os, the process is the same except the download path you use. And make sure you install right SnowSQL CLI appropriate to your OS.

At the time of writing this article, Snowflake supports unloading/exporting data into the following formats.

  • CSV, TSV (Any delimiter file)
  • JSON
  • Parquet

Pre-requisite

Install Snowflake CLI to run SnowSQL commands.

Below URL takes you to the Snowflake index download page, navigate to the OS you are using and download the binary and install.


https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/index.html

Download a Snowflake table to local file system is a two-step process

  • First use “COPY INTO” statement, which copies the table into the Snowflake internal stage, external stage or external location.
  • Use “GET” statement to download the file from the staged folder.

Connect to Snowflake database and schema

From command prompt, go to the location of the Snowsql CLI install and just enter Snowsql, this will display you the usage syntax. Below is a simple syntax to connect to Snowflake database. when you use this statement, it prompts for a password and after entering correctly, it provides you shell to interact with Snowflake.

syntax


user>snowsql -a accountname -u user -d dtabasename -s schemaname

Below Snowsql statement connects to account “oea82.us-east-1”, user “nkmr”, database “employee” and schema “public”.


user>snowsql -a oea82.us-east-1 -u nkmr -d employee -s public

Password:
* SnowSQL * v1.2.5
Type SQL statements or !help
nkmr#COMPUTE_WH@EMPLOYEE.PUBLIC>

Unload to Snowflake internal stage

Using COPY INTO you can unload Snowflake to internal stage or internal table stage, In below example, we are exporting table EMP and stores at the internal table “EMP” stage @%EMP/result/. By default, it creates a file with the name data_0_0_0.csv.gz

By default COPY INTO unload the data into CSV file with a header and compress the file with gzip.


COPY INTO '@%EMP/result/' from EMP;
+---------------+-------------+--------------+
| rows_unloaded | input_bytes | output_bytes |
|---------------+-------------+--------------|
|             5 |         122 |          115 |
+---------------+-------------+--------------+
1 Row(s) produced. Time Elapsed: 1.300s

As you see above, I’ve 5 records in the EMP table and all records have been unloaded. Now use GET to download the file.

Download from the internal stage

Use GET snowSQL command to download the unloaded file from the internal stage to the local file system. By default, all unloaded data files are compressed using gzip, unless compression is explicitly disabled or one of the other supported compression methods is explicitly specified.


GET @%EMP/result/ file://c:/tmp/;

file1.csv_0_0_0.csv.gz(0.00MB): [##########] 100.00% Done (0.204s, 0.00MB/s).
+------------------------+------+------------+---------+
| file                   | size | status     | message |
|------------------------+------+------------+---------|
| data_0_0_0.csv.gz      |  115 | DOWNLOADED |         |
+------------------------+------+------------+---------+
1 Row(s) produced. Time Elapsed: 2.013s

This downloads a file data_0_0_0.csv.gz to /tmp directory.

Ignore column names on header

Use HEADER optional parameter to specify whether to include the table column headings in the output files, by default it is set to TRUE, you can change it to FALSE if you do not want column names of a header on the output file. for example.


COPY INTO '@%EMP/result/'
  from EMP HEADER = FALSE 
  
1 Row(s) produced. Time Elapsed: 1.300s

Change compression

Use COMPRESSION to change the default compression, By default, it unloads a file into GZIP format, however, you can change it to use the following compressions AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE


COPY INTO '@%EMP/result/'
  from EMP  
  file_format = (type=CSV COMPRESSION = BZ2) 
  
1 Row(s) produced. Time Elapsed: 1.300s

Use NONE, if you want the file to unload in plain raw CSV file without default compression.

Change delimiter

Use RECORD_DELIMITER to change the default delimiter, by default it used ‘,’ character, but you can change to use any character, for example, let’s see how to use pipe delimiter


COPY INTO '@%EMP/result/'
  from EMP  
  file_format = (type=CSV COMPRESSION = NONE RECORD_DELIMITER='|') 
  
1 Row(s) produced. Time Elapsed: 1.300s

You can find more options to use with CSV

Download Snowflake table to Parquet file

Here, we will see how to unload or download the Snowflake table to the Parquet file.

Use the file_format = (type = parquet) to export in a Parquet file format.


COPY INTO '@%EMP/result/' from EMP file_format = (type = parquet);

+---------------+-------------+--------------+
| rows_unloaded | input_bytes | output_bytes |
|---------------+-------------+--------------|
|             5 |        1445 |         1445 |
+---------------+-------------+--------------+
1 Row(s) produced. Time Elapsed: 0.841s
nkmr#COMPUTE_WH@EMPLOYEE.PUBLIC>GET '@%EMP/result/' file://c:/tmp/;

data_0_0_0.snappy.parquet(0.00MB): [##########] 100.00% Done (0.213s, 0.01MB/s)
.
+----------------------------+------+------------+---------+
| file                       | size | status     | message |
|----------------------------+------+------------+---------|
| data_0_0_0.snappy.parquet  | 1445 | DOWNLOADED |         |
+----------------------------+------+------------+---------+
1 Row(s) produced. Time Elapsed: 1.654s
nkmr#COMPUTE_WH@EMPLOYEE.PUBLIC>

Change compression

Use COMPRESSION to change the default compression, By default, it unloads a parquet file using snappy format, but, you can change it to use the compressions AUTO | LZO | SNAPPY | NONE


COPY INTO '@%EMP/result/'
  from EMP 
  file_format = (type = parquet COMPRESSION=LZO) 

1 Row(s) produced. Time Elapsed: 1.300s

Use NONE, if you want parquet file to unload in raw parquet file without default compression.

You can find more options to use with Parquet

Conclusion

Reference

Snowflake reference

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium