In this Snowflake article, you will learn how to upload the CSV data file from the local filesystem(Windows|Linux|Mac OS) to Snowflake internal stage using PUT
SQL command and then load CSV file from the internal stage to the Snowflake database table using COPY INTO
SQL command.
Related: Unload Snowflake table to CSV file
Loading a data CSV file to the Snowflake Database table is a two-step process.
- First, by using
PUT
command upload the data file to Snowflake Internal stage. - Second, using
COPY INTO
command, load the file from the internal stage to the Snowflake table.
PUT
– Upload the file to Snowflake internal stage
Using PUT
command, you can upload the CSV file to Snowflake Internal stage. You can upload to the following internal stages Snowflake supports
- Name stage
- Table stage
- User stage or to the
- External stage.
@~<username>
– Is used to upload to Snowflake user stage
@%<tablename>
– Is used to upload to Snowflake table stage
@<name>
– Is used to upload to name stage
Below example uploads the emp.csv file to internal table EMP
stage.
PUT file:///apps/sparkbyexamples/emp.csv @%EMP;
emp.csv_c.gz(0.00MB): [##########] 100.00% Done (0.224s, 0.00MB/s).
+---------+------------+-------------+-------------+--------------------+--------------------+----------+---------+
| source | target | source_size | target_size | source_compression | target_compression | status | message |
|---------+------------+-------------+-------------+--------------------+--------------------+----------+---------|
| emp.csv | emp.csv.gz | 129 | 123 | NONE | GZIP | UPLOADED | |
+---------+------------+-------------+-------------+--------------------+--------------------+----------+---------+
1 Row(s) produced. Time Elapsed: 1.483s
By default, the PUT
command compresses the file using GZIP. You can change the default compression using SOURCE_COMPRESSION
option.
Let’s see an example with the name internal stage.
put file:///apps/sparkbyexamples/emp.csv @~;
Another example with the name internal stage along with the path.
put file:///apps/sparkbyexamples/emp.csv @~/tmp;
LIST
– To list the files from internal stages
To verify if the file uploaded successfully, use the LIST
command with the stage as a parameter.
LIST @%EMP;
+------------+------+----------------------------------+------------------------------+
| name | size | md5 | last_modified |
|------------+------+----------------------------------+------------------------------|
| emp.csv.gz | 128 | 5f973d95b017ec3eb416389e43fc778c | Thu, 5 Mar 2020 03:40:42 GMT |
+------------+------+----------------------------------+------------------------------+
1 Row(s) produced. Time Elapsed: 0.522s
COPY INTO
– Load the CSV file to Snowflake table
COPY INTO
SQL command is used to load the file from the internal stage into the table. Before looking into COPY INTO, first, let’s create a Snowflake table.
CREATE TABLE EMP
(FNAME VARCHAR, LNAME VARCHAR,
SALARY VARCHAR, DEPARTMENT varchar,
GENDER varchar);
+---------------------------------+
| status |
|---------------------------------|
| Table EMP successfully created. |
+---------------------------------+
Now use the COPY INTO
tablename command to load the compressed CSV file to the Snowflake database table.
COPY INTO EMP from '@%EMP/emp.csv.gz';
+------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| emp.csv.gz | LOADED | 5 | 5 | 1 | 0 | NULL | NULL | NULL | NULL |
+------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
1 Row(s) produced. Time Elapsed: 2.096s
Finally, let’s run the select and see if the data loaded successfully.
SELECT * FROM EMP;
+---------+----------+----------+------------+--------+
| FNAME | LNAME | SALARY | DEPARTMENT | GENDER |
|---------+----------+----------+------------+--------|
| James | Smith | 10000 | 10 | M |
| Michael | Jones | 20000 | 10 | M |
| Robert | Williams | 3000.4 | 10 | M |
| Maria | Jones | 23476.5 | 20 | F |
| Jen | Mary | 123445 | 20 | F |
+---------+----------+----------+------------+--------+
5 Row(s) produced. Time Elapsed: 0.646s
Options with COPY INTO
COMPRESSION
– Load file with specified compression
Use COMPRESSION
to specify the compressed file you wanted to load, By default, it loads a file with GZIP format, however, you can change it to use the following compressions AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
COPY INTO EMP from '@%EMP/data_0_0_0.csv.bz2'
file_format = (type=CSV COMPRESSION = BZ2);
+--------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|--------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| data_0_0_0.csv.bz2 | LOADED | 5 | 5 | 1 | 0 | NULL | NULL | NULL | NULL |
+--------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
1 Row(s) produced. Time Elapsed: 1.902s
RECORD_DELIMITER
– Changes delimiter for record
In case if you have a file with record separator other than ‘\n’, use RECORD_DELIMITER
. for example, below query uses with ‘\n\r’
COPY INTO EMP from '@%EMP/emp.csv.gz'
file_format = (type=CSV RECORD_DELIMITER='\n\r');
1 Row(s) produced. Time Elapsed: 1.300s
FIELD_DELIMITER
– Specify field delimiter of loading file
Use FIELD_DELIMITER
to change the default delimiter, by default it used ‘,’ character, but you can change to use any character, for example, let’s see how to use pipe delimiter
COPY INTO EMP from '@%EMP/emp.csv.gz'
file_format = (type=CSV FIELD_DELIMITER='|');
1 Row(s) produced. Time Elapsed: 1.300s
SKIP_HEADER
– Ignore column names on header
Use HEADER
optional parameter to specify whether to include the table column headings in the output files, by default it is set to TRUE
, you can change it to FALSE
if you do not want column names of a header on the output file. for example.
DATE_FORMAT
– Specify input date format
By default date columns expects in ‘YYYY-MM-DD
‘, If your loading file has a different format, specify the input date format with DATE_FORMAT
option.
COPY INTO EMP from '@%EMP/emp.csv.gz'
file_format = (type=CSV DATE_FORMAT='MM-DD-YYYY');
1 Row(s) produced. Time Elapsed: 1.300s
TIME_FORMAT
– Specify Input time format
Time columns expect in ‘HH24:MI:SS
‘, If your loading file has a different format then use TIME_FORMAT
option specify the input format.
COPY INTO EMP from '@%EMP/emp.csv.gz'
file_format = (type=CSV TIME_FORMAT='HH24:MI:SS')
1 Row(s) produced. Time Elapsed: 1.300s
TIMESTAMP_FORMAT
– Specify input timestamp format
Timestamp columns expect in ‘YYYY-MM-DD HH24:MI:SS.FF3 TZHTZM
‘, If your loading file has a different format, use TIMESTAMP_FORMAT
option to specify the input format.
COPY INTO EMP from '@%EMP/emp.csv.gz'
file_format = (type=CSV TIMESTAMP_FORMAT='MM-DD-YYYY HH24:MI:SS.FF3 TZHTZM')
1 Row(s) produced. Time Elapsed: 1.300s
Conclusion
To load a CSV file into the Snowflake table, you need to upload the data file to Snowflake internal stage and then load the file from the internal stage to the table. You can also change the compression.
Related Articles
- How to Load JSON file into Snowflake table
- How to Load Parquet file into Snowflake table
- Snowflake – CREATE TABLE LIKE
- Download Snowflake table to local WINDOWS | Linux | MAC
- How to Load Parquet file into Snowflake table
- Load file from Amazon S3 into Snowflake table
- Snowflake Create, Clone, Drop Database
References
- https://docs.snowflake.net/manuals/sql-reference/sql/put.html
- https://docs.snowflake.net/manuals/sql-reference/sql/copy-into-table.html
Happy Learning !!