You are currently viewing How to Load JSON file into Snowflake table

In this article, you will learn how to load the JSON file from the local file system into the Snowflake table and from Amazon S3 into the Snowflake table.

Related: Unload Snowflake table into JSON file

Loading JSON file into Snowflake table

Loading a JSON data file to the Snowflake Database table is a two-step process.

  • First, using PUT command upload the data file to Snowflake Internal stage.
  • Second, using COPY INTO, load the file from the internal stage to the Snowflake table.
load json snowflake table

First, let’s create a table with one column as Snowflake loads the JSON file contents into a single column.


CREATE TABLE ZIPCODE_JSON (ZIPCODE_RAW VARIANT);
+------------------------------------------+
| status                                   |
|------------------------------------------|
| Table ZIPCODE_JSON successfully created. |
+------------------------------------------+
1 Row(s) produced. Time Elapsed: 9.657s

PUT – Upload the file to Snowflake internal stage

Using PUT command, you can upload the JSON file to Snowflake Internal stage, either to name, table and user stage or to the external stage.

@~<username> – Is used to upload to Snowflake user stage

@%<tablename> – Is used to upload to Snowflake table stage

@<name> – Is used to upload to name stage

Below example uploads the simple_zipcodes.json file to internal table ZIPCODE_JSON stage. You can find the JSON file at GitHub, download this file to a folder on your system, I’ve downloaded to /tmp folder.


PUT file:///tmp/simple_zipcodes.json @%ZIPCODE_JSON;
simple_zipcodes.json_c.gz(0.00MB): [##########] 100.00% Done (0.212s, 0.00MB/s).
+----------------------+-------------------------+-------------+-------------+--------------------+--------------------+----------+---------+
| source               | target                  | source_size | target_size | source_compression | target_compression | status   | message |
|----------------------+-------------------------+-------------+-------------+--------------------+--------------------+----------+---------|
| simple_zipcodes.json | simple_zipcodes.json.gz |         768 |         251 | NONE               | GZIP               | UPLOADED |         |
+----------------------+-------------------------+-------------+-------------+--------------------+--------------------+----------+---------+
1 Row(s) produced. Time Elapsed: 1.531s

LIST – To list the files from internal stages

To verify if the file uploaded successfully, use the LIST command with the table stage as a parameter.


LIST @%ZIPCODE_JSON;
+-------------------------+------+----------------------------------+------------------------------+
| name                    | size | md5                              | last_modified                |
|-------------------------+------+----------------------------------+------------------------------|
| simple_zipcodes.json.gz |  256 | 0a16f25cef1f079f1f0b3b62a34868db | Sun, 8 Mar 2020 06:31:44 GMT |
+-------------------------+------+----------------------------------+------------------------------+
1 Row(s) produced. Time Elapsed: 0.505s

Let’s see an example with the name internal stage.


PUT file:///tmp/simple_zipcodes.json @~;

Another example with the name internal stage along with the path.


PUT file:///tmp/simple_zipcodes.json @~/tmp;

COPY INTO – Load the JSON file into Snowflake table

Once you upload the JSON file to the internal stage, now use the COPY INTO tablename command to load the JSON file to the Snowflake database table.


COPY INTO ZIPCODE_JSON from '@%ZIPCODE_JSON/simple_zipcodes.json' file_format = (type=JSON);
+-------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file                    | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|-------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| simple_zipcodes.json.gz | LOADED |          10 |          10 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
+-------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
1 Row(s) produced. Time Elapsed: 1.259s

Now, let’s run the select and see if the data loaded successfully.


nnkumar13#COMPUTE_WH@EMPLOYEE.PUBLIC>SELECT * FROM ZIPCODE_JSON;
+----------------------------------+
| ZIPCODE_RAW                      |
|----------------------------------|
| {                                |
|   "City": "PARC PARQUE",         |
|   "State": "PR",                 |
|   "ZipCodeType": "STANDARD",     |
|   "Zipcode": 704                 |
| }                                |
| {                                |
|   "City": "PASEO COSTA DEL SUR", |
|   "State": "PR",                 |
|   "ZipCodeType": "STANDARD",     |
|   "Zipcode": 704                 |
| }                                |
| {                                |
|   "City": "BDA SAN LUIS",        |
|   "State": "PR",                 |
|   "ZipCodeType": "STANDARD",     |
|   "Zipcode": 709                 |
| }                                |
| {                                |
|   "City": "CINGULAR WIRELESS",   |
|   "State": "TX",                 |
|   "ZipCodeType": "UNIQUE",       |
----
----
---

Note that the above example loads the JSON file into a single column (each record in a JSON file loads into a single column of a row.

Conclusion

To load a JSON file into the Snowflake table, you need to upload the data file to Snowflake internal stage and then load the file from the internal stage to the table. You have also learned to change the default compression and many more options.

References

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has One Comment

  1. Anonymous

    @coder getting error while loading 35MB file.
    Error parsing JSON: document is too large, max size 16777216 bytes

Comments are closed.