Pandas – Convert JSON to CSV

  • Post author:
  • Post category:Pandas
  • Post last modified:January 9, 2024
  • Reading time:16 mins read

Convert JSON to CSV using Pandas, Pandas is a library in Python that can be used to convert JSON (String or file) to CSV file, all you need is first read the JSON into a pandas DataFrame and then write pandas DataFrame to CSV file.

The JSON stands for JavaScript Object Notation which is used to store and transfer the data between two applications. To use JSON in Python you have to use Python supports JSON through a built-in package called JSON. To use this feature, we import the JSON package in Python script. The text in JSON is done through quoted-string which contains the value in key-value mapping within { }. It is similar to the dictionary in Python.

CSV stands for Comma Separated Values that are used to store tabular data in a text format. Each file line is a row consisting of several fields and each field is separated by a delimiter. This is a very common format in the industry to exchange data between two organizations or different groups in the same organization.

Following are the detailed steps involved in converting JSON to CSV in pandas. In this article, I will cover these steps with several examples.

  1. Create a JSON file
  2. Install pandas
  3. Load the JSON into pandas DataFrame
  4. Apply any transformations you want
  5. Convert JSON to CSV file

1. Quick Examples of Convert JSON to CSV

If you are in a hurry, below are some quick examples of how to convert JSON string or file to CSV file.


# Below are the quick examples

# Example 1: Convert JSON File to CSV File
# pandas read JSON File
df = pd.read_json('courses_data.json')
df.to_csv('courses.csv')


# Example 2: Convert JSON String to CSV File
# Read json from String
json_str = '{"Courses":{"r1":"Spark"},"Fee":{"r1":"25000"},"Duration":{"r1":"50 Days"}}'
df = pd.read_json(json_str)
df.to_csv('courses.csv')

Now let’s follow the steps specified above to convert JSON to CSV file using the Python pandas library.

2. Create a JSON file

First, let’s create a JSON file that you want to convert to a CSV file. Pandas by default support JSON in single lines or in multiple lines. The following file contains JSON in a Dict-like format.

pandas convert json to csv
JSON File

Find this JSON file on GitHub. In case you want to use the JSON string, let’s use the below.


# Read json from String
json_str = '{"Courses":{"r1":"Spark"},"Fee":{"r1":"25000"},"Duration":{"r1":"50 Days"}}'

3. Install Pandas

If you don’t have pandas on your system, install Python Pandas using the pip command. skip this step. If you already have pandas installed.


# Install pandas using pip
pip install pandas
(or)
pip3 install pandas

4. Load the JSON file

You can represent data in a JSON in multiple ways, I have written a complete article on how to read JSON files into DataFrame with several JSON types.

In the below example, I am loading JSON from a file courses_data.json file.


# Pandas read JSON File
df = pd.read_json('courses_data.json')
print(df)

# Output:
#  Courses    Fee Duration
# 0   Spark  25000  50 Days
# 1  Pandas  20000  35 Days
# 2    Java  15000 

To read a JSON from a string variable.


# Read json from String
json_str = '{"Courses":{"r1":"Spark"},"Fee":{"r1":"25000"},"Duration":{"r1":"50 Days"}}'
df = pd.read_json(json_str)
print(df)

# Output:
#   Courses    Fee Duration
# r1   Spark  25000  50 Days

5. Apply Transformations

This is an optional step. In real-time applications, we are often required to transform the data and write the DataFrame result to a CSV file.

The transformation can be changing the data on the DataFrame that was created from JSON for example, replacing NaN with string, replacing empty with NaN, converting one value to another e.t.c

6. Convert JSON to CSV

Now write the Pandas DataFrame to CSV file, with this, we have converted the JSON to a CSV file.


# Write DataFrame to CSV file
df.to_csv()

7. Complete Example


import pandas as pd
# Convert JSON File to CSV File
# pandas read JSON File
df = pd.read_json('courses_data.json')
df.to_csv('courses.csv')

# Convert JSON String to CSV File
# Read json from String
json_str = '{"Courses":{"r1":"Spark"},"Fee":{"r1":"25000"},"Duration":{"r1":"50 Days"}}'
df = pd.read_json(json_str)
df.to_csv('courses.csv')

Frequently Asked Questions on Pandas Convert JSON to CSV

How can I convert a JSON file to a CSV file using Pandas?

You can use the pandas.read_json() function to read the JSON file into a DataFrame and then use the to_csv() method to write the DataFrame to a CSV file

How can I load JSON data into a Pandas DataFrame?

To load JSON data into a Pandas DataFrame, you can use the pd.read_json() function. If your JSON data is in the form of a JSON string, you can use pd.read_json() with the orient parameter.

What is the purpose of the index = False parameter in to_csv()?

The index parameter in the to_csv() method of a Pandas DataFrame is used to control whether the index column should be included in the output CSV file. When index is set to False, it means that the index column will not be written to the CSV file.

What if my JSON data contains missing values (NaN)?

If your JSON data contains missing values represented as NaN (Not a Number), Pandas will handle them appropriately during the conversion to a DataFrame. When you use the pd.read_json() function to read JSON data, missing values will be represented as NaN in the resulting DataFrame.

Can I select specific columns to include in the CSV file?

You can select specific columns to include in the CSV file when using the to_csv() method in Pandas. If you only want to export a subset of columns from your DataFrame, you can do so by providing a list of column names to the columns parameter.

What if my JSON data is nested?

If your JSON data is nested, meaning it contains hierarchical or nested structures, you may need to flatten it before converting it to a Pandas DataFrame. The pd.json_normalize() function in Pandas can be used to flatten nested JSON structures.

Conclusion

In this article, you have learned steps on how to convert JSON to CSV in pandas using the pandas library. steps include installing pandas, loading JSON files, applying transformations (optional), and finally converting to CSV files.

References

Naveen (NNK)

Naveen (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply