Pandas Convert JSON to DataFrame

  • Post author:
  • Post category:Pandas / Python
  • Post last modified:January 22, 2023
Spread the love

You can convert JSON to pandas DataFrame by using json_normalize(), read_json() and from_dict() functions. Some of these methods are also used to extract data from JSON files and store them as DataFrame. JSON stands for JavaScript object notation. JSON is used for sharing data between servers and web applications.

In this article, I will cover how to convert JSON to DataFrame by using json_normalize(), read_json() and DataFrame.from_dict() functions.

1. Quick Examples of Convert JSON to DataFrame

If you are in a hurry, below are some quick examples of how to convert JSON to DataFrame.


# Use json_normalize() to convert JSON to DataFrame
dict= json.loads(data)
df = json_normalize(dict['technologies']) 

# Convert JSON to DataFrame Using read_json()
df2 = pd.read_json(jsonStr, orient ='index')

# Use pandas.DataFrame.from_dict() to Convert JSON to DataFrame
dict= json.loads(data)
df2 = pd.DataFrame.from_dict(dict, orient="index")

Now let’s see with an example. first, create a string that contains JSON.


import pandas as pd
import json
from pandas import json_normalize
data = '''
{
"technologies":
         [
         { "Courses": "Spark", "Fee": 22000,"Duration":"40Days"},
         { "Courses": "PySpark","Fee": 25000,"Duration":"60Days"},
         { "Courses": "Hadoop", "Fee": 23000,"Duration":"50Days"}
         ],
"status": ["ok"]
}
'''
print(data)

2. Pandas Convert JSON String to DataFrame

The json_normalize() function is used to convert the JSON string into a DataFrame. You can load JSON string using json.loads() function. Pass JSON object to json_normalize(), which returns a Pandas DataFrame. In order to load JSON data, I am using the JSON python library.


# Use json_normalize() to convert JSON to DataFrame
dict = json.loads(data)
df2 = json_normalize(dict['technologies']) 
print(df2)

Yields below output.


   Courses    Fee Duration
0    Spark  22000   40Days
1  PySpark  25000   60Days
2   Hadoop  23000   50Days

3. Read JSON File into DataFrame

You can convert JSON to Pandas DataFrame by simply using read_json(). Just pass JSON string to the function. It takes multiple parameters, for our case I am using orient that specifies the format of JSON string. This function is also used to read JSON files into pandas DataFrame.


import pandas as pd
jsonStr = '''{"Index0":{"Courses": "Pandas","Discount": "1200"},
           "Index1":{"Courses": "Hadoop","Discount": "1500"},
           "Index2":{"Courses": "Spark","Discount": "1800"}
          }'''
# Convert JSON to DataFrame Using read_json()
df2 = pd.read_json(jsonStr, orient ='index')
print(df2)

Yields below output.


       Courses  Discount
Index0  Pandas      1200
Index1  Hadoop      1500
Index2   Spark      1800

4. Use DataFrame.from_dict() to Convert JSON to DataFrame

First load JSON string to a dict object and then use pd.DataFrame.from_dict(data, orient="index") to create a DataFrame from the dict object where keys from the dict are used as an index. Setting orient param to "columns" creates a DataFrame with keys from data as its column names.


import pandas as pd
import json
from pandas import json_normalize
json_string = '{ "Courses": "Spark", "Fee": 22000,"Duration":"40Days"}'
data = json.loads(json_string)

# Use pandas.DataFrame.from_dict() to Convert JSON to DataFrame
df2 = pd.DataFrame.from_dict(data, orient="index")
print(df2)

Yields below output.


               0
Courses    Spark
Fee        22000
Duration  40Days

Conclusion

In this article, you have learned how to convert JSON to DataFrame by using json_normalize(), read_json() and DataFrame.from_dict() methods and with more examples.

Happy Learning !!

References

Leave a Reply

You are currently viewing Pandas Convert JSON to DataFrame