You can convert JSON to pandas DataFrame by using json_normalize()
, read_json()
and from_dict()
functions. Some of these methods are also used to extract data from JSON files and store them as DataFrame. JSON
stands for JavaScript object notation
. JSON is used for sharing data between servers and web applications.
In this article, I will cover how to convert JSON to DataFrame by using json_normalize()
, read_json()
and DataFrame.from_dict()
functions.
1. Quick Examples of Convert JSON to DataFrame
If you are in a hurry, below are some quick examples of how to convert JSON to DataFrame.
# Use json_normalize() to convert JSON to DataFrame
dict= json.loads(data)
df = json_normalize(dict['technologies'])
# Convert JSON to DataFrame Using read_json()
df2 = pd.read_json(jsonStr, orient ='index')
# Use pandas.DataFrame.from_dict() to Convert JSON to DataFrame
dict= json.loads(data)
df2 = pd.DataFrame.from_dict(dict, orient="index")
Now let’s see with an example. first, create a string that contains JSON.
import pandas as pd
import json
from pandas import json_normalize
data = '''
{
"technologies":
[
{ "Courses": "Spark", "Fee": 22000,"Duration":"40Days"},
{ "Courses": "PySpark","Fee": 25000,"Duration":"60Days"},
{ "Courses": "Hadoop", "Fee": 23000,"Duration":"50Days"}
],
"status": ["ok"]
}
'''
print(data)
2. Pandas Convert JSON String to DataFrame
The json_normalize()
function is used to convert the JSON string into a DataFrame. You can load JSON string using json.loads()
function. Pass JSON object to json_normalize()
, which returns a Pandas DataFrame. In order to load JSON data, I am using the JSON python library.
# Use json_normalize() to convert JSON to DataFrame
dict = json.loads(data)
df2 = json_normalize(dict['technologies'])
print(df2)
Yields below output.
# Output:
Courses Fee Duration
0 Spark 22000 40Days
1 PySpark 25000 60Days
2 Hadoop 23000 50Days
3. Read JSON File into DataFrame
You can convert JSON to Pandas DataFrame by simply using read_json()
. Just pass JSON string to the function. It takes multiple parameters, for our case I am using orient
that specifies the format of JSON string. This function is also used to read JSON files into pandas DataFrame.
import pandas as pd
jsonStr = '''{"Index0":{"Courses": "Pandas","Discount": "1200"},
"Index1":{"Courses": "Hadoop","Discount": "1500"},
"Index2":{"Courses": "Spark","Discount": "1800"}
}'''
# Convert JSON to DataFrame Using read_json()
df2 = pd.read_json(jsonStr, orient ='index')
print(df2)
Yields below output.
# Output:
Courses Discount
Index0 Pandas 1200
Index1 Hadoop 1500
Index2 Spark 1800
4. Use DataFrame.from_dict() to Convert JSON to DataFrame
First load JSON string to a dict object and then use pd.DataFrame.from_dict(data, orient="index")
to create a DataFrame from the dict object where keys from the dict are used as an index. Setting orient
param to "columns"
creates a DataFrame with keys from data as its column names.
# Use DataFrame.from_dict() to Convert JSON to DataFrame
import pandas as pd
import json
from pandas import json_normalize
json_string = '{ "Courses": "Spark", "Fee": 22000,"Duration":"40Days"}'
data = json.loads(json_string)
# Use pandas.DataFrame.from_dict() to Convert JSON to DataFrame
df2 = pd.DataFrame.from_dict(data, orient="index")
print(df2)
Yields below output.
# Output:
0
Courses Spark
Fee 22000
Duration 40Days
Conclusion
In this article, you have learned how to convert JSON to DataFrame by using json_normalize()
, read_json()
and DataFrame.from_dict()
methods and with more examples.
Happy Learning !!
Related Articles
- Pandas Sum DataFrame Columns With Examples
- How to Print Pandas DataFrame without Index
- Rename Index Values of Pandas DataFrame
- Pandas Convert Datetime to Date Column
- Rename Specific Columns in Pandas
- Pandas Convert Column to Int in DataFrame
- Pandas Convert String to Integer
- Pandas Convert Column to Numpy Array