Pandas – Convert DataFrame to JSON String

  • Post author:
  • Post category:Pandas / Python
  • Post last modified:January 28, 2023
Spread the love

You can convert pandas DataFrame to JSON string by using DataFrame.to_json() method. This method takes a very important param orient which accepts values ‘columns‘, ‘records‘, ‘index‘, ‘split‘, ‘table‘, and ‘values‘. JSON stands for JavaScript Object Notation. It is used to represent structured data. You can use it especially for sharing data between servers and web applications.

In this article, I will cover how to convert pandas DataFrame to JSON String. Pandas DataFrame.to_json() to convert a DataFrame to JSON string or store it to an external JSON file. The JSON format depends on what value you use for orient parameter.

1. Quick Examples of Convert DataFrame To JSON String

If you are in a hurry, below are some quick examples of how to convert DataFrame to JSON String.


# Below are quick example

# Use DataFrame.to_json() to orient = 'columns' 
df2 = df.to_json(orient = 'columns')  

# Convert Pandas DataFrame To JSON Using orient = 'records' 
df2 = df.to_json(orient = 'records')

# Convert Pandas DataFrame To JSON Using orient = 'index'
df2 = df.to_json(orient ='index')

# Convert Pandas DataFrame To JSON Using orient = 'split'
df2 = df.to_json(orient = 'split')

# Convert Pandas DataFrame To JSON Using orient = 'table'
df2 = df.to_json(orient = 'table')

# Convert Pandas DataFrame To JSON Using orient ='values'
df2 = df.to_json(orient ='values')

Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


import pandas as pd
technologies = [
            ("Spark", 22000,'30days',1000.0),
            ("PySpark",25000,'50days',2300.0),
            ("Hadoop",23000,'55days',1500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df)

Yields below output.


   Courses    Fee Duration  Discount
0    Spark  22000   30days    1000.0
1  PySpark  25000   50days    2300.0
2   Hadoop  23000   55days    1500.0

2. Use DataFrame.to_json() to orient = ‘columns’

orient='columns' is a default value, when not specify the DataFrame.to_json() function uses columns as orient and returns JSON string like a dict {column -> {index -> value}} format.


# Use DataFrame.to_json() to orient = 'columns' 
df2 = df.to_json(orient = 'columns')
print(df2)

Yields below output.


{"Courses":{"0":"Spark","1":"PySpark","2":"Hadoop"},"Fee":{"0":22000,"1":25000,"2":23000},"Duration":{"0":"30days","1":"50days","2":"55days"},"Discount":{"0":1000.0,"1":2300.0,"2":1500.0}}

3. Convert DataFrame to JSON Using orient = ‘records’

Use orient='records' to convert DataFrame to JSON in format  [{column -> value}, … , {column -> value}]


# Convert Pandas DataFrame To JSON Using orient = 'records' 
df2 = df.to_json(orient = 'records')
print(df2)

Yields below output.


[{"Courses":"Spark","Fee":22000,"Duration":"30days","Discount":1000.0},{"Courses":"PySpark","Fee":25000,"Duration":"50days","Discount":2300.0},{"Courses":"Hadoop","Fee":23000,"Duration":"55days","Discount":1500.0}]

4. Using orient = ‘index’

use orient='index' to get JSON string in format dict like {index -> {column -> value}}


# Convert Pandas DataFrame To JSON Using orient = 'index'
df2 = df.to_json(orient ='index')
print(df2)

Yields below output.


{"0":{"Courses":"Spark","Fee":22000,"Duration":"30days","Discount":1000.0},"1":{"Courses":"PySpark","Fee":25000,"Duration":"50days","Discount":2300.0},"2":{"Courses":"Hadoop","Fee":23000,"Duration":"55days","Discount":1500.0}}

5. Using orient = ‘split’

You can use orient='split' to convert DataFrame to JSON in format dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}.


# Convert Pandas DataFrame To JSON Using orient = 'split'
df2 = df.to_json(orient = 'split')
print(df2)

Yields below output.


{"columns":["Courses","Fee","Duration","Discount"],"index":[0,1,2],"data":[["Spark",22000,"30days",1000.0],["PySpark",25000,"50days",2300.0],["Hadoop",23000,"55days",1500.0]]}

6. Using orient = ‘table’

You can use orient = ‘table’ to convert DataFrame to JSON with format  dict like {‘schema’: {schema}, ‘data’: {data}}.


# Convert Pandas DataFrame To JSON Using orient = 'table'
df2 = df.to_json(orient = 'table')
print(df2)

Yields below output.


{"schema":{"fields":[{"name":"index","type":"integer"},{"name":"Courses","type":"string"},{"name":"Fee","type":"integer"},{"name":"Duration","type":"string"},{"name":"Discount","type":"number"}],"primaryKey":["index"],"pandas_version":"0.20.0"},"data":[{"index":0,"Courses":"Spark","Fee":22000,"Duration":"30days","Discount":1000.0},{"index":1,"Courses":"PySpark","Fee":25000,"Duration":"50days","Discount":2300.0},{"index":2,"Courses":"Hadoop","Fee":23000,"Duration":"55days","Discount":1500.0}]}

7. Using orient =’values’

You can also use orient =’values’ to get DataFrame as an array of values.


# Convert Pandas DataFrame To JSON Using orient ='values'
df2 = df.to_json(orient ='values')
print(df2)

Yields below output.


[["Spark",22000,"30days",1000.0],["PySpark",25000,"50days",2300.0],["Hadoop",23000,"55days",1500.0]]

8. Complete Example For Convert DataFrame To JSON


import pandas as pd
technologies = [
            ("Spark", 22000,'30days',1000.0),
            ("PySpark",25000,'50days',2300.0),
            ("Hadoop",23000,'55days',1500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df)
 
# Use DataFrame.to_json() to orient = 'columns' 
df2 = df.to_json(orient = 'columns')
print(df2)   

# Convert Pandas DataFrame To JSON Using orient = 'records' 
df2 = df.to_json(orient = 'records')
print(df2)

# Convert Pandas DataFrame To JSON Using orient = 'index'
df2 = df.to_json(orient ='index')
print(df2)

# Convert Pandas DataFrame To JSON Using orient = 'split'
df2 = df.to_json(orient = 'split')
print(df2)

# Convert Pandas DataFrame To JSON Using orient = 'table'
df2 = df.to_json(orient = 'table')
print(df2)

# Convert Pandas DataFrame To JSON Using orient ='values'
df2 = df.to_json(orient ='values')
print(df2)

Conclusion

In this article, you have learned how to convert pandas DataFrame to JSON by using DataFrame.to_json() method and with more examples. For mare params use to_json() method from pandas reference

Happy Learning !!

References

Leave a Reply

You are currently viewing Pandas – Convert DataFrame to JSON String