• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:13 mins read
You are currently viewing Pandas – Convert DataFrame to Dictionary (Dict)

pandas.DataFrame.to_dict() method is used to convert DataFrame to a Dictionary (dict) object. Use this method If you have a DataFrame and want to convert it to a Python dictionary (dict) object by converting column names as keys and the data for each row as values.

This method takes the param orient that is used the specify the output format. It takes values 'dict''list''series''split''records', and 'index'. In this article, I will explain each of these with examples.

Related: You can create DataFrame using a Python dictionary.

Syntax of pandas.DataFrame.to_dict() method –


# to_dict() method syntax
DataFrame.to_dict(orient='dict', into=<class 'dict'>)

1. Quick Examples of Converting DataFrame to Dictionary

If you are in a hurry, below are some quick examples of how to convert Pandas DataFrame to the dictionary (dict).


# Below are some quick examples

# Example 1: Use DataFrame.to_dict() to convert DataFrame to dictionary
dict = df.to_dict()

# Example 2: Use dict as orient
dict = df.to_dict('dict')

# Example 3: Convert DataFrame to dictionary using dict() and zip() methods
dict = dict([(i,[x,y,z ]) for i, x,y,z in zip(df.Courses, df.Fee,df.Duration,df.Discount)])

# Example 4: Using dict() and zip() methods
dict = dict([(i,[x,y,z]) for i,x,y,z in zip(df['Courses'], df['Fee'],df['Duration'],df['Discount'])])

Now, let’s create a DataFrame with a few rows and columns, execute these examples, and validate the results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


# Create DataFrame
import pandas as pd
technologies = [
            ("Spark", 22000,'40days',1500.0),
            ("PySpark",25000,'50days',3000.0),
            ("Hadoop",23000,'30days',2500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print("Create DataFrame:\n", df) 

Yields below output.

pandas DataFrame convert dictionary

2. Use DataFrame.to_dict() to Convert DataFrame to Dictionary

To convert Pandas DataFrame to a Dictionary object, use the to_dict() method, which takes orient as dict by default which returns the DataFrame in the format {column -> {index -> value}}. When no orient is specified, to_dict() returns in this format.


# Use DataFrame.to_dict() to convert DataFrame to dictionary
dict = df.to_dict()
print("After converting a DataFrame to dictionary:\n", dict)

# Using orient as dict
dict = df.to_dict('dict')
print("After converting a DataFrame to dictionary:\n", dict)

Yields below output.

pandas DataFrame convert dictionary

3. Convert DataFrame to Dictionary With Column as Key

list orient – Each column is converted to a list and the lists are added to a dictionary as values to column labels. To get the dict in format {column -> [values]}, specify with the string literal “list” for the parameter orient.

Related: You can convert a list of dictionaries to a DataFrame.


# Using orient as list
dict = df.to_dict('list')
print("After converting a DataFrame to dictionary:\n", dict)

Yields below output.


# Output:
# After converting a DataFrame to dictionary:
{'Courses': ['Spark', 'PySpark', 'Hadoop'], 
 'Fee': [22000, 25000, 23000], 
 'Duration': ['40days', '50days', '30days'], 
 'Discount': [1500.0, 3000.0, 2500.0]}

4. Convert DataFrame to Dictionary of Series

series orient – Each column is converted to a Pandas Series, and the series is represented as values.

To get the dict in format {column -> Series(values)}, specify with the string literal “series” for the parameter orient.


# Using orient as series
df2 = df.to_dict('series')
print("After converting a DataFrame to dictionary:\n", df2)

Yields below output.


# Output:
# After converting a DataFrame to dictionary:
{'Courses': 0      Spark
1    PySpark
2     Hadoop
Name: Courses, dtype: object, 'Fee': 0    22000
1    25000
2    23000
Name: Fee, dtype: int64, 'Duration': 0    40days
1    50days
2    30days
Name: Duration, dtype: object, 'Discount': 0    1500.0
1    3000.0
2    2500.0
Name: Discount, dtype: float64}

5. Convert DataFrame to Dictionary of Split

split orient – Each row is converted to a list and they are wrapped in another list and indexed with the key “data”.

To get the dict in format {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}, specify with the string literal “split” for the parameter orient.


# Using orient as split
df2 = df.to_dict('split')
print("After converting a DataFrame to dictionary:\n", df2)

Yields below output.


# Output:
# After converting a DataFrame to dictionary:
{'index': [0, 1, 2], 'columns': ['Courses', 'Fee', 'Duration', 'Discount'], 'data': [['Spark', 22000, '40days', 1500.0], ['PySpark', 25000, '50days', 3000.0], ['Hadoop', 23000, '30days', 2500.0]]}

6. Convert DataFrame to Dictionary of Records

records orient – Each column is converted to a dictionary where the column name is key and the column value for each row is a value.

In order to get the list-like format [{column -> value}, … , {column -> value}], specify with the string literal “records” for the parameter orient.


# Using orient as records
df2 = df.to_dict('records')
print("After converting a DataFrame to dictionary:\n", df2)

Yields below output.


# Output:
# After converting a DataFrame to dictionary:
[{'Courses': 'Spark', 'Fee': 22000, 'Duration': '40days', 'Discount': 1500.0}, {'Courses': 'PySpark', 'Fee': 25000, 'Duration': '50days', 'Discount': 3000.0}, {'Courses': 'Hadoop', 'Fee': 23000, 'Duration': '30days', 'Discount': 2500.0}]

7. Convert DataFrame to Dictionary by Index

index orient – Each column is converted to a dictionary where the column elements are stored against the column name.

In order to get the dict in format {index -> {column -> value}}, specify with the string literal “index” for the parameter orient.


# Using orient as index
df2 = df.to_dict('index')
print("After converting a DataFrame to dictionary:\n", df2)

Yields below output.


# Output:
# After converting a DataFrame to dictionary:
{0: {'Courses': 'Spark', 'Fee': 22000, 'Duration': '40days', 'Discount': 1500.0}, 1: {'Courses': 'PySpark', 'Fee': 25000, 'Duration': '50days', 'Discount': 3000.0}, 2: {'Courses': 'Hadoop', 'Fee': 23000, 'Duration': '30days', 'Discount': 2500.0}}

8. Convert DataFrame to Dictionary Using dict() and zip() Methods


# Convert DataFrame to dictionary using dict() and zip() methods
df2 = dict([(i,[x,y,z ]) for i, x,y,z in zip(df.Courses, df.Fee,df.Duration,df.Discount)])
print("After converting a DataFrame to dictionary:\n", df2)

# Using dict() and zip() methods
df2 = dict([(i,[x,y,z]) for i,x,y,z in zip(df['Courses'], df['Fee'],df['Duration'],df['Discount'])])
print("After converting a DataFrame to dictionary:\n", df2)

Yields below output.


# Output:
# After converting a DataFrame to dictionary:
{'Spark': [22000, '40days', 1500.0], 'PySpark': [25000, '50days', 3000.0], 'Hadoop': [23000, '30days', 2500.0]}

9. Complete Example For Convert DataFrame to Dictionary


import pandas as pd
technologies = [
            ("Spark", 22000,'40days',1500.0),
            ("PySpark",25000,'50days',3000.0),
            ("Hadoop",23000,'30days',2500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df)                 

# Use DataFrame.to_dict() to convert DataFrame to dictionary
dict = df.to_dict()
print(dict )

# Use dict orient
dict = df.to_dict('dict')
print(dict )

# Use list orient
dict = df.set_index('Courses').T.to_dict('list')
print(dict )

# Use series orient
dict = df.to_dict('series')
print(dict )

# Use split orient
dict = df.to_dict('split')
print(dict )

# Use list orient
dict = df.to_dict('list')
print(dict )

# Use records orient
dict = df.to_dict('records')
print(dict )

# Use index orient
dict = df.to_dict('index')
print(dict )

# Convert DataFrame to dictionary using dict() and zip() methods
dict = dict([(i,[x,y,z ]) for i, x,y,z in zip(df.Courses, df.Fee,df.Duration,df.Discount)])
print(dict )

# Using dict() and zip() methods
dict = dict([(i,[x,y,z]) for i,x,y,z in zip(df['Courses'], df['Fee'],df['Duration'],df['Discount'])])
print(dict )

Conclusion

You have learned pandas.DataFrame.to_dict() method is used to convert DataFrame to a Dictionary (dict) object by converting column names as keys and the data for each row as values. Also learned when we pass the param orient how it can be specified in the output format.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has One Comment

  1. Anonymous

    good article thanks a lot!

Comments are closed.