Pandas – Convert DataFrame to Dictionary (Dict)

  • Post author:
  • Post category:Pandas / Python
  • Post last modified:January 20, 2023
Spread the love

pandas.DataFrame.to_dict() method is used to convert DataFrame to Dictionary (dict) object. Use this method If you have a DataFrame and want to convert it to python dictionary (dict) object by converting column names as keys and the data for each row as values.

This method takes param orient which is used the specify the output format. It takes values 'dict''list''series''split''records', and 'index'. In this article, I will explain each of these with examples.

Syntax of pandas.DataFrame.to_dict() method –


# to_dict() method syntax
DataFrame.to_dict(orient='dict', into=<class 'dict'>)

1. Quick Examples of Convert DataFrame to Dictionary

If you are in a hurry, below are some quick examples of how to convert pandas DataFrame to the dictionary (dict).


# Below are quick example

# Use DataFrame.to_dict() to convert DataFrame to dictionary
dict = df.to_dict()

# Use dict as orient
dict = df.to_dict('dict')

# Convert DataFrame to dictionary using dict() and zip() methods
dict = dict([(i,[x,y,z ]) for i, x,y,z in zip(df.Courses, df.Fee,df.Duration,df.Discount)])

# Using dict() and zip() methods
dict = dict([(i,[x,y,z]) for i,x,y,z in zip(df['Courses'], df['Fee'],df['Duration'],df['Discount'])])

Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


import pandas as pd
technologies = [
            ("Spark", 22000,'40days',1500.0),
            ("PySpark",25000,'50days',3000.0),
            ("Hadoop",23000,'30days',2500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df) 

Yields below output.


   Courses    Fee Duration  Discount
0    Spark  22000   40days    1500.0
1  PySpark  25000   50days    3000.0
2   Hadoop  23000   30days    2500.0

2. Use DataFrame.to_dict() to Convert DataFrame to Dictionary

To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}}. When no orient is specified, to_dict() returns in this format.


# Use DataFrame.to_dict() to convert DataFrame to dictionary
dict = df.to_dict()
print(dict)

# Using orient as dict
dict = df.to_dict('dict')
print(dict)

Yields below output.


{'Courses': {0: 'Spark', 1: 'PySpark', 2: 'Hadoop'}, 'Fee': {0: 22000, 1: 25000, 2: 23000}, 'Duration': {0: '40days', 1: '50days', 2: '30days'}, 'Discount': {0: 1500.0, 1: 3000.0, 2: 2500.0}}

3. Convert DataFrame to Dictionary With Column as Key

list orient – Each column is converted to a list and the lists are added to a dictionary as values to column labels.

To get the dict in format {column -> [values]}, specify with the string literal “list” for the parameter orient.


# Using orient as list
dict = df.to_dict('list')
print(dict )

Yields below output.


{'Courses': ['Spark', 'PySpark', 'Hadoop'], 
 'Fee': [22000, 25000, 23000], 
 'Duration': ['40days', '50days', '30days'], 
 'Discount': [1500.0, 3000.0, 2500.0]}

4.Convert DataFrame to Dictionary of Series

series orient – Each column is converted to a pandas Series, and the series are represented as values.

To get the dict in format {column -> Series(values)}, specify with the string literal “series” for the parameter orient.


# Using orient as series
df2 = df.to_dict('series')
print(df2)

Yields below output.


{'Courses': 0      Spark
1    PySpark
2     Hadoop
Name: Courses, dtype: object, 'Fee': 0    22000
1    25000
2    23000
Name: Fee, dtype: int64, 'Duration': 0    40days
1    50days
2    30days
Name: Duration, dtype: object, 'Discount': 0    1500.0
1    3000.0
2    2500.0
Name: Discount, dtype: float64}

5. Convert DataFrame to Dictionary of Split

split orient – Each row is converted to a list and they are wrapped in another list and indexed with the key “data”.

To get the dict in format {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}, specify with the string literal “split” for the parameter orient.


# Using orient as split
df2 = df.to_dict('split')
print(df2)

Yields below output.


{'index': [0, 1, 2], 'columns': ['Courses', 'Fee', 'Duration', 'Discount'], 'data': [['Spark', 22000, '40days', 1500.0], ['PySpark', 25000, '50days', 3000.0], ['Hadoop', 23000, '30days', 2500.0]]}

6. Convert DataFrame to Dictionary of Records

records orient – Each column is converted to a dictionary where the column name as key and column value for each row is a value.

In order to get the list like format [{column -> value}, … , {column -> value}], specify with the string literal “records” for the parameter orient.


# Using orient as records
df2 = df.to_dict('records')
print(df2)

Yields below output.


[{'Courses': 'Spark', 'Fee': 22000, 'Duration': '40days', 'Discount': 1500.0}, {'Courses': 'PySpark', 'Fee': 25000, 'Duration': '50days', 'Discount': 3000.0}, {'Courses': 'Hadoop', 'Fee': 23000, 'Duration': '30days', 'Discount': 2500.0}]

7. Convert DataFrame to Dictionary by Index

index orient – Each column is converted to a dictionary where the column elements are stored against the column name.

In order to get the dict in format {index -> {column -> value}}, specify with the string literal “index” for the parameter orient.


# Using orient as index
df2 = df.to_dict('index')
print(df2)

Yields below output.


{0: {'Courses': 'Spark', 'Fee': 22000, 'Duration': '40days', 'Discount': 1500.0}, 1: {'Courses': 'PySpark', 'Fee': 25000, 'Duration': '50days', 'Discount': 3000.0}, 2: {'Courses': 'Hadoop', 'Fee': 23000, 'Duration': '30days', 'Discount': 2500.0}}

8. Convert DataFrame to Dictionary Using dict() and zip() Methods


# Convert DataFrame to dictionary using dict() and zip() methods
df2 = dict([(i,[x,y,z ]) for i, x,y,z in zip(df.Courses, df.Fee,df.Duration,df.Discount)])
print(df2)

# Using dict() and zip() methods
df2 = dict([(i,[x,y,z]) for i,x,y,z in zip(df['Courses'], df['Fee'],df['Duration'],df['Discount'])])
print(df2)

Yields below output.


{'Spark': [22000, '40days', 1500.0], 'PySpark': [25000, '50days', 3000.0], 'Hadoop': [23000, '30days', 2500.0]}

9. Complete Example For Convert DataFrame to Dictionary


import pandas as pd
technologies = [
            ("Spark", 22000,'40days',1500.0),
            ("PySpark",25000,'50days',3000.0),
            ("Hadoop",23000,'30days',2500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df)                 

# Use DataFrame.to_dict() to convert DataFrame to dictionary
dict = df.to_dict()
print(dict )

# Use dict orient
dict = df.to_dict('dict')
print(dict )

# Use list orient
dict = df.set_index('Courses').T.to_dict('list')
print(dict )

# Use series orient
dict = df.to_dict('series')
print(dict )

# Use split orient
dict = df.to_dict('split')
print(dict )

# Use list orient
dict = df.to_dict('list')
print(dict )

# Use records orient
dict = df.to_dict('records')
print(dict )

# Use index orient
dict = df.to_dict('index')
print(dict )

# Convert DataFrame to dictionary using dict() and zip() methods
dict = dict([(i,[x,y,z ]) for i, x,y,z in zip(df.Courses, df.Fee,df.Duration,df.Discount)])
print(dict )

# Using dict() and zip() methods
dict = dict([(i,[x,y,z]) for i,x,y,z in zip(df['Courses'], df['Fee'],df['Duration'],df['Discount'])])
print(dict )

Conclusion

You have learned pandas.DataFrame.to_dict() method is used to convert DataFrame to Dictionary (dict) object. Use this method to convert DataFrame to python dictionary (dict) object by converting column names as keys and the data for each row as values.

This method takes param orient which is used the specify the output format. It takes values 'dict''list''series''split''records', and 'index'. In this article, I will explain each of these with examples.

Happy Learning !!

References

Leave a Reply

This Post Has One Comment

  1. Anonymous

    good article thanks a lot!

You are currently viewing Pandas – Convert DataFrame to Dictionary (Dict)