Python dict (dictionary) which is a key-value pair can be used to create a pandas DataFrame, In real-time, mostly we create a pandas DataFrame by reading a CSV file or from other sources however some times you may need to create it from a dict (dictionary) object.
Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named Numpy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.
In my last article, I have explained how easy to create a DataFrame from a list object, similarly, I will explain how easy to create pandas DataFrame from different types of dict (dictionary) objects.
Table of contents
- Create pandas DataFrame from dict (dictionary)
- Create from dict with selected columns
- Create dataframe from nested dict object
- Create using pandas.DataFrame.from_dict()
- Create DataFrame from Dict by using Values a Rows
1. Create pandas DataFrame from Dict (Dictionary)
By using the pandas DataFrame constructor you can create a DataFrame from dict (dictionary) object. From dict key-value pair, key represented as column name and values is used for column values in DataFrame.
# Dict object
courses = {'Courses':['Spark','PySpark','Java','PHP'],
'Fee':[20000,20000,15000,10000],
'Duration':['35days','35days','40days','30days']}
# Create DataFrame from dict
df = pd.DataFrame.from_dict(courses)
print(df)
Yields below output.

You can set custom index to DataFrame.
index=['r0','r1','r2','r3']
# Create DataFrame with index
df = pd.DataFrame.from_dict(courses,index=index)
# Set index to existing DataFrame
df.set_index(index, inplace=True)
Yields below output.

2. Create from Dict with Selected Columns
In case you wanted to use only selected columns from the dict to create DataFrame, use columns param and specify the names as a list.
# Create for selected columns
df = pd.DataFrame(courses, columns = ['Courses', 'Fee'])
print(df)
This creates a DataFrame with Courses
and Fee
columns
3. Create DataFrame From Nested Dict Object
Finally, we can also create it from a nested JSON dictionary. This creates a DataFrame with keys as columns and values as indices. As you know this is not right. Now we need to transpose() this by converting rows into columns and columns into rows.
# Creating from nested dictionary
courses = {'r0':{'Courses':'Spark','Fee':'20000','Duration':'35days'},
'r1':{'Courses':'PySpark','Fee':'20000','Duration':'35days'},
'r2':{'Courses':'Java','Fee':'15000','Duration':'40days'},
'r3':{'Courses':'PHP','Fee':'10000','Duration':'30days'}}
df=pd.DataFrame(courses).transpose()
print(df)
4. Create using pandas.DataFrame.from_dict()
pandas.DataFrame.from_dict()
can be used to create a pandas DataFrame from Dict (Dictionary) object. This method takes parameters data
, orient
, dtype
, columns
and returns a DataFrame. Note that this is a class method which means you can access it from DataFrame class without creating its object.
# Syntax of from_dict()
DataFrame.from_dict(data, orient='columns', dtype=None, columns=None)
Now pass the dict object to from_dict() method to create. By default it uses orient=columns
.
# Create DataFrame from dict using from_dict()
df = pd.DataFrame.from_dict(courses)
# Set index to existing DataFrame
df.set_index(index, inplace=True)
print(df)
Yields same output as above.
5. Create DataFrame from Dict by using Values a Rows
In case you have a dict with the list of values and each list you wanted as a row in DataFrame, use orient=index
.
Note that when using the ‘index’ orientation, the column names need to be specified manually in order to have the right column names. Not specifying column names, it creates default names as 0, 1, 2 e.t.c
# Dict object
courses = {'r0':['Spark',20000,'35days'],
'r1':['PySpark',20000,'35days'],
'r2':['Java',15000,'40days'],
'r3':['PHP',10000,'30days'],}
columns=['Courses','Fee','Duration']
# Create from from_dict() using orient=index
df = pd.DataFrame.from_dict(courses, orient='index', columns=columns)
print(df)
6. Complete Example of pandas create DataFrame from Dict
Below is complete examples of how to create DataFrame from the dictionary.
import pandas as pd
# Dict object
courses = {'Courses':['Spark','PySpark','Java','PHP'],
'Fee':[20000,20000,15000,10000],
'Duration':['35days','35days','40days','30days']}
# Create DataFrame from dict
df = pd.DataFrame.from_dict(courses)
print(df)
# Create for selected columns
df = pd.DataFrame(courses, columns = ['Courses', 'Fee'])
print(df)
# Create from from_dict()
df = pd.DataFrame.from_dict(courses)
print(df)
# Dict object
courses = {'r0':['Spark',20000,'35days'],
'r1':['PySpark',20000,'35days'],
'r2':['Java',15000,'40days'],
'r3':['PHP',10000,'30days'],}
columns=['Courses','Fee','Duration']
# Create from from_dict() using orient=index
df = pd.DataFrame.from_dict(courses, orient='index', columns=columns)
print(df)
# Creating from nested dictionary
courses = {'r0':{'Courses':'Spark','Fee':'20000','Duration':'35days'},
'r1':{'Courses':'PySpark','Fee':'20000','Duration':'35days'},
'r2':{'Courses':'Java','Fee':'15000','Duration':'40days'},
'r3':{'Courses':'PHP','Fee':'10000','Duration':'30days'}}
df=pd.DataFrame(courses).transpose()
print(df)
Conclusion
In this article, you have learned to create a DataFrame from the dict by using the DataFames constructor and from_dict() method. Also learned how to add columns and indexes while creating a DataFrame and to this existing one.
Related Articles
- Pandas Convert List of Dictionaries to DataFrame
- Pandas Convert JSON to DataFrame
- pandas Create DataFrame From List
- Create Pandas DataFrame With Examples
- Pandas Create Conditional Column in DataFrame
- Pandas Create Test and Train Samples from DataFrame
- Pandas Create New DataFrame By Selecting Specific Columns
- Create a Set From a Series in Pandas