How to create a Pandas DataFrame from List? Most of the time we create a pandas DataFrame by reading a CSV file or from other sources however sometimes you may need to create it from a list, multiple lists, or even a list of lists. In this article, I will cover creating a DataFrame from all these different ways with examples.
Table of contents
- Create DataFrame from list
- Create from multiple lists
- Create from list of lists
- Create from Dict of lists
- Complete Example
Key Points –
- Use the
DataFrame()
constructor from the Pandas library to create a DataFrame from a list. - Pass the list as an argument to the
data
parameter within theDataFrame()
constructor. - Ensure the consistency of data dimensions and alignment when constructing DataFrames from lists to avoid errors.
- Ensure that the length and structure of the nested lists align with the desired DataFrame structure.
- Optionally, provide additional parameters such as column names using the
columns
parameter to customize the DataFrame’s structure. - The list should typically contain nested lists or other iterable objects representing rows of data.
1. Create Pandas DataFrame from List
One simple way to create Pandas DataFrame from a list is by using the DataFrame constructor. DataFrame constructor takes several optional parameters that are used to specify the characteristics of the DataFrame.
First, let’s create a list with some values, pass this list object to the DataFrame constructor as data
argument. Note that you don’t have to explicitly specify the data argument while creating.
import pandas as pd
technologies = ['Spark','PySpark','Java','PHP']
# Create DataFrame from list
df=pd.DataFrame(technologies)
print(df)
Yields below output.
2. Create Pandas DataFrame from Multiple Lists
Now let’s see how to create a Pandas DataFrame from multiple lists, since we are not giving labels to columns and rows(index), DataFrame by default assigns incremental sequence numbers as labels to both rows and columns.
# Create DataFrame from multiple lists
technologies = ['Spark','PySpark','Java','PHP']
fee = [20000,20000,15000,10000]
duration = ['35days','35days','40days','30days']
df = pd.DataFrame(list(zip(technologies,fee,duration)))
print(df)
Yields below output.
Column names with sequence numbers don’t make sense as it’s hard to identify what data holds on each column hence, it is always best practice to provide column names that identify the data it holds. Use column
param and index
param to provide column & row labels respectively to the DataFrame.
Alternatively, you can also add column names to DataFrame and set the index using pandas.DataFrame.set_index() method.
# Create from multiple lists
columns=['Courses','Fee','Duration']
index=['r0','r1','r2','r3']
df = pd.DataFrame(list(zip(technologies, fee,duration)),
columns=columns,index=index )
print(df)
Yields below output.
3. Create DataFrame from List of List
When you have records in multiple lists, ideally each row representing as a list, you can create these all lists into a multi-dimensional list and create a DataFrame from it as shown in the below example.
# Creating from multi list (list of list)
courses = [['Spark','20000', '35days'],['Pyspark','20000','35days'],
['Java','15000','40days'],['PHP','10000','30days']]
df = pd.DataFrame(courses,columns=columns,index=index )
print(df)
This results in the same output as above.
4. Create from Dict of List
The below example demonstrates how to create it from the dictionary object that contains lists as values.
# Creating from dict of list
courses = {'Courses':['Spark','PySpark','Java','PHP'],
'Fee':[20000,20000,15000,10000],
'Duration':['35days','35days','40days','30days']}
df = pd.DataFrame(courses,index=index )
print(df)
Yields the same output as above.
5. Complete Example of Create DataFrame from List
Below is complete examples of how to create DataFrame from the list, multiple lists, two-dimensional e.t.c
import pandas as pd
# Create DataFrame from list
technologies = ['Spark','PySpark','Java','PHP']
df=pd.DataFrame(technologies)
print(df)
# Create DataFrame from multiple lists
technologies = ['Spark','PySpark','Java','PHP']
fee = [20000,20000,15000,10000]
duration = ['35days','35days','40days','30days']
df = pd.DataFrame(list(zip(technologies, fee,duration)))
print(df)
# Add column names and index labels
columns=['Courses','Fee','Duration']
index=['r0','r1','r2','r3']
df = pd.DataFrame(list(zip(technologies, fee,duration)),
columns=columns,index=index )
print(df)
# Creating from multi list (list of list)
courses = [['Spark','20000', '35days'],['Pyspark','20000','35days'],
['Java','15000','40days'],['PHP','10000','30days']]
df = pd.DataFrame(courses,columns=columns,index=index )
print(df)
Frequently Asked Questions on Pandas Create DataFrame From List
The list should typically contain nested lists or other iterable objects representing rows of data. These nested lists align with the rows and columns of the desired DataFrame.
You can provide additional parameters such as column names using the columns
parameter to customize the DataFrame’s structure.
It’s essential to ensure the consistency of data dimensions and alignment. The length and structure of nested lists should align with the desired DataFrame structure to avoid errors during creation.
Besides using the DataFrame()
constructor, you can also use functions like from_records()
or from_dict()
to create DataFrames from lists or dictionaries, respectively.
Conclusion
In this article, you have learned to create a Pandas DataFrame from the list, multiple lists, and two-dimensional lists by using the constructor. Also learned how to add columns and indexes while creating a DataFrame.
Related Articles
- Pandas Create DataFrame From Dict (Dictionary)
- Pandas Create New DataFrame By Selecting Specific Columns
- Pandas Create Test and Train Samples from DataFrame
- Pandas Create Conditional Column in DataFrame
- Pandas Create Empty DataFrame
- Pandas Get Total / Sum of Columns
- Create a Set From a Series in Pandas
- How to Create Pandas Pivot Multiple Columns
- Create Pandas DataFrame With Examples
- How to Union Pandas DataFrames using Concat?
- Pandas Create Conditional Column in DataFrame
- Pandas – Create DataFrame From Multiple Series