I will explain how to create an empty DataFrame in pandas with or without column names (column names) and Indices. Below I have explained one of the many scenarios where you would need to create an empty DataFrame.
While working with files, sometimes we may not receive a file for processing, however, we still need to create a DataFrame manually with the same column names we expect. If we don’t create with the same column names, our operations/transformations (like unions) on DataFrame fail as we refer to the columns that may not be present.
To handle situations similar to these, we always need to create a DataFrame with the same schema, which means the same column names and datatypes regardless of the file exists or empty file processing.
Note: DataFrame contains rows with all NaN values not considered as empty. To consider DF empty it needs to have shape (0, n). shape (n,0) is not considered empty as it has n rows.
Key Points –
- An empty DataFrame can be created using
pd.DataFrame()
without passing any data. - Columns can be added to an empty DataFrame by assigning new column names or using
assign()
. - You can define columns during the creation of the empty DataFrame using the
columns
parameter. - You can reindex an empty DataFrame using
.reindex()
to add new rows or columns. - Use the
dtype
parameter to define the data types of the columns when creating an empty DataFrame. - Creating an empty DataFrame and adding data later is slower compared to initializing it with data; efficient for dynamic cases.
1. Quick Examples of Creating Empty DataFrame in pandas
If you are in a hurry, below are some quick examples of how to create an empty DataFrame in pandas.
# Quick examples of creating empty dataframe
# Create empty DataFrame
# Using constucor
df = pd.DataFrame()
# Creating Empty DataFrame with Column Names
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
# Create DataFrame with index and columns
# Note this is not considered empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=['index1'])
# Add rows to empty Dataframe
df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
# Check if DataFrame empty
print("Empty DataFrame :"+ str(df.empty))
To understand in detail, follow reading the article.
2. Create Empty DataFrame Using Constructor
One simple way to create an empty pandas DataFrame is by using its constructor. The below example creates a DataFrame with zero rows and columns (empty).
# Create empty DataFrame using constucor
df = pd.DataFrame()
print(df)
print("Empty DataFrame : "+str(df1.empty))
Yields below output. Notice that the columns and Index have no values.
3. Creating Empty DataFrame with Column Names
The column labels also can be added while creating an empty DataFrame. In this case, DataFrame contains only columns but not rows/Indexes. To do this, will use DataFrame constructor with columns
param. columns param accepts a list of column labels.
# Creating Empty DataFrame with Column Names
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
print(df)
print("Empty DataFrame : "+str(df.empty))
Yields below output.
# Output:
Empty DataFrame
Columns: [Courses, Fee, Duration, Discount]
Index: []
Empty DataFrame : True
All columns on the above DataFrame have type object
, you can change it by assigning a custom data type.
# Create empty DataFrame with specific column types
df = pd.DataFrame({'Courses': pd.Series(dtype='str'),
'Fee': pd.Series(dtype='int'),
'Duration': pd.Series(dtype='str'),
'Discount': pd.Series(dtype='float')})
print(df.dtypes)
Yields below output
# Output:
Courses object
Fee int32
Duration object
Discount float64
dtype: object
4. Add Columns and Index While Creating DataFrame
Let’s see how to add a DataFrame with columns and rows with nan values. Note that this is not considered an empty DataFrame as it has rows with NaN, you can check this by calling df.empty
attribute, which returns False
. Use DataFrame.dropna() to drop all NaN values. To add index/row, will use index param, along with columns param for column labels.
# Add columns and index while creating empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=['index1'])
print(df)
print("Empty DataFrame : "+str(df.empty))
Yields below output. Note that, this is not an empty DataFrame as it has rows with NaN values.
# Output:
Courses Fee Duration Discount
index1 NaN NaN NaN NaN
Empty DataFrame : False
5. Check if DataFrame is Empty
DataFrame.empty property is used to check if a DataFrame is empty or not. When it is empty it returns True
otherwise False
. DataFrame is considered non-empty if it contains 1 or more rows. Having all rows with NaN values is still considered a non-empty DataFrame.
# Check if DataFrame is Empty
if df.empty:
print("Empty DataFrame")
else
print("Non Empty DataFrame")
6. Create Empty DataFrame From Another DataFrame
You can also create a zero record DataFrame from another existing DF. This would be done to create a blank DataFrame with the same columns as the existing but without rows.
# Create empty DataFrame from another DataFrame
columns_list = df.columns
df2 = pd.DataFrame(columns = columns_list)
print(df2)
Yields below output.
# Output:
Empty DataFrame
Columns: [Courses, Fee, Duration, Discount]
Index: []
7. Add Rows to Empty DataFrame
DataFrame.append() method is used to append/add rows to empty DataFrame. Use append() if you wanted to add few rows as it has a performance issue. To add hundreds or thousands of rows to a DataFrame, use a constructor with data in a list collection.
# Add rows to empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
To add more rows use a constructor.
# Collect rows into list.
data = []
db_data=get_data()
for Courses, Fee, Duration, Discount in db_data:
data.append([Courses, Fee, Duration, Discount])
# Fill DataFrame with rows.
df = pd.DataFrame(data, columns=["Courses", "Fee", "Duration","Discount"])
8. Add Rows From Another DataFrame
If you have an empty data frame and fill it with data from one or multiple DataFrame’s, you can do this as below
# Creates a new empty DataFrame
df = pd.DataFrame()
df = df.append(df2, ignore_index = True)
df = df.append(df3, ignore_index = True)
Complete Example of Create Empty DataFrame in Pandas
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Create empty DataFrame using constucor
df2 = pd.DataFrame()
print(df2)
# Add column names/labels to empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
print(df2)
# Add columns and index while creating empty DataFrame
index_labels=['index1']
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=index_labels)
df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
print(df2)
# Create empty DataFrame from another DataFrame
columns_list = df.columns
df2 = pd.DataFrame(columns = columns_list)
print(df2)
# Add rows to empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
print(df2)
FAQ on Pandas Create Empty DataFrame
To create a completely empty DataFrame in Pandas, you can simply use the pd.DataFrame()
constructor without passing any arguments. This will create an empty DataFrame with no rows or columns.
To create an empty DataFrame with specific column names, you can pass the columns
parameter to the pd.DataFrame()
constructor, providing a list of column names.
You can create an empty DataFrame with a predefined index by passing the index
parameter when creating the DataFrame. This allows you to specify the row labels (index) even if there are no rows initially.
It is possible to create an empty DataFrame from a dictionary. You can initialize a DataFrame with a dictionary where the keys are the column names and the values are empty lists or arrays.
You can check if a DataFrame is empty in Pandas using the .empty
attribute. This attribute returns True
if the DataFrame has no elements (i.e., no rows and no columns), and False
otherwise.
Conclusion
In this article, I have explained how to create a DataFrame with zero rows, with or without columns, add rows to the DataFrame, and many more with examples.
Happy Learning !!
Related Articles
- Pandas Drop First N Rows From DataFrame
- How to Slice Columns in Pandas DataFrame
- Create Pandas DataFrame With Working Examples
- How to Replace String in Pandas DataFrame
- Pandas Empty DataFrame with Column Names & Types
- Pandas Append Rows & Columns to Empty DataFrame
- How to Get Column Average or Mean in Pandas DataFrame