I will explain how to create an empty DataFrame in pandas with or without column names (column names) and Indices. Below I have explained one of the many scenarios where you would need to create an empty DataFrame.
While working with files, sometimes we may not receive a file for processing, however, we still need to create a DataFrame manually with the same column names we expect. If we don’t create with the same column names, our operations/transformations (like unions) on DataFrame fail as we refer to the columns that may not be present.
To handle situations similar to these, we always need to create a DataFrame with the same schema, which means the same column names and datatypes regardless of the file exists or empty file processing.
Note: DataFrame contains rows with all NaN values not considered as empty. To consider DF empty it needs to have shape (0, n). shape (n,0) is not considered empty as it has n rows.
1. Quick Examples of Creating Empty DataFrame in pandas
If you are in a hurry, below are some quick examples of how to create an empty DataFrame in pandas.
# Below are quick example
# Create empty DataFrame using constucor
df = pd.DataFrame()
# Creating Empty DataFrame with Column Names
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
# Create DataFrame with index and columns
# Note this is not considered empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=['index1'])
# Add rows to empty DataFrame
df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
# Check if DataFrame empty
print("Empty DataFrame :"+ str(df.empty))
To understand in detail, follow reading the article.
2. Create Empty DataFrame Using Constructor
One simple way to create an empty pandas DataFrame is by using its constructor. The below example creates a DataFrame with zero rows and columns (empty).
# Create empty DataFrame using constucor
df = pd.DataFrame()
print(df)
print("Empty DataFrame : "+str(df1.empty))
Yields below output. Notice that the columns and Index have no values.
3. Creating Empty DataFrame with Column Names
The column labels also can be added while creating an empty DataFrame. In this case, DataFrame contains only columns but not rows/Indexes. To do this, will use DataFrame constructor with columns
param. columns param accepts a list of column labels.
# Creating Empty DataFrame with Column Names
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
print(df)
print("Empty DataFrame : "+str(df.empty))
Yields below output.
# Output:
Empty DataFrame
Columns: [Courses, Fee, Duration, Discount]
Index: []
Empty DataFrame : True
All columns on the above DataFrame have type object
, you can change it by assigning a custom data type.
# Create empty DataFrame with specific column types
df = pd.DataFrame({'Courses': pd.Series(dtype='str'),
'Fee': pd.Series(dtype='int'),
'Duration': pd.Series(dtype='str'),
'Discount': pd.Series(dtype='float')})
print(df.dtypes)
Yields below output
# Output:
Courses object
Fee int32
Duration object
Discount float64
dtype: object
4. Add Columns and Index While Creating DataFrame
Let’s see how to add a DataFrame with columns and rows with nan values. Note that this is not considered an empty DataFrame as it has rows with NaN, you can check this by calling df.empty
attribute, which returns False
. Use DataFrame.dropna() to drop all NaN values. To add index/row, will use index param, along with columns param for column labels.
# Add columns and index while creating empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=['index1'])
print(df)
print("Empty DataFrame : "+str(df.empty))
Yields below output. Note that, this is not an empty DataFrame as it has rows with NaN values.
# Output:
Courses Fee Duration Discount
index1 NaN NaN NaN NaN
Empty DataFrame : False
5. Check if DataFrame is Empty
DataFrame.empty property is used to check if a DataFrame is empty or not. When it is empty it returns True
otherwise False
. DataFrame is considered non-empty if it contains 1 or more rows. Having all rows with NaN values is still considered a non-empty DataFrame.
# Check if DataFrame is Empty
if df.empty:
print("Empty DataFrame")
else
print("Non Empty DataFrame")
6. Create Empty DataFrame From Another DataFrame
You can also create a zero record DataFrame from another existing DF. This would be done to create a blank DataFrame with the same columns as the existing but without rows.
# Create empty DataFrame from another DataFrame
columns_list = df.columns
df2 = pd.DataFrame(columns = columns_list)
print(df2)
Yields below output.
# Output:
Empty DataFrame
Columns: [Courses, Fee, Duration, Discount]
Index: []
7. Add Rows to Empty DataFrame
DataFrame.append()
method is used to append/add rows to empty DataFrame. Use append() if you wanted to add few rows as it has a performance issue. To add hundreds or thousands of rows to a DataFrame, use a constructor with data in a list collection.
# Add rows to empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
To add more rows use a constructor.
# Collect rows into list.
data = []
db_data=get_data()
for Courses, Fee, Duration, Discount in db_data:
data.append([Courses, Fee, Duration, Discount])
# Fill DataFrame with rows.
df = pd.DataFrame(data, columns=["Courses", "Fee", "Duration","Discount"])
8. Add Rows From Another DataFrame
If you have an empty data frame and fill it with data from one or multiple DataFrame’s, you can do this as below
# Creates a new empty DataFrame
df = pd.DataFrame()
df = df.append(df2, ignore_index = True)
df = df.append(df3, ignore_index = True)
9. Complete Example of Create Empty DataFrame in pandas
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Create empty DataFrame using constucor
df2 = pd.DataFrame()
print(df2)
# Add column names/labels to empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
print(df2)
# Add columns and index while creating empty DataFrame
index_labels=['index1']
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=index_labels)
df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
print(df2)
# Create empty DataFrame from another DataFrame
columns_list = df.columns
df2 = pd.DataFrame(columns = columns_list)
print(df2)
# Add rows to empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True)
print(df2)
Conclusion
In this article, you have learned how to create a DataFrame with zero rows, with or without columns, add rows to the DataFrame, and many more with examples.
Happy Learning !!
Related Articles
- Create Pandas DataFrame With Working Examples
- How to Get Column Average or Mean in Pandas DataFrame
- Retrieve Number of Columns From Pandas DataFrame
- Pandas Drop First/Last N Columns From DataFrame
- Pandas Drop First N Rows From DataFrame
- Pandas Empty DataFrame with Column Names & Types
- Pandas Append Rows & Columns to Empty DataFrame
- Pandas Create Empty DataFrame