• Post author:
  • Post category:Pandas
  • Post last modified:April 29, 2024
  • Reading time:14 mins read
You are currently viewing How to Append Two Pandas DataFrames

There are multiple ways to append two pandas DataFrames, In this article, I will explain how to append two or more pandas DataFrames by using several functions with examples.

Advertisements

In order to append two DataFrames you can use DataFrame.append() method. When you are appending two or more DataFrames, pass all DataFrames as a list to this method.

Key Points –

  • Use the append() function to concatenate two DataFrames vertically, adding rows from one DataFrame to the end of the other.
  • Specify the ignore_index=True parameter to reset the index of the resulting DataFrame after appending, ensuring a continuous index.
  • Consider alternatives such as pd.concat() for more complex concatenation operations or DataFrame.append() for appending a single row or column.

Quick Examples of Append Two DataFrames

If you are in a hurry, below are some quick examples of how to append two DataFrames in pandas.


# Below are some quick examples

# Append two DataFrames of same column
df3 = df1.append(df2)

# Append two DataFrames of different columns
df2 = df.append(df1)

# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)

# Appending three DataFrames
df3 = df.append([df1, df2], ignore_index=True)

Create Two Append DataFrames

To run some examples of appending two pandas DataFrames, let’s create a pandas DataFrame from Dict.


# Create two DataFrames with same columns
import pandas as pd

df1 = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
                    'Fee' : [20000,25000,22000,24000]})
print("First DataFrame:\n", df1)

df2 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
                    'Fee': [25000,25200,24500,24900],
                    'Duration': ['30days','35days','40days','45days']})
print("Second DataFrame:\n", df2)

Yields below output.

pandas append two DataFrames

To append two DataFrames with the same columns in Pandas, you can utilize the append() function. This function concatenates the DataFrames along the specified axis, filling in NaN values for rows where columns don’t match.


# Append two DataFrames of same columns
# using append() function
df3 = df1.append(df2)
print("After appending DataFrames:\n", df3)

Yields below output.

pandas append two DataFrames

Append Two DataFrames With the Different Columns

Alternatively, to append two DataFrames with different columns in Pandas, you can use the append() function. This function allows you to combine DataFrames along a specified axis (rows or columns), and it handles the alignment of columns with different names.


# Craete DataFrames of different columns
import pandas as pd
df1 = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
                    'Fee' : [20000,25000,22000,24000]})
print("First DataFrame:\n", df1)

df2 = pd.DataFrame({'discount': [2500,2520,2450,2490],
                    'Duration': ['30days','35days','40days','45days']})
print("Second DataFrame:\n", df2)

# Append two DataFrames of different columns
# using append() function
df3 = df1.append(df2)
print("After appending DataFrames:\n", df3)

Yields below output.


# Output:
First DataFrame:
    Courses    Fee
0    Spark  20000
1  PySpark  25000
2   Python  22000
3   pandas  24000
Second DataFrame:
    discount Duration
0      2500   30days
1      2520   35days
2      2450   40days
3      2490   45days

After appending DataFrames:
    Courses      Fee  discount Duration
0    Spark  20000.0       NaN      NaN
1  PySpark  25000.0       NaN      NaN
2   Python  22000.0       NaN      NaN
3   pandas  24000.0       NaN      NaN
0      NaN      NaN    2500.0   30days
1      NaN      NaN    2520.0   35days
2      NaN      NaN    2450.0   40days
3      NaN      NaN    2490.0   45days

Append Two DataFrames Ignore Index

To append two Pandas DataFrames while ignoring the index, you can use the ignore_index=True parameter with the append() function.


# Using append() with ignore_index
df3 = df1.append(df2, ignore_index=True)
print(df3)

Yields below output.


# Output:
   Courses      Fee  discount Duration
0    Spark  20000.0       NaN      NaN
1  PySpark  25000.0       NaN      NaN
2   Python  22000.0       NaN      NaN
3   pandas  24000.0       NaN      NaN
4      NaN      NaN    2500.0   30days
5      NaN      NaN    2520.0   35days
6      NaN      NaN    2450.0   40days
7      NaN      NaN    2490.0   45days

Append Three DataFrames

Similarly, If you have three DataFrames pass all these as a list to the append() method. you can use the ignore_index=True parameter to reset the index of a Pandas DataFrame to start from zero


# Create third DataFrame  
df2 = pd.DataFrame({'Courses':['PHP','GO'],
                    'Duration':['30day','40days'],
                    'Fee':[10000,23000]})
  
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)

Yields below output


# Output:
    Courses    Fee Duration
    Courses    Fee Duration
0     Spark  20000      NaN
1   PySpark  25000      NaN
2    Python  22000      NaN
3    pandas  24000      NaN
4    Pandas  25000   30days
5    Hadoop  25200   35days
6  Hyperion  24500   40days
7      Java  24900   45days
8       PHP  10000    30day
9        GO  23000   40days

Complete Example of Append Two DataFrames


import pandas as pd
# Complete Example of Append Two DataFrames
df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
                    'Fee' : [20000,25000,22000,24000]})

df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
                    'Fee': [25000,25200,24500,24900],
                    'Duration': ['30days','35days','40days','45days']})

# Using append() method
df3 = df1.append(df2)
print(df3)

# Using append() with ignore_index
df3 = df1.append(df2, ignore_index=True)
print(df2)

# Create third DataFrame  
df2 = pd.DataFrame({'Courses':['PHP','GO'],
                    'Duration':['30day','40days'],
                    'Fee':[10000,23000]})
  
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)

FAQ on Append Two DataFrames

What is the purpose of appending two DataFrames in Pandas?

Appending two DataFrames in Pandas is a common operation used to combine two DataFrames vertically, stacking one on top of the other. This can be useful when you have data split across multiple DataFrames and want to consolidate them into a single DataFrame for analysis.

How can I append two DataFrames in Pandas?

You can use the pd.concat() function to append two DataFrames.

Can I append DataFrames with different column names?

You can append DataFrames with different column names. By default, pd.concat() will keep all columns from both DataFrames. However, if columns in one DataFrame do not exist in the other, the missing columns will be filled with NaN values.

How can I reset the index after appending DataFrames?

You can reset the index of the resulting DataFrame using the reset_index() method. For example: The drop=True argument prevents the old index from being added as a new column in the resulting DataFrame.

What is the difference between appending DataFrames and merging them in Pandas?

Appending DataFrames (using pd.concat()) stacks DataFrames vertically, combining rows. Merging DataFrames (using pd.merge()) combines DataFrames based on common columns, aligning rows based on common values in those columns.

Conclusion

In this article, you have learned how to append two DataFrames by examples. Also learned how to append even three DataFrame by passing all DF’s you wanted to append as a list.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium