How to Append Two pandas DataFrames

There are multiple ways to append two pandas DataFrames, In this article, I will explain how to append two or more pandas DataFrames by using several functions with examples.

In order to append two DataFrames you can use DataFrame.append() method. When you are appending two or more DataFrames, pass all DataFrames as a list to this method.

Alternatively, you can also use pandas.DataFrame.concat() to concatenate DataFrames which can also be used to append two DataFrames.

1. Quick Examples of Append Two DataFrames

If you are in hurry, below are some quick examples of how append two DataFrames in pandas.


# Append two DataFrames
df2 = df.append(df1)

# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)

# Appending three DataFrames
df3 = df.append([df1, df2], ignore_index=True)

2. Append Two DataFrames

append() is a DataFrame method, meaning if you need to call it on DataFrame. When you want to append two DataFrames, ideally you should be calling this as df2.append(df2)

When you have a different columns on two DataFrames, it appends the column with NaN on the result for rows the same column does not exist. Let’s create a pandas DataFrame from Dict to explore this with an example.


import pandas as pd

df2 = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
                    'Fee' : [20000,25000,22000,24000]})

df2 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
                    'Fee': [25000,25200,24500,24900],
                    'Duration': ['30days','35days','40days','45days']})

# Using append() method
df3 = df1.append(df2)
print(df3)

Yields below output.


    Courses    Fee Duration
0     Spark  20000      NaN
1   PySpark  25000      NaN
2    Python  22000      NaN
3    pandas  24000      NaN
0    Pandas  25000   30days
1    Hadoop  25200   35days
2  Hyperion  24500   40days
3      Java  24900   45days

Using this method you can also append list of rows to the DataFrame.

3. Ignore Index while Append

Append by defaults merges all rows including indices. if you wanted to ignore the index on DataFrames, you can set the new index on the pandas DataFrame by using ignore_index=True param.


# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)
print(df2)

Yields below output.


    Courses    Fee Duration
0     Spark  20000      NaN
1   PySpark  25000      NaN
2    Python  22000      NaN
3    pandas  24000      NaN
4    Pandas  25000   30days
5    Hadoop  25200   35days
6  Hyperion  24500   40days
7      Java  24900   45days

5. Append Three DataFrames

Similarly, If you have three DataFrames pass all these as a list to the append() method. Use ingore_index=True param to reset the index on pandas DataFrame to start from zero.


# Create third DataFrame  
df2 = pd.DataFrame({'Courses':['PHP','GO'],
                    'Duration':['30day','40days'],
                    'Fee':[10000,23000]})
  
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)

Yields below output


    Courses    Fee Duration
    Courses    Fee Duration
0     Spark  20000      NaN
1   PySpark  25000      NaN
2    Python  22000      NaN
3    pandas  24000      NaN
4    Pandas  25000   30days
5    Hadoop  25200   35days
6  Hyperion  24500   40days
7      Java  24900   45days
8       PHP  10000    30day
9        GO  23000   40days

6. Complete Example of Append Two DataFrames


import pandas as pd

df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
                    'Fee' : [20000,25000,22000,24000]})

df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
                    'Fee': [25000,25200,24500,24900],
                    'Duration': ['30days','35days','40days','45days']})

# Using append() method
df2 = df.append(df1)
print(df2)

# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)
print(df2)

# Create third DataFrame  
df2 = pd.DataFrame({'Courses':['PHP','GO'],
                    'Duration':['30day','40days'],
                    'Fee':[10000,23000]})
  
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)

Conclusion

In this artilce you have learned how toappend two DataFrames by examples. Also learned how to append even three DataFrame by passing all DF’s you wanted to append as a list.

References

pandas append two DataFrames

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing How to Append Two pandas DataFrames