pandas.DataFrame.append() method is used to append one DataFrame row(s) and column(s) with another, it can also be used to append multiple (three or more) DataFrames. This method takes other
(DataFrame you wanted to append), ignore_index
, verify_integrity
, sort
as parameters and returns a new DataFrame with the combined result.
In this article, I will explain how to append pandas DataFrame with examples like appending rows, columns, ignoring index while appending, and more by using its parameters.
1. pandas append() Syntax
Below is the syntax of pandas.DataFrame.append() method.
# Syntax of append()
DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)
other
– DataFrame or Series/dict-like object, or list of these.ignore_index
– bool, default False. When set to True, It creates axis with incremental numeric number.verify_integrity
– bool, default False. When set to True, raises error for duplicate index.sort
– bool, default False.
Alternatively, you can also use pandas.DataFrame.concat() to concatenate DataFrames which can also be used to append.
2. append() DataFrames Example
By default append() method appends rows and columns of the other pandas DataFrame at the end of the caller DataFrame. For example, In the below snippet appends rows of df1
towards end of the df
and returns a new DataFrame.
When you have an additional column on any of the DataFrame, it appends the column with NaN on the result for rows the same column does not exist. Let’s create a pandas DataFrame from Dict to explore this with an example.
import pandas as pd
df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
'Fee' : [20000,25000,22000,24000]})
df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
'Fee': [25000,25200,24500,24900],
'Duration': ['30days','35days','40days','45days']})
# Using append() method
df2 = df.append(df1)
print(df2)
Yields below output.
Courses Fee Duration
0 Spark 20000 NaN
1 PySpark 25000 NaN
2 Python 22000 NaN
3 pandas 24000 NaN
0 Pandas 25000 30days
1 Hadoop 25200 35days
2 Hyperion 24500 40days
3 Java 24900 45days
Using this method you can also append list of rows to the DataFrame.
3. Reindex the DataFrame while Append
In the above result DataFrame, the index has duplicate values. you can set the new index on the pandas DataFrame while appending by using ignore_index=True
param.
# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)
print(df2)
Yields below output.
Courses Fee Duration
0 Spark 20000 NaN
1 PySpark 25000 NaN
2 Python 22000 NaN
3 pandas 24000 NaN
4 Pandas 25000 30days
5 Hadoop 25200 35days
6 Hyperion 24500 40days
7 Java 24900 45days
5. Append Dict as Row to DataFrame
Sometimes you would be required to append a dict as a row to DataFrame. The below example demonstrates how to do this with example. First, create a Dict and add it to the df
object.
# Append Dict as row to DataFrame
new_row = {'Courses':'Hyperion', 'Fee':24000}
df2=df.append(new_row, ignore_index=True)
print(df2)
Yields below output.
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 24000
4 Hyperion 24000
5. Append Multiple DataFrames
To append multiple pandas DataFrames pass the DataFrames you wanted to append as a list to the append() method. Use ingore_index=True
param to reset the index on pandas DataFrame to start from zero.
# Create third DataFrame
df2 = pd.DataFrame({'Courses':['PHP','GO'],
'Duration':['30day','40days'],
'Fee':[10000,23000]})
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)
Yields below output
Courses Fee Duration
Courses Fee Duration
0 Spark 20000 NaN
1 PySpark 25000 NaN
2 Python 22000 NaN
3 pandas 24000 NaN
4 Pandas 25000 30days
5 Hadoop 25200 35days
6 Hyperion 24500 40days
7 Java 24900 45days
8 PHP 10000 30day
9 GO 23000 40days
6. Complete Example of pandas append()
import pandas as pd
df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
'Fee' : [20000,25000,22000,24000]})
df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
'Fee': [25000,25200,24500,24900],
'Duration': ['30days','35days','40days','45days']})
# Using append() method
df2 = df.append(df1)
print(df2)
# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)
print(df2)
# Create third DataFrame
df2 = pd.DataFrame({'Courses':['PHP','GO'],
'Duration':['30day','40days'],
'Fee':[10000,23000]})
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)
# Append Dict as row to DataFrame
new_row = {'Courses':'Hyperion', 'Fee':24000}
df2=df.append(new_row, ignore_index=True)
print(df2)
Conclusion
By using the append() method you can append one DataFrame with another by rows and columns. This method takes other (pass list for multiple dataframes), ignore_index, verify_integrity, sort as parameters, and returns a new DataFrame with the combined result. Note that when you have an additional column on any of the DataFrame, it appends the column with NaN on the result for rows the same column does not exist.
Related Articles
- Pandas Sum DataFrame Rows With Examples
- How to Append Pandas Series?
- Append Pandas DataFrames Using for Loop
- Pandas Stack Two Series Vertically and Horizontally
- Convert Pandas Series of Lists to One Series
- Pandas Append Rows & Columns to Empty DataFrame
- How to Append Row to pandas DataFrame
- How to Append Two pandas DataFrames