pandas.DataFrame.append() method is used to append one DataFrame row(s) and column(s) with another, it can also be used to append multiple (three or more) DataFrames. This method takes other
(DataFrame you wanted to append), ignore_index
, verify_integrity
, sort
as parameters and returns a new DataFrame with the combined result.
In this article, I will explain how to append pandas DataFrames with examples like appending rows, and columns, ignoring the index while appending, and more by using its parameters.
Key Points –
- Pandas
append()
method is used to concatenate or append new rows of data to an existing DataFrame. - The
append()
function returns a new DataFrame containing the combined data from the original DataFrame and the appended data. - It is useful for combining multiple datasets vertically when they have the same columns.
append()
can be less efficient than other methods like when dealing with large datasets due to creatinga new DataFrame.
Pandas append() Syntax
Below is the syntax of pandas.DataFrame.append() method.
# Syntax of append()
DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)
other
– DataFrame or Series/dict-like object, or list of these.ignore_index
– bool, default False. When set to True, It creates an axis with an incremental numeric number.verify_integrity
– bool, default False. When set to True, raises an error for the duplicate index.sort
– bool, default False.
Alternatively, you can also use pandas.DataFrame.concat() to concatenate DataFrames which can also be used to append.
Append Two DataFrames With the Same Columns
To run some examples of pandas append() function, let’s create a DataFrame from dict.
# Create two DataFrames with same columns
import pandas as pd
df1 = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
'Fee' : [20000,25000,22000,24000]})
print("First DataFrame:\n", df1)
df2 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
'Fee': [25000,25200,24500,24900],
'Duration': ['30days','35days','40days','45days']})
print("Second DataFrame:\n", df2)
Yields below output.
The concat()
function can be used to concatenate or append two DataFrames along either axis (rows or columns). When concatenating DataFrames with the same columns, the concat()
function will align the columns and append rows, adding NaN values for any missing columns in either DataFrame. This allows for combining DataFrames even if they do not have exactly the same set of columns.
# Append two DataFrames of same columns
# using append() function
df3 = df1.append(df2)
print("After appending DataFrames:\n", df3)
Append Two DataFrames Ignore Index
When using the append()
function in Pandas, by default, it preserves the indices of the original DataFrames. However, you can specify ignore_index=True
to ignore the original indices and generate a new sequential index.
# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)
print(df2)
Yields below output.
# Output:
Courses Fee Duration
0 Spark 20000 NaN
1 PySpark 25000 NaN
2 Python 22000 NaN
3 pandas 24000 NaN
4 Pandas 25000 30days
5 Hadoop 25200 35days
6 Hyperion 24500 40days
7 Java 24900 45days
Append Dict as Row to DataFrame
Sometimes you would be required to append a dict as a row to DataFrame. The below example demonstrates how to do this with example. First, create a Dict and add it to the df
object.
# Append Dict as row to DataFrame
new_row = {'Courses':'Hyperion', 'Fee':24000}
df2=df.append(new_row, ignore_index=True)
print(df2)
Yields below output.
# Output:
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 24000
4 Hyperion 24000
Append Multiple DataFrames
Similarly, to append multiple DataFrames in Pandas using the append()
method, you can pass the DataFrames as a list. By setting ignore_index=True
, you reset the index to start from zero.
# Create third DataFrame
df2 = pd.DataFrame({'Courses':['PHP','GO'],
'Duration':['30day','40days'],
'Fee':[10000,23000]})
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)
Yields below output
# Output:
Courses Fee Duration
Courses Fee Duration
0 Spark 20000 NaN
1 PySpark 25000 NaN
2 Python 22000 NaN
3 pandas 24000 NaN
4 Pandas 25000 30days
5 Hadoop 25200 35days
6 Hyperion 24500 40days
7 Java 24900 45days
8 PHP 10000 30day
9 GO 23000 40days
Complete Example
# Example of pandas append()
import pandas as pd
df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
'Fee' : [20000,25000,22000,24000]})
df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
'Fee': [25000,25200,24500,24900],
'Duration': ['30days','35days','40days','45days']})
# Using append() method
df2 = df.append(df1)
print(df2)
# Using append() with ignore_index
df2 = df.append(df1, ignore_index=True)
print(df2)
# Create third DataFrame
df2 = pd.DataFrame({'Courses':['PHP','GO'],
'Duration':['30day','40days'],
'Fee':[10000,23000]})
# Appending multiple DataFrame
df3 = df.append([df1, df2], ignore_index=True)
print(df3)
# Append Dict as row to DataFrame
new_row = {'Courses':'Hyperion', 'Fee':24000}
df2=df.append(new_row, ignore_index=True)
print(df2)
Frequently Asked Questions on append() Function
The append()
method in Pandas is used to concatenate or append new rows of data to an existing DataFrame.
The append()
function in Pandas is used to concatenate rows from one DataFrame to another DataFrame. It appends rows from one DataFrame to the end of another DataFrame, effectively stacking them vertically.
You can use the append()
function by calling it on a DataFrame object and passing another DataFrame or a list of DataFrames that you want to append. Optionally, you can set ignore_index=True
to reset the index of the resulting DataFrame to start from zero.
By default, the index of the original DataFrames is preserved. However, if you set ignore_index=True
, the index is reset, and a new sequential index starting from zero is assigned to the resulting DataFrame.
You can append multiple DataFrames at once by passing them as a list to the append()
function. For example, df.append([df1, df2], ignore_index=True)
will append the rows of df1
and df2
to df
, and reset the index.
Conclusion
By using the append()
method you can append one DataFrame with another by rows and columns. This method takes other (pass list for multiple dataframes), ignore_index, verify_integrity, sort as parameters, and returns a new DataFrame with the combined result. Note that when you have an additional column on any of the DataFrame, it appends the column with NaN on the result for rows the same column does not exist.
Related Articles
- Pandas Sum DataFrame Rows With Examples
- How to Append Pandas Series?
- Append Pandas DataFrames Using for Loop
- Pandas Stack Two Series Vertically and Horizontally
- Convert Pandas Series of Lists to One Series
- Pandas Append Rows & Columns to Empty DataFrame
- How to Append Row to pandas DataFrame
- How to Append Two pandas DataFrames
- How to Merge Series into Pandas DataFrame
- Pandas Merge DataFrames on Index
- Pandas Merge Two DataFrames
- Pandas Merge DataFrames Explained Examples