Use pandas.concat()
to concatenate/merge two or multiple pandas DataFrames across rows or columns. When you concat() two pandas DataFrames on rows, it creates a new Dataframe containing all rows of two DataFrames basically it does append one DataFrame with another. When you use concat() on columns it performs the join operation.
In this article, I will explain how to concatenate two pandas DataFrames by rows and columns with examples
pandas concat() Key Points
- By default concat() method performs append operation meaning, it appends each DataFrame at the end of the another DataFrame and creates a single DataFrame
- When you use concat() to join two DataFrames, it supports only inner and outer joins and by default it performs outer join.
- Using concat you can join or append multiple pandas DataFrames
1. Quick Examples of Concat Two pandas DataFrames
If you are in a hurry, below are some quick examples of how to concatenate two DataFrames using concat() method.
# Below are quick example
# Using pandas.concat() to concat two DataFrame
data = [df, df1]
df2 = pd.concat(data)
# Use pandas.concat() method to ignore_index
df2 = pd.concat([df, df1], ignore_index=True, sort=False)
# Using pandas.concat() Method
data = [df, df1]
df2 = pd.concat(data, ignore_index=True, sort=False)
# Using pandas.concat() to join concat two DataFrames
data = pd.concat([df, df1], axis=1, join='inner')
# Using DataFrame.append() method
data = df.append(df1)
# Use DataFrame.append()
df2 = df.append(df1, ignore_index=True)
# Appending multiple DataFrame
data = df.append([df1, df2])
2. pandas concat() Syntax and Usage
Below is the syntax of the pandas.concat() method.
pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)
2. Use pandas.concat() to Concat Two DataFrames
First, let’s see pandas.concat()
method to concat two DataFrames by rows meaning appending two DataFrames. By default, it performs append operations similar to a union where it bright all rows from both DataFrames to a single DataFrame. The below example demonstrates append using concat().
import pandas as pd
df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
'Fee' : [20000,25000,22000,24000]})
df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
'Fee': [25000,25200,24500,24900]})
# Using pandas.concat() to concat two DataFrames
data = [df, df1]
df2 = pd.concat(data)
print(df2)
Yields below output.
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 24000
0 Pandas 25000
1 Hadoop 25200
2 Hyperion 24500
3 Java 24900
If you notice in the above example, it just added the row index as-is from two DataFrame, sometimes you may want to reset the index. You can do so by using the ignore_index=True
param.
# Use pandas.concat() method to ignore_index
df2 = pd.concat([df, df1], ignore_index=True, sort=False)
print(df2)
Yields below output.
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 24000
4 Pandas 25000
5 Hadoop 25200
6 Hyperion 24500
7 Java 24900
3. Using pandas.concat() to Join Two DataFrames
As I said above pandas.concat()
method is also used to join two pandas DataFrams on columns. In order to do so use axis=1
, join='inner'
. By default, pd.concat()
is a row-wise outer join.
import pandas as pd
df = pd.DataFrame({'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,24000]})
df1 = pd.DataFrame({'Duration':['30day','40days','35days','60days'],
'Discount':[1000,2300,2500,2000,]})
# Using pandas.concat() to join concat two DataFrames
df2 = pd.concat([df, df1], axis=1, join='inner')
print(df2)
Yields below output.
Courses Fee Duration Discount
0 Spark 20000 30day 1000
1 PySpark 25000 40days 2300
2 Python 22000 35days 2500
3 pandas 24000 60days 2000
4. Concatenate Multiple DataFrames Using pandas.concat()
You can also use pandas.concat()
method to concatenate multiple DataFrames.
import pandas as pd
df = pd.DataFrame({'Courses': ["Spark", "PySpark", "Python", "Pandas"],
'Fee' : ['20000', '25000', '22000', '24000']})
df1 = pd.DataFrame({'Courses': ["Unix", "Hadoop", "Hyperion", "Java"],
'Fee': ['25000', '25200', '24500', '24900']})
df2 = pd.DataFrame({'Duration':['30day','40days','35days','60days','55days'],
'Discount':[1000,2300,2500,2000,3000]})
# Appending multiple DataFrame
df3 = pd.concat([df, df1, df2])
print(df3)
Yields below output.
Courses Fee Duration Discount
0 Spark 20000 NaN NaN
1 PySpark 25000 NaN NaN
2 Python 22000 NaN NaN
3 Pandas 24000 NaN NaN
0 Unix 25000 NaN NaN
1 Hadoop 25200 NaN NaN
2 Hyperion 24500 NaN NaN
3 Java 24900 NaN NaN
0 NaN NaN 30day 1000.0
1 NaN NaN 40days 2300.0
2 NaN NaN 35days 2500.0
3 NaN NaN 60days 2000.0
4 NaN NaN 55days 3000.0
4. Use DataFrame.append() to Concat Two DataFrames
Alternatively, you can use pandas.DataFrame.append() method to concatenate DataFrames on rows. For E.x, df.append(df1)
appends df1 to the df DataFrame.
import pandas as pd
df = pd.DataFrame({'Courses': ["Spark","PySpark","Python","pandas"],
'Fee' : [20000,25000,22000,24000]})
df1 = pd.DataFrame({'Courses': ["Pandas","Hadoop","Hyperion","Java"],
'Fee': [25000,25200,24500,24900]})
# Using DataFrame.append() to concat Two DataFrames
df2 = df.append(df1)
print(df2)
Yields below output.
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 24000
0 Pandas 25000
1 Hadoop 25200
2 Hyperion 24500
3 Java 24900
Use ignore_index=True
param to reset the index on combined DataFrame.
# Use DataFrame.append()
df2 = df.append(df1, ignore_index=True)
print(df2)
Yields below output.
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 24000
4 Pandas 25000
5 Hadoop 25200
6 Hyperion 24500
7 Java 24900
Conclusion
In this article, I have explained how to concatenate two pandas DataFrames using pandas.concat()
and DataFrame.append()
methods with examples. concat() method is also used to concatenate multiple pandas DataFrames.
Happy Learning !!
Related Articles
- Empty Pandas DataFrame with Specific Column Types
- Sum Pandas DataFrame Columns With Examples
- How to Print Pandas DataFrame without Index
- Rename Specific Columns in Pandas
- Pandas Rename Index of DataFrame
- Pandas Concatenate Two Columns
- Pandas Sum DataFrame Rows With Examples
- How to Append Pandas Series?
- Append Pandas DataFrames Using for Loop
good for out standing khnowlgde