• Post author:
  • Post category:Pandas
  • Post last modified:September 20, 2024
  • Reading time:14 mins read
You are currently viewing Pandas Left Join Explained By Examples

In Pandas, a left join merges two DataFrames based on a common key from the left DataFrame and includes all the rows from the left DataFrame while matching rows from the right DataFrame. If there’s no match in the right DataFrame, the resulting DataFrame will have NaN values in the columns from the right DataFrame.

Advertisements

Left join is also called Left Outer Join that returns all rows from the left DataFrame regardless of match found on the right DataFrame. When the join expression doesn’t match, it assigns null for that record for left records and drops records from right where match not found.

Key Points –

  • In a left join, all rows from the left DataFrame are retained, even if there are no matches in the right DataFrame.
  • If no matching row is found in the right DataFrame, the columns from the right DataFrame will contain NaN values.
  • You can use either the pd.merge() or DataFrame.join() method, depending on whether you want to merge by columns or index.
  • You can specify columns to join on using the on, left_on, and right_on parameters, especially useful if the column names differ between DataFrames.
  • If the DataFrames have columns with the same name, you can add suffixes to differentiate them using the suffixes parameter.

Quick Examples of Left Join

Below are some quick examples of pandas left join DataFrames.


# Quick examples of pandas left join of dataframes

# Example 1: Pandas left join two DataFrames by index
df3=df1.join(df2, lsuffix="_left", rsuffix="_right", how='left')

# Example 2: Pandas.merge() by column
df3=pd.merge(df1,df2, how='left')

# Example 3: DataFrame.merge() by Column
df3=df1.merge(df2, how='left')

# Example 4: Merge DataFrames by Column
df3=pd.merge(df1,df2, on='Courses', how='left')

# Example 5: When column names are different
df3=pd.merge(df1,df2, left_on='Courses', right_on='Courses', how='left')

First, let’s create a DataFrames that I can use to demonstrate Left Join with examples.


# Create DataFrames
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
              }
index_labels=['r1','r2','r3','r4']
df1 = pd.DataFrame(technologies,index=index_labels)

technologies2 = {
    'Courses':["Spark","Java","Python","Go"],
    'Discount':[2000,2300,1200,2000]
              }
index_labels2=['r1','r6','r3','r5']
df2 = pd.DataFrame(technologies2,index=index_labels2)

print("Create DataFrame:\n",df1)
print("Create DataFrame:\n",df2)

Yields below output

pandas left join

Pandas Left Join using join()

panads.DataFrame.join() method by default does the left Join on row indices and provides a way to do join on other join types. It also supports different params, refer to pandas join() for syntax, usage, and more examples.


# Pandas join two DataFrames
df3=df1.join(df2, lsuffix="_left", rsuffix="_right")
print("Joining two DataFrames:\n",df3)

Yields below output.

pandas left join

If you check the results, indexes r2 and r4 are from the right DataFrame and it contains NaN for some columns for records that don’t match.

Left Join Using merge()

Using merge() you can do merging by columns, merging by index, merging on multiple columns, and different join types. By default, it joins on all common columns that exist on both DataFrames and performs an inner join. Use param how to specify the left join.


# Pandas.merge()
df3=pd.merge(df1,df2, how='left')
print(df3)

# DataFrame.merge()
df3=df1.merge(df2, how='left')
print(df3)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
0    Spark  20000   30days    2000.0
1  PySpark  25000   40days       NaN
2   Python  22000   35days    1200.0
3   pandas  30000   50days       NaN

You can also specify the column names explicitly.


# Merge DataFrames by Columns
df3=pd.merge(df1,df2, on='Courses', how='left')

In case if you wanted to combine column names that are different on two pandas DataFrames.


# When column names are different
df3=pd.merge(df1,df2, left_on='Courses', right_on='Courses', how='left')
print(df3)

merge() also supports different params, refer to pandas merge() to learn syntax, usage with examples.

Complete Examples


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
              }
index_labels=['r1','r2','r3','r4']
df1 = pd.DataFrame(technologies,index=index_labels)

technologies2 = {
    'Courses':["Spark","Java","Python","Go"],
    'Discount':[2000,2300,1200,2000]
              }
index_labels2=['r1','r6','r3','r5']
df2 = pd.DataFrame(technologies2,index=index_labels2)

print(df1)
print(df2)

# Pandas join two DataFrames
df3=df1.join(df2, lsuffix="_left", rsuffix="_right")
print(df3)

# Pandas.merge()
df3=pd.merge(df1,df2, how='left')
print(df3)

# DataFrame.merge()
df3=df1.merge(df2, how='left')
print(df3)

# Merge DataFrames by Column
df3=pd.merge(df1,df2, on='Courses', how='left')

# When column names are different
df3=pd.merge(df1,df2, left_on='Courses', right_on='Courses', how='left')

Frequently Asked Questions on Pandas Left Join

What is a left join in Pandas?

In Pandas, a left join combines two DataFrames based on a common column, including all rows from the left DataFrame and matching rows from the right DataFrame. If there are no matches in the right DataFrame, the result will contain NaN values for the columns from the right DataFrame.

What happens if there are no matching values in the right DataFrame?

If there are no matching values in the right DataFrame during a left join, the resulting DataFrame will still include all rows from the left DataFrame, and the columns from the right DataFrame will have NaN (Not a Number) values for those rows. In other words, the left DataFrame is preserved entirely, and missing values are filled with NaN for the columns coming from the right DataFrame.

Can I perform a left join using the merge() method on a specific index?

You can perform a left join using the merge() method on a specific index. You can use the left_on parameter to specify the column in the left DataFrame and the right_index parameter to indicate that the right DataFrame’s index should be used for the join.

How can I handle column name conflicts during a left join?

If there are column name conflicts during a left join, you can handle them by using the suffixes parameter in the merge() function. This parameter allows you to specify suffixes to be appended to the overlapping column names from the left and right DataFrames. This helps in disambiguating the column names and avoiding conflicts.

Conclusion

In this article, you have learned to perform a left join on DataFrams by using join() and merge() methods with explanations and examples. A left join is also called Left Outer Join which returns all rows from the left DataFrame regardless of match found on the right DataFrame. When the join expression doesn’t match, it assigns null for that record for left records and drops records from right where match not found.

References