Site icon Spark By {Examples}

Pandas loc[] Multiple Conditions

Pandas loc multiple conditions

When you want to select rows based on multiple conditions use the Pandas loc[] attribute. You can combine these conditions using logical operators like & (and), | (or), and parentheses for grouping. loc[] property is used to select rows and columns based on labels. Pandas DataFrame is a two-dimensional tabular data structure with labeled axes. i.e. columns and rows. Selecting columns from DataFrame results in a new DataFrame containing only specified selected columns from the original DataFrame.

In this article, I will explain how to select rows using pandas loc[] with multiple conditions.

1. Quick Examples of pandas loc[] with Multiple Conditions

Below are some quick examples of Pandas.DataFrame.loc[] to select rows by checking multiple conditions.


# Below are the quick examples 

# Example 1 Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]

# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)

Let’s create a DataFrame and explore how to use pandas loc[].


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)

# Outputs:
# r1    Spark  20000    30day      1000
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r4   Python  22000   40days      2500
# r5   pandas  24000   60days      2000

Yields below output.

Pandas loc multiple conditions

2. Using loc[] by Multiple Conditions

By using the loc[] attribute you can get selected or filtered rows from DataFrame based on multiple conditions. Here, you can specify the multiple conditions using the & operator. Make sure you surround each condition with the brace. Not using this will get you incorrect results.


# Using loc[] by Multiple Conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print("Get selected rows after applying multiple conditions:\n", df2)

Yields below output.

Pandas loc multiple conditions

let’s look at another example, here, we can specify the multiple conditions using | (or) operator.


Using loc[] by multiple conditions
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print("Get selected rows after applying multiple conditions:\n", df2)

Yields below output.


# Output:
# Get selected rows after applying multiple conditions:
    Courses    Fee Duration  Discount
r2  PySpark  25000   40days      2300
r3   Hadoop  26000   35days      1200
r4   Python  22000   40days      2500
r5   pandas  24000   60days      2000

3. Complete Examples of pandas loc[] With Multiple Conditions


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Example 1 - Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print(df2)

# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)

Conclusion

In this article, you have learned how to use the Pandas loc[] property to filter or select DataFrame rows based on multiple conditions. Also explained how we can specify the multiple conditions using logical operators like, and(&) and or(|).

Happy Learning !!

Related Articles

References

Exit mobile version