• Post author:
  • Post category:Pandas
  • Post last modified:December 9, 2024
  • Reading time:12 mins read
You are currently viewing Pandas loc[] Multiple Conditions

To select rows based on multiple conditions, use the Pandas loc[] attribute. The loc[] function in pandas allows you to select data based on labels or a boolean array. When dealing with multiple conditions, you can combine them using logical operators like & (and), | (or), and ~ (not).

Advertisements

A Pandas DataFrame, a structured data format with labeled rows and columns, represents two-dimensional data. When you select specific columns from a DataFrame yields a fresh DataFrame containing solely the chosen columns from the original. In this article, I will explain pandas loc[] with multiple conditions.

Key Points –

  • loc[] is primarily used to access rows in a DataFrame based on label-based indexing, which is highly efficient for filtering.
  • Combine multiple conditions using logical operators (& for ‘and’, | for ‘or’) within loc[] to create complex filters.
  • Each condition must be enclosed within parentheses to ensure correct evaluation when using multiple conditions.
  • Use & and | inside loc[] for multiple conditions instead of and and or, as the latter are Python-specific and incompatible with Pandas DataFrames.
  • When using loc[] with multiple conditions, the original DataFrame’s index is preserved, making it easy to reference filtered rows back to the original data.
  • loc[] with conditions can also be used to update specific values within a DataFrame based on criteria.

Quick Examples of loc[] Multiple Conditions

Below are some quick examples of Pandas loc[] multiple conditions.


# Quick examples of loc[] multiple conditions

# Example 1 Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]

# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)

First, let’s create a pandas DataFrame.


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)

# Outputs:
# r1    Spark  20000    30day      1000
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r4   Python  22000   40days      2500
# r5   pandas  24000   60days      2000

Yields below output.

Pandas loc multiple conditions

Using loc[] by Multiple Conditions

By using the loc[] attribute you can get selected or filtered rows from DataFrame based on multiple conditions. Here, you can specify the multiple conditions using the & operator. Make sure you surround each condition with the brace. Not using this will get you incorrect results.


# Using loc[] by Multiple Conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print("Get selected rows after applying multiple conditions:\n", df2)

Yields below output.

Pandas loc multiple conditions

let’s look at another example, here, we can specify the multiple conditions using | (or) operator.


Using loc[] by multiple conditions
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000)]
print("Get selected rows after applying multiple conditions:\n", df2)

Yields below output.


# Output:
# Get selected rows after applying multiple conditions:
    Courses    Fee Duration  Discount
r2  PySpark  25000   40days      2300
r3   Hadoop  26000   35days      1200
r4   Python  22000   40days      2500
r5   pandas  24000   60days      2000

Complete Examples


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Example 1 - Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print(df2)

# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)

FAQ on Pandas loc[] Multiple Conditions

How do I apply multiple conditions using loc[]?

To apply multiple conditions with loc[], you can use logical operators like & (AND), | (OR), and ~ (NOT). Each condition should be enclosed in parentheses.

What is the importance of parentheses around conditions?

Parentheses are necessary to ensure that each condition is evaluated properly before applying the logical operators. Without them, Python will throw an error or not evaluate the conditions correctly.

Can I use more than two conditions with loc[]?

You can use more than two conditions with loc[] in Pandas. You can combine as many conditions as needed using logical operators (& for AND, | for OR, ~ for NOT). Each condition must be enclosed in parentheses.

How do I negate a condition with loc[]?

To negate a condition with loc[], you can use the ~ (tilde) operator. The tilde negates the boolean condition, selecting rows where the condition is not true.

Can I assign values to rows filtered by multiple conditions?

You can assign values to rows filtered by multiple conditions using loc[] in Pandas. You can use loc[] to select the rows that meet specific conditions, and then assign new values to those rows or specific columns.

Conclusion

In this article, you have learned about Pandas loc[] property to filter or select DataFrame rows based on multiple conditions. Also explained how we can specify the multiple conditions using logical operators like, and(&) and or(|).

Happy Learning !!

Related Articles

References