To select rows based on multiple conditions, use the Pandas loc[]
attribute. The loc[]
function in pandas allows you to select data based on labels or a boolean array. When dealing with multiple conditions, you can combine them using logical operators like &
(and), |
(or), and ~
(not).
A Pandas DataFrame, a structured data format with labeled rows and columns, represents two-dimensional data. When you select specific columns from a DataFrame yields a fresh DataFrame containing solely the chosen columns from the original. In this article, I will explain pandas loc[]
with multiple conditions.
Key Points –
loc[]
is primarily used to access rows in a DataFrame based on label-based indexing, which is highly efficient for filtering.- Combine multiple conditions using logical operators (
&
for ‘and’,|
for ‘or’) withinloc[]
to create complex filters. - Each condition must be enclosed within parentheses to ensure correct evaluation when using multiple conditions.
- Use
&
and|
insideloc[]
for multiple conditions instead ofand
andor
, as the latter are Python-specific and incompatible with Pandas DataFrames. - When using
loc[]
with multiple conditions, the original DataFrame’s index is preserved, making it easy to reference filtered rows back to the original data. loc[]
with conditions can also be used to update specific values within a DataFrame based on criteria.
Quick Examples of loc[] Multiple Conditions
Below are some quick examples of Pandas loc[] multiple conditions.
# Quick examples of loc[] multiple conditions
# Example 1 Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)
First, let’s create a pandas DataFrame.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
'Fee' :[20000,25000,26000,22000,24000],
'Duration':['30day','40days','35days','40days','60days'],
'Discount':[1000,2300,1200,2500,2000]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)
# Outputs:
# r1 Spark 20000 30day 1000
# r2 PySpark 25000 40days 2300
# r3 Hadoop 26000 35days 1200
# r4 Python 22000 40days 2500
# r5 pandas 24000 60days 2000
Yields below output.
Using loc[] by Multiple Conditions
By using the loc[]
attribute you can get selected or filtered rows from DataFrame based on multiple conditions. Here, you can specify the multiple conditions using the & operator. Make sure you surround each condition with the brace. Not using this will get you incorrect results.
# Using loc[] by Multiple Conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print("Get selected rows after applying multiple conditions:\n", df2)
Yields below output.
let’s look at another example, here, we can specify the multiple conditions using |
(or) operator.
Using loc[] by multiple conditions
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000)]
print("Get selected rows after applying multiple conditions:\n", df2)
Yields below output.
# Output:
# Get selected rows after applying multiple conditions:
Courses Fee Duration Discount
r2 PySpark 25000 40days 2300
r3 Hadoop 26000 35days 1200
r4 Python 22000 40days 2500
r5 pandas 24000 60days 2000
Complete Examples
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
'Fee' :[20000,25000,26000,22000,24000],
'Duration':['30day','40days','35days','40days','60days'],
'Discount':[1000,2300,1200,2500,2000]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Example 1 - Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print(df2)
# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)
FAQ on Pandas loc[] Multiple Conditions
To apply multiple conditions with loc[]
, you can use logical operators like &
(AND), |
(OR), and ~
(NOT). Each condition should be enclosed in parentheses.
Parentheses are necessary to ensure that each condition is evaluated properly before applying the logical operators. Without them, Python will throw an error or not evaluate the conditions correctly.
You can use more than two conditions with loc[]
in Pandas. You can combine as many conditions as needed using logical operators (&
for AND, |
for OR, ~
for NOT). Each condition must be enclosed in parentheses.
To negate a condition with loc[]
, you can use the ~
(tilde) operator. The tilde negates the boolean condition, selecting rows where the condition is not true.
You can assign values to rows filtered by multiple conditions using loc[]
in Pandas. You can use loc[]
to select the rows that meet specific conditions, and then assign new values to those rows or specific columns.
Conclusion
In this article, you have learned about Pandas loc[]
property to filter or select DataFrame rows based on multiple conditions. Also explained how we can specify the multiple conditions using logical operators like, and(&
) and or(|
).
Happy Learning !!
Related Articles
- Pandas Series loc[] Function
- Drop multiple columns by index
- Pandas iloc[] Usage with Examples
- Pandas Add or Insert Row to DataFrame
- Pandas Difference Between loc[] vs iloc[]
- How to drop the Pandas column by index?
- Drop the last column from the DataFrame
- How to Slice Columns in Pandas DataFrame
- Pandas Drop Columns with NaN or None Values
- Pandas Extract Column Value Based on Another Column