To select rows based on multiple conditions, use the Pandas loc[]
attribute. The loc[]
function in pandas allows you to select data based on labels or a boolean array. When dealing with multiple conditions, you can combine them using logical operators like &
(and), |
(or), and ~
(not).
A Pandas DataFrame, a structured data format with labeled rows and columns, represents two-dimensional data. When you select specific columns from a DataFrame yields a fresh DataFrame containing solely the chosen columns from the original. In this article, I will explain pandas loc[]
with multiple conditions.
Quick Examples of loc[] Multiple Conditions
Below are some quick examples of Pandas loc[] multiple conditions.
# Quick examples of loc[] multiple conditions
# Example 1 Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)
First, let’s create a pandas DataFrame.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
'Fee' :[20000,25000,26000,22000,24000],
'Duration':['30day','40days','35days','40days','60days'],
'Discount':[1000,2300,1200,2500,2000]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)
# Outputs:
# r1 Spark 20000 30day 1000
# r2 PySpark 25000 40days 2300
# r3 Hadoop 26000 35days 1200
# r4 Python 22000 40days 2500
# r5 pandas 24000 60days 2000
Yields below output.
![Pandas loc multiple conditions](https://i0.wp.com/sparkbyexamples.com/wp-content/uploads/2023/11/image-63.png?resize=352%2C146&ssl=1)
Using loc[] by Multiple Conditions
By using the loc[] attribute you can get selected or filtered rows from DataFrame based on multiple conditions. Here, you can specify the multiple conditions using the & operator. Make sure you surround each condition with the brace. Not using this will get you incorrect results.
# Using loc[] by Multiple Conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print("Get selected rows after applying multiple conditions:\n", df2)
Yields below output.
![Pandas loc multiple conditions](https://i0.wp.com/sparkbyexamples.com/wp-content/uploads/2023/11/image-65.png?resize=493%2C102&ssl=1)
let’s look at another example, here, we can specify the multiple conditions using |
(or) operator.
Using loc[] by multiple conditions
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000)]
print("Get selected rows after applying multiple conditions:\n", df2)
Yields below output.
# Output:
# Get selected rows after applying multiple conditions:
Courses Fee Duration Discount
r2 PySpark 25000 40days 2300
r3 Hadoop 26000 35days 1200
r4 Python 22000 40days 2500
r5 pandas 24000 60days 2000
Complete Examples
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
'Fee' :[20000,25000,26000,22000,24000],
'Duration':['30day','40days','35days','40days','60days'],
'Discount':[1000,2300,1200,2500,2000]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Example 1 - Using loc[] with multiple conditions
df2=df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
print(df2)
# Example 2
df2=df.loc[(df['Discount'] >= 1200) | (df['Fee'] >= 23000 )]
print(df2)
Conclusion
In this article, you have learned about Pandas loc[]
property to filter or select DataFrame rows based on multiple conditions. Also explained how we can specify the multiple conditions using logical operators like, and(&
) and or(|
).
Happy Learning !!
Related Articles
- Pandas Series loc[] Function
- Pandas iloc[] Usage with Examples
- Pandas Difference Between loc[] vs iloc[]
- How to drop the Pandas column by index?
- Pandas Drop Columns with NaN or None Values
- Drop the last column from the DataFrame
- Drop multiple columns by index
- How to Slice Columns in Pandas DataFrame