You are currently viewing R Select Rows by Condition with Examples

How do I select rows by the condition in the R Data Frame? You can select rows from the R Data Frame by single/multiple conditions using bracket notation. In this article, I will explain how to select rows based on a list of values, by multiple and single conditions, not equal conditions, etc.

Advertisements

1. Quick Examples of Select Rows by Condition

The following are quick examples of how to select rows by condition from R DataFrame.


# Quick Examples

# Example 1: Select Rows by equal condition
df[df$gender == 'M',]

# Example 2: Select Rows by not equal condition
df[df$gender != 'M',]

# Example 3: Select Rows by Multiple Conditions
df[df$gender == 'M' & df$id > 15,]

# Example 4: Select rows based on list
df[df$id %in% c(13,14,15),]

# Example 5: Using subset()
subset(df, gender == 'M')

# Example 6: Using filter()
library("dplyr")  
filter(df, gender == 'M')

Let’s create an R DataFrame, run these examples, and explore the output.


# Create DataFrame
df <- data.frame(
  id = c(10,11,12,13,14,15,16,17),
  name = c('sai','ram','deepika','sahithi','kumar','scott','Don','Lin'),
  gender = c('M','M','F','F','M','M','M','F'),
  dob = as.Date(c('1990-10-02','1981-3-24','1987-6-14','1985-8-16',
                  '1990-10-02','1981-3-24','1987-6-14','1985-8-16')),
  state = c('CA','NY',NA,NA,'DC','DW','AZ','PH')
)
df

Yields below output.

r select rows condition

2. Select Rows Based on Condition

By using bracket notation we can select rows by the condition in R. In the following example, I am selecting all rows where gender is equal to ‘M’ from DataFrame. For more examples refer to selecting rows from the data frame.


# Select Rows by equal condition
df[df$gender == 'M',]

Yields below output.

r select rows condition

3. Select Rows Based on Negative Condition

Similarly, the below example performs negation. here I am selecting all rows that are not equal to ‘M’. Let’s filter the specified rows along with the specified condition.


# Select Rows by not equal condition
df[df$gender != 'M',]

# Output
#  id    name gender        dob state
#3 12 deepika      F 1987-06-14  <NA>
#4 13 sahithi      F 1985-08-16  <NA>
#8 17     Lin      F 1990-08-26    PH

In this case, rows with ‘gender’ not equal to ‘M’ are selected, showcasing the application of the negation operator (!=).

4. Select Rows Based on Multiple Conditions

Let’s see how to select rows based on multiple conditions in R. In the below example I am selecting rows when gender is equal to M and id is greater than 15. Here I am using & (AND) logical operator. It returns true when both conditions are true.

Similarly, you can also use | (or), ! (not), &&, and || operators.


# Select Rows by Multiple Conditions
df[df$gender == 'M' & df$id > 15,]

# Output
#  id name gender        dob state
# 7 16  Don      M 1986-03-24    AZ

5. Select Rows Based on a List of Values

If you have a vector of values and you want to select rows based on a list of values (vector values) in R, use in operator %in% . The below example returns rows that have id values 13,14 and 15.


# Select rows based on list
df[df$id %in% c(13,14,15),]

# Output
#  id    name gender        dob state
#4 13 sahithi      F 1985-08-16  <NA>
#5 14   kumar      M 1995-03-02    DC
#6 15   scott      M 1991-06-21    DW

6. By Using subset()

R base also provides a subset() function that can be used to select rows based on the logical condition of a column.


# Using subset()
subset(df, gender == 'M')

7. By using filter()

Finally, you can select rows from the data frame by using the filter() function from the dplyr package. To use this package, first, you need to install it by using install.packages("dplyr") and load it using library("dplyr").


# Load dplyr package
library("dplyr")
  
# Using filter()
filter(df, gender == 'M')

8. Conclusion

In this article, you have learned R examples of how to select rows from DataFrame by single condition, multiple conditions, by not-equal condition, and finally, select rows based on a list of values.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium