R Delete Multiple Columns from DataFrame

To delete multiple columns from a data frame in R, you can use the df[] notation, subset() function, and select() function from the dplyr package. Below are quick examples.

In this article, I will explain deleting or removing multiple columns by column names, by the list from the data frame.

1. Quick Examples

Following are quick examples of how to delete multiple columns from a data frame.


# Remove Columns by Range
df[,-2:-4]

# Remove multiple Columns from List
df[,!names(df) %in% c("id", "name", "chapters")]

# Remove using subset
subset(df, select = -c(id, name, chapters))

# Remove columns using select() from dplyr
df %>% select(-c(id, name, chapters))

Let’s create the R DataFrame from Vectors.


# Create data frame
df=data.frame(id=c(11,22),
              pages=c(32,45),
              name=c("spark","python"),
              chapters=c(76,86),
              price=c(144,553))

# Display the data frame
print(df)

Yields below output.

r delete multiple columns

2. R df[] to Delete Multiple Columns

First, let’s use the R base bracket notation df[] to remove multiple columns. This notation takes syntax df[, columns] to select columns in R, And to remove columns you have to use the – (negative) operator.

This notation also supports selecting columns by the range and using the negative operator to remove multiple columns by range. In the following example, it removes all columns between 2 and 4 indexes, which ideally deletes columns pages, names, and chapters.


# Remove Columns by Range
df2 <- df[,-2:-4]
df2

# Output
#  id price
#1 11   144
#2 22   553

3. R Delete Multiple Columns by Name

The above example explains how to delete multiple columns by index, now let’s see how to remove multiple columns by name in R by using the same df[] notation.


# Remove  Columns in List
df2 <- df[,!names(df) %in% c("id", "name", "chapters")]

# Output
#  pages price
#1    32   144
#2    45   553

3. Using subset()

Alternatively, you can also use the subset() function from the base package to delete multiple columns by specifying a list of column names to be removed. This function takes the data frame object as an argument and the columns you wanted to remove.


# Remove using subset
df2 <- subset(df, select = -c(id, name, chapters))

Similar to the above example, this will delete the columns named “id“, “name” and “chapters” from the data frame and leave the columns “pages” and “price“.

4. select() to Delete Multiple Columns

The select() function from the dplyr package can be used to delete multiple columns from a data frame in R. The select() function takes a minus sign (-) before the column name to specify that the column should be removed. You can specify as many column names as you want in this way to delete them.


# Load the dplyr package
library("dplyr")

# Remove columns using select()
df2 <- df %>% select(-c(id, name, chapters))

This also yields the same output as above.

Conclusion

In this article, you have learned how to delete multiple columns by name, index, and names from a list by using df[] notation, a subset(), and select() from the dplyr package.

Related Articles

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply

You are currently viewing R Delete Multiple Columns from DataFrame