You are currently viewing R Delete Multiple Columns from DataFrame

To delete multiple columns from a data frame in R, you can use the df[] notation, subset() function, and select() function from the dplyr package. Below are quick examples.

Advertisements

In this article, I will explain deleting or removing multiple columns by column names, by the list from the data frame.

1. Quick Examples

Following are quick examples of how to delete multiple columns from a data frame.


# Remove Columns by Range
df[,-2:-4]

# Remove multiple Columns from List
df[,!names(df) %in% c("id", "name", "chapters")]

# Remove using subset
subset(df, select = -c(id, name, chapters))

# Remove columns using select() from dplyr
df %>% select(-c(id, name, chapters))

Let’s create the R DataFrame from Vectors.


# Create data frame
df=data.frame(id=c(11,22),
              pages=c(32,45),
              name=c("spark","python"),
              chapters=c(76,86),
              price=c(144,553))

# Display the data frame
print(df)

Yields below output.

r delete multiple columns

2. R df[] to Delete Multiple Columns

First, let’s use the R base bracket notation df[] to remove multiple columns. This notation takes syntax df[, columns] to select columns in R, And to remove columns you have to use the – (negative) operator.

This notation also supports selecting columns by the range and using the negative operator to remove multiple columns by range. In the following example, it removes all columns between 2 and 4 indexes, which ideally deletes columns pages, names, and chapters.


# Remove Columns by Range
df2 <- df[,-2:-4]
df2

# Output
#  id price
#1 11   144
#2 22   553

3. R Delete Multiple Columns by Name

The above example explains how to delete multiple columns by index, now let’s see how to remove multiple columns by name in R by using the same df[] notation.


# Remove  Columns in List
df2 <- df[,!names(df) %in% c("id", "name", "chapters")]

# Output
#  pages price
#1    32   144
#2    45   553

3. Using subset()

Alternatively, you can also use the subset() function from the base package to delete multiple columns by specifying a list of column names to be removed. This function takes the data frame object as an argument and the columns you wanted to remove.


# Remove using subset
df2 <- subset(df, select = -c(id, name, chapters))

Similar to the above example, this will delete the columns named “id“, “name” and “chapters” from the data frame and leave the columns “pages” and “price“.

4. select() to Delete Multiple Columns

The select() function from the dplyr package can be used to delete multiple columns from a data frame in R. The select() function takes a minus sign (-) before the column name to specify that the column should be removed. You can specify as many column names as you want in this way to delete them.


# Load the dplyr package
library("dplyr")

# Remove columns using select()
df2 <- df %>% select(-c(id, name, chapters))

This also yields the same output as above.

Conclusion

In this article, you have learned how to delete multiple columns by name, index, and names from a list by using df[] notation, a subset(), and select() from the dplyr package.

Related Articles

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium