Site icon Spark By {Examples}

R Delete Multiple Columns from DataFrame

r delete multiple columns

To delete multiple columns from a data frame in R, you can use the df[] notation, subset() function, and select() function from the dplyr package. In this article, I will explain deleting or removing multiple columns by column names, by the list from the data frame.

Key points-

Quick Examples

Following are quick examples of how to delete multiple columns from a data frame.


# Remove Columns by Range
df[,-2:-4]

# Remove multiple Columns from List
df[,!names(df) %in% c("id", "name", "chapters")]

# Remove using subset
subset(df, select = -c(id, name, chapters))

# Remove columns using select() from dplyr
df %>% select(-c(id, name, chapters))

Let’s create the R DataFrame from Vectors.


# Create data frame
df=data.frame(id=c(11,22),
              pages=c(32,45),
              name=c("spark","python"),
              chapters=c(76,86),
              price=c(144,553))

# Display the data frame
print(df)

Yields below output.

r delete multiple columns

R df[] to Delete Multiple Columns

To remove multiple columns in R, you can use square bracket notation df[]. The typical syntax to select specific columns is df[, columns]. To remove columns, use the negative operator (-) before the column numbers.

You can also use this method to select a range of columns and then exclude them by applying the negative operator to specify which columns to remove.


# Remove Columns by Range
df2 <- df[,-2:-4]
df2

# Output
#   id price
# 1 11   144
# 2 22   553

From the above example, it removes all columns from index 2 to 4, effectively deleting the pages, names, and chapters columns.

R Delete Multiple Columns by Name

The above example explains how to delete multiple columns by index, now let’s see how to remove multiple columns by name in R by using the same df[] notation.


# Remove  Columns in List
df2 <- df[,!names(df) %in% c("id", "name", "chapters")]

# Output
#  pages price
# 1    32   144
# 2    45   553

Using subset()

Alternatively, you can also use the subset() function from the base package to delete multiple columns by specifying a list of column names to be removed. This function requires a data frame object and a list of columns you want to delete as arguments.


# Remove using subset
df2 <- subset(df, select = -c(id, name, chapters))

Similar to the above example, this will delete the columns named “id“, “name” and “chapters” from the data frame and leave the columns “pages” and “price“.

select() to Delete Multiple Columns

The select() function from the dplyr package can be used to delete multiple columns from a data frame in R. The select() function takes a minus sign (-) before the column name to specify that the column should be removed. You can specify as many column names as you want in this way to delete them.


# Load the dplyr package
library("dplyr")

# Remove columns using select()
df2 <- df %>% select(-c(id, name, chapters))

This also yields the same output as above.

Conclusion

In this article, you have learned how to delete multiple columns by name, index, and names from a list by using df[] notation, a subset(), and select() from the dplyr package.

Related Articles

Exit mobile version