You are currently viewing How to Drop Columns by Name in R?

Let’s see how to drop/delete columns by name (single column or multiple columns) in R DataFrame (data.frame). In other words how to drop variables by name from the DataFrame (columns are called variables).

Advertisements

First, I will cover using the %in% operator. second, I will use subset() with select() to drop selected columns and third by using the select() function available in the dplyr package. Finally, we use the within() method with the rm() method to drop the selected columns.

Related: How to Delete Rows in R Data Frame?

1. Quick Examples of Drop Columns by Name

If you are in hurry, then have a look at the entire code which drops or deletes the single column/variable or multiple columns/variables by name in the R data frame.


# Create dataframe
df=data.frame(id=c(11,22,33,44,55),
pages=c(32,45,33,22,56),name=c("spark","python","R","java","jsp"),
chapters=c(76,86,11,15,7),price=c(144,553,321,567,890))

# Display the dataframe
print(df)

#Drop id,name and chapters using %in% operator with '!'
print(df[,!names(df) %in% 
      c("id", "name", "chapters")])

#Drop id, name and  chapters using subset() method  with select()
print(subset(df, select = -c(id, name, chapters)))

#load the dplyr package
library("dplyr")

#Drop id, name and  chapters using   select()
print(select(df, -c(id, name, chapters)) )

#Drop column1,column4 and column5 using  within()
print(within(df, rm(id, name, chapters))  )

Read through the rest of the article to learn how to delete or drop columns by name from the R data frame where I explained each example in detail.

Dataframe in R stores the data in the form of rows and columns similar to RDBMS tables. So it is a two-dimensional data structure such that one dimension refers to the row and another dimension refers to a column.

Let’s create an R DataFrame, run these examples and explore the output. If you already have data in CSV you can easily import CSV files to R DataFrame. Also, refer to Import Excel File into R.


#Create dataframe
df=data.frame(
   "id"=c(11,22,33,44,55),
   "pages"=c(32,45,33,22,56),
   "name"=c("spark","python","R","java","jsp"),
   "chapters"=c(76,86,11,15,7),
   "price"=c(144,553,321,567,890)
  )

#Display the dataframe
print(df)

Let’s see the data present in the data frame:


#Output
  id pages   name chapters price
1 11    32  spark       76   144
2 22    45 python       86   553
3 33    33      R       11   321
4 44    22   java       15   567
5 55    56    jsp        7   890

2. Drop Columns by Name Using %in% Operator

We are using the %in% operator to drop or delete the columns by name from the R data frame, This operator will select the columns by name present in the list or vector.

So, In order to drop the selected columns, we have to use ! operator (not operator) that will drop the selected columns and return the remained columns. We will use operator-[] such that It will display the remained columns. It is an index operator.

Syntax:


#Syntax to drop columns using %in% operator
df[ , !names(df) %in% 
    c("column_name1","column_name2",..... )]

Here,

  1. df is the input data frame.
  2. column_name1,……… represent the column names to be selected.
  3. names() is a method that will return the column names by taking the input data frame as a parameter.

Example:

In this example, we will drop the id, name, and chapters columns.


#Drop id,name and chapters using %in% operator with '!'
print(df[ ,  !names(df) %in% 
    c("id", "name", "chapters")])

Output:


#Output
  pages price
1    32   144
2    45   553
3    33   321
4    22   567
5    56   890

From the above output, we can see that the mentioned three columns are dropped, and returned the remaining two columns from the R data frame.

3. Drop R Dataframe Columns by Name Using subset() With select()

In this method, we are using the subset() method to drop the columns by selecting some columns, This operator will select the columns that are filtered using the select() method.

subset() will take two parameters, the first parameter is the data frame name and the second parameter is the select() method. So this method will drop column names by taking column names inside the c() function. c() refers to the combine that will take one or more column names. Finally, we are placing ‘-‘ to drop the columns by unselecting the selected columns.

Syntax:


#Syntax to drop columns ny name using subset() with select()
subset(df, select = -c("column_name1","column_name2",..... ))

Parameters:

  1. df is the input DataFrame
  2. column_name1,……… represent the column names to be selected

Example:

In this example, we will drop the id, name, and chapters columns.


#Drop id, name and  chapters using subset() method  with select()
print(subset(df, select = -c(id, name, chapters)))

Output:


# Output
  pages price
1    32   144
2    45   553
3    33   321
4    22   567
5    56   890

From the above output, we can see that the mentioned three columns are deleted, and returned the remaining two columns from the R data frame.

4. Drop Columns by Name Using select()

It is possible to directly use the select() method to drop the selected columns. To do that, we have to load the dplyr package, and select() is available in that package. We can load the library using the library() function.

Syntax:


#Syntax to laod the library
library("library_name")

#Syntax to laod the dplyr library
library("dplyr")

select() takes two parameters, the first parameter is the data frame name and the second parameter takes column names to be dropped inside c(). Finally, we are placing ‘-‘ to drop the columns by unselecting the selected columns.

Syntax:


#Syntax to drop columns using select()
select(df, -c("column_name1","column_name2",..... )

Parameters:

  1. df is the input DataFrame
  2. column_name1,………represent the column names to be selected

Example:

In this example, we will drop the id, name, and chapters columns.


#Load the dplyr package
library("dplyr")

#Drop id, name and  chapters using   select()
print(select(df, -c(id, name, chapters)) )

Output:


# Output
  pages price
1    32   144
2    45   553
3    33   321
4    22   567
5    56   890

From the above output, we can see that the mentioned three columns are deleted, and returned the remaining two columns from the data frame.

5. Delete Columns by Name Using within()

within() will take two parameters, the first parameter is the data frame name and the second parameter takes the rm() method which is used to delete variables by taking column names from the R data frame.

Syntax:


#Syntax to drop columns using within()
within(df, rm("column_name1","column_name2",.....)

Parameters:

  1. df is the input DataFrame.
  2. rm() is known as remove. This takes column_name1,……… which represents the column names to be deleted.

Example:

In this example, we will drop the id, name, and chapters columns.


#Drop id, name and chapters columns using  within()
print(within(df, rm(id, name, chapters)))

Output:


# Output
  pages price
1    32   144
2    45   553
3    33   321
4    22   567
5    56   890

From the above output, we can see that the mentioned three columns are dropped, and returned the remaining two columns from the R data frame.

6. Conclusion

From the above article, we have seen 4 ways to drop or delete the single column/variable and multiple columns/variables by name from an R DataFrame (data.frame), so based on the requirement in your application, you can use any of the above methods. While using the select() method, make sure that you have loaded the dplyr library.

References