Let’s see how to drop/delete columns by name (single column or multiple columns) in R DataFrame (data.frame
). In other words how to drop variables by name from the DataFrame (columns are called variables).
First, I will cover using the %in%
operator. second, I will use subset()
with select()
to drop selected columns and third by using the select()
function available in the dplyr
package. Finally, we use the within()
method with the rm()
method to drop the selected columns.
Related: How to Delete Rows in R Data Frame?
1. Quick Examples of Drop Columns by Name
If you are in hurry, then have a look at the entire code which drops or deletes the single column/variable or multiple columns/variables by name in the R data frame.
# Create dataframe
df=data.frame(id=c(11,22,33,44,55),
pages=c(32,45,33,22,56),name=c("spark","python","R","java","jsp"),
chapters=c(76,86,11,15,7),price=c(144,553,321,567,890))
# Display the dataframe
print(df)
#Drop id,name and chapters using %in% operator with '!'
print(df[,!names(df) %in%
c("id", "name", "chapters")])
#Drop id, name and chapters using subset() method with select()
print(subset(df, select = -c(id, name, chapters)))
#load the dplyr package
library("dplyr")
#Drop id, name and chapters using select()
print(select(df, -c(id, name, chapters)) )
#Drop column1,column4 and column5 using within()
print(within(df, rm(id, name, chapters)) )
Read through the rest of the article to learn how to delete or drop columns by name from the R data frame where I explained each example in detail.
Dataframe in R stores the data in the form of rows and columns similar to RDBMS tables. So it is a two-dimensional data structure such that one dimension refers to the row and another dimension refers to a column.
Let’s create an R DataFrame, run these examples and explore the output. If you already have data in CSV you can easily import CSV files to R DataFrame. Also, refer to Import Excel File into R.
#Create dataframe
df=data.frame(
"id"=c(11,22,33,44,55),
"pages"=c(32,45,33,22,56),
"name"=c("spark","python","R","java","jsp"),
"chapters"=c(76,86,11,15,7),
"price"=c(144,553,321,567,890)
)
#Display the dataframe
print(df)
Let’s see the data present in the data frame:
#Output
id pages name chapters price
1 11 32 spark 76 144
2 22 45 python 86 553
3 33 33 R 11 321
4 44 22 java 15 567
5 55 56 jsp 7 890
2. Drop Columns by Name Using %in% Operator
We are using the %in%
operator to drop or delete the columns by name from the R data frame, This operator will select the columns by name present in the list or vector.
So, In order to drop the selected columns, we have to use ! operator
(not operator) that will drop the selected columns and return the remained columns. We will use operator-[]
such that It will display the remained columns. It is an index operator.
Syntax:
#Syntax to drop columns using %in% operator
df[ , !names(df) %in%
c("column_name1","column_name2",..... )]
Here,
df
is the input data frame.column_name1
,……… represent the column names to be selected.names()
is a method that will return the column names by taking the input data frame as a parameter.
Example:
In this example, we will drop the id
, name
, and chapters
columns.
#Drop id,name and chapters using %in% operator with '!'
print(df[ , !names(df) %in%
c("id", "name", "chapters")])
Output:
#Output
pages price
1 32 144
2 45 553
3 33 321
4 22 567
5 56 890
From the above output, we can see that the mentioned three columns are dropped, and returned the remaining two columns from the R data frame.
3. Drop R Dataframe Columns by Name Using subset() With select()
In this method, we are using the subset()
method to drop the columns by selecting some columns, This operator will select the columns that are filtered using the select()
method.
subset()
will take two parameters, the first parameter is the data frame name and the second parameter is the select()
method. So this method will drop column names by taking column names inside the c()
function. c() refers to the combine that will take one or more column names. Finally, we are placing ‘-‘ to drop the columns by unselecting the selected columns.
Syntax:
#Syntax to drop columns ny name using subset() with select()
subset(df, select = -c("column_name1","column_name2",..... ))
Parameters:
df
is the input DataFramecolumn_name1
,……… represent the column names to be selected
Example:
In this example, we will drop the id
, name
, and chapters
columns.
#Drop id, name and chapters using subset() method with select()
print(subset(df, select = -c(id, name, chapters)))
Output:
# Output
pages price
1 32 144
2 45 553
3 33 321
4 22 567
5 56 890
From the above output, we can see that the mentioned three columns are deleted, and returned the remaining two columns from the R data frame.
4. Drop Columns by Name Using select()
It is possible to directly use the select() method to drop the selected columns. To do that, we have to load the dplyr package, and select()
is available in that package. We can load the library using the library()
function.
Syntax:
#Syntax to laod the library
library("library_name")
#Syntax to laod the dplyr library
library("dplyr")
select()
takes two parameters, the first parameter is the data frame name and the second parameter takes column names to be dropped inside c()
. Finally, we are placing ‘-‘ to drop the columns by unselecting the selected columns.
Syntax:
#Syntax to drop columns using select()
select(df, -c("column_name1","column_name2",..... )
Parameters:
df
is the input DataFramecolumn_name1
,………represent the column names to be selected
Example:
In this example, we will drop the id
, name
, and chapters
columns.
#Load the dplyr package
library("dplyr")
#Drop id, name and chapters using select()
print(select(df, -c(id, name, chapters)) )
Output:
# Output
pages price
1 32 144
2 45 553
3 33 321
4 22 567
5 56 890
From the above output, we can see that the mentioned three columns are deleted, and returned the remaining two columns from the data frame.
5. Delete Columns by Name Using within()
within()
will take two parameters, the first parameter is the data frame name and the second parameter takes the rm() method which is used to delete variables by taking column names from the R data frame.
Syntax:
#Syntax to drop columns using within()
within(df, rm("column_name1","column_name2",.....)
Parameters:
df
is the input DataFrame.rm()
is known as remove. This takescolumn_name1
,……… which represents the column names to be deleted.
Example:
In this example, we will drop the id
, name
, and chapters
columns.
#Drop id, name and chapters columns using within()
print(within(df, rm(id, name, chapters)))
Output:
# Output
pages price
1 32 144
2 45 553
3 33 321
4 22 567
5 56 890
From the above output, we can see that the mentioned three columns are dropped, and returned the remaining two columns from the R data frame.
6. Conclusion
From the above article, we have seen 4 ways to drop or delete the single column/variable and multiple columns/variables by name from an R DataFrame (data.frame), so based on the requirement in your application, you can use any of the above methods. While using the select() method, make sure that you have loaded the dplyr
library.
Related Articles
- Reorder Columns of DataFrame in R
- R – Replace Column Value with Another Column
- How to Remove Rows with NA in R
- How to Add Empty Column to DataFrame in R?
- Add Column to DataFrame in R
- How to Remove Column in R?
- Select Columns by Index Position in R
- How to delete DataFrame rows with examples?