To remove a single column or multiple columns in R DataFrame use square bracket notation [] or use functions from third-party packages like dplyr. There are several ways to remove columns or variables from the R DataFrame (data.frame).
1. Prepare the Data
Let’s create an R DataFrame, run these examples and explore the output. If you already have data in CSV you can easily import CSV files to R DataFrame. Also, refer to Import Excel File into R.
# Create dataframe
df=data.frame(id=c(11,22),
pages=c(32,45),
name=c("spark","python"),
chapters=c(76,86),
price=c(144,553))
# Display the dataframe
print(df)
# Output
# id pages name chapters price
#1 11 32 spark 76 144
#2 22 45 python 86 553
2. Remove Column using R Base Functions
By using R base function subset() or square bracket notation you can remove single or multiple columns by index/name from the R DataFrame.
2.1 Remove Column by Index
First, let’s use the R base bracket notation df[]
to remove the column by Index. This notation takes syntax df[, columns]
to select columns in R, And to remove columns you have to use the – (negative) operator.
The following example removes the second column by Index from the R DataFrame.
# Remove Columns by Index
df2 <- df[,-2]
df2
# Output
id name chapters price
1 11 spark 76 144
2 22 python 86 553
2.2 Remove Columns by Range
This notation also supports selecting columns by the range and using the negative operator to remove columns by range. In the following example, removes all rows between 2
and 4
indexes, which ideally removes columns pages
, names
, and chapters
.
# Remove Columns by Range
df2 <- df[,-2:-4]
df2
# Output
id price
1 11 144
2 22 553
2.3 Remove Multiple Columns
Use vector to specify the column/vector indexes you want to remove from the R data frame. The following example removes multiple columns with indexes 2 and 3.
# Remove Multiple columns
df2 <- df[,-c(2,3)]
df2
# Output
id chapters price
1 11 76 144
2 22 86 553
2.4 Remove Columns From List
You can also use the column names from the list to remove them from the R data frame. Here I am using names() function which returns all column names and checks if a name is present in the list using %in% operator.
# Remove Columns in List
df2 <- df[,!names(df) %in% c("id", "name", "chapters")]
# Output
pages price
1 32 144
2 45 553
2.5 By using subset() Function
By using the R base function subset() you can remove columns by name from the data frame. This function takes the data frame object as an argument and the columns you wanted to remove.
# Remove using subset
df2 <- subset(df, select = -c(id, name, chapters))
Yields the same output as above.
3. Remove Columns by using dplyr Functions
In this section, I will use functions from the dplyr package to remove columns in R data frame. dplyr is an R package that provides a grammar of data manipulation and provides a most used set of verbs that helps data science analysts to solve the most common data manipulation. In order to use this, you have to install it first using install.packages('dplyr')
and load it using library(dplyr)
.
3.1 Remove Column by Matching
dplyr select() function is used to select the column and by using negation of this to remove columns. All verbs in dplyr
package take data.frame
as a first argument. When we use dplyr
package, we mostly use the infix operator %>%
from magrittr
, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.
For example, x %>% f(y)
converted into f(x, y)
so the result from left-hand side is then “piped” into the right-hand side. This pipe can be used to write multiple operations that you can read left-to-right.
# Load the dplyr package
library("dplyr")
# Remove columns using select()
df2 <- df %>% select(-c(id, name, chapters))
# Output
pages price
1 32 144
2 45 553
3.2 Remove Variables By Name Range
The same function can also be used to remove variables by name range.
# Remove columns by Range
df2 <-df %>% select(-(id:chapters))
df2
# Output
price
1 144
2 553
3.3 Remove Variables using contains
Use -contains()
to ignore columns that contain text. The following example removes the column chapters
as it contains text apt
. This function also takes a list of values to check contains.
# Remove columns contains character
df2 <-df %>% select(-contains('apt'))
df2
# Output
id pages name price
1 11 32 spark 144
2 22 45 python 553
3.4 Remove Column starts with
Use -starts_with()
to ignore columns that start with a text. The following example removes the column chapters
as it starts with character c.
# Remove columns starts with
df2 <-df %>% select(-starts_with('c'))
df2
# Output
id pages name price
1 11 32 spark 144
2 22 45 python 553
3.5 Remove Column ends with
Similarly, use -ends_with()
to remove variables that end with a text, the following examples remove name
and price
columns as they end with the letter e.
# Remove columns ends with
df2 <-df %>% select(-ends_with('e'))
df2
# Output
# id pages chapters
#1 11 32 76
#2 22 45 86
3.6 Remove Columns if it exists
Finally, use the one_of()
function to check if the column exists and then remove it from the data frame only when exists. If a column is not found, it returns a warning.
df2 <- df %>%
select(-one_of("name", "marks"))
1. Complete Example of Remove Columns in R
The following is a complete example of how to remove a single column/variable or several columns/variables from the R DataFrame (data.frame)
# Create dataframe
df=data.frame(id=c(11,22,33,44,55),
pages=c(32,45,33,22,56),
name=c("spark","python","R","java","jsp"),
chapters=c(76,86,11,15,7),
price=c(144,553,321,567,890))
# Display the dataframe
print(df)
# Remove Columns by Index
df2 <- df[,-2]
# Remove Columns by Range
df2 <- df[,-2:-4]
# Remove Multiple columns
df2 <- df[,-c(2,3)]
# Remove Columns in List
df2 <- df[,!names(df) %in% c("id", "name", "chapters")]
# Remove using subset
df2 <- subset(df, select = -c(id, name, chapters))
# Load the dplyr package
library("dplyr")
# Remove columns using select()
df2 <- df %>% select(-c(id, name, chapters))
# Remove columns by Range
df2 <- df %>% select(-(id:chapters))
# Remove columns contains character
df2 <- df %>% select(-contains('apt'))
# Remove columns starts with
df2 <- df %>% select(-starts_with('c'))
# Remove columns ends with
df2 <- df %>% select(-ends_with('e'))
# Remove columns using within()
df2 <- within(df, rm(id, name, chapters))
Conclusion
In this article, you have learned different ways to remove a single column/variable and several columns/variables in the R data frame. The example includes removing columns by name, index, from the list based on conditions e.t.c. Also learned how to use a select() function from the dplyr package.
Related Articles
- How to Remove Rows with NA in R
- Select Rows in R
- R Remove Duplicates From Vector
- Uninstall or Remove Package from R Environment
- Remove Character From String in R
- How to Replace Column Values in R
- R Remove From Vector with Examples
- How to Remove Duplicate Rows in R