• Post author:
  • Post category:R Programming
  • Post last modified:August 18, 2024
  • Reading time:10 mins read
You are currently viewing How to R Convert Factor to Numeric in R?

How to convert factor to numeric in R? You can use the as.numeric() function to convert the data in R from factor to numeric. This function serves not only to convert factors to numeric but also facilitates the conversion of character data to numeric format. Converting factor vectors to numeric vectors in R is a common task, especially when dealing with data imported from external sources where numerical data may be read as factors. This conversion is essential for performing numerical operations and analyses.

Advertisements

In this article, I will explore multiple approaches to convert factor vectors to numeric values using various methods in R.

Key points-

  • In R, factors are used to represent categorical data and store the data as integer codes with corresponding levels.
  • The as.numeric() function is commonly used to convert factors to numeric values in R.
  • Before converting factors to numeric, it’s often necessary to first convert them to character vectors using as.character().
  • Directly applying as.numeric() to a factor yields internal integer codes, not the actual numeric values represented by the factor levels.
  • To obtain the actual numeric values from factor levels, you can use as.numeric(levels(f))[f], where f is the factor vector.
  • Converting factors with missing values (NA) to numeric retains the NA while converting valid values.
  • Factors containing non-numeric levels when converted to numeric result in NA values for non-numeric entries.
  • Packages like dplyr and tidyverse provide efficient methods (mutate()) for converting factor columns within data frames to numeric.
  • When converting a factor column within a data frame to numeric, it’s essential to ensure proper handling of factors with non-numeric levels or missing values.
  • The choice of conversion method (direct conversion, using levels, or package functions) depends on the specific data structure and the desired outcome, whether it’s handling missing data, ensuring numeric integrity, or maintaining efficiency.

Convert Factor to Numeric in R using as.numeric()

Let’s create the factor vector and use a two-step conversion process involving as.character() and as.numeric() to get the exact numeric values. If you try to convert the factor directly to numeric using as.numeric(), you will get the internal integer codes instead of the actual numeric values.


# Convert factor to numeric using as.numeric()
fact_vec <- factor(c("10", "20", "30"))
print("Given factor vector:")
print(fact_vec)
print("Get the type of the vector:")
print(class(fact_vec))
num_vec <- as.numeric(as.character(fact_vec))
print("After converting factor to numeric:")
print(num_vec)
print("Get the type of the vector:")
print(class(num_vec))

Yields below output.

Use Factor levels Convert to Numeric in R

Alternatively, you can use the as.numeric() function directly to convert the given vector from factor to numeric. However, this direct conversion will provide you with the internal integer codes of the factor levels. To avoid this, you can directly use the levels of the factor, which retrieves the numeric values from the levels and assigns them to the factor positions. This approach is more direct and efficient.


# Use factor levels convert to numeric in R 
fact_vec <- factor(c("10", "20", "30"))
print("Given factor vector:")
print(fact_vec)
print("Get the type of the vector:")
print(class(fact_vec))
num_vec <- as.numeric(levels(fact_vec))[fact_vec]
print("After converting factor to numeric:")
print(num_vec)
print("Get teh type of the vector:")
print(class(num_vec))

Yields the same output as above.

Convert Data Frame Column from Factor to Numeric

Similarly, you can use a combination of the as.character() and as.numeric() functions to convert a specified data frame column from factor to numeric. Create a data frame with a factor column and use these functions to convert the factor column to numeric directly within the data frame. This straightforward approach allows for single-column conversion without the need for additional packages.


# Convert data frame column to numeric using as.numeric()
df <- data.frame(fact_vec = factor(c("10", "20", "30")))
print("Given data frame:")
print(df)
df$num_vec = as.numeric(as.character(df$fact_vec))
print(df$num_vec)
print("Get the type of vector")
print(class(df$num_vec))

# Output:
[1] "Given data frame:"
#   fact_vec
# 1       10
# 2       20
# 3       30

# [1] 10 20 30
# [1] "Get the type of vector"
# [1] "numeric"

Convert Factor to Numeric With dplyr Package

To convert data in R from factor to numeric, you can use the mutate() function from the dplyr package. First, create a data frame with a factor column, then use the mutate() function to add a new numeric column converted from the factor. Finally, use the sapply() function to get the type of each column in the data frame.


# Convert data frame column to numeric using dplyr
library(dplyr)
df <- data.frame(fact_vec = factor(c("100", "200", "300")))
print("Given data frame:")
print(df)
df <- df %>% mutate(num_vec = as.numeric(as.character(fact_vec)))
print("After converting column of data frame to numeric:")
print(df)
print("Get the type of each column:")
print(sapply(df, class))

Yields below output.

Convert Factor to Numeri Using tidyverse

Similar to dplyr, the tidyverse package provides a consistent and powerful approach for data manipulation, including converting factors to numeric within a data frame using mutate(). This function adds a new numeric column by first converting the factor to a character vector and then to a numeric vector.


# Convert data frame column to numeric using dplyr
library(tidyverse)
df <- tibble(fact_vec = factor(c("10", "20", "30")))
print("Given data frame:")
print(df)
print(class(df))
df <- df %>% mutate(num_vec = as.numeric(as.character(fact_vec)))
print(df$num_vec)
print("Get the type of vector")
print(class(df$num_vec))

# Output:
# [1] "Given data frame:"
# A tibble: 3 × 1
#   fact_vec
#   <fct>   
# 1 10      
# 2 20      
# 3 30  
# [1] "tbl_df"     "tbl"        "data.frame"    
# [1] 10 20 30
# [1] "Get the type of vector"
# [1] "numeric"

Handling Missing Values

If our factor data contains missing values (NA) and we want to convert it into a numeric format using the as.character() and as.factor() functions, this approach converts the data to numeric while preserving the NA values unchanged.


# Handling missing values
fact_vec <- factor(c("10", "20", NA, "30"))
print("Given vector:")
print(fact_vec)
num_vec <- as.numeric(as.character(fact_vec))
print("After converting column of data frame to numeric:")
print(num_vec)
print("Get the type of vector:")
print(class(num_vec))

# Output:
# [1] "Given vector:"
# [1] 10   20   <NA> 30  
# Levels: 10 20 30
# [1] "After converting column of data frame to numeric:"
# [1] 10 20 NA 30
# [1] "Get the type of vector:"
# [1] "numeric"

Handling Non-Numeric Levels

Finally, using the as.numeric() function to convert a factor containing non-numeric levels directly to numeric will result in NA values for non-numeric entries will return NAs for non-numeric entries and a warning message as In eval(ei, envir) : NAs introduced by coercion.


# Convert factor of non numeric charcters
# to numeric using as.numeric()
fact_vec <- factor(c("10", "spark", "30"))
print("Given factor vector:")
num_vec <- as.numeric(as.character(fact_vec))
print("After converting factor to numeric:")
print(num_vec)
print("Get the type of the vector:")
print(class(num_vec))

# Output:
# [1] "After converting factor to numeric:"
# [1] 10 NA 30
# [1] "Get the type of the vector:"
# [1] "numeric"
# Warning message:
# In eval(ei, envir) : NAs introduced by coercion

Conclusion

In this article, I have discussed several methods to convert factor vectors to numeric in R. These methods include using base R functions like as.numeric() and levels(), as well as leveraging packages such as dplyr and tidyverse. Additionally, I’ve highlighted the importance of handling factors with non-numeric levels or missing values carefully to ensure precise and meaningful conversion.

Happy learning!!