• Post author:
  • Post category:R Programming
  • Post last modified:June 18, 2024
  • Reading time:7 mins read

To extract the last N characters from a string in R, you can use the base R functions substring() and substr(), or the str_sub() function from the stringr package. Extracting the last N characters is a common task in data manipulation and text processing. Each of these functions offers unique advantages and can be chosen based on the specific needs of your task. In this article, I will explain multiple methods for efficiently extracting the last N characters from a string using both base R functions and the stringr package.

Advertisements

Key points-

  • substring() and substr() are native functions in R that can extract parts of a string based on specified positions.
  • The str_sub() function from the stringr package offers more flexibility and ease of use, especially for operations relative to the end of the string.
  • Use the nchar() function to calculate the length of the string, which is crucial for determining the starting position for extraction.
  • To find the starting position for extracting the last N characters, use the formula nchar(string) - n + 1.
  • The substring() function can be used to extract the last N characters by specifying the starting position derived from the above formula.
  • The substr() function requires specifying both the start and end positions. For the last N characters, the end position is the length of the string.
  • The str_sub() function simplifies extracting the last N characters by allowing the use of negative indices to count from the end of the string.
  • The function call str_sub(string, -n) directly extracts the last N characters by starting from the nth character from the end.

Using substring() Extract Last N Characters

The substring() function in base R extracts the last N characters from a string. Here’s how you can use it to extract the last n characters from a string.

First, define a string and specify the number of characters to extract. Use the nchar() function to compute the length of the string, then subtract the nth character from this length (i.e., nchar(string) - nth character). Finally, add 1 to this calculation to get the desired result.


# Extract last chracters of a string using substring()
string <- "SparkByExamples"
n <- 8
print("Extract the last n characters from the given string:")
substring(string, nchar(string) - n + 1)
  • From the above example, nchar(string) computes the number of characters in the string SparkByExamples, which is 15.
  • nchar(string) - n + 1 calculates the starting position for extracting the last n characters. Here, 15 - 8 + 1 equals 8.
  • substring(string, 8) extracts the substring from the 8th character to the end of the string. In this case, it extracts Examples.

Yields below output.

Extract last n characters in r

Using substr() Extract the Last N Characters in R

Alternatively, you can use the substr() function, another base R function, to extract the last N characters from a string. It takes three parameters: the string, the starting position for extracting the last N characters, and the ending position. It returns the specified substring from the string.


string <- "SparkByExamples"
n <- 8
print("Given string")
print(string)

print("Extract the last n characters from the given string:")
substr(string, nchar(string) - n + 1, nchar(string))

# Output:
# [1] "Given string"
# [1] "SparkByExamples"
# [1] "Extract the last n characters from the given string:"
# [1] "Examples"
  • nchar(string) returns the number of characters in the string SparkByExamples, which is 15.
  • <strong>nchar</strong>(string) - n + 1 it calculates the starting position for the substring extraction. Given that nchar(string) is 15 and n is 8, the calculation is 15 - 8 + 1, which equals 8. So, the starting position is the 8th character of the string.
  • substr(string, 8, 15): This function extracts a substring from string starting at position 8 and ending at position 15. Which is Examples.

Using stringr::str_sub() in R

The str_sub() function from the stringr package offers a straightforward and reliable method for managing string operations. It allows to extract a substring starting from the nth character to the end of the string. To specify the substring from the end, you can place a negative sign before the nth character.


# install stringr package
install.packages("stringr")
library(stringr)
string <- "SparkByExamples"
n <- 8
print("Given string")
print(string)
print("Extract the last n characters from the given string:")
str_sub(string, -n)

# Output:
# [1] "Given string"
# [1] "SparkByExamples"
# [1] "Extract the last n characters from the given string:"
# [1] "Examples"

From the above code, str_sub(string, -n) extracts the substring starting from the nth character from the end of the string. Given that n is 8, it retrieves the last 8 characters of SparkByExamples, resulting in Examples.

Conclusion

In this article, I have explained that extracting the last N characters from a string in R can be efficiently achieved using various functions, including the base R functions substring() and substr(), as well as the str_sub() function from the stringr package.

Happy learning!!