To extract the last N characters from a string in R, you can use the base R functions substring()
and substr()
, or the str_sub()
function from the stringr
package. Extracting the last N characters is a common task in data manipulation and text processing. Each of these functions offers unique advantages and can be chosen based on the specific needs of your task. In this article, I will explain multiple methods for efficiently extracting the last N characters from a string using both base R functions and the stringr
package.
Key points-
substring()
andsubstr()
are native functions in R that can extract parts of a string based on specified positions.- The
str_sub()
function from thestringr
package offers more flexibility and ease of use, especially for operations relative to the end of the string. - Use the
nchar()
function to calculate the length of the string, which is crucial for determining the starting position for extraction. - To find the starting position for extracting the last N characters, use the formula
nchar(string) - n + 1
. - The
substring()
function can be used to extract the last N characters by specifying the starting position derived from the above formula. - The
substr()
function requires specifying both the start and end positions. For the last N characters, the end position is the length of the string. - The
str_sub()
function simplifies extracting the last N characters by allowing the use of negative indices to count from the end of the string. - The function call
str_sub(string, -n)
directly extracts the last N characters by starting from the nth character from the end.
Using substring() Extract Last N Characters
The substring() function in base R extracts the last N characters from a string. Here’s how you can use it to extract the last n
characters from a string.
First, define a string and specify the number of characters to extract. Use the nchar()
function to compute the length of the string, then subtract the nth character from this length (i.e., nchar(string) - nth character
). Finally, add 1
to this calculation to get the desired result.
# Extract last chracters of a string using substring()
string <- "SparkByExamples"
n <- 8
print("Extract the last n characters from the given string:")
substring(string, nchar(string) - n + 1)
- From the above example,
nchar(string)
computes the number of characters in the stringSparkByExamples
, which is 15. nchar(string) - n + 1
calculates the starting position for extracting the lastn
characters. Here,15 - 8 + 1
equals 8.- substring(string, 8) extracts the substring from the 8th character to the end of the string. In this case, it extracts
Examples
.
Yields below output.
Using substr() Extract the Last N Characters in R
Alternatively, you can use the substr()
function, another base R function, to extract the last N characters from a string. It takes three parameters: the string, the starting position for extracting the last N characters, and the ending position. It returns the specified substring from the string.
string <- "SparkByExamples"
n <- 8
print("Given string")
print(string)
print("Extract the last n characters from the given string:")
substr(string, nchar(string) - n + 1, nchar(string))
# Output:
# [1] "Given string"
# [1] "SparkByExamples"
# [1] "Extract the last n characters from the given string:"
# [1] "Examples"
nchar(string)
returns the number of characters in the stringSparkByExamples
, which is 15.<strong>nchar</strong>(string) - n + 1
it calculates the starting position for the substring extraction. Given thatnchar(string)
is 15 andn
is 8, the calculation is15 - 8 + 1
, which equals 8. So, the starting position is the 8th character of the string.substr(string, 8, 15):
This function extracts a substring fromstring
starting at position 8 and ending at position 15. Which isExamples
.
Using stringr::str_sub() in R
The str_sub()
function from the stringr package offers a straightforward and reliable method for managing string operations. It allows to extract a substring starting from the nth character to the end of the string. To specify the substring from the end, you can place a negative sign before the n
th character.
# install stringr package
install.packages("stringr")
library(stringr)
string <- "SparkByExamples"
n <- 8
print("Given string")
print(string)
print("Extract the last n characters from the given string:")
str_sub(string, -n)
# Output:
# [1] "Given string"
# [1] "SparkByExamples"
# [1] "Extract the last n characters from the given string:"
# [1] "Examples"
From the above code, str_sub(string, -n)
extracts the substring starting from the nth character from the end of the string. Given that n is 8, it retrieves the last 8 characters of SparkByExamples
, resulting in Examples
.
Conclusion
In this article, I have explained that extracting the last N characters from a string in R can be efficiently achieved using various functions, including the base R functions substring()
and substr()
, as well as the str_sub()
function from the stringr
package.
Happy learning!!