The set.seed()
function in R is used to make random number generation reproducible. By setting a specific seed value with set.seed()
, you initialize the random number generator to a particular state, producing the same sequence of random numbers each time you run your code. Different seeds will produce different results, but using the same seed consistently will produce the same result each time. This is especially valuable for simulations, testing, or any scenario where consistent outcomes are required.
In this article, I will explain the set.seed()
function and demonstrate how it can be used to achieve reproducible results, even in the presence of (pseudo) randomness.
Key Points-
- The primary purpose of
set.seed()
is to make the sequence of random numbers reproducible. By setting a seed, you make sure that anyone running the same code will obtain identical results. - Setting a seed is crucial for reproducibility in research and analysis. Without it, results involving randomness could vary with each run, making it difficult to verify findings.
- Applying
set.seed()
before using functions likesample()
ensures that the same samples are drawn every time the code is executed. - Setting a seed at the beginning of a script can control the randomness across the entire script, providing consistent results in all random operations.
- Different seed values will generate different sequences of random numbers, providing flexibility when you need varied outcomes.
- Placing
set.seed()
inside a function make sure that the random operations within that function are reproducible and independent of the global environment. - Depending on where you set the seed (inside or outside a loop), you can control whether each iteration produces the same or different random results.
- Seed like 123 They are often used as standard examples in tutorials and documentation, making them familiar and easy for testing and learning.
R set.seed() Function
The set.seed()
function in R is used to handle the randomness of any subsequent random operations. By setting a seed, you can make random processes, such as sampling or random number generation from a normal distribution, produce the same results each time the code is run. This is crucial for reproducibility, especially in research, simulations, or any random numbers analysis.
Syntax of set.seed() Function
Following is the syntax of the set.seed() function.
# Syntax of set.seed() function
set.seed(seed)
Parameters
It allows only one argument.
seed
: It is an integer value. This value initializes the random number generator, determining the sequence of random numbers.
Return Value
It returns None
. The set.seed()
function does not return any value. Instead, it sets the seed for R’s random number generator, affecting the outcome of any subsequent random operations (e.g., sample()
, runif()
, rnorm()
).
Create Random Numbers without R set.seed() Function
You can use the sample() function in R to randomly select elements from a vector. By passing a vector into this function, you’ll receive different random values with each execution.
# Get random samples without set.seed() function
# Create a vector
vec <- c(1:10)
print("Given vector:")
print(vec)
samp <- sample(vec)
print("Get random samples from a vector:")
print(samp)
Yields below output.
R Set Seed from Sample
As you can see, the code above generates different random numbers with each execution. To manage this randomness, you can set a seed value using the set.seed()
function along with the sample()
will produce consistent results each time the code is run.
Let’s use the set.seed()
with the sample()
to make the selection reproducible,
# Get reproducible random numbers using set.seed() with sample()
# Create vector
vec <- c(1:10)
print("Given vector:")
print(vec)
# Seed the set.seed()
set.seed(5)
# Generate random samples
samp <- sample(vec)
print("Get random samples from a Vector:")
print(samp)
Running this code multiple times with the same seed will always produce the same output. If you change the seed, the output will differ.
Yields below output.
R Set Seed from rnorm
You can generate reproducible random numbers from a normal distribution in R using the rnorm() function with a defined mean and standard deviation by setting a specific seed value with the set.seed()
function.
# Get reproducible random numbers using set.seed() with rnorm()
set.seed(123)
print("Generate reproducible random numbers:")
rnorm(5, mean = 2, sd = 0.5)
# Output:
# [1] "Generate reproducible random numbers:"
# [1] 1.719762 1.884911 2.779354 2.035254 2.064644
Set Seed from Entire Script
Alternatively, you can set a seed at the beginning of your script to provide reproducibility across the entire script. Let’s use the set.seed()
function before multiple random operations to apply its effect consistently across all operations in the script.
# Set seed for entire script
# Seed the set.seed()
set.seed(5)
# Generate first random samples
samp1 <- sample(1:50, 5)
print("Get random samples from fisrt range:")
print(samp1)
# Generate second random samples
samp2 <- sample(51:100, 5)
print("Get random samples from second range:")
print(samp2)
# Output:
# [1] "Get random samples from first range:"
# [1] 2 43 15 11 41
# [1] "Get random samples from second range :"
# [1] 71 80 57 69 53
# [1] "Get random samples from first range:"
# [1] 2 43 15 11 41
# [1] "Get random samples from second range :"
# [1] 71 80 57 69 53
Common set seed(123) Initialzer in R
The set.seed(123)
is a commonly used initializer to make random number generation reproducible. Let’s look at the example below to see how it makes the code reproducible.
# Generate random numbers with set.seed(123)
set.seed(123)
samp <- sample(1:20, 5)
print("Get reproducible random samples:")
print(samp)
# Output:
# Get reproducible random samples:
# [1] 3 13 19 15 10
R Set Seed Inside Function
You can generate reproducible random numbers within a function using the set.seed() function. To do this, first define the function, which takes a single argument n
that specifies the number of random numbers you want to generate. Inside the function, set.seed()
initializes the random number generator with a specific seed value, so the sequence of random numbers generated will be identical each time you call the function, producing consistent results.
# Set Seed Inside Function
samp <- function(n) {
set.seed(123)
sample(1:100, n)
}
print("Generate random samples:")
print(samp(5))
# Output:
# [1] "Generate random samples:"
# [1] 31 79 51 14 67
print("Generate random samples:")
print(samp(5))
# Output:
# [1] "Generate random samples:"
# [1] 31 79 51 14 67
R Set Seed For Loop
You can use set.seed() with an R for loop to generate reproducible random numbers. This can be implemented in two ways: by setting the seed outside the loop or by setting it inside the loop. Setting the seed outside the loop results in different random numbers for each iteration, while setting the seed inside the loop produces the same random number in every iteration. Let’s look at the below example,
# Seed outside the loop
set.seed(123)
for (i in 1:3) {
print(sample(1:10, 3))
}
# Seed inside the loop
for (i in 1:3) {
set.seed(123)
print(sample(1:10, 3))
}
# Output:
# > # Seed outside the loop
# [1] 3 10 2
# [1] 2 6 3
# [1] 5 4 6
# > # Seed inside the loop
# [1] 3 10 2
# [1] 3 10 2
# [1] 3 10 2
Conclusion
In this article, I have explained the set.seed()
function in R and using this function how we can make the sequence of random numbers reproducible. Also explained using different seed values how we can generate different sequences of random numbers. Additionally, I demonstrated how to implement the set.seed()
function within a function, as well as how it behaves both inside and outside of loops.
Happy Learning!!