Site icon Spark By {Examples}

Calculate the Median in R

r median

How to calculate the median of a DataFrame column or a Vector in R? You can use the R base median() function for computing the median of a Vector and DataFrame. This function takes the vector as a parameter and returns the median value as a numeric. The median of a dataset represents the middle value when the dataset is arranged in ascending order. If there are an even number of values in a dataset, the average of middle two values is the median.

Key points-

1. Syntax of median()

The following is the syntax of the median() function that calculates the median value.


# Syntax of median
median(x, na.rm = FALSE, …)

Parameters:

2. Calculate the Median in R

To calculate the median value of the DataFrame column you can use the median() function. This function allows the data frame column(from these values we are going to get a median) as an argument and computes the median. Let’s see the following example and get the median with and without NA values on a column.


# Create Data Frame
df <- data.frame(id=c(11,25,50,42,55),
              price=c(144,NA,321,567,567))
print("Create a data frame:")
df

# Calculate median of DataFrame column
res <- median(df$id)
print("Get the median of a data frame column:")
res

Yields below output.

r median

Calculate the Median with NA Values

If the column of the data frame has NA values and wants to get the median of these values without getting the result as NA. For that, we need to use the na.rm param of the median() function to ignore NA values. Let’s pass the na.rm = TRUE into this function along with the data frame column which has NA values, to get the median as a numeric value.


# Calculate the median of data frame column with na.rm param
res <- median(df$price, na.rm=TRUE)
print("Get the median of a data farme column:")
res

# Output:
# "Get the median of a data farme column:"
# [1] 444

3. R Median of Vector

Alternatively, you can use this function to a vector and calculate the median of these values. The following examples demonstrate calculating the median when you have an even count and odd count of vector.


# Calculate median of Vector 
vec = c(7, 6, 8)
median(vec)

# Output:
# [1] 7

# Calculate median of Vector which has even values
vec = c(9, 7, 6, 8)
median(vec)

# Output:
# [1] 7.5

Median of Vector with NA Values

Finally, you can use the median() function on the vector which has NA values to get the median value, with or without specifying the na.rm parameter.


# Calculate median of Vector having NA value
# Using median() without na.rm param
vec = c(6, 7, 8, 9, NA)
median(vec)

# Output:
# [1] NA

# Calculate median of Vector ignoring NA value
# Using median() with na.rm param
vec = c(6, 7, 8, 9, NA)
median(vec, na.rm=TRUE)

# Output
# [1] 7.5

As you can observe when calculating the median of a vector that contains NA values, the median function returns NA as the result. However, if you specify the na.rm parameter as TRUE, the function will ignore the NA values and return the median of the remaining values.

4. Conclusion

In this article, I have explained the median value and how to obtain it using the R median() function on a data frame column and vector. I also provided instructions for calculating the median of a DataFrame column or Vector containing NA values.

Exit mobile version