Calculate the Median in R

How to calculate the median of a DataFrame column or a Vector in R? The median() is a base function in R that is used to calculate the median of a Vector. The median of a dataset is the value that, assuming the dataset is ordered from smallest to largest, falls in the middle. If there are an even number of values in a dataset, the middle two values are the median.

This function accepts a vector as input and returns the median as a numeric value.

1. Syntax of median()

The following is the syntax of the median() function that calculates the median value.


# Syntax of median
median(x, na.rm = FALSE, …)

Parameters:

  • x – It is an input vector of type Numeric
  • na.rm – Defaults to FALSE. When TRUE, it ignores NA value.

2. R Median of DataFrame Column

By using R base function median() let’s calculate the median value of the DataFrame column. The following example demonstrates getting median with and with out NA values on a column.


# Create Data Frame
df <- data.frame(id=c(11,25,50,42,55),
              price=c(144,NA,321,567,567))
df

# Calculate median of DataFrame column
res <- median(df$id)
res

Yields below output.

median in r

Calculating the median on a column that has NA values results in NA, you need to ignore the NA to get the right result. Let’s calculate the median on the column that has NA values by using the na.rm param to ignore NA values. On our DataFrame, we have a column price that has NA values.


# with NA
res <- median(df$price, na.rm=TRUE)
res

# Output
# [1] 444

3. R Median of Vector

Similarly, let’s also calculate the median from the values of Vector. The following examples demonstrate calculating the median when you have an even count and odd count of vector and also when you have NA values.


# Calculate median of Vector
vec = c(6,7,8)
median(vec)

# Output
# [1] 7

# Calculate mean of Vector
vec = c(6,7,8, 9)
median(vec, na.rm=TRUE)

# Output
# [1] 7.5

# Calculate mean of Vector
vec = c(10,11,6,7,8,9, NA)
median(vec, na.rm=TRUE)

# Output
# [1] 8.5

4. Conclusion

In this article, you have learned what is median value and how to get it in R. Also learned how to calculate the median of a DataFrame column and Vector.

Naveen (NNK)

Naveen (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ @ LinkedIn

Leave a Reply

You are currently viewing Calculate the Median in R