You are currently viewing Import Excel in R with Examples

R provides several packages like readxl, xlsx, and openxlsx to read or import Excel files into R DataFrame. These packages provide several methods with different arguments which help us read Excel files effectively.

We have also provided quick articles for reading CSV files and writing CSV files using R base functions as well as using readr package, which is 10 times faster than R base functions.

In this article, you will learn how to read data from an Excel file with the extension xls or xlsx into R DataFrame. This can be done by using readxl::read_excel() and xlsx:read.xlsx()

1. Quick Examples

Following are quick examples of how to read or import Excel files into R DataFrame.



# Load readxl package
library("readxl")

# Read xlsx files
df = read_excel("/Users/admin/new_file.xlsx")
df = read_xlsx("/Users/admin/new_file.xlsx")
df = read_xsl("/Users/admin/new_file.xsl")

# Read sheet names 'sheet_name' from excel file
df = read_excel("/Users/admin/new_file.xlsx", sheet='sheet_name')

# Read sheet 3 from excel file
df = read_excel("/Users/admin/new_file.xlsx", sheet=3)

#Load xlsx package
library("xlsx")

#Read first sheet from new_file.xlsx file
df = read.xlsx(file='/Users/admin/new_file.xlsx', sheetIndex=1) 

# Load openxlsx
library(openxlsx)

# Read excel file
read.xlsx('/Users/admin/new_file.xlsx')

2. Import Excel files into R using readxl package

Use read_excel() function from readxl package to read or import an Excel file (xlsx or xls) as R DataFrame. To use readxl library, you need to first install it by using install.packages('readxl'). Once installation completes, load the readxl library to use this read_excel() method. To load a library in the R use library("readxl").

read_excel() calls excel_format() to check if path is xls or xlsx. If you know the file extension, use read_xls() and read_xlsx() directly.

2.1 Install readxl Package

To use readxl package in R to work with Excel files, first, you need to install it using install.packages(). Run the below command in R or RStudio.


# Install readxl package
install.packages("readxl")

2.2 Syntax of read_excel() Function

Following is the syntax of read_excel() function.


# Syntax of read_excel()
read_excel(path,sheet = NULL,range = NULL,col_names = TRUE,
  col_types = NULL, na = "", trim_ws = TRUE, skip = 0,
  n_max = Inf, guess_max = min(1000, n_max), progress = readxl_progress(),
  .name_repair = "unique"
)

2.3 Import Excel File using readxl Package

Let’s use the above syntax of read_excel() to import an Excel file into an R DataFrame. If you want to read the data from a specific sheet use the sheet argument.


# Load readxl package
library("readxl")

# Read xlsx files
df = read_excel("/Users/admin/new_file.xlsx")
(or)
df = read_xlsx("/Users/admin/new_file.xlsx")

If you have an xls file then use the below methods to read or import it as DataFrame.


# Read xsl files
df = read_excel("/Users/admin/new_file.xsl")
(or)
df = read_xsl("/Users/admin/new_file.xsl")

2.4 Read Data from Specific Excel Sheet

If you have multiple sheets in an Excel file (xlsx, xsl) use the sheet argument to read or import the data from a specific sheet. Pass either sheet name or sheet index to this argument.


# Read sheet names 'sheet_name' from excel file
df = read_excel("/Users/admin/new_file.xlsx", sheet='sheet_name')
#or
# Read sheet 3 from excel file
df = read_excel("/Users/admin/new_file.xlsx", sheet=3)

3. Import Excel files using the xlsx package

Package xlsx is another package R to read or import Excel files into DataFrame. It is a Java-based solution, for reading, writing, and formatting Excel files in R.

Use read.xlsx() function from xlsx package to read or import an excel file (xlsx or xls) as R DataFrame. In order to use the xlsx library, you need to first install it by using install.packages('xlsx'). Once installation completes, load the xlsx library to use this read_xlsx() method. To load a library in R use library("xlsx").

3.1 Install xlsx Package

Run the below command in R or RStudio to install xlsx package.


# Install readxl package
install.packages("xlsx")

3.2 Syntax

The xslx package provides 2 function read.xlsx() and read.xlsx2(), below are their syntaxes. Use read.xlsx2() if you have big files as it performs better than read.xlsx().


# Syntax of read.xlsx() & read.xlsx2()
read.xlsx(file, sheetIndex, header=TRUE)
read.xlsx2(file, sheetIndex, header=TRUE)

3.3 Using read.xlsx() Example

Let’s use the above syntax of the read.xslx() to read an Excel file into an R DataFrame. If you want to read the data from a specific sheet use the sheetIndex argument.


#Load xlsx package
library("xlsx")

#Read first sheet from new_file.xlsx file
df = read.xlsx(file='/Users/admin/new_file.xlsx', sheetIndex=1) 

4. Import Excel files using openxlsx Package

The openxlsx is another package that can be used to perform operations with excel files. Use read.xlsx() function from this package to read the worksheet.



#Install openxlsx package
install.packages("openxlsx")

# Load openxlsx
library(openxlsx)

# Read excel file
read.xlsx('/Users/admin/new_file.xlsx')

Frequently Asked Questions on Import Excel in R with Examples

How can I import an Excel file into R?

You can use the readxl package or the openxlsx package in R to import Excel files. The readxl::read_excel() function is commonly used for this purpose.

What formats of Excel files are supported in R?

R supports both .xls and .xlsx formats. The readxl and openxlsx packages can handle both Excel formats.

How do I install the required packages for Excel file import in R?

You can install the necessary packages using the install.packages() function. For example, install.packages("readxl") or install.packages("openxlsx").

How can I import specific sheets from an Excel file?

You can import specific sheets using the sheet parameter in functions like readxl::read_excel() or openxlsx::read.xlsx().

What if my Excel file has missing values (NA) or blank cells?

R will typically represent missing values as NA (Not Available). You can handle them using functions like complete.cases() or use the na.strings parameter while importing to specify other representations of missing values.

What if my Excel file has multiple headers or starts from a specific row?

You can use the skip parameter to skip a specific number of rows at the beginning of the Excel file. The col_names parameter in functions like readxl::read_excel() allows you to specify whether to treat the first row as a column name.

5. Conclusion

In this article, you have learned how to read or import an Excel file into R DataFrame by using methods from the package openxlsx, xlsx, and readxl packages.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium