You are currently viewing How to Read Multiple CSV Files in R

Using read.csv() is not a good option to import multiple large CSV files into R Data Frame, however, R has several packages where it provides a method to read large multiple CSV files into a single R DataFrame.

In my previous article, I explained how to read a CSV file, In this article, I will explain how to read multiple CSV files from a folder into a single DataFrame in R by using different packages.

1. Quick Examples of R Read Multiple CSV Files

The following are quick examples of how to read or import multiple CSV files into a DataFrame in R by using different packages.


# Quick examples

# Example 1 - Use data.table package
library(data.table)
df <- 
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~fread(.))
df

# Example 2 - Using tidyverse
library(tidyverse)
df <-
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~read_csv(.))
df

# Example 3 - Using readr package
library(readr)
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df2 <- readr::read_csv(list_csv_files, id = "file_name")
df2

# Example 4 - Using read.csv()
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df2 = do.call(rbind, lapply(list_csv_files, function(x) read.csv(x, stringsAsFactors = FALSE)))
df2

2. Read Multiple CSV Files in R (The best approach)

In order to read multiple CSV files or all files from a folder in R, use data.table package. data.table is a third-party library hence, in order to use data.table library, you need to first install it by using install.packages('data.table'). Once installation completes, load the data.table library by using library("data.table“).

I am using a fread() version of data.table package as this is the efficient option in R to import multiple larger CSV files as it gives better performance compared with other packages.


# Use data.table package
library(data.table)
df <- 
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~fread(.))
df

Yields below output. This by default uses stringsAsFactors = FALSE. Here list.files() returns all CSV files from a specific path.


# Output
   id name        dob gender
1: 10  sai 1990-10-02      M
2: NA  ram 1981-03-24       
3: -1 <NA> 1987-06-14      F
4: 13      1985-08-16   <NA>
5: 10  sai 1990-10-02      M
6: NA  ram 1981-03-24       
7: -1 <NA> 1987-06-14      F
8: 13      1985-08-16   <NA>

3. Using tidyverse to Read Multiple CSV Files From Folder

Using tidyverse to read multiple CSV files into a single DataFrame in R is a second best approach.


# Using tidyverse
library(tidyverse)
df <-
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~read_csv(.))
df

Yields below output.


# Output
# A tibble: 8 × 4
     id name  dob        gender
  <dbl> <chr> <date>     <chr> 
1    10 sai   1990-10-02 M     
2    NA ram   1981-03-24 NA    
3    -1 <NA>  1987-06-14 F     
4    13 NA    1985-08-16 <NA>  
5    10 sai   1990-10-02 M     
6    NA ram   1981-03-24 NA    
7    -1 <NA>  1987-06-14 F     
8    13 NA    1985-08-16 <NA>

4. Using readr Package

You can consider this as a third option to load multiple CSV files into R DataFrame, This method uses the read_csv() function readr package. readr is a third-party library hence, in order to use readr library, you need to first install it by using install.packages('readr'). Once installation completes, load the readr library in order to use this read_csv() method. To load a library in R use library("readr").


# Using readr package
library(readr)
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df <- readr::read_csv(list_csv_files, id = "file_name")
df

Yields the same output as above.

5. Using R Base read.csv()

R base function provides read.csv() to import a CSV file into DataFrame. You can also use to this to import multiple CSV files at a time in R.

This is the slowest method of all hence it’s not recomanded to use on larget files. If you have small files and you don’t have above packages installed then you could use this option.


# Using read.csv()
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df2 = do.call(rbind, lapply(list_csv_files, function(x) read.csv(x, stringsAsFactors = FALSE)))
df2

Yields below output.


# Output
  id name        dob gender
1 10  sai 1990-10-02      M
2 NA  ram 1981-03-24       
3 -1 <NA> 1987-06-14      F
4 13      1985-08-16   <NA>
5 10  sai 1990-10-02      M
6 NA  ram 1981-03-24       
7 -1 <NA> 1987-06-14      F
8 13      1985-08-16   <NA>

Conclusion

In this article, you have learned how to read/import multiple CSV files from a folder into a single R DataFrame.

Related Articles

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has 2 Comments

  1. Adam_S

    This is helpful (and the first thing that came up for me in a search 😉 ), but you might want to add the fact that read_csv defaults to the working directory, so the value of path used in the list.files line must also be the working directory. The path is specified above in list.files but not in read_csv which could cause confusion.

  2. Adam_S

    This is helpful (and the first thing that came up for me in a search 😉 ), but you might want to add the fact that read_csv defaults to the working directory, so the value of path used in the list.files line must also be the working directory. The path is specified above in list.files but not in read_csv which could cause confusion.

Comments are closed.