How to Read Multiple CSV Files in R

Using read.csv() is not a good option to import multiple large CSV files into R Data Frame, however, R has several packages where it provides a method to read large multiple CSV files into a single R DataFrame.

In my previous article, I explained how to read a CSV file, In this article, I will explain how to read multiple CSV files from a folder into a single DataFrame in R by using different packages.

1. Quick Examples of R Read Multiple CSV Files

The following are quick examples of how to read or import multiple CSV files into a DataFrame in R by using different packages.


# Quick examples

# Example 1 - Use data.table package
library(data.table)
df <- 
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~fread(.))
df

# Example 2 - Using tidyverse
library(tidyverse)
df <-
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~read_csv(.))
df

# Example 3 - Using readr package
library(readr)
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df2 <- readr::read_csv(list_csv_files, id = "file_name")
df2

# Example 4 - Using read.csv()
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df2 = do.call(rbind, lapply(list_csv_files, function(x) read.csv(x, stringsAsFactors = FALSE)))
df2

2. Read Multiple CSV Files in R (The best approach)

In order to read multiple CSV files or all files from a folder in R, use data.table package. data.table is a third-party library hence, in order to use data.table library, you need to first install it by using install.packages('data.table'). Once installation completes, load the data.table library by using library("data.table“).

I am using a fread() version of data.table package as this is the efficient option in R to import multiple larger CSV files as it gives better performance compared with other packages.


# Use data.table package
library(data.table)
df <- 
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~fread(.))
df

Yields below output. This by default uses stringsAsFactors = FALSE. Here list.files() returns all CSV files from a specific path.


# Output
   id name        dob gender
1: 10  sai 1990-10-02      M
2: NA  ram 1981-03-24       
3: -1 <NA> 1987-06-14      F
4: 13      1985-08-16   <NA>
5: 10  sai 1990-10-02      M
6: NA  ram 1981-03-24       
7: -1 <NA> 1987-06-14      F
8: 13      1985-08-16   <NA>

3. Using tidyverse to Read Multiple CSV Files From Folder

Using tidyverse to read multiple CSV files into a single DataFrame in R is a second best approach.


# Using tidyverse
library(tidyverse)
df <-
  list.files(path = "/Users/admin/apps/csv-courses/", pattern = "*.csv") %>% 
  map_df(~read_csv(.))
df

Yields below output.


# Output
# A tibble: 8 × 4
     id name  dob        gender
  <dbl> <chr> <date>     <chr> 
1    10 sai   1990-10-02 M     
2    NA ram   1981-03-24 NA    
3    -1 <NA>  1987-06-14 F     
4    13 NA    1985-08-16 <NA>  
5    10 sai   1990-10-02 M     
6    NA ram   1981-03-24 NA    
7    -1 <NA>  1987-06-14 F     
8    13 NA    1985-08-16 <NA>

4. Using readr Package

You can consider this as a third option to load multiple CSV files into R DataFrame, This method uses the read_csv() function readr package. readr is a third-party library hence, in order to use readr library, you need to first install it by using install.packages('readr'). Once installation completes, load the readr library in order to use this read_csv() method. To load a library in R use library("readr").


# Using readr package
library(readr)
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df <- readr::read_csv(list_csv_files, id = "file_name")
df

Yields the same output as above.

5. Using R Base read.csv()

R base function provides read.csv() to import a CSV file into DataFrame. You can also use to this to import multiple CSV files at a time in R.

This is the slowest method of all hence it’s not recomanded to use on larget files. If you have small files and you don’t have above packages installed then you could use this option.


# Using read.csv()
list_csv_files <- list.files(path = "/Users/admin/apps/csv-courses/")
df2 = do.call(rbind, lapply(list_csv_files, function(x) read.csv(x, stringsAsFactors = FALSE)))
df2

Yields below output.


# Output
  id name        dob gender
1 10  sai 1990-10-02      M
2 NA  ram 1981-03-24       
3 -1 <NA> 1987-06-14      F
4 13      1985-08-16   <NA>
5 10  sai 1990-10-02      M
6 NA  ram 1981-03-24       
7 -1 <NA> 1987-06-14      F
8 13      1985-08-16   <NA>

Conclusion

In this article, you have learned how to read/import multiple CSV files from a folder into a single R DataFrame.

Related Articles

References

r read multiple csv

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing How to Read Multiple CSV Files in R