You are currently viewing How to Read CSV From URL in R?

How do I read a CSV from a Web URL to R DataFrame? R provides a method from the R base library, from readr library, and data.table library to create a DataFrame by reading CSV content from a URL. CSV format is the easiest way to store scientific, analytical, or any structured data (two-dimensional with rows and columns). Data in CSV is separated by delimiter most commonly comma (,) but you can also use any character like pipe, tab e.t.c

In this article, I will explain how to create an R DataFrame by reading a CSV content from a URL and also explain different options you can use with CSV.

1. Quick Examples of Read CSV From URL

The following are quick examples of how to read a CSV from a URL.


# Quick examples

# Read CSV into DataFrame
df <- read.csv('https://sparkbyexamples.com.com/csv_data.csv')

# Read with custom delimiter
df = read.csv('https://sparkbyexamples.com.com/csv_data.csv',sep=',')

# Read without header
df = read.csv('https://sparkbyexamples.com.com/csv_data.csv',header=FALSE)

# Using encoding
read_csv = read.csv('https://sparkbyexamples.com.com/csv_data.csv', encoding='utf-8')

# Using data.table
library(data.table)
df <- fread('https://sparkbyexamples.com.com/csv_data.csv')

# Using readr
library(readr)
data <- read_csv('https://sparkbyexamples.com.com/csv_data.csv')

2. Read CSV from a URL in R

In order to read CSV content from a URL into DataFrame use the R base function read.csv(). Following is the syntax of the read.csv() function in R. This method is also used to read a CSV file from disk into DataFrame.


# Syntax of read.csv()
read.csv(file, header = TRUE, sep = ",", quote = "\"",
         dec = ".", fill = TRUE, comment.char = "", …)

If you have the URL with CSV content comma-separated use the read.csv(), by default this method considers the content in comma-separated format.


# Read CSV into DataFrame
df <- read.csv('https://sparkbyexamples.com.com/csv_data.csv')
print(df)

Yields below output.


# Output
  id name        dob gender
1 10  sai 1990-10-02      M
2 NA  ram 1981-03-24       
3 -1 <NA> 1987-06-14      F
4 13      1985-08-16   <NA>

3. Read with Custom Delimiter from URL

By default this reads the content in a comma delimiter however, you can use any custom delimiter by using sep argument.


# Usage of sep param
read_csv = read.csv('https://sparkbyexamples.com.com/csv_data.csv',sep=',')
print(read_csv)

4. Read without Header from URL

Sometimes you may have a URL where the data doesn’t contain the header, if so use header=FALSE. Let’s take another URL where content doesn’t contain a header row (column names) and load it into DataFrame.


# Use header=False
read_csv = read.csv('https://sparkbyexamples.com.com/file_noheader.csv', header=FALSE)
print(read_csv)

Yields below output.


# Output
  V1   V2         V3     V4
1 10  sai 1990-10-02      M
2 NA  ram 1981-03-24       
3 -1 <NA> 1987-06-14      F
4 13      1985-08-16   <NA>

Note that the default column names it assigns as V1, V2, V3, and V4. To rename column names on DataFrame to your own use colnames().


# Set column names
colnames(read_csv) = c('id','name','dob','gender')
print(read_csv)

5. CSV encoding

If a CSV you are reading from URL is in another encoding then use encoding=UTF-8 argument. This reads data as UTF-8 into DataFrame.


# Use UTF-8 encoding
read_csv = read.csv('https://sparkbyexamples.com.com/file_noheader.csv', encoding='utf-8')
print(read_csv)

6. Use read_csv() from readr Package

If you are working with larger data, you should use the read_csv() function readr package. readr is a third-party library hence, in order to use readr library, you need to first install it by using install.packages('readr'). Once installation completes, load the readr library in order to use this read_csv() method. To load a library in R use library("readr").


# Load readr
library("readr")

# Read CSV into DataFrame
read_csv = read_csv('https://sparkbyexamples.com.com/csv_data.csv')
print(read_csv)

7. Using data.table

Finally, let’s use the data.table package to read CSV content from a URL in R. Using this is the best approach when you are reading very large data sets.


# Using data.table
library(data.table)
df <- fread('https://sparkbyexamples.com.com/csv_data.csv')

Conclusion

In this article, you have learned how to read or load a CSV from a web URL using read.csv(), read.table() and finally read_csv() from readr package.

Related Articles

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium