How to read JSON files in R? You can use the fromJSON()
method of the rjson
package. JSON(JavaScript Object Notation) is a lightweight data format widely used for data exchange between systems. It is both human-readable and machine-parsable, making it an ideal format for storing and transmitting structured data, particularly in web applications. In R, JSON files can be read, written, and processed efficiently using specialized packages.
In this article, I will explore how to read JSON files into R with practical examples. We will also show how to convert JSON data into data frames, a format more commonly used in R for analysis and manipulation. R provides several powerful packages for handling JSON data, such as:
rjson
: A simple, easy-to-use package for reading and writing JSON.jsonlite
: A more robust package capable of handling nested JSON structures and large datasets.RJSONIO
: An older package that provides similar functionality torjson
but with certain performance differences.
What is JSON?
JSON stands for JavaScript Object Notation. It is a lightweight data format that is easy to read and write for humans and machines. JSON files are stored as text and often used for exchanging data between systems. In R, we can read and write JSON files using various packages.
Read JSON Files using the rjson
Package
The rjson
package is useful for simple tasks like reading or writing JSON files. However, it may not handle large datasets efficiently. Here’s how to install and use it:
Steps:
- Install and load the
rjson
package:
# Install rjson package
install.packages("rjson")
# Load the rjson package
library(rjson)
2. Reading JSON Data from a File:
Create a JSON file by copying the below data into a text editor like Notepad. Save the file with a .json extension and choose the file type as all files(*.*).
3. Use the fromJSON()
function to read the file:
You can read a JSON file using the fromJSON()
function by providing the file location as an argument.
# Read JSON data from a file
library(rjson)
# Give the input file name to the function.
json_data <- fromJSON("C:/Users/B.VIJETHA/Downloads/empdata.json")
# Print the result
print("After importing JSON file into R:")
print(json_data)
Yields below output.
Convert JSON File to R Data Frame
If you want to convert the JSON data to a data frame, you can use the as.data.frame()
function. For that, pass loaded data to the as.data.frame() method to convert it into an R data frame.
# Convert JSON data into data frame
df <- as.data.frame(json_data)
print("After converting JSON data into data frame:")
print(df)
Yields below output.
Read JSON Data into R using jsonlite
Package
Alternatively, you can use the jsonlite
package to load JSON data into R. It is more versatile and efficiently handles nested JSON structures. Here’s how to use it.
Steps:
1. Install and load the jsonlite
package:
install.packages("jsonlite")
library(jsonlite)
2. Read the JSON File:
You can load a JSON file using the fromJSON()
function by passing the file location as an argument.
# Read JSON data from a file
json_data <- fromJSON("C:/Users/B.VIJETHA/Downloads/empdata.json")
# Print the result
print("After importing JSON file into R:")
print(json_data)
Convert to Data Frame
Once the JSON data is imported into the R environment, you can convert it into a data frame for further analysis. To do this, simply use the as.data.frame()
function.
# Convert JSON file into data frame
df <- as.data.frame(json_data)
print("After converting JSON data into data frame:")
print(df)
Read JSON Data into R using RJSONIO
Package
The RJSONIO
package offers another option for handling JSON data in R. It’s similar to rjson
, but it is slower with large data.
Steps:
- Install and load the
RJSONIO
package:
# Install the package
install.packages("RJSONIO")
# Load the package
library(RJSONIO)
2. Read JSON Data from a File:
You can import a JSON file using the fromJSON()
function by providing the file location as an argument.
# Read JSON data from the file
json_data <- fromJSON("C:/Users/B.VIJETHA/Downloads/empdata.json")
# Print the result
print("After importing JSON file into R:")
print(json_data)
# Output:
# [1] "After importing JSON file into R:"
# $EmployeeID
# [1] "1" "2" "3" "4" "5"
# $Name
# [1] "Vinay" "Vidhya" "Saketh" "Neelam" "Rao"
# $Age
# [1] "28" "34" "45" "29" "38"
# $Department
# [1] "HR" "Finance" "IT" "Marketing" "Sales"
# $Salary
# [1] "50000" "70000" "120000" "65000" "75000"
# $HireDate
# [1] "2018-03-15" "2016-07-01" "2010-09-23" "2019-05-30" "2015-11-12"
Read JSON Data from a JSON String
You can also read JSON data directly from a JSON string. Simply pass the string to the fromJSON()
function to load the data into R as a list.
Read JSON Data from a JSON String
json_string <- '{"name": "Vidhya", "age": 34}'
json_data <- fromJSON(json_string)
# Print the result
print("After importing JSON string into R:")
print(json_data)
# Output:
# [1] "After importing JSON sting into R:"
# $name
# [1] "Vidhya"
# $age
# [1] 34
Access Elements of the JSON data:
Once the JSON data is loaded into R, it will usually be stored as a list, and you can access the elements like this.
# Access individual elements
print(json_data$name)
print(json_data$age)
# Output:
# [1] "Vidhya"
# [1] 34
Read JSON Data into R using tidyjson
Package
For more complex or nested JSON data, the tidyjson
package works well by converting the data into the tidy format. Here’s an example:
Steps:
- Install and load the
tidyjson
package:
# Install and load tidyjson packages
install.packages("tidyjson")
library(tidyjson)
library(dplyr)
2. Convert JSON String to a tidy object:
Let’s create a JSON string that includes an array of two objects and utilize the tidyjson
and dplyr
packages. This process transforms the JSON array into a tidy format by extracting key attributes with spread_values()
, separating each JSON object into individual rows using gather_array()
, and converting the JSON string into a tbl_json
object.
# Convert JSON String to a tidy object
library(tidyjson) # this package
library(dplyr) # for %>% and other dplyr functions
json_string <- '[{"name": "Vinay", "age": 28}, {"name": "Vidhya", "age": 34}]'
json_string %>%
as.tbl_json %>%
gather_array %>%
spread_values(
user.name = jstring("name"),
user.age = jnumber("age")
)
# Output:
# # A tbl_json: 2 x 5 tibble with a "JSON" attribute
# ..JSON document.id array.index user.name user.age
# <chr> <int> <int> <chr> <dbl>
# 1 "{\"name\":\"Vinay\"..." 1 1 Vinay 28
# 2 "{\"name\":\"Vidhya..." 1 2 Vidhya 34
Explanation:
document.id
: Identifies the JSON document (in this case, only 1 document).array.index
: Indicates the position of each object in the array (1 for the first object, 2 for the second).user.name
: Contains the name values from the JSON objects.user.age
: Contains the age values from the JSON objects.
This process efficiently transforms the JSON string into a tabular format suitable for analysis in R.
Conclusion
In this article, I have explored different ways to read JSON files in R using various packages, including rjson
, jsonlite
, RJSONIO
, and tidyjson
. For simpler JSON structures, rjson
and RJSONIO
work well. For more complex data or better performance, jsonlite
and tidyjson
are recommended. Once you’ve loaded the JSON data, you can easily convert it to a data frame for further analysis.
Happy Learning!!