Spark spark.table() vs spark.read.table()

  • Post author:
  • Post category:Apache Spark
  • Post last modified:December 10, 2022

In Spark or PySpark what is the difference between spark.table() vs spark.read.table()? There is no difference between spark.table() vs spark.read.table() methods and both are used to read the table into Spark DataFrame.

1. spark.table() vs spark.read.table()

There is no difference between spark.table() & spark.read.table() function. Actually, spark.read.table() internally calls spark.table().

I understand this confuses why Spark provides these two syntaxes that do the same. Imagine, spark.read which is object of DataFrameReader provides methods to read several data sources like CSV, Parquet, Text, Avro e.t.c, so it also provides a method to read a table.

2. spark.table() Usage

Here, spark is an object of SparkSession and the table() is a method of SparkSession class which contains the below code snippet.


package org.apache.spark.sql.SparkSession

def table(tableName: String): DataFrame = {
  table(sessionState.sqlParser.parseTableIdentifier(tableName))
}

3. spark.read.table() Usage

Here, spark is an object of SparkSession, read is an object of DataFrameReader and the table() is a method of DataFrameReader class which contains the below code snippet. Notice that inside this method it is calling SparkSession.table() that described above.


package org.apache.spark.sql.DataFrameReader

def table(tableName: String): DataFrame = {
   assertNoSpecifiedSchema("table")
   sparkSession.table(tableName)
}

4. Example Spark Read Table

The below example shows how to read a Hive table to Spark DataFrame by using spark.read.table() and spark.table() methods.

import org.apache.spark.sql.SparkSession object ReadHiveTable extends App { // Create SparkSession with hive enabled val spark = SparkSession.builder().master(“local[*]”) .appName(“SparkByExamples.com”) .enableHiveSupport() .getOrCreate() // Read table using table() val df = spark.read.table(“emp.employee”) df.show() // Read table using table() val df2 = spark.table(“emp.employee”) df2.show() }

Both of these show() methods from above example yields the same output.

5. Conclusion

In this article, you have learned what is the difference between spark.table() vs spark.read.table() methods. As you learned both are exactly the same and are used to read the table into DataFrame.

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply