Spark show() – Display DataFrame Contents in Table

Spark/PySpark DataFrame show() is used to display the contents of the DataFrame in a Table Row & Column Format. By default it shows only 20 Rows and the column values are truncated at 20 characters.

1. Spark DataFrame show() Syntax & Example

1.1 Syntax


  def show()
  def show(numRows : scala.Int)
  def show(truncate : scala.Boolean)
  def show(numRows : scala.Int, truncate : scala.Boolean)
  def show(numRows : scala.Int, truncate : scala.Int)
  def show(numRows : scala.Int, truncate : scala.Int, vertical : scala.Boolean)

1.2 Example


import spark.implicits._
val columns = Seq("Seqno","Quote")
val data = Seq(("1", "Be the change that you wish to see in the world"),
    ("2", "Everyone thinks of changing the world, but no one thinks of changing himself."),
    ("3", "The purpose of our lives is to be happy."),
    ("4", "Be cool."))
val df = data.toDF(columns:_*)
df.show()
//+-----+--------------------+
//|Seqno|               Quote|
//+-----+--------------------+
//|    1|Be the change tha...|
//|    2|Everyone thinks o...|
//|    3|The purpose of ou...|
//|    4|            Be cool.|
//+-----+--------------------+

As you see above, values in the Quote column is truncated at 20 characters, Let’s see how to display the full column contents.


//Display full column contents
df.show(false)

//+-----+-----------------------------------------------------------------------------+
//|Seqno|Quote                                                                        |
//+-----+-----------------------------------------------------------------------------+
//|1    |Be the change that you wish to see in the world                              |
//|2    |Everyone thinks of changing the world, but no one thinks of changing himself.|
//|3    |The purpose of our lives is to be happy.                                     |
//|4    |Be cool.                                                                     |
//+-----+-----------------------------------------------------------------------------+

By default show() method displays only 20 rows from DataFrame. The below example limit the rows to 2 and full column contents. Our DataFrame has just 4 rows hence I can’t demonstrate with more than 4 rows. If you have a DataFrame with thousands of rows try changing the value from 2 to 100 to display more than 20 rows.


// Display 2 rows and full column contents
df.show(2,false)

//+-----+-----------------------------------------------------------------------------+
//|Seqno|Quote                                                                        |
//+-----+-----------------------------------------------------------------------------+
//|1    |Be the change that you wish to see in the world                              |
//|2    |Everyone thinks of changing the world, but no one thinks of changing himself.|
//+-----+-----------------------------------------------------------------------------+

You can also truncate the column value at desired length.


// Display 2 rows & column values 25 characters
df.show(2,25)

//+-----+-------------------------+
//|Seqno|                    Quote|
//+-----+-------------------------+
//|    1|Be the change that you...|
//|    2|Everyone thinks of cha...|
//+-----+-------------------------+
//only showing top 2 rows

Finally, let’s see how to display the DataFrame vertically record by record.


// Display DataFrame rows & columns vertically
df.show(3,25,true)

//-RECORD 0--------------------------
// Seqno | 1                         
// Quote | Be the change that you... 
//-RECORD 1--------------------------
// Seqno | 2                         
// Quote | Everyone thinks of cha... 
//-RECORD 2--------------------------
// Seqno | 3                         
// Quote | The purpose of our liv... 

2. PySpark DataFrame show() Syntax & Example

Let’s assume you have a similar DataFrame mentioned above, for PySpark the syntax is slightly different as you need to pass argument names along with values.

2.1 Syntax


def show(self, n=20, truncate=True, vertical=False):

2.2 Examples


# Show full contents of DataFrame (PySpark)
df.show(truncate=False)

# Show top 2 rows and full column contents (PySpark)
df.show(2,truncate=False) 

# Shows top 2 rows and only 25 characters of each column (PySpark)
df.show(2,truncate=25) 

# Shows rows vertically (one line per column value) (PySpark)
df.show(n=3,truncate=25,vertical=True)

Happy Learning !!

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply