You are currently viewing Spark Dataframe – Show Full Column Contents?

Problem: In Spark or PySpark, when you do DataFrame show, it truncates column content that exceeds longer than 20 characters, wondering how to show full column content of a DataFrame as an output?

1. Solution: PySpark Show Full Contents of a DataFrame

In Spark or PySpark by default truncate column content if it is longer than 20 chars when you try to output using show() method of DataFrame, in order to show the full contents without truncating you need to provide a boolean argument false to show(false) method. Following are some examples.

1.1 Spark with Scala/Java


// Shows only 20 characters for each column (Scala/java)
df.show(true) 

// Show full column contents of DataFrame (Scala/java)
df.show(false)

// Show top 5 rows and full column contents of DataFrame (Scala/java)
df.show(5,false)

1.2 PySpark (Spark with Python)


// Show full contents of DataFrame (PySpark)
df.show(truncate=False)

// Show top 5 rows and full column contents (PySpark)
df.show(5,truncate=False) 

// Shows top 5 rows and only 10 characters of each column (PySpark)
df.show(5,truncate=10) 

// Shows rows vertically (one line per column value) (PySpark)
df.show(vertical=True)

Let’s see with an example. First, let’s create a DataFrame with some long data in a column.


val spark:SparkSession = SparkSession.builder()
    .master("local[1]")
    .appName("SparkByExamples.com")
    .getOrCreate()

import spark.implicits._
val columns = Seq("Seqno","Quote")
val data = Seq(("1", "Be the change that you wish to see in the world"),
    ("2", "Everyone thinks of changing the world, but no one thinks of changing himself."),
    ("3", "The purpose of our lives is to be happy."))
val df = data.toDF(columns:_*)
df.show()

Yields below output.


// Output:
+-----+--------------------+
|Seqno|               Quote|
+-----+--------------------+
|    1|Be the change tha...|
|    2|Everyone thinks o...|
|    3|The purpose of ou...|
+-----+--------------------+

By default, show() method truncate long columns however, you can change this behavior by passing a boolean value false to show() method to display the full content.


df.show(false)

This yields the below output.


// Output:
+-----+-----------------------------------------------------------------------------+
|Seqno|Quote                                                                        |
+-----+-----------------------------------------------------------------------------+
|1    |Be the change that you wish to see in the world                              |
|2    |Everyone thinks of changing the world, but no one thinks of changing himself.|
|3    |The purpose of our lives is to be happy.                                     |
+-----+-----------------------------------------------------------------------------+

2. PySpark Show Full Contents of a DataFrame

Let’s assume you have a similar DataFrame mentioned above, for PySpark the syntax is slightly different to show the full contents of the columns. Here you need to specify truncate=False to show() method.


df.show(truncate=False)

This yields same output as above.

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium