Skip to content
  • Home
  • About
  • Write For US
|       { One stop for all Spark Examples }
Spark By {Examples}
  • Spark
    • Spark RDD
    • Spark DataFrame
    • Spark SQL Functions
    • What’s New in Spark 3.0?
    • Spark Streaming
    • Apache Spark Interview Questions
  • PySpark
  • Pandas
  • R
    • R Programming
    • R Data Frame
    • R dplyr Tutorial
    • R Data Frame
    • R Vector
    • R dplyr Tutorial
  • Snowflake
  • Hive
  • Interview Q
    • Spark Interview Questions
  • More
    • KafkaApache Kafka Tutorials with Examples
    • NumPy
    • H2O.ai
    • Apache Hadoop
    • Apache HBase
    • Apache Cassandra
    • H2O Sparkling Water
    • Scala Language
    • Python
Menu Close
  • Spark
    • Spark RDD
    • Spark DataFrame
    • Spark SQL Functions
    • What’s New in Spark 3.0?
    • Spark Streaming
    • Apache Spark Interview Questions
  • PySpark
  • Pandas
  • R
    • R Programming
    • R Data Frame
    • R dplyr Tutorial
    • R Data Frame
    • R Vector
    • R dplyr Tutorial
  • Snowflake
  • Hive
  • Interview Q
    • Spark Interview Questions
  • More
    • Kafka
    • NumPy
    • H2O.ai
    • Apache Hadoop
    • Apache HBase
    • Apache Cassandra
    • H2O Sparkling Water
    • Scala Language
    • Python
  • Home
  • About
  • Write For US
Read more about the article PySpark Distinct to Drop Duplicate Rows
PySpark

PySpark Distinct to Drop Duplicate Rows

PySpark distinct() function is used to drop/remove the duplicate rows (all columns) from DataFrame and dropDuplicates() is used to drop rows based on selected (one or multiple) columns. In this…

8 Comments
August 12, 2020
Read more about the article Spark SQL – How to Remove Duplicate Rows
Apache Spark

Spark SQL – How to Remove Duplicate Rows

Duplicate rows could be remove or drop from Spark SQL DataFrame using distinct() and dropDuplicates() functions, distinct() can be used to remove rows that have the same values on all…

4 Comments
December 25, 2019
Read more about the article Spark SQL – Get Distinct Multiple Columns
Apache Spark

Spark SQL – Get Distinct Multiple Columns

In this Spark SQL tutorial, you will learn different ways to get the distinct values in every column or selected multiple columns in a DataFrame using methods available on DataFrame…

2 Comments
December 24, 2019
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

Top Tutorials

  • Apache Spark Tutorial
  • PySpark Tutorial
  • Python Pandas Tutorial
  • R Programming Tutorial
  • Python NumPy Tutorial
  • Apache Hive Tutorial
  • Apache HBase Tutorial
  • Apache Cassandra Tutorial
  • Apache Kafka Tutorial
  • Snowflake Data Warehouse Tutorial
  • H2O Sparkling Water Tutorial

Categories

  • Apache Spark
  • PySpark
  • Pandas
  • R Programming
  • Snowflake Database
  • NumPy
  • Apache Hive
  • Apache HBase
  • Apache Kafka
  • Apache Cassandra
  • H2O Sparkling Water

About SparkByExamples.com

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand, and well tested in our development environment Read more ..
  • Opens in a new tab
  • Opens in a new tab
  • Opens in a new tab
  • Opens in a new tab
  • Opens in a new tab
[email protected]
+1 (949) 345-0676
Desert Bloom
Irvine, CA 92618
USA
Copyright sparkbyexamples.com