What is Spark Streaming?

Apache Spark Streaming is a scalable, high-throughput, fault-tolerant streaming processing system that supports both batch and streaming workloads. It is an extension of the core Spark API to process real-time data from sources like Kafka, Flume, and Amazon Kinesis to name a few. This processed data can be pushed to other systems like databases, Kafka, live dashboards e.t.c

Apache Kafka is a publish-subscribe messaging system originally written at LinkedIn. A Kafka cluster is a highly scalable and fault-tolerant system and it also has a much higher throughput compared to other message brokers such as ActiveMQ and RabbitMQ

Spark Streaming with Kafka Example

| *** Please Subscribe for Ad Free & Premium Content ***

Post author:Naveen Nelamali
Post category:Apache Spark / Apache Spark Streaming / Member
Post last modified:April 24, 2024
Reading time:13 mins read

You are currently viewing Spark Streaming with Kafka Example

This content is for members only.
Join Now

Already a member? Log in here

Tags: from_json, kafka consumer, kafka producer, spark streaming, to_json