How to Setup a Kafka Cluster (step-by-step)

This article provides step by step instructions on how to install, setup, and run Apache Kafka Cluster on Ubuntu and test Producer and Consumer shell scripts that come with Kafka distribution also will see how to create and describe a topic.

Prerequisites :

  • JDK 8 and above

Table of Contents:

Install and Setup Kafka Cluster

Download Apache kafka latest version


wget http://apache.claz.org/kafka/2.1.0/kafka_2.11-2.1.0.tgz

Once your download is complete, unzip the file’s contents using tar, a file archiving tool and rename the folder to spark


tar -xzf kafka_2.11-2.1.0.tgz
mv kafka_2.11-2.1.0.tgz kafka

Set Kafka home location to PATH environment variable on .bashrc or .profile file.


vi ~/.bashrc

PATH=$PATH:~/kafka/bin

Now load the environment variables to the opened session by running below command.

source ~/.bashrc

In case if you have added path to .profile file then restart your session by logging out and logging in again.

Note : All commands demonstrated below are executed from Kafka home directory.

Start Zookeeper

ZooKeeper is a high-performance coordination service for distributed applications and Kafka uses ZooKeeper to store the metadata information of the cluster. Kafka comes with the Zookeeper built-in, all we need is to start the service with default configuration.

bin/zookeeper-server-start.sh config/zookeeper.properties

Start Kafka Broker

A Kafka cluster consists of one or more brokers(Kafka servers) and broker organizes messages to respective topics and persists all the Kafka messages in a topic log file for 7 days. Depends on your replication factor of the topic, the messages are replicated to multiple brokers.

Open another ubuntu session and start Kafka server with default configuration.


bin/kafka-server-start.sh config/server.properties

Create Kafka Topic

All Kafka messages are organized into topics and topics are partitioned and replicated across multiple brokers in a cluster. Producer sends messages to topic and consumer reads messages from topic. Replication factor defines how many copies of the message to be stored and Partitions allow you to parallelize a topic by splitting the data in a particular topic across multiple brokers.

Open another ubuntu session and create kafka topic “text_topic” with replication factor 1 and partitions 1


bin/kafka-topics.sh --create --zookeeper localhost:2181 \
--replication-factor 1 \
--partitions 1 \
--topic text_topic

List all Topics

Run below command to list all the the topics

bin/kafka-topics.sh --zookeeper localhost:2181 --list

Describe Topic

Run below command to describe the topic. This returns topic partition and replication information.


bin/kafka-topics.sh --zookeeper localhost:2181 --describe

This command returns below output.

kafka-topics

Run Kafka Producer

Run Kafka Producer shell that comes with kafka distribution


bin/kafka-console-producer.sh \
--broker-list localhost:9092 --topic text_topic

First kafka example
My Message

Run Kafka Consumer

Run Kafka Consumer shell that comes with Kafka distribution


bin/kafka-console-consumer.sh \
--bootstrap-server localhost:9092 --topic text_topic --from-beginning

First kafka example
My Message

Conclusion:

You now have Apache Kafka running on your Ubuntu server. You can create Kafka producers and consumers using Kafka clients, which are available for most programming languages. Below next steps includes a URL to work with Scala programming language.

Next Steps:

Kafka consumer producer

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing How to Setup a Kafka Cluster (step-by-step)