• Post author:
  • Post category:Hadoop
  • Post last modified:March 27, 2024
  • Reading time:3 mins read

When your datanodes are not starting due to java.io.IOException: Incompatible clusterIDs error, means you have formatted namenode with out deleting files from datanode.

java.io.IOException: Incompatible clusterIDs in /tmp/hadoop-ubuntu/dfs/data: namenode clusterID = CID-7dc253be-a1e4-4bf6-b051-9f495185c892; datanode clusterID = CID-90f3ade0-0287-45be-a1db-e94cf5b3147d
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:736)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294)

You will get this error when the cluster ID of name node and cluster ID of data node are different. We can see the cluster ID of name node in <dfs.namenode.name.dir>/current/VERSION file and cluster ID of data node in <dfs.datanode.data.dir>/current/VERSION file.

Precaution:

Before formatting the name node, we need to delete the files under <dfs.datanode.data.dir>/ directories on all data nodes.

Solution to Fix:

Below are two solutions to fix this, use the one that suits to your need.

Solution 1=> If you have valid data on cluster and do not want to delete it then, copy the clusterID from VERSION file of namenode and past it on datanode VERSION file.

Solution 2 => delete all files from <dfs.datanode.data.dir> of datanode and<dfs.datanode.data.dir> of namenode directory and format namenode using below command

hdfs namenode -format

If this resolves your issue, please leave us a comment. It would be helpful for others.

Related Articles

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has 6 Comments

  1. prab

    Thanks for the solution

  2. Rakendu

    Thanks! This sawved me! i went for the solution 2. i just deleted the datanode and namenode folders under the nodes folder. Thank you!

  3. Anonymous

    You saved me, thank you!

  4. Green Vetal

    Yes it helped for sure :)

  5. NNK

    Glad it helped you and thanks for the comment.

  6. Anonymous

    Hey it worked for me!

Comments are closed.