Hadoop Count Command - Returns HDFS File Size and File Counts

| *** Please Subscribe for Ad Free & Premium Content ***

Post author:Prabha
Post category:Apache Hadoop
Post last modified:May 9, 2024
Reading time:7 mins read

Hadoop HDFS count option is used to count a number of directories, number of files, number of characters in a file and file size. Below is a quick example how to use count command.

Hadoop fs -count Option

The hadoop fs shell option count returns the number of directories, number of files and a number of file bytes under the paths that match the specified file pattern.

hadoop fs -count Option gives following information. Alternatively you can also use hdfs dfs -count

Directory count
File count
Content size
Filename


$ hadoop fs -count [-q] [-u] [-t] [-h] [-v] [-x] [-e] /hdfs-file-path
or
$ hdfs dfs -count [-q] [-u] [-t] [-h] [-v] [-x] [-e] /hdfs-file-path

Options:

HDFS Count Options	Description
-q	Shows quotas QUOTA, REMAINING_QUOTA, DIR_COUNT, SPACE_QUOTA, FILE_COUNT, CONTENT_SIZE, REMAINING_SPACE_QUOTA, PATHNAME
-u	Limits the output to show quotas usage only. QUOTA, REMAINING_QUOTA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, PATHNAME
-t	Shows the quota and usage for each storage type. “all”, “ram_disk”, “SSD”, “disk” or “archive”.
-h	Shows sizes in a human-readable format.
-v	Displays header line.
-x	Excludes snapshots from the result calculation.
-e	Shows the erasure coding policy for each file. DIR_COUNT, FILE_COUNT, CONTENT_SIZE, ERASURECODING_POLICY, PATHNAME

Hadoop HDFS File count Options

Hadoop fs -count Options Examples

Below are the examples of how to use hadoop hdfs count with several options.

Example 1: Shows Quotas

The quota is the hard limit on the number of names and the amount of space used for individual directories.


$ hadoop fs -count -q /hdfs-file-path
or
$ hdfs dfs -count -q /hdfs-file-path

Example 2: Limits the Output to Show Quotas and Usage only


$ hadoop fs -count -u /hdfs-file-path
or
$ hdfs dfs -count -u /hdfs-file-path

Example 3: Shows the Quota and Usage for Each Storage Type

-f shows the quota and usage for each storage type.


$ hadoop fs -count -t /hdfs-file-path
or
$ hdfs dfs -count -t /hdfs-file-path

Example 4: Shows Sizes in a Human-Readable Format

-h shows the file sizes in human readable format (M – for Mega byte, G – for Giga bytes e.t.c)


$ hadoop fs -count -h /user
   62          232            216.9 M /user

Example 5: Displays Header Line for command output

Displays header line which includes (DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME)


$ hadoop fs -count -v /tmp/data.txt
   DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
           0            1                 52 /tmp/data.txt

Example 6 : Excludes Snapshots from the Result Calculation

Excludes snapshots from the result. It always calculated from all Nodes.


$ hadoop fs -count -x /hdfs-file-path
or
$ hdfs dfs -count -x /hdfs-file-path

Example 7: Shows the Erasure Coding Policy

Shows details with replicated.


$ hadoop fs -count -e /tmp/data.txt
           0            1                 52 Replicated /tmp/data.txt

Tags: HDFS Count Command