Hadoop HDFS count option is used to count a number of directories, number of files, number of characters in a file and file size. Below is a quick example how to use count command.
$ hadoop fs -count /hdfs-file-path
or
$ hdfs dfs -count /hdfs-file-path
On above screenshot command hadoop fs -count /tmp/data.txt
, returns 0 1 52 (0 – directory, 1- file count , 52 – character count in a data.txt). Below example demonstrates using -count on a directory.
The /data directory contains 2 files hence it returns 1 2 775 ( 1- directory, 2- Files and 775 characters in 2 files). If you have a sub directories, this command returns count of all files with in a subdirectories as well.
Related: Hadoop HDFS Commands with Examples
Now let’s check other count options.
Hadoop fs -count Option
The hadoop fs
shell option count returns the number of directories, number of files and a number of file bytes under the paths that match the specified file pattern.
hadoop fs -count
Option gives following information. Alternatively you can also use hdfs dfs -count
- Directory count
- File count
- Content size
- Filename
$ hadoop fs -count [-q] [-u] [-t] [-h] [-v] [-x] [-e] /hdfs-file-path
or
$ hdfs dfs -count [-q] [-u] [-t] [-h] [-v] [-x] [-e] /hdfs-file-path
Options:
HDFS Count Options | Description |
---|---|
-q | Shows quotas QUOTA, REMAINING_QUOTA, DIR_COUNT, SPACE_QUOTA, FILE_COUNT, CONTENT_SIZE, REMAINING_SPACE_QUOTA, PATHNAME |
-u | Limits the output to show quotas usage only. QUOTA, REMAINING_QUOTA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, PATHNAME |
-t | Shows the quota and usage for each storage type. “all”, “ram_disk”, “SSD”, “disk” or “archive”. |
-h | Shows sizes in a human-readable format. |
-v | Displays header line. |
-x | Excludes snapshots from the result calculation. |
-e | Shows the erasure coding policy for each file. DIR_COUNT, FILE_COUNT, CONTENT_SIZE, ERASURECODING_POLICY, PATHNAME |
Hadoop fs -count Options Examples
Below are the examples of how to use hadoop hdfs count
with several options.
Example 1: Shows Quotas
The quota is the hard limit on the number of names and the amount of space used for individual directories.
$ hadoop fs -count -q /hdfs-file-path
or
$ hdfs dfs -count -q /hdfs-file-path
Example 2: Limits the Output to Show Quotas and Usage only
$ hadoop fs -count -u /hdfs-file-path
or
$ hdfs dfs -count -u /hdfs-file-path
Example 3: Shows the Quota and Usage for Each Storage Type
-f shows the quota and usage for each storage type.
$ hadoop fs -count -t /hdfs-file-path
or
$ hdfs dfs -count -t /hdfs-file-path
Example 4: Shows Sizes in a Human-Readable Format
-h shows the file sizes in human readable format (M – for Mega byte, G – for Giga bytes e.t.c)
$ hadoop fs -count -h /user
62 232 216.9 M /user
Example 5: Displays Header Line for command output
Displays header line which includes (DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME)
$ hadoop fs -count -v /tmp/data.txt
DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME
0 1 52 /tmp/data.txt
Example 6 : Excludes Snapshots from the Result Calculation
Excludes snapshots from the result. It always calculated from all Nodes.
$ hadoop fs -count -x /hdfs-file-path
or
$ hdfs dfs -count -x /hdfs-file-path
Example 7: Shows the Erasure Coding Policy
Shows details with replicated.
$ hadoop fs -count -e /tmp/data.txt
0 1 52 Replicated /tmp/data.txt
Related Articles
- Hadoop Yarn Configuration on Cluster
- WARNING: “HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
- Apache Hadoop Installation on Ubuntu (multi-node cluster).
- Hadoop FS – How to List Files in HDFS
- Hadoop FS | HDFS DFS Commands with Examples
- Hadoop – How To Get HDFS File Size(DU)
- Hadoop Get File From HDFS to Local