Hadoop HDFS count option is used to count a number of directories, number of files, number of characters in a file and file size. Below is a quick example how to use count command.
$ hadoop fs -count /hdfs-file-path or $ hdfs dfs -count /hdfs-file-path
On above screenshot command
hadoop fs -count /tmp/data.txt, returns 0 1 52 (0 – directory, 1- file count , 52 – character count in a data.txt). Below example demonstrates using -count on a directory.
The /data directory contains 2 files hence it returns 1 2 775 ( 1- directory, 2- Files and 775 characters in 2 files). If you have a sub directories, this command returns count of all files with in a subdirectories as well.
Related: Hadoop HDFS Commands with Examples
Now let’s check other count options.
Hadoop fs -count Option
hadoop fs shell option count returns the number of directories, number of files and a number of file bytes under the paths that match the specified file pattern.
hadoop fs -count Option gives following information. Alternatively you can also use
hdfs dfs -count
- Directory count
- File count
- Content size
$ hadoop fs -count [-q] [-u] [-t] [-h] [-v] [-x] [-e] /hdfs-file-path or $ hdfs dfs -count [-q] [-u] [-t] [-h] [-v] [-x] [-e] /hdfs-file-path
|HDFS Count Options||Description|
|-q||Shows quotas QUOTA, REMAINING_QUOTA, DIR_COUNT, SPACE_QUOTA, FILE_COUNT, CONTENT_SIZE, REMAINING_SPACE_QUOTA, PATHNAME|
|-u||Limits the output to show quotas usage only. QUOTA, REMAINING_QUOTA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, PATHNAME|
|-t||Shows the quota and usage for each storage type. “all”, “ram_disk”, “SSD”, “disk” or “archive”.|
|-h||Shows sizes in a human-readable format.|
|-v||Displays header line.|
|-x||Excludes snapshots from the result calculation.|
|-e||Shows the erasure coding policy for each file. DIR_COUNT, FILE_COUNT, CONTENT_SIZE, ERASURECODING_POLICY, PATHNAME|
Hadoop fs -count Options Examples
Below are the examples of how to use
hadoop hdfs count with several options.
Example 1: Shows Quotas
The quota is the hard limit on the number of names and the amount of space used for individual directories.
$ hadoop fs -count -q /hdfs-file-path or $ hdfs dfs -count -q /hdfs-file-path
Example 2: Limits the Output to Show Quotas and Usage only
$ hadoop fs -count -u /hdfs-file-path or $ hdfs dfs -count -u /hdfs-file-path
Example 3: Shows the Quota and Usage for Each Storage Type
-f shows the quota and usage for each storage type.
$ hadoop fs -count -t /hdfs-file-path or $ hdfs dfs -count -t /hdfs-file-path
Example 4: Shows Sizes in a Human-Readable Format
-h shows the file sizes in human readable format (M – for Mega byte, G – for Giga bytes e.t.c)
$ hadoop fs -count -h /user 62 232 216.9 M /user
Example 5: Displays Header Line for command output
Displays header line which includes (DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME)
$ hadoop fs -count -v /tmp/data.txt DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 0 1 52 /tmp/data.txt
Example 6 : Excludes Snapshots from the Result Calculation
Excludes snapshots from the result. It always calculated from all Nodes.
$ hadoop fs -count -x /hdfs-file-path or $ hdfs dfs -count -x /hdfs-file-path
Example 7: Shows the Erasure Coding Policy
Shows details with replicated.
$ hadoop fs -count -e /tmp/data.txt 0 1 52 Replicated /tmp/data.txt
- Hadoop Yarn Configuration on Cluster
- WARNING: “HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
- Apache Hadoop Installation on Ubuntu (multi-node cluster).
- Hadoop FS – How to List Files in HDFS
- Hadoop FS | HDFS DFS Commands with Examples
- Hadoop – How To Get HDFS File Size(DU)
- Hadoop Get File From HDFS to Local