-du command is used to get the hdfs file and directory size. The size is the base size of the file or directory before replication. This shows the amount of space in bytes that have been used by the files that match the specified file pattern.Hadoop fs -du Command
Hadoop fs -du command displays the sizes of files and files contained in the given directory or the size of a file in case its just a file.
$ hadoop fs -du [-s] [-h] [-v] [-x] URI [URI] /HDFS-Directory or $ hadoop fs -du [-s] [-h] [-v] [-x] URI [URI] /HDFS-Directory
|-s||Show the size of each individual file that matches the pattern, show the total (summary) size.|
|-h||Used to format the sizes of the files in a human-readable manner rather than the number of bytes.|
|-v||Display the names of columns as a header line.|
|-x||Exclude snapshots from the result calculation|
Related: Hadoop HDFS Commands with Examples
Hadoop fs -du Command Examples
Below are the examples of how to get the file and directory size using
hadoop fs -du and
hdfs dfs -du command with several options.
$ hadoop fs -du /tmp/ 52 52 /tmp/data.txt 0 0 /tmp/export 0 0 /tmp/export_csv 283279596 2476986504 /tmp/hadoop-yarn 224 224 /tmp/hive
On above example, the data.txt file contains 52 characters hence it shows as 52 as size.
Example 1: shows the total (summary) size
$ hadoop fs -du -s /tmp/ 283279872 2476986780 /tmp
Example 2: sizes of the files in a human-readable
The first column shows the actual raw size of the files that users have placed in the various HDFS directories. The second column shows the actual space consumed by those files in HDFS.
$ hadoop fs -du -h /tmp/ 52 52 /tmp/data.txt 0 0 /tmp/export 0 0 /tmp/export_csv 270.2 M 2.3 G /tmp/hadoop-yarn <====== Shows in Mega & Giga bytes 224 224 /tmp/hive
Example 3: Display the names of columns as a header line
Displays the header on the output. header includes SIZE, DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS, FULL_PATH_NAME
[email protected]:~$ hadoop fs -du -v /tmp/ SIZE DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS FULL_PATH_NAME 52 52 /tmp/data.txt 0 0 /tmp/export 0 0 /tmp/export_csv 283279596 2476986504 /tmp/hadoop-yarn 224 224 /tmp/hive
Example 4: Exclude snapshots from the result calculation
$ hadoop fs -du [-x] URI [URI]/HDFS-Directory or $ hadoop fs -du [-x] URI [URI] /HDFS-Directory