Hadoop -du
command is used to get the hdfs file and directory size. The size is the base size of the file or directory before replication. This shows the amount of space in bytes that have been used by the files that match the specified file pattern.Hadoop fs -du Command
Hadoop fs -du command displays the sizes of files and files contained in the given directory or the size of a file in case its just a file.
$ hadoop fs -du [-s] [-h] [-v] [-x] URI [URI] /HDFS-Directory
or
$ hadoop fs -du [-s] [-h] [-v] [-x] URI [URI] /HDFS-Directory
Options:
DU Options | Description |
---|---|
-s | Show the size of each individual file that matches the pattern, show the total (summary) size. |
-h | Used to format the sizes of the files in a human-readable manner rather than the number of bytes. |
-v | Display the names of columns as a header line. |
-x | Exclude snapshots from the result calculation |
Related: Hadoop HDFS Commands with Examples
Hadoop fs -du Command Examples
Below are the examples of how to get the file and directory size using hadoop fs -du
and hdfs dfs -du
command with several options.
$ hadoop fs -du /tmp/
52 52 /tmp/data.txt
0 0 /tmp/export
0 0 /tmp/export_csv
283279596 2476986504 /tmp/hadoop-yarn
224 224 /tmp/hive
On above example, the data.txt file contains 52 characters hence it shows as 52 as size.
Example 1: shows the total (summary) size
$ hadoop fs -du -s /tmp/
283279872 2476986780 /tmp
Example 2: sizes of the files in a human-readable
The first column shows the actual raw size of the files that users have placed in the various HDFS directories. The second column shows the actual space consumed by those files in HDFS.
$ hadoop fs -du -h /tmp/
52 52 /tmp/data.txt
0 0 /tmp/export
0 0 /tmp/export_csv
270.2 M 2.3 G /tmp/hadoop-yarn <====== Shows in Mega & Giga bytes
224 224 /tmp/hive
Example 3: Display the names of columns as a header line
Displays the header on the output. header includes SIZE, DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS, FULL_PATH_NAME
prabha@namenode:~$ hadoop fs -du -v /tmp/
SIZE DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS FULL_PATH_NAME
52 52 /tmp/data.txt
0 0 /tmp/export
0 0 /tmp/export_csv
283279596 2476986504 /tmp/hadoop-yarn
224 224 /tmp/hive
Example 4: Exclude snapshots from the result calculation
$ hadoop fs -du [-x] URI [URI]/HDFS-Directory
or
$ hadoop fs -du [-x] URI [URI] /HDFS-Directory
Related Articles
- Hadoop Count Command – Returns HDFS File Size and File Counts
- Hadoop Get File From HDFS to Local
- Hadoop Copy Local File to HDFS – PUT Command
- Hadoop FS – How to List Files in HDFS
- Hadoop FS | HDFS DFS Commands with Examples
- Hadoop “WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform” warning
- WARNING: “HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.