Hadoop FS – How to List Files in HDFS

hadoop hdfs list command

Hadoop FS consists of several File System commands to interact with Hadoop Distributed File System (HDFS), among these LS (List) command is used to display the files and directories in HDFS, This list command shows the list of files and directories with permissions, user, group, size, and other details.

In order to use the -ls command on Hadoop, you can use it with either hadoop fs -ls or hdfs dfs -ls , Both returns the same results.

The Hadoop fs -ls command allows you to view the files and directories in your HDFS file system, much as the ls command works on Linux / OS X / Unix / Linux

Hadoop fs -ls Command 

Command hadoop fs -ls defaults to /user/username (user home directory), so you can leave the path blank to view the contents of your home directory.

The following arguments are available with hadoop ls command


$hadoop fs -ls [-c] [-d] [-h] [-R] [-t] [-S] [-r] [-u] /args
or
$hdfs dfs -ls [-c] [-d] [-h] [-R] [-t] [-S] [-r] [-u] /args

Options:

HDFS ls Options Description
-cDisplay the paths of files and directories only.
-dDirectories are listed as plain files.
-hFormats the sizes of files in a human-readable fashion rather than several bytes.
-qPrint? instead of non-printable characters.
-RRecursively list the contents of directories.
-tSort files by modification time (most recent first).
-SSort files by size.
-rReverse the order of the sort.
-uUse the time of last access instead of modification for display and sorting.
-eDisplay the erasure coding policy of files and directories.
Hadoop fs ls options

Hadoop fs -ls Command Examples

Below are the examples of how to use hadoop hdfs ls command with several options.

Example 1: Display the Paths of Files and Directories

Below example lists full path of the files and directors from give path.


$hadoop fs -ls -c file-name directory
or
$hdfs dfs -ls -c file-name directory

Example 2: List Directories as Plain Files

 -R: Recursively list subdirectories encountered.


$hadoop fs -ls -d diretory_name
or
$hdfs dfs -ls -d diretory_name

Example 3: Format File Sizes in a Human-Readable Fashion


$hadoop fs -ls -h file-size
or
$hdfs dfs -ls -h file-size

Example 4: Print and Instead of Non-Printable Characters

Add option -q to “hdfs dfs -ls” to print non-printable characters as. Non-printable characters are defined by That is extremely surprising behavior that will definitely break stuff.


$hadoop fs -ls -q print
or
$hdfs dfs -ls -q print

Example 5: Recursively List the Contents of Directories

Use R to display the files and sub directories inside a directory recursively.


$hadoop fs -ls -r diretory_name
or
$hdfs dfs -ls -r diretory_name

Example 6: Sort Files by Modification Time

By using the T option, the list shows the files and directories by modification time order (Recently modified files come first).


$hadoop fs -ls -t sort-file
or
$hdfs dfs -ls -t sort-file

Example 7: Sort Files by Size.

By using the S option, tried to list the Hadoop directories in a human-readable format using the below command. Now I am trying to sort this output based on size descending or ascending.

“-S” option sorts based on file size


$hadoop fs -ls -s file-size
or
$hdfs dfs -ls -s file-size

Example 8: Reverse the Order of the Sort

By using the r option, Sorting an array that is initially in reverse sorted order is an interesting case because it is common in practice and it brings out worse-case behavior for insertion.


$hadoop fs -ls -r 
or
$hdfs dfs -ls -r 

Example 9: Instead of Modification for Display and Sorting


$hadoop fs -ls -u 
or
$hdfs dfs -ls -u  

Example 10: Display the Erasure Coding Policy of Files and Directories

You can use the hdfs ec command with its various options to set erasure coding policies on directories. Sets an EC policy on a directory at the specified path. The following EC policies are supported.


$hadoop fs -ls -ec file-directory 
or
$hdfs dfs -ls -ec  file-directory 

Reference

Leave a Reply