woensdag 4 november 2015

Hadoop : HDFS User Commands

Introduction

In this blogpost I'll describe some of HDFS commands in more detail that you can use to interact with the HDFS fielsystem. For this blogpost I've used Cloudera quickstart virtual machine and it's freely downloadable from the cloudera site. 

List files in HDFS

The command to list files in HDFS is ls. the command is pretty simple and it list all the files in Hadoop. It looks like this.

 hdfs dfs ls /      

And this will return the directory list of the rootfolder.


Make a directory

With mkdir it's possible to make a directory in Hadoop. Mkdir also looks pretty much on the mkdir in linux. Below an example of the command.

hdfs dfs -mkdir /user/test       
hdfs dfs -ls /user

And the result is:


Create a random local file

dd is a simple but a powerful tool to copy data from source to destination, block by block.

dd if=/dev/urandom of=sample.txt bs=64M count=16

Resulting in :


This creates a 1GB file sample.txt.


Put the file in HDFS

With the command put we can copy the file to the Hadoop filesystem.

hdfs dfs -put sample.txt /user/test

and when we take a look at the folder /user/test w'll see that file is put into this folder:


FSCK command

The system utility fsck stands for "File System Consistency Check" and is a tool for checking the consistency of a file system in HDFS. Let's try this:

hdfs fsck /user/test/sample.txt

And this results in :



HDFS administrator commands

dfsadmin gives a status of the datanodes

       
hdfs dfsadmin -report
 

Resulting in :


Conclusion

This is a small introduction into some of the commands of HDFS that you can use in your day to day work with Hadoop

Greetz,
Hennie