Managing Hadoop Compatible Storage

GlusterFS provides compatibility for Apache Hadoop and it uses the standard file system APIs available in Hadoop to provide a new storage option for Hadoop deployments. Existing MapReduce based applications can use GlusterFS seamlessly. This new functionality opens up data within Hadoop deployments to any file-based or object-based application.

Advantages

The following are the advantages of Hadoop Compatible Storage with GlusterFS:

Provides simultaneous file-based and object-based access within Hadoop.
Eliminates the centralized metadata server.
Provides compatibility with MapReduce applications and rewrite is not required.
Provides a fault tolerant file system.

Pre-requisites

The following are the pre-requisites to install Hadoop Compatible Storage :

Java Runtime Environment
getfattr - command line utility

Installing, and Configuring Hadoop Compatible Storage

See the detailed instruction set at https://forge.gluster.org/hadoop/pages/ConfiguringHadoop2

Resources

Apache Hadoop project home
Community Q&A for GlusterFS Betas and Hadoop
Download GlusterFS 3.3 with the Hadoop connector
GlusterFS 3.3 Beta Resource Page
Download GlusterFileSystem (the hadoop plugin) : http://ec2-54-243-59-213.compute-1.amazonaws.com/archiva/browse/org.apache.hadoop.fs.glusterfs/glusterfs-hadoop