HBase Setup Guide

Not so quick guide to  HDFS based HBase setup on a single node Ubuntu box ( Pseudo-distributed mode ).



Lets start with the installation of hadoop first. This guide is one of the best which works simply out of the box.
We will install hbase in a similar manner for the hduser. Download a stable version of hbase form here.



    
    $ cd /usr/local
    $ sudo tar xzf hbase-0.90.4.tar.gz
    $ sudo mv hbase-0.90.4 hbase
    $ sudo chown -R hduser:hadoop hbase




Now we need to makes some changes to the config files of hbase to let it know the details of the HDFS.  Note the hbase.rootdir property, the value is being set to use port 54310 because thats the way we configured our hadoop dfs in /usr/local/hadoop/conf/core-site.xml file. Edit file at location /usr/local/hbase/conf/hbase-site.xml to look like following:



    <configuration>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://localhost:54310/hbase</value>
        <description>The directory shared by RegionServers.
        </description>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
        <description>The replication count for HLog and HFile storage. Should not be greater

           than HDFS datanode count.
       </description>
    </property>
    </configuration>




We should also change the JAVA_HOME variable in /usr/local/hbase/conf/hbase-env.sh to read like export JAVA_HOME=/usr/lib/jvm/java-6-sun


Important:
This section is really important because failing to do this you will face errors in start up of HMaster or HQuorumPeer daemons ( or both ) which are critical for hbase.  Since running of hbase in distributed mode is based on hadoop, therefore the hbase has dependency on hadoop libraries and thats the reason you will find hadoop-core-x.xx-append-xxxx.jar file in /usr/local/hbase/lib/ location. You should replce it with hadoop's own jar file.




    $ cd /usr/local/hbase/lib
    $ cp /usr/local/hadoop/hadoop-0.20.2-core.jar .
    $ chmod +wx hadoop-0.20.2-core.jar
    $ mv hadoop-core-x.xx-append-xxxx.jar hadoop.backup



At this point we can add the $HBASE_HOME variables to hduser's .bashrc file and also append the bin directory to environment PATH. The hduser's .bashrc file should looks like following:



# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun

#Set Hbase home
export HBASE_HOME=/usr/local/hbase

# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin:$HBASE_HOME/bin
..............



Lets start our hadoop single node cluster and hbase. To check everything is up and running we will use the jps tool.


$start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-ubuntu.out
.........
$start-hbase.sh

localhost: starting zookeeper, logging to /usr/local/hbase/logs/hbase-hduser-zookeeper-jj-ubuntu.out
......
$jps
23143 Jps
22985 HRegionServer
22817 HMaster
22767 HQuorumPeer
5750 SecondaryNameNode
5399 NameNode
5838 JobTracker
5567 DataNode
6006 TaskTracker


Now that everything is at place lets start the hbase shell and create a table


$hbase shell

HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011


hbase(main):001:0> create 'test','cf'
0 row(s) in 1.3890 seconds


hbase(main):004:0> list
TABLE                                                                                                                                                                   
test                                                                                                                                                                    
1 row(s) in 0.0120 seconds


Thanks for you patience to go through the long guide. For suggestions and queries please leave a comment.

EDIT: If you are done with single node, Take a look at multiple node setup guide here.


No comments: