HBase: My Study Note
I’ve never used Apache HBase before, and now would be a great time to give it a try. I am also pretty new to Docker, so I figured, why not try them together?
Running HBase (via Docker)
> docker pull oddpoet/hbase-cdh5
> docker run -p 2181:2181 \
-p 65000:65000 \
-p 65010:65010 \
-p 65020:65020 \
-p 65030:65030 \
-h $(hostname) -d oddpoet/hbase-cdh5
Now if we load our browser at http://localhost:65010/
and http://localhost:65030/
we should see the
master & region server status.
Shell Access
Next step is to use the interactive shell:
> docker ps
CONTAINER ID IMAGE COMMAND
5a0ca2bf030e oddpoet/hbase-cdh5 "/bin/bash start.sh"
...
Mark the “CONTAINER ID” value, and issue a command to attach an interactive shell into the running container:
> docker exec -i -t 5a0ca2bf030e /bin/bash
[root@Nings-MacBook-Pro /]#
Now we are inside the container that is running the HBase instance. If we run the command hbase shell
now
we can interact with the HBase. The following session creates a table, add some data, and then display and
query them:
(Please note that the following commands can be pretty slow, almost like they are failing)
[root@Nings-MacBook-Pro /]# hbase shell
hbase(main):001:0> create 'testTable', 'cf'
0 row(s) in 15.8690 seconds
=> Hbase::Table - testTable
hbase(main):002:0> put 'testTable', 'r1', 'cf:c1', 'v1'
0 row(s) in 0.1400 seconds
hbase(main):003:0> put 'testTable', 'r2', 'cf:c1', 'v2'
0 row(s) in 0.0180 seconds
hbase(main):004:0> scan 'testTable'
ROW COLUMN+CELL
r1 column=cf:c1, timestamp=1477295738212, value=v1
r2 column=cf:c1, timestamp=1477295800618, value=v2
2 row(s) in 0.0320 seconds
hbase(main):005:0> get 'testTable', 'r2'
COLUMN CELL
cf:c1 timestamp=1477295800618, value=v2
1 row(s) in 0.0160 seconds
Java Client
My goto language is Java, so this study is not complete without a very simple working Java client.
We will be using the official hbase-client library:
(pom.xml)
<dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.2.3</version>
</dependency>
</dependencies>
and the Java code:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
public class TestApp {
public static void main(String[] args) throws IOException, ClassNotFoundException {
Configuration config = HBaseConfiguration.create();
Connection connection = ConnectionFactory.createConnection(config);
Table table = connection.getTable(TableName.valueOf("testTable"));
Get g = new Get(Bytes.toBytes("r2"));
Result r = table.get(g);
byte [] value = r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("c1"));
String valueStr = Bytes.toString(value);
System.out.println("GET: " + valueStr);
table.close();
}
}
And when we run this java code we see the result “v2”, which was what we added earlier:
r2 column=cf:c1, timestamp=1477295800618, value=v2
So it works!