HBase is part of the Apache stable of NoSQL databases and is enjoying popularity as part of the Hadoop ecosystem. While HBase is a classic NoSQL database, the Apache Phoenix project provides a SQL layer for HBase, and this is how QuerySurge connects to HBase.
Setting up a Connection to HBase with Phoenix
Connecting to HBase with Phoenix is done using the Connection Extensibility feature of the QuerySurge Connection Wizard. Following are the details you'll need to set up your QuerySurge Connection to HBase with Phoenix.
- Make sure Phoenix is installed and running with your HBase service. Basic installation instructions are available here (look for the section: "Blah, blah, blah - I just want to get started!").
- Obtain the Phoenix JDBC driver from the installation tar. Both phoenix-client.jar and hbase-client.jar should be deployed to your Agent(s).
- Deploy the Phoenix JDBC driver jars to your Agent(s). The procedure for deploying a new driver to a QuerySurge Agent is here (for Agents on Windows) and here (for Agents on Linux).
- Log into QuerySurge as a QuerySurge Admin user, and navigate to the Admin view. Steps for using the Connection Extensibility feature can be found here. To use the Connection Extensibility option in the Connection Wizard with the Phoenix driver, you'll need the following information:
zk_quorum- a comma separated list of the ZooKeeper Servers.
zk_port- the ZooKeeper port (default: 2181)
zk_hbase_path- the path used by HBase to stop information about the instance.
On a non-secure cluster the default
When you've entered your information, the Connection Wizard will look similar to this:
If you have a Test Query for HBase, feel free to enter it to help verify that your Connection parameters are correct. It should be a standard query that returns a small amount of information - one row is enough.
- If you entered a Test Query, you can use the Test Connection button to test whether your Connection is set up properly. Once your driver is set up, you should be able to write SQL queries against your HBase data in QuerySurge.
The Phoenix SQL grammar, function calls and data types are described here, here and here.
If a Phoenix VARCHAR column is created with an explicit size, then the Phoenix driver returns the configured column size in the metadata, as expected. However, Phoenix gives the option to create a VARCHAR column with no defined size. In this case, the Phoenix driver returns a static size of 40 (bytes) in the query metadata regardless of the actual data size. This can easily cause a run to fail, because QuerySurge relies on the query metadata to handle the resultset. In this case, if VARCHAR data comes back to QuerySurge that is larger than 40 bytes, QuerySurge will nevertheless expect it to be no larger than 40 bytes, and a fatal error will occur.