Configuring QuerySurge Connections: Hadoop Hive
When you create a QuerySurge Connection, the Connection Wizard will guide you through the process. Different types of QuerySurge connections require different types of information.
For a Hive Connection, you will need the following information (check with a Hive administrator or other knowledgeable resource in your organization):
- Server Name or IP address of the Hive Server (e.g. myHive.mycompany.com, or 192.168.0.255)
- The Port for your Hive server (10000 is the default port)
- The Hive Database name
- Database login credentials (ID and Password)
Note: Hive drivers are bundled in the QuerySurge installer, so you can install a Hive driver from the Installer. However, due to the possibility of version mismatches, we recommend that you copy your driver jar files from your Hive server to your QuerySurge Agent. Get the JDBC driver jars for your Hive distro from your Hive server (asterisks indicate distribution-specific notations):
hive-jdbc-***-standalone.jar
hadoop-common-***.jar
For HDP distros ('X's refer to version numbers):
/usr/hdp/X.X.X.X-XXXX/hive/lib/hive-jdbc-X.X.X-standalone.jar
/usr/hdp/X.X.X.X-XXXX/hadoop/client/hadoop-common-X.X.X.X.X.X.X-XXXX.jar
For CDH distros ('X's refer to version and distro numbers):
/usr/lib/hive/lib/hive-jdbc-X.X.X-cdhX.X.X-standalone.jar
/usr/lib/hadoop/hadoop-common-X.X.X-cdhX.X.X.jar
For Other distros:
If your distro and version are different, you'll need to find paths and files corresponding to your version on your Hive server.
Note: You can see how to deploy driver jar files to Agents here (Windows) and here (Linux).
Launch the Add Connection Wizard
- Log into QuerySurge as an Admin user.
- To configure a Connection, select Configuration > Connection in the Administrative View tree (at the left).
- Click on the Add button at the bottom left of the page to launch the Add Connection Wizard. Click Next.
Note: Check the Advanced Mode checkbox for access to advanced features.
- Provide a name for your connection. Select Hive as the Data Source. Click Next.
- Once you have selected your Data Source, the Wizard will tell you what information you are likely to need in order to create your Connection. Once you have collected this information, you are ready to click Next.
- Provide the connection information to Hive. This includes the Server name or IP address, the port (the default Hive port will automatically populate), the database name and login credentials if required. Click Next.
Required fields for your Connection Type are marked by an *.
- You can provide an optional Test Query if you want to test your Connection. A query of the type: "Select * from myhivetable limit 1" is suggested. Click Next.
- If you entered a Test Query, you can click on Test Connection (A).
Note: You must have an Agent running with the driver for this Connection deployed in order to test the Connection.
-
Save the Connection (B).
- Congratulations! You’ve created a QuerySurge Connection. Again, make sure that you have deployed the driver files for this connection to all your QuerySurge Agents.
Comments
1 comment
Hi,
I did the connection but when i view the data after my queryPair runs the Data on the Hive which is my Target shows data as{clob}. I am pretty sure it is comparing the data because i had both pass and Fail scenario but unable to review the data on the failures as well as Target daat view.
Is this something you have ever come across?
Thanks,
Rajesh
Please sign in to leave a comment.