Connecting QuerySurge to Azure Databricks
Azure Databricks is an increasingly popular business tool and a connection to QuerySurge is an effective way to improve data analytics. The connection uses a JDBC Driver, which is true for all connections to QuerySurge. For this article, we use the JDBC Driver offered by Databricks which is available for download here.
Setting up a QuerySurge Connection with the Databricks JDBC Driver
Note: A Databricks JDBC driver needs to be deployed on all Agents that you plan to issue Databricks queries from. Instructions for deploying drivers can be found here (Windows) and here (Linux). The driver file name is SparkJDBC4x.jar, where '4x' specifies the JDBC version for the driver.
1. Login to QuerySurge with a QuerySurge Admin login.
2. Navigate to the Administration page by clicking the Administration icon in the bottom toolbar.
3. Under the Administration Tree to the left, select Connections.
4. On the bottom left hand side of the window click Add.
6. The Connection Wizard will launch. Under Add Connection click Next.
Note: For advanced features, click the Advanced Mode checkbox.
7. Insert a name for your Connection in the Connection Name field. This is how your connection will be listed in QuerySurge. Under Data Source, use the drop down menu and scroll down to select * All Other JDBC Connections (Connections Extensibility). Click Next.
Note: Be sure to fill out all fields marked by an asterisk *
8. Provide the Driver Class for the JDBC driver (com.simba.spark.jdbc.Driver) in the Driver Class field. Click Next.
Note: Earlier versions of the driver used driver class names com.simba.spark.jdbc4.Driver or com.simba.spark.jdbc41.Driver. Check your driver documentation.
9. For the Connection URL, you'll need the proper URL for your Databricks instance. The URL template is as follows:
In order to create the URL, the values specific to your environment should be inserted in place of the <sever-hostname>, <port>, <http-path> and <personal-access-token>.
The information for the first three of these values (<sever-hostname>, <port>, and <http-path>) can be found in the cluster detail page under the JDBC/ODBC tab in Azure (use the first URL option under "JDBC URL").
Note that the <personal-access-token> in the template URL must also be changed to your Personal Access Token. For instructions on how to generate a new Access Token in Azure, click here:
Once you have substituted your access token in the URL, copy the URL into the Wizard:
Optionally, insert a Test Query to run in order to verify the connection details. Click Next.
10. You have the option of entering a test query in order to verify the Connection setup. A query that returns a small amount of information is adequate (for example, one column/one row). Click Test Connection. A status message will confirm whether nor not the connection was successful.
11. Click Save to save the Connection in QuerySurge. The Connection can be saved whether you tested it or not, or whether the information you entered is correct, or not. We recommend that you test the Connection when you create it, so that there is no confusion about which Connections work and which do not. The Connection can now be used in a QueryPair.