Parquet is an open-source column-oriented file storage format. QuerySurge customers who use Parquet files in their data architectures can connect QuerySurge to their files for data testing purposes. In this article, we show the connection setup for the CData JDBC driver.
|Note: RTTS, the vendor of QuerySurge, partners with CData to make a broad range of JDBC drivers available to QuerySurge users. For information about our partnership, click here. See all of CData's JDBC offerings here. For questions related to ordering, contact us here.|
Adding a Parquet file Connection using the CData JDBC Driver
To connect to a Cosmos DB data source, you'll need to use the Connection Extensibility feature of the QuerySurge Connection Wizard.
- Download the Parquet JDBC driver from CData.
Note: You'll need to deploy a copy of the driver to each of your Agents that you plan to test Parquet files on.
- Deploy the Parquet JDBC driver to your Agent(s). To install the CData driver, unzip the driver download and run their driver installer (setup.jar) to install the driver on your Agent box(es) (instructions for doing so can be found in the readme.txt file). We recommend that you install the driver in the default location. Once installed, copy the driver file(s) and the license file from the installation directory to your QuerySurge Agent jdbc directory (see the relevant Knowledge Base article for deploying drivers to Agents on Windows or Agents on Linux). You'll need to deploy a driver for each Agent box on which you intend to use the driver.
Note: For this example the deployed files are: a) cdata.jdbc.parquet.jar and b) cdata.jdbc.parquet.lic
- Login to QuerySurge as a QuerySurge Admin user, and navigate to the Admin view. Select Connections in the leftnav tree, and click the Add button (at the bottom left of the main panel). Leave the Advanced Mode checkbox unchecked. You'll need the following information for the next few steps.
Driver Class: cdata.jdbc.parquet.ParquetDriver
Connection URL template: jdbc:parquet:URI=C:/folder/yourfile.parquet;Note that you'll need to provide a URL with actual values for the endpoint and key.
- Enter a Connection Name of your choice, and select *All Other JDBC Connections (Connection Extensibility)from the Data Source dropdown menu. Click the Next button.
- Enter the Driver Class and click the Next button.
- Enter the Connection URL and an optional Test Query (recommended). A test query need only return one row and one column. Click Next.
- If you entered a Test Query, you can use the Test Connection button to test whether your Connection is set up properly. Once you have verified the connection details, save the Connection.
You're all set! You can write and execute QueryPairs against your parquet file of choice. The CData documentation discusses different driver configuration options, advanced settings, and supported SQL syntax and functions.