QuerySurge Technical Whitepaper No. 9
In rare instances, you may find the need to write SQL joins against two or more XML documents. An ETL process, for example, might use multiple XML source files to load a target database table. The setup in QuerySurge, however, can get a bit complicated as this will require multiple table definitions in a single schema file.
The QuerySurge Schema File
Your QuerySurge schema file is an XML document with a root tag of <schema> and (ordinarily) a single child tag of <table>. In situations that necessitate a SQL join, we must add one or more additional <table> tags to the schema file. Each tag, of course, represents a separate XML file that QuerySurge will validate, and therefore must have unique name, file, and path attributes.
In this example, we are using two XML files – customer-data.xml and order-data.xml. We are using a single schema file – customers-orders-schema.xml – to define the data files. The full schema file can be seen in the screenshot below:
In this example, the name attribute represents the table name you will use in your SQL statements; the file attribute is the file name of the XML data file; and the path attribute is the parent path to each record tag (i.e., each tag in the XML data file that QuerySurge will view as a row in the table).
The XML Data Files
Your XML data files, which QuerySurge will convert to database table, must match the path structure defined in the aforementioned schema file. The data types must also match what was specified in the schema file. Below are the customers and orders XML files that we will be using in this example:
The customer-data.xml data file:
The order-data.xml data file:
Joining the XML Files
Two different JOIN syntaxes are supported by the XML driver packaged with QuerySurge. You can include your JOIN statement in the WHERE clause, like so:
SELECT * FROM customers AS c, orders AS o
WHERE c.customer_id = o.customer_id
Alternatively, you can write your JOIN like this:
SELECT * FROM customers AS c JOIN orders AS o ON c.customer_id = o.customer_id
Below you can see the query editor and query results screenshots: