QuerySurge Support for multilevel XML Hierarchy
XML documents typically store data in nodes of the XML ‘tree’; often, the tree structure is simple, however for many complex business applications, there are multiple values that have complex interrelationships, and these relationships are represented in the XML hierarchy. In this example, we look at the setup for an XML of moderate complexity to show how highly complex XML trees are handled.
Setting up the XML Schema File
We’ll use the sample “rates” XML below to illustrate the setup. In the sample XML, the <rate> tag is the “major” tag; therefore each <rate> tag will be viewed as a row. Each child tag of <rate> will be a column. As is clear from Sample 1, the information is embedded in an XML hierarchy that puts the country name information in child tags of the <Country> tag, and puts the quotes in a child <ratequote> tag.
<?xml version="1.0" encoding="UTF-8"?>
Sample 1. An XML with multilevel hierarchical structure
The mapping information that the XML driver requires is stored in a “Schema” file, which is a separate, custom XML file that the driver reads to process your XML data file. The Schema XML for each “data” XML must be created by hand and deployed with the driver. You can create a Schema file using any text or XML editor. A Schema XML starts with a root <schema> tag, which contains a <table> tag that itself contains one or more child <column> tags. Each of these tags defines the specifics of the XML-to-table mapping. For our sample XML (above), the Schema file is:
<column name="Currency" path="Currency" type="Varchar" size="128"/>
<column name="YESTERDAY" path="ratequote/YESTERDAY" type="Decimal"/>
<column name="TODAY_" path="ratequote/TODAY_" type="Decimal"/>
Sample 2. Schema File illustrating XML mappings (see Sample 1)
There are several things to note about the Schema file. First, in the <table> tag, the name attribute designates the name you will use in your SQL to query your XML. The file attribute gives the location of the XML data file (if the Schema XML and the data XML are both in the same directory, no file path is required for the data XML). Finally, the path attribute of <table> tag (actually an XPath instruction) defines the tag that will be considered a “row” in the table view of the XML.
Each of the <column> tags defines a column in the table view of the XML, as the name implies. In this example, we will map all of the data-bearing XML nodes to columns. The name attribute of each <column> tag assigns the column a name that you can use in your SQL. The path attribute is an XPath instruction that locates the XML object being mapped to the specified column. The type attribute gives the data type (Varchar and Decimal types are shown above; standard type designations may be used, including other numeric types, date types, etc.). The size attribute gives the Varchar size of the column.
Note: If you omit a size attribute for a Varchar column, the driver will use its default size, which will make all your Varchar columns into CLOB types in QuerySurge (2 GB wide).
To define a node-to-column mapping, the Schema XML simply uses the XPath designation for the path attribute of the <column>. For example, consider the Country column:
<column name="Country" path="Country/name" type="Varchar" size="128"/>
By setting the path attribute to “Country/name”, we have identified the child <name> tag of the <Country> tag in the data XML with the column name Country.
Now, consider a deeper mapping, which we can illustrate with the PCTCHANGE column:
<column name="PCTCHANGE" path="ratequote/delta/PCTCHANGE" type="Decimal"/>
Here, we have set the path attribute to “ratequote/delta/PCTCHANGE”, which maps the PCTCHANGE tag of the <PCTCHANGE> tag in the data XML to the column PCTCHANGE.
For this example, save the Schema XML to a file curr-xchg-rate-schema_hier.xml and the data XML to a file curr-xchg-rate-data_hier.xml. Put both files in the same directory, and make a note of the directory path, which will be required to set up the XML Connection in QuerySurge.
Note that what we have done in this example is to use XPath statements to specify the components of a "column" for our query against the XML. That really is all that is involved - once you specify the columns that you are interested in via XPath, QuerySurge will "read" your XML according to your specification, and you're ready to write SQL against the XML.
Querying your XML with SQL
When you query your XML, QuerySurge uses your XPath instructions to read your XML as if it were a database table. As noted, you use the name attribute of the table tag for your table name in your SQL. A robust SQL syntax is available. A sample query for the example shown here might be:
from `curr-xchg-rates_hier` where YESTERDAY > 80.0
Sample XML Files
Sample XML and schema files for this article can be downloaded here.