With QuerySurge 10.0+, we offer additional support for Timestamp data with new options for Timestamps with Timezones. Timestamp with Timezone data is stored by QuerySurge as a string type in order to preserve the timezone part of the data as you have captured it. In this release, you have the option to configure your QueryPairs to treat this data as a string during analysis (the default) or as timestamp data, in which case equivalent times, regardless of timezone, will match.
In this article, we describe these options for handling timestamp data. To illustrate, we'll use the following timestamps:
Source Timestamp | Target Timestamp |
2007-04-15 15:10:01-05:00 | 2007-04-15 12:10:01-08:00 |
Clearly, if we compare these as strings, they will not match, and a QueryPair returning these values will fail. However, if we configure QuerySurge to convert these values to Timestamp objects during analysis, because each represents the same time (taking into account the specified timezone), a QueryPair returning this data would pass.
True Timestamp with Timezone Data
If incoming timestamp data is a true timestamp with timezone type in your Source or Target, and you are comfortable treating it via a string during QuerySurge's analysis, then leave your QueryPair on the default settings. However, if you want to treat the data as true timestamp with timezone data during analysis, you can configure this on your QueryPair (see below).
Timestamp with Timezone Data as Strings
Alternately, if your data is stored as a string type in your Source or Target, you have a couple of options to choose from. The first, and simplest, option, is to leave it as it is, using the default settings on your QueryPair. This means that QuerySurge will treat this timestamp with timezone data as it does any other string data in the analysis phase of your executions. Considering the sample data above, this would mean the Source and Target would not match, and the QueryPair returning this data would fail.
If, however, you want QuerySurge to treat your data as a true timestamp during analysis, you can configure your QueryPair to treat the data as a timestamp with timezone (as above). Once you have done this, after execution, your data will still be stored as a string type, so it will appear with its timezone preserved, but it will be compared as a timestamp with timezone type. In the sample data above, the Source timestamp (in EST) and Target timestamp (in PST) would match, and the QueryPair returning this data would pass.
Note: At release time, QuerySurge support two formats for "timestamp-with-timezone-as-string" processing: a) the ISO 8601 standard or b) the related space-delimited format. If your timestamp-as-string data is in a different format, you will need to reformat it in your SQL.
Supported Timestamp with Timezone Formats | |
ISO 8601 | Space-delimited "ISO 8601" |
2007-04-15T15:10:01-05:00 | 2007-04-15 15:10:01-05:00 |
2007-04-15T15:10:01+01:00 | 2007-04-15 15:10:01+01:00 |
Specifying a Timstamp with Timezone Column in a QueryPair
To identify one or more columns to be treated a a timestamp with timezone in the comparison, open a QueryPair with timestamp with timezone columns and navigate to the Properties tab. Expand the Timestamp Columns fieldset. For the example below, we have the following columns: ID, SOURCE_ORDER_ID, and ORDER_DT.
The Timestamp Column Origin indicates whether the column name is being parsed from the Source of Target query. Whichever origin you choose, the column you specify and the column in the equivalent ordinal position in the opposing query will be treated as a Timestamp objects during analysis.
Adding Timestamp Columns
Select Source or Target as your Origin, and click on the gear icon in the toolbar to automatically parse and add every column from your selected origin query.
Alternatively, click on the plus icon in the toolbar to manually add column names.
Use the toggle buttons associated with each added column in the grid to ignore it during analysis.
Click on the Save button in the lower right of the panel.
Removing Columns
Use the minus icon next to each added column in the grid to remove it. Alternatively, use the X icon in the toolbar to remove all columns.
Click on the Save button in the lower right of the panel.
Restrictions
There are a few restrictions that apply to Timestamp Columns, primarily surrounding the concurrent use of shared key column definitions, column threshold options, and Timestamp columns.
The restrictions are simple.
- If a column is configured as a shared key, it cannot be treated as a timestamp during analysis.
- If a column has a threshold applied to it, it cannot be treated as a timestamp during analysis.
A Worked Example
Consider the following two data sets:
Source |
Target |
||||
id | source_order_id | order_dt | id | target_order_id | order_ts |
1 | 100 | 2007-04-15 15:10:01-05:00 | 1 | 100 | 2007-04-15 12:10:01-08:00 |
2 | 200 | 2007-04-14 12:11:00-05:00 | 2 | 200 | 2007-04-14 12:11:00-04:00 |
3 | 300 | 2007-04-15 15:10:01-05:00 | 3 | 300 | 2007-04-15 15:10:01-05:00 |
Note the following:
- Row 1 - Source and Target differ as strings, but refer to the same timestamp.
- Row 2 - Source and Target differ as strings, and also differ as timestamps.
- Row 3 - Source and Target are the same both as strings and and as timestamps.
So, if the data is timestamp+timezone-as-string data, and analysis is run in the default (all strings) mode, both row 1 and row 2 will fail. However, if analysis is run with col 3 configured as a timestamp with timezone, only row 2 will fail.
Execute the QueryPair
Default Timestamp Configuration (Timestamps with Timezones treated as strings in analysis)
When we execute the QueryPair with no timestamp configured, we get:
As expected, we see two rows failing when the timestamps with timezones are treated as strings during analysis.
True Timestamp with Timezone Configuration (Data treated as true Timestamps with Timezones in analysis)
However, if we configure the third column to be handled as a Timestamp with Timezone in analysis, only a single row fails:
Row 1 no longer fails, because the timestamp+timezone combination result in matching times during analysis. Row 2 continues to fail, because these timestamp+timezone values are neither equivalent as string data nor as timestamp data.
Comments
0 comments
Article is closed for comments.