QuerySurge Queries - Best Practices
QuerySurge is designed based on the querying across two Data Endpoints (“Source” and “Target” Queries). These endpoints could be the two “ends” of an ETL “leg” or they could be the two sides of a Data Migration project. Each query in a Test QueryPair needs to return logically equivalent data in a comparable format. Therefore, Test Queries written for QuerySurge need to spoof any data transformations that occur across the Data Endpoints (ETL “leg” or the Migration endpoints) being tested.
In general, each member of a Test Query Pair should match it's counterpart in the following characteristics in order to support QuerySurge data Analysis:
- Queries must match in column order logically, meaning the first column in the Source Query matches the first column in the Target Query and so on. If columns do not follow the same logical sequence in both Source Query and Target Query, QuerySurge data analysis will score the test as failing.
- Each member of a QueryPair should employ matching key columns. This will make each of your rows unique which allows QuerySurge to give you the best insight into your data comparisons. If you do not include key columns in your queries, QuerySurge will attempt to compare your data, however there may be situations where it cannot properly identify mismatches in your data.
- Queries should also match in row order (although the default is to try to ignore row order during Analysis). This is usually managed with an ORDER BY clause.
- Queries must match in the number of columns and the number of rows returned, or the comparison will fail.
- Queries must match in column data type or the comparison will fail. The type matching is at a generic level. QuerySurge treats all string, whole numbers and decimal types as comparable.
Query Notes and Considerations
SQL ALIASING
- Use of Aliases. Users are generally advised to alias all compound columns, tables and views. QuerySurge will handle most situations where no aliases are provided; however, there is the possibility where the lack of an alias could cause a problem with the data comparison.
- Identically Named Columns in Single Query. QuerySurge will attempt to support columns with the same name in a single query by appending a counter to the matching column name when one is found. However, it is best for the user to provide unique column aliases for all columns (as noted above) and especially for columns with identical names.
- Identically Named Columns across a QueryPair. Because each Query in a QueryPair must use the same logical order for its columns (See QuerySurge Queries – Basics), it is strongly advised to use the same aliases for each member. This means that Target Query column 1, which corresponds to Source Query column 1, should use the same alias as Source Query column 1. This is true for the remaining columns.
- Alias Length. QuerySurge currently can accept a maximum column name or alias length of 64 characters. Because QuerySurge may modify a column name or alias under certain circumstances, it is best to keep your aliases as short as possible, and, of course, well under 64 characters.
QUERYSURGE RESERVED WORDS AND TERMS
- QuerySurge reserves the ‘$$’ sequence for column names or aliases. Do not use any column names or aliases in your SQL that contain ‘$$’ (for example: an alias such as ‘$$id’ is illegal). If you need to work with column names that contain ‘$$’, you will need to alias them to use other characters.
- QuerySurge reserves curly braces ({…}) in queries. They should not be used in any Test QueryPair as the results may be unpredictable.
REUSABLE QUERY SNIPPETS
- Snippets should be created before they are used in queries. If you try to enter a snippet name in a query that doesn’t exist, QuerySurge will alert you and ask you to save the Snippet before you save the query.
- Snippet names may not contain curly braces, tabs or line breaks.
- Snippets may not be used inside single quotes. The quotes need to be part of the Snippet.
For example: select * from table t where t.text = '{my snippet}' where {my snippet} = why a duck? is not legal.
This is in the proper format is: select * from table t where t.text = {my snippet} where {my snippet} = 'why a duck?'
QUERIES AND DATA TYPES
- Type Translation. QuerySurge translates the types it receives from your Source and Target databases to analogous types in it's database.
- Type Agreement. As part of the Analysis procedure, QuerySurge checks Source column type against Target column type, using the translated types in it's database. If the types do not agree, QuerySurge will list the test as failed. The type matching is at a generic level - for example, QuerySurge treats all string types as comparable, all whole numbers as comparable and all decimal types as comparable.
- If you want to ignore data type differences for a given QueryPair, either use a cast in either Source or Target Query or both (for specific columns) or use the Data Type Checking feature in the QueryPair Properties tab.
Definitions of Reporting Metrics
Data Failure: In rows of a ResultSet that QuerySurge successfully paired, a mismatch in data values in corresponding columns occurred.
Row Count Difference: The two Queries in a QueryPair returned different numbers of rows of data
Data Type Mismatch: The data returned in two corresponding columns from each member of a QueryPair differed in data type.
Pass/Fail Rate or % : The percentage of QueryPairs in a Scenario that pass.
Data Reliability: The number of Data Failures in a Scenario compared to the total number of data compared.
Note: A Scenario can have a relatively low Pass/Fail Rate while having a high Reliability. This is because only one datum needs to differ between Source and Target Query ResultSets for a Test QueryPair to fail. QueryPairs with an outcome of Error are excluded from the Reliability calculation.
SQL Exception: One or both Queries in a QueryPair returned an Exception
Comments
0 comments
Please sign in to leave a comment.