Versions:2.0+
The CLOB Data Type - Character Large OBjects
Most database and data store products have a data type for storing large text or character objects, often called CLOBs (Character Large OBject). These objects are typically 2 GB of character data or more, depending on the specific data technology you are using. Obviously, a resultset with even a single column of CLOBs may represent significantly large data payload, and a resultset with multiple CLOB columns may represent an extremely large payload.
Note: Your specific technology may use a different term for this data type. For example, Oracle calls this type a CLOB, while in MySQL, it is a LONGTEXT type, while in Hive, it is a STRING type.
Note: CLOB columns cannot be used as Shared Keys in QuerySurge. However, if you have a short unique character sequence in the CLOB, you can use the appropriate syntax for your SQL dialect to create a CHAR or VARCHAR column in your resultset that can be used as a Shared Key.
QuerySurge and CLOBs
QuerySurge handles data typing by checking query data types from the query metadata for each of the columns in your queries. So, if you have a CLOB column in a table, it will be typed as a CLOB by QuerySurge because the metadata will indicate that it is a CLOB. In addition, if your SQL uses a string handling function to manipulate a column, it may return a CLOB type even when the underlying column is a different type. When your query returns a CLOB column, instead of pushing the entire data set to your browser, which may have significant negative network and/or browser impacts (it could be ≥ 2GB of text per row), QuerySurge simply shows you a "{clob}" notation when you view your data:
It is important to note that, when QuerySurge shows you the "{clob}" notation, your data exists in QuerySurge and is available, and the QuerySurge comparison has run on your data. When your results contain the "{clob}" notation, you can view your CLOB data in QuerySurge a few different ways.
View CLOBs Cell-by-Cell
You can see CLOB data cell-by-cell, by choosing a cell in your data popup, and right-clicking on it. You'll see a context menu that either lets you view the data in a popup, or save it to a file:
This option, as we have emphasized, only lets you see the data cell-by-cell.
View CLOBs by CLOB Peeking
Versions:7.1+
QuerySurge has a CLOB Peeking option - you can "peek" your CLOB data (the default is to peek the first 100 characters) so that you can see the initial character data in your CLOB columns. The advantage is that QuerySurge peeks CLOBs on all displayed rows, so you can get an idea of what the data looks like at a glance:
You can find a full description of this feature in our "QueryPair Results Viewing Options" article.
A Workaround
In many cases, the data stored in a CLOB column is nowhere near the size maximum of the column. If you know that there is a hard ceiling for the maximum size of your text data in a CLOB column, you can simply cast it in your SQL to a smaller character type, such as a VARCHAR. For example, if you knew that no data in a CLOB column exceeded 512 bytes, you could use the following:
SELECT CAST(myclobcol as VARCHAR(512)) FROM mytable...
Because the data returns to QuerySurge from this query as a VARCHAR type and not a CLOB type, it will display the data values and not the "{clob}" notation.
Note: This workaround can only work if you know of a hard maximum size for the actual data in the CLOB column. Otherwise, data could be truncated or otherwise mangled.
Flat Files: A Special Case
If you are encountering CLOBs with a Flat File Connection, the default size maximum of your VARCHAR columns may be larger than it needs to be for your data. This situation is similar to the one described above, but instead of casting each column, you can set the maximum string width on the Connection. This is done by editing your Flat File Connection and selecting the Advanced Mode checkbox.
Now, when you click the Next button through the Connection details, you will see the options to edit the Max String Width field (under String Options). You can lower this value based on the size of the strings in your data.
A Final Important Note: Coerced to CLOB
There is one condition where QuerySurge will automatically change your character or text data types from the actual type indicated in your query metadata to a CLOB. This type coercion occurs when QuerySurge estimates that your resultset will exceed the allowable row size limit for a resultset (64KB per row). Since CLOBs (paradoxically) use less storage in the resultset table, coercion to a CLOB type results in a space savings for the table.
This condition usually occurs when you have large numbers of columns in your query, especially if there are multiple and large character columns in the resultset. The simplest workaround is to break your QueryPair into multiple QueryPairs. For example, if you have a QueryPair with 80 columns in which all of the text columns are returning as CLOB types (and you know that the columns are not CLOB types in the underlying Source/Target databases) try breaking this QueryPair into four QueryPairs of 20 columns each.
Comments
0 comments
Please sign in to leave a comment.