There are a number of standard mainframe data "caches" that are typically accessed for data testing as part of ETL, Data Warehouse or Big Data testing: DB2 tables, exported flat files, EBCDIC-encoded files, VSAM files and database logs. Below, we briefly describe accessing each of these for data testing, in terms of QuerySurge setup.
DB2 on the Mainframe
For DB2 instances on the mainframe, QuerySurge connects via JDBC in the same way that QuerySurge connects to any database. IBM has multiple JDBC drivers available for DB2, depending on the version and the DB2 "flavor".
In terms of setup, you will need to obtain a DB2 JDBC driver from your DBA or other admin, or from IBM, as RTTS (the makers of QuerySurge) does not have redistribution rights from IBM for these drivers. Once you have obtained the driver, deployment instructions for QuerySurge Agents on Windows are here, and for Agents on Linux, instructions are here. Remember, you will need to deploy the driver to all of your Agents.
In terms of setup, after you have deployed your drivers, you can use the QuerySurge Connection Wizard to set up connections. If you have deployed the IBM DB2 Universal JDBC driver, then you can use QuerySurge's Connection Wizard to set up your Connection. If you are using a different IBM JDBC driver for DB2, then you can use the QuerySurge's Connection Extensibility option. In this latter case, you'll need to consult the driver documentation for the Driver Class, the JDBC URL syntax, and any other specifics required for the Connection.
Flat Files on the Mainframe
Mainframe operations often include regular flat file dumps. These may be set up for ETL purposes or for other purposes. Whatever the reason, flat files are useful targets for ETL and Data Warehouse testing.
If there is already a process for moving data in a standard ASCII flat file format off your mainframe, then you can use those files with QuerySurge as part of your data testing. QuerySurge ships with a flat file JDBC driver that handles both delimited and fixed width files. Your files appear like database tables to QuerySurge, and you can use a robust SQL dialect to query against them.
Testing with files does require set up within QuerySurge, because QuerySurge needs to know how to read the files in order to interpret them as database tables. For delimited files, Connection setup information is here, and for fixed width files, setup is here.
EBCDIC Files on the Mainframe
If your organization already has a process in place to build and move flat files off your mainframe, there may be a file encoding issue; EBCDIC is an IBM standard for mainframe files, and files will need conversion to ASCII for QuerySurge to process them. In this case, a file conversion will need to be put in place in order for QuerySurge to test against the files. This procedure can usually be fully automated, as QuerySurge's flat file JDBC driver can take custom "plug-in" functions, which can be run along with the file query.
Because EBCDIC-to-ASCII conversions may be complex (depending on how your EBCDIC files are built) it is helpful to know whether a conversion tool or process is already available in your organization. If a conversion protocol is already available, frequently it is simplest to simply call it from a custom plug-in function. Then, the file conversion will run in place, triggered by QuerySurge's execution, and after it is complete, your query can automatically run against the converted file.
VSAM Files on the Mainframe
The need to access data in VSAM files is a common one in mainframe ETL testing. There are a couple of options for implementing tests with data in VSAM files.
VSAM files can be handled in a manner similar to the procedure used for flat files (see above), via a file-specific set of conversions:
- The VSAM files need to be converted to a 'conventional' flat file layout,
- Most likely, an EBCDIC-to-ASCII conversion is required,
- The files need to be moved off the mainframe to a location where QuerySurge can access them.
With VSAM files, the initial conversion may be performed on the mainframe using JCL. Then, either the files are moved off the mainframe, usually using FTP, and the EBCDIC-to-ASCII conversion is applied outside the mainframe environment. Alternately, the EBCDIC-to-ASCII conversion may be done on the mainframe prior to moving the files off. As noted above, QuerySurge's flat file driver can accommodate a custom plug-in function, which in many cases, can log into the mainframe to trigger the conversion and ftp processes. If this is possible, the full testing flow from VSAM file to converted, accessible, tested flat file, can be automated from QuerySurge.
VSAM data can be accessed using a JDBC-based approach. In this case, VSAM file access from the QuerySurge perspective is broadly similar to access to other data stores via JDBC. There are several driver products available that provide JDBC access to VSAM files on the mainframe, including:
- CONNX for VSAM-MVS
- VSAM Data Connectivity by Attunity
- Rocket Shadow z/SQL (z/OS)
- IBM z/VSE JDBC solution described here and here
- IBM z/OS Connect Enterprise Edition to expose data via REST, accessible via a REST JDBC driver
Note: RTTS, the vendor of QuerySurge, does not have distribution rights for these products, and has no relationship with the vendors regarding these products. If you download a product, your license is with the vendor and not with RTTS. Support for these products is not provided by RTTS.
DB2 Database Logs
Database logs may play a role in testing. If logs are used as part of the ETL, and are processed into external databases or into a Hadoop data lake, then log exports off the mainframe may be useful as an entry point for testing. All of the conversion issues discussed above will apply, however. If log replication software is in use that employs message queue software to process the logs, then one option would be to use QuerySurge to check messages on the queue as a data Source. This would likely require a custom driver plug-in implementation (depending on the message queue technology) to pull the information from the queue and process it as a file.
How to get your mainframe's data for Hadoop analytics, Andrew C. Oliver