The WHERE Constraint

The Database Connectors WHERE constraint is an external variable that gives the developer some control over the data searches that result from database queries. It can help to improve performance in some situations. This section describes purpose of the WHERE constraint and shows how it is implemented.

COBOL data processing is based on keyed READ operations following a positioning operation. Records are read until the key value being processed changes. Because traditional COBOL data processing is based on a B+ tree file system, the overhead for such operations is relatively minor.

RDBMS data processing introduces a new level of complexity to data processing. The database's query optimizer receives the SQL query for the COBOL operation being performed and then builds a working set of data that satisfies that query. Because the database optimizer has many different possible execution methods, this can result in poor performance if the optimizer chooses a query execution path that is less than optimal.

Performance degradation may also result from the fact that queries generated by COBOL operations result in unbounded index queries. Unbounded queries are generated because COBOL positioning operations (Start Not Less Than and Start Not Greater Than) provide only one of the bounding conditions for the working set, instead of both an upper and lower boundary.

As an example, consider the case in which an application needs to process all items in a warehouse on aisle 17, shelf 8, and bin 2. If each of these items is a field in a key, the COBOL program might generate the following query for a READ operation:

Note: The following example applies to the Database Connectors for Oracle product. The SQL generated will be different for the different interfaces.
 
SELECT * FROM warehouse_items WHERE 
   aisle = 17 and 
   shelf = 8 and 
   bin = 2 
ORDER BY aisle ASC, shelf ASC, bin ASC;

This query achieves the desired result but has one problem. For the COBOL program to end its processing, it must read a record that has a new value for "bin". The COBOL application has no way of specifying an upper boundary for the read operation, so when all rows of data from bin 2 have been read, the Connector will attempt to read the next record by generating the following query:

SELECT * FROM warehouse_items WHERE 
   aisle = 17 and 
   shelf = 8 and 
   bin > 2 
ORDER BY aisle ASC, shelf ASC, bin ASC;

This query will cause the database query optimizer to gather all records pertaining to items on the remainder of shelf 8 to build its working set. This is excessive from the COBOL application's point of view, because the COBOL program needs only the first record of the working set to determine that it has finished processing.

This problem can be even more serious if the application is processing the last bin on a shelf. Because there are no more bins on that shelf, the query would drop down a level and generate the following:

SELECT * FROM warehouse_items WHERE 
   aisle = 17 and 
   shelf > 8 
ORDER BY aisle ASC, shelf ASC, bin ASC;

This would select all items on the remainder of that aisle of the warehouse, which could be a very large working set if each aisle had 130 shelves!

In reality, most of the time the database query optimizer will not build the entire working set if it has been properly tuned, but will instead perform INDEXED READS to process the query. This means that the query optimizer will traverse an index tree to retrieve the records, much as COBOL index files do, as opposed to using combinations of indexes and sort and merge operations.

It can be helpful for the COBOL developer to influence precisely which information is to be returned. If the application developer knows at compile time (or before the query is executed) the precise scope of the record processing that needs to be performed by the read operations, the developer can more finely tune the information being retrieved.

Database Connectors provides a method by which the COBOL programmer can provide additional information to the database query optimizer by providing more specific selection information to the Connector. This selection information is added to the WHERE clause of the SQL queries generated by the Connector. This can be particularly useful in providing upper boundaries to queries being generated, with the result that the working set is smaller.

The developer may provide upper boundaries on the key segments for a select, or any other selection criteria needed to constrain the working set to just the desired subset of rows. This additional information is added to generated queries with the AND condition. It is not possible for the application developer to specify a larger working set than would otherwise have resulted. The developer may only constrain the working set to a smaller subset.