Accessing Numeric Value Somewhere in a DB2 Column

We recently had some inconsistent data in our database that required a pretty special query. The column contained what should have been a three byte marketing territory number, but depending on what application/user entered the data, the value could be stored almost anywhere in the column, and be 1 to 3 bytes. The following query gets that numeric value.

LPAD(
  TRANSLATE(
    REPLACE(
      TRANSLATE(MISC1_KEY || MISC2_KEY, ' ', X'00'),
    ' ', ''),
  '', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),
3, '0') AS TERRITORY,

The query has four steps, working from the inside out:
  1. Combine the two columns together and replace and null values with a space, X'00' is the hex code for null.
  2. Remove any spaces from the column, originally I wanted to do this with the next TRANSLATE step, but for some reason the spaces seem to be ignored with TRANSLATE
  3. Remove any alpha characters, the column also sometimes contained state abbreviations. It would be nice if we could use something like ^[0..9], but DB2 doesn't support that, and on z/OS it doesn't even support hex ranges. 
  4. Finally pad the number with leading zeros to make it three bytes long. 
It's quite a few steps, but the end result is a consistent three byte code. Especially useful when used with a CTE or a sub-select.

Post a Comment