Groovy Copybook Slurper


In order to call CICS from Java, we've traditionally used RDz to generate a J2C bean via an Ant script. But I don't have RDz anymore (it doesn't support Groovy and always running older versions of Eclipse), so my only alternative was to use a JCL to generate a bean via IBM's JZOS library.

I'm still not a big fan, it takes time to run the JCL, download the generated file from the mainframe and tweak the object names, packages, etc.. What if I could just declare my copylib structure in my Java code and call CICS? To that end, I've been working on a parser which converts a byte stream to/from a map.

Leaving the CICS call of it out for the moment, here's an idea of want the read code would look like.

def slurper = new CopybookSlurper("""\
01 TESTING-COPYBOOK.
   03 STATES OCCURS 10 TIMES .
      05 STATE PIC XX.
""")
def data = slurper.getReader('MIOHINWINVCAKYFLNCSC'.getBytes())
data.STATES.each {
  print it.STATE  //will print each two byte state
}
assert 'IN' == data.STATES[2].STATE //zero-based indexed of the STATES list

And here's the write code

def slurper = new CopybookSlurper("""\
  01 TOP-LEVEL.
     03 STATES OCCURS 2 TIMES.
        05 STATE-NUM        PIC X(3).
        05 STATE-NAME       PIC X(10).
        05 NUMERIC-STATE    PIC 9(2). """)
    
def writer = slurper.getWriter(new byte[slurper.length])
writer.with {
  STATES[0].with {
 STATE_NUM = '1'
 STATE_NAME = 'MICHIGAN'
 NUMERIC_STATE = 2
  }
  STATES[1].with {
 STATE_NUM = '2'
 STATE_NAME = 'OHIO'
 NUMERIC_STATE = 3
  }
}
assert new String(writer.getBytes(), 'IBM-37') ==
       '1  MICHIGAN  022  OHIO      03'

The slurper name and map structure is inspired by Groovy's XmlSlurper and JsonSlurper. My parser implementation is written in groovy to take advantage of the dynamic nature of the maps and closures. I used the JZOS library under the covers to handle parsing the individual data elements so it can handle binary fields (COMP-3 stuff). Unfortunately JZOS didn't provide any simple hooks for getting a data type based on a PIC definition, so I had to rewrite that logic. I know my parser isn't as robust as the IBM version, it doesn't handle some of the weird datatypes for instance, but it's good enough for the normal copybooks that we'd be using from the web.

I also wanted to make my map lazy so that the bytes are only parsed if requested, after all sometimes COBOL copybooks have a ton of data in them, and I only want a small piece. So rather than parsing the field right away, the parser creates a map of closures with instructions on how to parse the data. Likewise, the writer stores a closure for each variable with instructions on how to write data to a shared byte array. The getReader method is shown below. There's a lot of complexity involved in occurrences and groups that you can't see here, but I just wanted to show the mapping of the variable names to closures.

public Map<String, ?> getReader(byte [] bytes) {
  Map readers = [:]
  fieldProcessors.each { DataVariable variable ->
    if (variable.read) {
      readers[variable.name] = { variable.read(bytes) }
    }
  }
  return new LazyProperties(readers, bytes)
}

The LazyProperties class extends HashMap and invokes the closures from the map when referenced, then caches the results for the next call. The getWriter method works in a similar fashion with closures, though it doesn't cache results because any call to the write methods should write data to the byte stream.

That's what I have working so far, the next step is to look into the javax.resource.cci.Record and javax.resource.cci.Streamable interfaces and perhaps Spring's org.springframework.jca.cci.core.support.CciDaoSupport to call down to CICS.

Generating Lots Values From A DB2 Sequence Object

I was working on a Groovy script that inserts lots of data into a database, but I needed to generate a new key from a sequence object and keep track of the old one. So I need to invoke NEXTVAL for each key, but considering how many thousands of records I was inserting, I didn't want to call the database for each key individually.

There's no way to batch up SELECT statements, but I thought I could perhaps generate a big SELECT statement that includes all the keys I needed.

SELECT 1 AS OLD_KEY, MYSEQUENCE.NEXTVAL FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT 2 AS OLD_KEY, MYSEQUENCE.NEXTVAL FROM SYSIBM.SYSDUMMY1

The problem with this approach is that when you request NEXTVAL multiple times in the same SELECT statement, it returns the same value each time. So I ended up generating one new key assigned to all of my old keys. No good, so I had to try something else. I decided to insert into a GLOBAL TEMPORARY TABLE.

db.execute('''DECLARE GLOBAL TEMPORARY TABLE SESSION.GEN_IDS (
  ORIG_ID DECIMAL(12,0), NEW_ID DECIMAL(12,0)) ON COMMIT PRESERVE ROWS''')

db.withBatch('INSERT INTO SESSION.GEN_IDS VALUES (?, MYSEQ.NEXTVAL)') { 
  BatchingPreparedStatementWrapper ps ->
    oldKeys.each { ps.addBatch(it) }
  }

def mappings = target.rows('SELECT * FROM SESSION.GEN_IDS').collectEntries {
 Map data -> [(data.ORIG_ID) : data.NEW_ID]
}

db.execute('DROP TABLE SESSION.GEN_IDS')

Problem solved. The collectEntries method will create a map for the old and new values. Of course, then I learned about the JDBC getGeneratedKeys method, which makes this approach much less useful, but I'm still glad I learned how to use the GLOBAL TEMPORARY TABLEs.

Accessing Numeric Value Somewhere in a DB2 Column

We recently had some inconsistent data in our database that required a pretty special query. The column contained what should have been a three byte marketing territory number, but depending on what application/user entered the data, the value could be stored almost anywhere in the column, and be 1 to 3 bytes. The following query gets that numeric value.

LPAD(
  TRANSLATE(
    REPLACE(
      TRANSLATE(MISC1_KEY || MISC2_KEY, ' ', X'00'),
    ' ', ''),
  '', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),
3, '0') AS TERRITORY,

The query has four steps, working from the inside out:
  1. Combine the two columns together and replace and null values with a space, X'00' is the hex code for null.
  2. Remove any spaces from the column, originally I wanted to do this with the next TRANSLATE step, but for some reason the spaces seem to be ignored with TRANSLATE
  3. Remove any alpha characters, the column also sometimes contained state abbreviations. It would be nice if we could use something like ^[0..9], but DB2 doesn't support that, and on z/OS it doesn't even support hex ranges. 
  4. Finally pad the number with leading zeros to make it three bytes long. 
It's quite a few steps, but the end result is a consistent three byte code. Especially useful when used with a CTE or a sub-select.