Groovy Overload (of Operators)

I was recently working on a pretty long class which mapped a web-service response to a set of 'activity codes' in our Java application. The class ends up mapping some 60+ different codes based on various scenarios, and appends some extra data for some. The code was original written in Java and pretty well organized (okay I wrote it, so I may be bias), but it was very wordy, there were a lot of repeated words/verbs in the class.

So, I set about re-writing this class in Groovy, not just for the wordiness, there were some additional complexities to the rules that I needed to apply, but I wanted to make this monstrosity easier to understand. I made use of Groovy's capability to use Strings in switch statements (coincidentally, we were given the okay to use Java 7 the following week...), which helped (previously I had been converting the Strings into enum), and I made use of Groovy's implicit calls to setters and it's concise map notation. All of which made a big difference. I also used static imports for my enums so the names were shorter. So the code looked like this in the end.

details.activityCode = PAYMENT_TRANSFER_IN
details.activityProperties[REVERSAL_REAPPLIED_TO] = pay.transferAcct
details.activityProperties[ACCOUNT_NUMBER] = pay.transferAcct

But I wanted to go one step further, every single one of our activities had an activityCode, that's the primary data we're mapping, and the activity properties was only the only map in the object. So saying .activityCode and .activityProperties was redundant. What I really wanted was something more concise like this:

details << PAYMENT_TRANSFER_IN
details[REVERSAL_REAPPLIED_TO] = pay.transferAcct
details[ACCOUNT_NUMBER] = pay.transferAcct

And thanks to Groovy's ability to add methods which override operators, that was an easy task. These are the relevant parts of my POJO to support this notation. 

private String activityCode;
private Map<ActivityType, String> activityProperties = new HashMap<>();

//Overrides the '<<' operator in groovy code to set the account activity code.
public void leftShift(ActivityCode value) {
  this.setActivityCode(value);
}

// Used with groovy map notation to get activityProperties values.
public String getAt(Object key) {
  return this.sapProperties.get(key);
}

// Used with groovy map notation to set activityProperties values.
public String putAt(AccountActivityTypeDescriptor key, Object value) {
  if (value != null) {
    return this.sapProperties.put(key, String.valueOf(value));
  }
  return null;
}

Note: I could add leftShift methods for other object types too, like the amount, which is also often set on these objects, but I still want the code to be readable, that's the whole reason I was exploring this change, so since the activity code is the one that gets used repeatedly, that's the only value where I implemented leftShift.

The finally effect is much less code to read through, which makes understanding this massive class considerably easier.

Groovy Tabular Data DSL

After using Spock, I became really interested in the tabular input of data into my services, not just into my test. The project I'm working on has a lot of transaction data with various associated properties. So transaction type 1, has fee type A associated with it.

Traditionally I've done that type of property mapping using enums, or maybe a switch statement.This works well, but there's still a good amount of code, and I'd like to make the mapping as readable as possible for the next guy. Especially since in the particular case I'm working on, the enum has some 700+ transaction types and only a handful of them need additional properties.

Groovy made property mapping easier with it's slick map syntax, and that worked really well too, but I kept thinking about Spock, and that tabular data syntax, which I thought was even more readable than a map. I had this:

[Transaction.SERVICE_CHARGE : 
  [feeType: FeeType.SERVICE_FEE,
   waiveTransaction : Transaction.SERVICE_CHARGE_WAIVED,
   reapplyTransaction : Transaction.SERVICE_CHARGE_REAPPLIED
   desc: 'PAID SERVICE CHARGE'
  ],
Transaction.NSF_FEE_CHARGE : 
  [feeType: FeeType.NSF_FEE,
   waiveTransaction : Transaction.NSF_WAIVED,
   reapplyTransaction : Transaction.NSF_REAPPLIED
   desc: 'NSF CHARGE'
  ]
]

But I was really hoping to see something more like this (values omitted for formatting):

transType                  | feeType             | desc
Transaction.SERVICE_CHARGE | FeeType.SERVICE_FEE | 'PAID SERVICE CHARGE'
Transaction.NSF_FEE_CHARGE | FeeType.NSF_FEE     | 'NSF CHARGE'

During my research I ran across a blog entry by Christian Baranowski that did an excellent job of showing how to use Groovy's category and override properties features to create a simple table DSL. I took what he did and added a few more features for my own benefit, so my version supports three different outputs:

* data - returns a  list of lists, just giving the values from the table
* dataWithTitleRow - returns a list of mapped properties that represent each row, the title row column names are used as the key in the map
* dataWithTitleRow (def key) - returns a keyed map using the value from the value from the column identified from the 'key' parameter as the key.

def data = Table.withTitleRow('transType') {
transType                  | feeType             | desc
Transaction.SERVICE_CHARGE | FeeType.SERVICE_FEE | 'PAID SERVICE CHARGE'
Transaction.NSF_FEE_CHARGE | FeeType.NSF_FEE     | 'NSF CHARGE'
}
assert data[Transaction.SERVICE_CHARGE].desc == 'PAID SERVICE CHARGE'

Here's what my version looks like

import groovy.transform.Canonical

/**
 * Parses tubular data into 'rows' of just values, or mapped values (if there
 * is a title row in the data), or a map of keyed values.
 *
 * <p>
 * Based on blog entry at
 * http://tux2323.blogspot.com/2013/04/simple-table-dsl-in-groovy.html
 */
public class Table {
  /**
   * The Groovy category feature used to implement this DSL uses static
   * methods, so thread local is used to store off the parsed content as it
   * is processed.
   */
  private static ThreadLocal<List> PARSING_CONTEXT = new ThreadLocal<List>()

  /**
   * Parses a list of tabular data.
   *
   * <pre>
   * def data = Table.data {
   * Donald | Duck | Mallard
   * Mickey | Mouse | Rodent
   * }
   * assert data.first() == ['Donald', 'Duck', 'Mallard']
   * </pre>
   *
   * @param tabularData contains all the data to parse delimited by |
   * @return a list of lists of data
   */
  public static List data(Closure tabularData) {
    PARSING_CONTEXT.set([])
    use(Table) {
      tabularData.delegate = new PropertyVarConvertor()
      tabularData.resolveStrategy = Closure.DELEGATE_FIRST
      tabularData()
    }
    return PARSING_CONTEXT.get().collect { Row row -> row.values }
  }

  /**
   * Parses a list of tabular data with a title row, returns a list of
   * mapped properties.
   *
   * <pre>
   * def data = Table.dataWithTitleRow {
   * firstName | lastName | type
   * Donald | Duck | Mallard
   * Mickey | Mouse | Rodent
   * }
   * assert data.first() ==
   *   [firstName: 'Donald', lastName: 'Duck', type: 'Mallard']
   * </pre>
   *
   * @param tabularData contains all the data to parse delimited by |
   * @return a list of lists of data
   */
  public static List dataWithTitleRow(Closure tabularData) {
    List rows = data(tabularData)
    def titleRow = rows.first()

    return rows[1..<rows.size()].collect { List row ->
      def mappedRows = [:]
      row.eachWithIndex { it, index ->
        mappedRows[titleRow[index]] = it
      }
      return mappedRows
    }
  }

  /**
   * Parses a list of tabular data with a title row and specifies the column
   * that should be used as a key in the output map.
   *
   * <pre>
   * def data = Table.dataWithTitleRow('firstName') {
   * firstName | lastName | type
   * Donald | Duck | Mallard
   * Mickey | Mouse | Rodent
   * }
   * assert data['Donald'] ==
   *   [firstName: 'Donald', lastName: 'Duck', type: 'Mallard']
   * </pre>
   *
   * @param key the name of the column that should be used as the key in
   * the output map
   * @param tabularData contains all the data to parse delimited by |
   * @return a list of lists of data
   */
  public static Map dataWithTitleRow(def key, Closure tabularData) {
    Map keyed = [:]
    dataWithTitleRow(tabularData).each {
      keyed[it[key]] = it
    }
    return keyed
  }

  /**
   * Groovy treats a new line as the end of statement (with some exceptions),
   * so each new line will invoke this method which creates a new table row
   * which then is used to 'or' each value in the row together.
   *
   * @param self the left argument in the OR operator, the current value
   * @param arg the right argument in the OR operator, the next value
   *
   * @return a reference to the next argument, so that it can 'or-ed' against.
   */
  public static Row or(self, arg) {
    def row = new Row([self])
    PARSING_CONTEXT.get().add(row)
    return row.or(arg)
  }

  /**
   * Implements the 'or' operator to append each value in a row to a list.
   * Returns a reference to itself so the next or operation can append the
   * next value.
   */
  @Canonical
  static class Row {
    List values = []
    def Row or(arg) {
      values << arg
      return this
    }
  }

  /**
   * Handler to treat any properties that cannot be found as strings.
   */
  static class PropertyVarConvertor {
    def getProperty(String property) {
      return property
    }
  }
}

Generate COBOL Copybook Output From DB2 Using JCL

Several of my last few posts have involved data manipulation, and for the most part I've been using Groovy. However, what I really like about Groovy is writing less code, and there are times when other tools, JCL in this case, can help me write even less code.

IBM provides two programs, IKJEFT01 and DSNTIAUL which allows you to pipe the results of a SQL statement into an output stream (SYSPRINT or a flat file). So, assuming that I had a simple COBOL copybook like this.

01 MY-NAME-FILE.
   03 FIRST  PIC X(10).
   03 MIDDLE PIC X(10).
   03 LAST   PIC X(10).

I could easily pipe the first, middle, and last name from a DB2 file into a flat file using an JCL that looked something like this (fill in your DBNAME and PLANNAME values).

//TESTJOB JOB PROD,'USER',CLASS=A,MSGCLASS=V,REGION=0M,
// NOTIFY=&SYSUID                                        
//COLUMNS  EXEC PGM=IKJEFT01                             
//SYSTSPRT DD   SYSOUT=*                                 
//SYSPRINT DD   SYSOUT=*                                 
//SYSUDUMP DD   SYSOUT=*                                 
//SYSREC00 DD   SYSOUT=T01.MYDATASET
//SYSPUNCH DD DUMMY                                      
//SYSTSIN  DD *                                          
DSN SYSTEM(DBNAME)                                         
RUN PROGRAM(DSNTIAUL) PLAN(PLANNAME) PARMS('SQL') -     
    LIB('DBNAME.RUNLIB.LOAD')                           
//SYSIN    DD *                                          
SELECT  LPAD(FIRST_NAME, 10, ' ')
     || LPAD(MIDDLE_NAME, 10, ' ') 
     || LPAD(LAST_NAME, 10, ' ')
FROM MY_NAME_TABLE; 
/*                                                      

Okay, its not the most intuitive, but it is a simple script that can be run as part of a job stream that creates a file compatible with the copybook that I provided above. The SQL concats the three columns together and pads the fields with spaces. The padding may be unnecessary depending on how the columns are defined, but just wanted to get across the point that the records that will be written to the file will be the 30 bytes that match the copybook.

But what happens with a slightly more complex copybook, specifically a copybook involving binary data.

01 MY-NAME-FILE.
   03 FIRST  PIC X(10).
   03 MIDDLE PIC X(10).
   03 LAST   PIC X(10).
   03 ZIP    PIC 9(9) COMP-3.

Yikes! Binary data, how can you format a column in a SELECT statement to produce binary data that could be written to a file. Well, there is the HEX command in DB2, so you could potentially create a binary string and figure out how to convert it, but the IBM programs make it much easier than that. Using the same JCL above, change the SQL to be like this:

//SYSIN    DD *                                          
SELECT FIRST_NAME,
       MIDDLE_NAME,
       LAST_NAME,
       ZIP_CODE,
FROM MY_NAME_TABLE; 
/*    

I'm making an assumption here that the ZIP_CODE field is defined in the database as INTEGER, or DECIMAL(9, 0), because if it is, the default formatting of the data of DSNTIAUL will cause the ZIP_CODE field to be written out as binary data that fits perfectly into PIC 9(9) COMP-3 variable. This is different than the default formatting used by tools on a windows platform, like say Data Studio, but because DSNTIAUL is written to run on a z/OS, it assumes that COBOL datatypes will be used.

If your data isn't defined as binary (COMP), then that actually become trickier, and you might have to do something like REPLACING(CHAR(ZIP_CODE), '.', '') to get rid of that decimal point. I didn't get into that very far, cause the dataset that I needed to produce was using COMP-3 fields for it's numeric data.

COBOL Data Type File Formatting

Since a lot of existing COBOL jobs take flat files as input, and produce flat files, I've been doing some experimenting with data manipulation in COBOL friendly formats. Much of the formatting is pretty straight forward, if you have a variable defined as PIC X(10), it's going to take up 10 bytes in the flat file, and likely be right padded with spaces. And, being that my company runs on a z/OS platform, it's going to use the EBCDIC character encoding.

However, where it gets interesting is the formatting of numbers using signs, decimal points, and binary data. I wrote a little script that uses JZOS to format data into a copybook equivalent byte stream and tried a couple simple test cases to see what happened.

['99V99', 'S99V99', 'S99V99 COMP-3'].each { format ->
 def slurper = new CopybookSlurper("""
 01 TESTING.
    03 BEGIN   PIC X(6).
    03 MIDDLE  PIC $format.
    03 END     PIC X(4). """)

 def writer = slurper.getWriter(new byte[slurper.length])
 writer.BEGIN = 'BEGIN|'
 writer.MIDDLE = 43.86
 writer.END = '|END'
 println new String(writer.getBytes(), 'IBM-37')
}

After the script ran, I had the following outputs.

BEGIN|4386|END
BEGIN|438F|END
BEGIN|??%|END

So, the decimal point is assumed, it doesn't show up in the output at all. When I add the 'S' to have a sign, it changes the last digit in the number. Half of the byte is used to determine the sign, so the '6' gets converted in a binary value, which in this case happens to print as a 'F'. And of course binary data doesn't really look like anything if you try to print it as a String, it doesn't even reliably take up the same number of characters.

I probably knew this at one point, back when I was in COBOL training, but it was helpful to have a refresher course.