Skip to main content

CMF v3 and the ISO 8583 Dataset Model

· 5 min read
Alejandro Revilla
jPOS project founder
AR Agent
AI assistant

The jPOS Common Message Format specification is getting an update. CMF v3 is still a work in progress, but an early-access draft is available at jpos.org/doc/jPOS-CMFv3.pdf for anyone who wants to follow along. The most significant addition in v3 is first-class support for the ISO 8583 dataset model—and that's what this post is about.

What datasets are

ISO 8583 has long defined certain fields as composite containers—variable-length binary fields that carry structured sub-fields rather than a single atomic value. DE-55 (ICC data) is the most widely known: it holds raw BER-TLV data from the EMV specifications. Fields like DE-34 (electronic commerce data), DE-43 (card acceptor), and DE-49 (verification data) carry structured content using a similar pattern, formalized under the dataset model.

A dataset is a self-describing envelope inside a composite field. Each dataset has:

  • an identifier (one byte, 0x010xFE) that says what kind of data it contains
  • a length (2 bytes, big-endian) indicating the size of its content
  • a payload encoded according to the dataset's format

Multiple datasets can appear in sequence inside a single field. DE-49, for example, can carry a dataset 0x01 for TLV-encoded currency data and a dataset 0x71 for bitmap-structured verification data, back to back in the same wire bytes.

Two encoding formats: TLV and DBM

ISO 8583 defines two encoding formats for dataset payloads.

TLV (Tag-Length-Value) is BER-TLV encoding, the same format used by EMV and ISO/IEC 7816. Tags are 1, 2, or 3 bytes: a leading byte whose low 5 bits are all set (0x1F) signals that more tag bytes follow, with bit 7 (the high bit) of each subsequent byte acting as the continuation flag. Lengths use BER definite form: values up to 127 fit in a single byte; for longer values, the first byte has bit 7 set and the lower 7 bits indicate how many additional bytes encode the actual length (so a 300-byte value needs two additional length bytes after the indicator byte). Values are raw bytes. DE-34 dataset 0x01 (authentication request data) and DE-55 (ICC data) both use TLV.

DBM (Dataset Bitmap) carries structured fields using an ISO 8583-style bitmap. The payload starts with a 2-byte bitmap whose bits announce the presence of the corresponding elements; element values follow in order. Bit 1 of each word is a continuation bit—if set, the next byte extends the bitmap. DBM datasets can also carry trailing TLV elements after the bitmap section for extended or proprietary data.

DE-34 datasets 0x730x77 (authentication response data) and DE-49 dataset 0x71 (verification data) use DBM. DE-55 uses TLV throughout.

What it looks like in jPOS

CMF v3 support in jPOS introduces ISODatasetField, ISODataset, and DatasetPackager. The ISOMsg API extends to dataset paths using dot notation: "field.datasetId.elementId". The same path syntax works for both with() and get().

Building a message

ISOMsg msg = new ISOMsg("0100");
msg.setPackager(new GenericPackager("jar:packager/cmfv3.xml"));

msg.with("55.0x9F26", ISOUtil.hex2byte("1122334455667788")) // ICC: TLV tag 9F26
.with("55.0x9F10", ISOUtil.hex2byte("06011203A0B800")) // ICC: TLV tag 9F10
.with("55.0x95", ISOUtil.hex2byte("0000000000")) // ICC: TLV tag 95
.with("49.0x71.1", "1") // Verification: DBM dataset 0x71, element 1
.with("49.0x71.2", "1234"); // Verification: DBM dataset 0x71, element 2

The path "55.0x9F26" addresses field 55, TLV element with tag 0x9F26. The path "49.0x71.2" addresses field 49, dataset 0x71, DBM element at bit position 2.

Reading a message

getString() and getBytes() accept the same paths:

ISOMsg unpacked = new ISOMsg();
unpacked.setPackager(packager);
unpacked.unpack(packed);

// TLV elements from DE-55
byte[] cryptogram = unpacked.getBytes("55.0x9F26");
byte[] iad = unpacked.getBytes("55.0x9F10");

// DBM elements from DE-49 dataset 0x71
String flag = unpacked.getString("49.0x71.1");
String amount = unpacked.getString("49.0x71.2");

No casting, no intermediate objects. The path resolves through the field, the dataset, and the element in one call.

Mixed datasets in a single field

A single field can hold multiple datasets, mixing TLV and DBM in the same wire encoding:

msg.with("49.0x01.0x5F2A", ISOUtil.hex2byte("0840"))  // TLV dataset 0x01: transaction currency code
.with("49.0x71.1", "1") // DBM dataset 0x71: element 1
.with("49.0x71.2", "USD"); // DBM dataset 0x71: element 2

Reading back:

byte[] currency = unpacked.getBytes("49.0x01.0x5F2A");
String element1 = unpacked.getString("49.0x71.1");
String element2 = unpacked.getString("49.0x71.2");

Backward compatibility with DE-55

One design goal of CMF v3 is that the wire bytes for DE-55 are identical between the legacy packager (cmf.xml) and the new dataset packager (cmfv3.xml). The ICC data that was previously an opaque ISOBinaryField containing raw BER-TLV is now an ISODatasetField containing a TLV ISODataset—but the bytes on the wire are unchanged. Existing integrations that write DE-55 with the old packager can be read by a system using the new one without protocol negotiation.

ISO 20022 transport

CMF v3 also defines DE-114 as the ISO 20022 transport field—a slot for carrying a complete ISO 20022 XML document (UTF-8 encoded, full Document element with namespace URI identifying the message type) alongside the ISO 8583 message. This enables jPTS to act as a bridge between ISO 8583 and ISO 20022 messaging without a separate protocol translation layer.

The CMF v3 draft covers the dataset model in detail and the DE-114 transport mechanism. It is a work in progress—sections are still being written and some field definitions are pending—but it is already usable as a reference for implementation work.

Early-access draft: jpos.org/doc/jPOS-CMFv3.pdf