The binary files comparison will compare, in two folders, files with the same structure. More precisely, it will compare files with the same path inside the two folders.
Binary files are arbitrary files with a format that is described in another file named a fileformat file. The fileformat files are in another dedicated fileformat folder.
The fileformat files are xml description files that explain the various fields in the file, their length and their format. A simplest file would look like below, though other attributes are allowed:
<?xml version="1.0" encoding="ASCII"?>
<FileFormat ConversionTable="Cp037">
<RecordFormat>
<FieldFormat Name="NAME" Type="X" Size="15" />
<FieldFormat Name="CODE" Type="9" Size="5" />
<FieldFormat Name="AMOUNT" Type="9" Size="12" Decimal="2" Signed="true" />
</RecordFormat>
</FileFormat>
FileFormat | Mandatory. The main xml tag. | ||
ConversionTable | Mandatory. The used character set. | ||
headerSize | Optional, numeric, defaults to 0. Instructs the parser to skip the first x characters, which are a file header and to not respect the format. | ||
newLineSize | Optional, numeric, defaults to 0. Instructs the parser to skip x characters after each record, corresponding to files with records line by line. When used, usually set to 1 (Unix \n) or 2 (Windows \r\n) | ||
distinguishedFieldSize | Optional, numeric, defaults to 0. Mandatory if several records are used. | See Multiple record formats | |
hasBdw | Optional, boolean, defaults to false. Specifies that the file uses Block Descriptor Words (BDW). When enabled, the tool reads the 4-byte BDW at the beginning of each block and automatically enables RDW processing as well. | See Variable-length record files (V and VB) | |
hasRdw | Optional, boolean, defaults to false Specifies that the file uses Record Descriptor Words (RDW). When enabled, the tool reads the 4-byte RDW at the beginning of each record. | See Variable-length record files (V and VB) | |
RecordFormat | Mandatory. May have several of them if multiple record formats are used. | See Multiple record formats | |
cobolRecordName | Optional string. Mandatory if several record formats are used, enables to identify the record format. | See Multiple record formats | |
distinguishedFieldValue | Optional pattern. Mandatory if several records are used, enables to decide the correct format for a record. | See Multiple record formats | |
FieldsGroup | Optional. Even if there are groups in the legacy structure, this tag is optional unless it bears an ‘Occurs’ attribute. Otherwise, the group can be “flattened”. | See Repeated data | |
Name | Mandatory in a group. Identifies the group. | ||
Occurs | Optional, numeric, defaults to 1. Used when group is repeated. | See Repeated data | |
DependingOn | Optional string. Used when occurs is variable. | See Repeated data | |
FieldFormat | Mandatory. Describes each field to compare | ||
Name | Mandatory. Identifies the field. | ||
Type | Mandatory. Determines the field type, based on Cobol pictures. | See Field formats | |
Size | Mandatory. Determines the total field size (integral + decimal). | See Field formats | |
Decimal | Optional, numeric, defaults to 0. Determines the number of decimal digits. | See Field formats | |
Signed | Optional, boolean, defaults to false. Determines whether the number is signed. | See Field formats | |
Occurs | Optional, numeric, defaults to 1. Used when field is repeated. | See Repeated data | |
DependingOn | Optional string. Used when occurs is variable. | See Repeated data | |
The field formats use a terminology similar as in Cobol. The main difference is that the “Size” attribute corresponds to the total size.
For instance, this definition: NUM1 PIC S9(6)V9(2) corresponds to (notice the 8): <FieldFormat Name="NUM1" Type="9" Size="8" Decimal="2" Signed="true" />
The handled types are: X, 9, 1, 2, 3, 5, 7, 8, B, H, T, Z.
The “Occurs” attribute enables to define that a field is repeated. <FieldFormat Name="ADDRLINE" Type="X" Size="30" Occurs="3" />
The “DependingOn” attributes can be used if the occurs depends on another field. <FieldFormat Name="CNT" Type="9" Size="2" /> <FieldFormat Name="ARRAY" Type="X" Size="20" Occurs="50" DependingOn="CNT" />
In this case, the “Occurs” value corresponds to the maximum value.
If the file uses RDW (Record Descriptor Word) or BDW (Block Descriptor Word), the tool can read their values and make them available as reserved field names that can be used with DependingOn. See Variable-length record files (V and VB) for details.
These attributes can also apply to fields group <FieldFormat Name="CNT" Type="9" Size="2" /> <FieldsGroup Name="ARRAY" Occurs="50" DependingOn="CNT"> <FieldFormat Name="KEY" Type="9" Size="1" /> <FieldFormat Name="VALUE" Type="X" Size="10" /> </FieldsGroup>
The tool supports comparing variable-length (V) and variable-length blocked (VB) binary files by handling Block Descriptor Words (BDW) and Record Descriptor Words (RDW).
When hasBdw or hasRdw is enabled (either in the configuration file or in the fileformat file), the tool reads the descriptor words and makes their values available through reserved field names that can be used in the fileformat file:
Reserved field names
When hasBdw is enabled:
| Reserved name | Description |
_BDW_ | The raw BDW value (block length including the 4-byte BDW itself) |
_BLOCK_SIZE_ | The block data size (BDW value minus 4) |
When hasRdw is enabled:
| Reserved name | Description |
_RDW_ | The raw RDW value (record length including the 4-byte RDW itself) |
_RECORD_SIZE_ | The record data size (RDW value minus 4) |
These reserved field names should be declared as FieldFormat entries in the fileformat file. The Type and Size attributes are present for structural consistency but their values are ignored — the actual values come from the parsed BDW/RDW. The RDW reserved names (_RDW_ and _RECORD_SIZE_) can also be used in the DependingOn attribute of another field to define variable-length records. The BDW reserved names (_BDW_ and _BLOCK_SIZE_) are available for display in the comparison report but cannot be used in DependingOn.
Example: VB file (variable-length blocked)
For files that have both block and record descriptors, set hasBdw="true" and hasRdw="true":
<?xml version="1.0" encoding="ASCII"?>
<FileFormat ConversionTable="IBM-1147" hasBdw="true" hasRdw="true">
<RecordFormat cobolRecordName="RAW-RECORD-VARIABLE">
<FieldFormat Name="_BDW_" Type="B" Size="2" />
<FieldFormat Name="_BLOCK_SIZE_" Type="B" Size="2" />
<FieldFormat Name="_RDW_" Type="B" Size="2" />
<FieldFormat Name="_RECORD_SIZE_" Type="B" Size="2" />
<FieldFormat Name="REC-DATA" Type="X" Size="1" Occurs="80" DependingOn="_RECORD_SIZE_" />
</RecordFormat>
</FileFormat>In this example, REC-DATA has a variable length that depends on _RECORD_SIZE_ (the RDW value minus 4). The Occurs="80" defines the maximum possible size.
Example: V file (variable-length, not blocked)
For files that have record descriptors but no block descriptors, set hasBdw="false" (or omit it) and hasRdw="true":
<?xml version="1.0" encoding="ASCII"?>
<FileFormat ConversionTable="IBM-1147" hasRdw="true">
<RecordFormat cobolRecordName="RAW-RECORD-VARIABLE">
<FieldFormat Name="_RDW_" Type="B" Size="2" />
<FieldFormat Name="_RECORD_SIZE_" Type="B" Size="2" />
<FieldFormat Name="REC-DATA" Type="X" Size="1" Occurs="80" DependingOn="_RECORD_SIZE_" />
</RecordFormat>
</FileFormat>Notes
hasBdw is enabled, hasRdw is automatically enabled as well (VB files always have both BDW and RDW).hasBdw and hasRdw settings can be specified at two levels:When the record can have various formats, multiple record formats can be used. In this case, the record will first be parsed by the tool to determine which record format is suitable, thanks to a pattern.
By default, one HTML page will be generated, as a report, for each record. This way the HTML table describing the columns will be consistent in each page.
The amount of data that must be read to verify the pattern must be setup at the <FieldFormat> level, in the distinguishedFieldSize attribute and each <RecordFormat> must bear a pattern in its distinguishedFieldValue attribute.
In below example, the tool is instructed to look if the 16th character of the record is a A or a B (in this case it will match record RF1), or a Z (in this case it will match RF2). In this case, 16 characters must be read, but the 15 first can be anything.
<FileFormat ConversionTable="Cp037" distinguishFieldSize="16">
<RecordFormat cobolRecordName="RF1" distinguishFieldValue=".{15}(A|B)">
...
</RecordFormat>
<RecordFormat cobolRecordName="RF2" distinguishFieldValue=".{15}(Z)">
...
</RecordFormat>
</FileFormat>
A minimal bacmp configuration file for binary files will be as below. It will scan the default left and right folders and compare each file found with same name, using the fileformat file with the same name found in the default fileformat folder.
{"comparisonType“: "Binary“}
The files to compare and the corresponding fileformat files will be matched this way:
You can specify, for each file, columns which are business keys, rejected columns or removed columns. You can also manually specify the fileformat file, thus overriding the logic explained in previous chapter.
Business keys allows you to specify which records are the same in left and right file. (See Business keys for an example, it works the same as in databases).
Removed columns will not appear at all in comparison, rejected columns will appear but values will be replaced by ‘(REJECTED)’.

Please note that the file list that appears in the ‘details’ here does not restrict the list of compared files. All files in the folders are compared, and these files only need additional details. In order to compare only some files, you can change the left and right folder, or use a file filter. See Additional properties for more details.
In addition to the configurations above, there are additional properties. Most have default values that are used if the property does not appear in the file, but can also be specified to another value. Others are not used if left unspecified.
Default values are written in green below when they exist, a red ‘no value’ otherwise.
fileFilter | Comma-separated extension list to compare. For instance, setting to "ebc" will compare only the ebc files in the folders. | |
leftFolder | Customize the left folder for the files to compare | |
rightFolder | Customize the right folder for the files to compare | |
fileformatFolder | Customize the folder containing the fileformat files for the files to compare | |
splitRecordFormats | Instructs the tool to handle, compare and report distinct record formats separately. Value is true by default, and it is highly recommended to let it this way. | See Multiple record formats |
hasBdw | Specifies that the files use Block Descriptor Words (BDW). Defaults to false. When enabled, RDW is also automatically enabled. The BDW and block size values are read and can be referenced in the fileformat file using reserved field names. | See Variable-length record files (V and VB) |
hasRdw | Specifies that the files use Record Descriptor Words (RDW). Defaults to false. The RDW and record size values are read and can be referenced in the fileformat file using reserved field names. | See Variable-length record files (V and VB) |
isVBColumnsToAggregate | Enables aggregation of large variable-length (VB) columns into single formatted columns with lazy loading support. When enabled, VB fields exceeding the minVBOccursToAggregate threshold are consolidated into a single column instead of creating individual columns for each array element. This prevents browser freezing when viewing HTML reports with large arrays. Default value is false. | |
minVBOccursToAggregate | Defines the minimum number of occurrences (OCCURS clause value) required for a VB field to be aggregated. Only applies when isVBColumnsToAggregate is set to true. VB fields with occurrences greater than this value will be aggregated into a single formatted column with progressive rendering. VB fields with occurrences equal to or less than this value will display as individual columns. Default value is 100. |
Note: hasBdw and hasRdw can be set both at the configuration file level (bacmp) and at the fileformat file level (XML). The fileformat-level setting takes precedence for individual files.