The database comparison will compare, in two databases, elements with the same structure. More precisely, it will run a given sql query in both databases and compare the query results.
Sql query files are stored in the ‘queries’ folder, referenced in the comparison file and run on both databases. The simplest use is to have sql queries that perform a Select all columns in a single table, but more complex queries can be used.
A minimal bacmp configuration file for database will be as below
The "leftDb"
and "rightDb"
represent the connection to a database. "dbName"
or "serviceName"
, "hostName"
, "port"
correspond to the database url. For instance:
jdbc:oracle:thin:@hostName:port:dbName
jdbc:oracle:thin:@hostName:port/serviceName
jdbc:postgresql://hostName:port/dbName
"user"
and "pwd"
are the connection credentials. The "schema"
to use depends on the database. In most database systems, it represents the database subdivision where the tables are stored. For instance, in Postgresql, it is most often “public”, in Oracle, it is usually the same as the user name.
The "type"
is the database type. Supported values are:
An alternative way to describe your database in the bacmp file can be to use the optional "customDbConnectionUrl"
field. With this field you can directly provide the connection string for the database. In this case you can remove the "dbName"
, "hostName"
and "port"
fields. If the user and password are stored in your connection string, you have to remove those fields from the database configuration.
Below is an example of a simple bacmp file using two way to connect to the same database with "customDbConnectionUrl"
:
{
...
"leftDb": {
"customDbConnectionUrl": "jdbc:oracle:thin:MyUser/PASSW0RD@hostname:1234:mydatabase",
"schema": "sys",
},
"rightDb": {
"customDbConnectionUrl": "jdbc:oracle:thin:@hostname:1234:mydatabase",
"pwd": "PASSW0RD",
"schema": "sys",
"user": "MyUser"
},
...
}
It's also possible to compare databases through AWS Secrets Manager. In your .bacmp file, give the "secretName"
of your database and nothing else. Then paste your credentials of the account that have access to this AWS Secrets Manager database in your environment and launch the Compare Tool.
"rightDb": {
"dataBaseAWSSecretName": "secret/mysecret"
},
Note that you can add any of the usual parameter concerning the database, like "customDbConnectionUrl"
or "serviceName"
as a secret value in AWS Secrets Manager depending on how you want to connect to your database.
For a given connection to work properly, the corresponding driver must be in the home folder:
The current compatible version of the drivers are:
These jars are available through the links below.
MSSQL - https://go.microsoft.com/fwlink/?linkid=2259203
Note: Unzip the downloaded sqljdbc_12.6.0.0_enu.zip
and find mssql-jdbc-12.6.0.jre11.jar
under the enu > jars
folder.
ORACLE - https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/12.2.0.1/
POSTGRES - https://jdbc.postgresql.org/download/postgresql-42.6.2.jar
"selectedElements“: ["...", "...“]
specifies the elements that must be compared.
This corresponds to query files residing in the ‘queries’ folder. For instance, if you have this ‘queries’ folder:
And you have this configuration: "selectedElements": ["mytable1", "mytable2", "mytable4"]
This means you will compare mytable1, mytable2 and mytable4, but not mytable3. The specified queries are run on both the “left” database and the “right” database, and the results are compared.
The Sql query files must be SELECT queries returning the data to compare.
They will often be simple SELECT on a table, but might be more complex. See Complex queries.
There is no requirement that the sql file name corresponds to the table name.
For instance, you may have a query that selects data with a restricting where clause: SELECT … FROM MYTABLE WHERE category = 'USR'
In this case, you may want to call the sql file ‘mytable1-usr.sql’ and use this name in the "selectedElements"
tag.
In addition to the SQL query, the sql file may contain a business key definition with the syntax -- Business_key: <column-name>;
after the query. For instance: SELECT id, firstname, name FROM PERSON;
-- Business_key: id;
The business key definition allows to enforce a comparison between two identified rows.
With ‘id’ as a business key, there may be two cases:
On the other hand, if the business key is ‘name’ instead of ‘id’, then both cases will show a modified row for business key ‘Kent’.
Notes:
In some situations, the user may want to cleanup the database, or perform any other preliminary sql task, before launching the comparison. In this case, scripts can be put in a ‘presql’ folder beside the ‘queries’ folder.
If a script is found in the folder, it will be run at the beginning of the comparison.
In addition to the mandatory properties shown in Configuration file, there are additional properties. Most have default values that are used if the property does not appear in the file, but can also be specified to another value. Others are not used if left unspecified.
Default values are written in green below when they exist, a red ‘no value’ otherwise Please note that the double backslashes are only the json syntax for a backslash.
Specific to the database comparison
preSqlFolder | Customize the folder containing the presql scripts | See Presql |
queriesFolder | Customize the folder containing the query files | See Elements to compare |
Common properties
comparisonDataStorage | InRam (default) / OnDisk | See Disk mode |
csvSeparator | Internal separator used for the comparison algorithm. Use a character that is not likely to appear in the data. | |
differencesFolder | Intermediate folder (OnDisk mode) | See Disk mode |
leftCsvFolder | Intermediate folder (OnDisk mode) | See Disk mode |
reportFolder | Folder containing the comparison HTML report | |
rightCsvFolder | Intermediate folder (OnDisk mode) | See Disk mode |
keepDifferencesFiles | Intermediate folder cleanup (OnDisk mode) | See Disk mode |
keepExtractedCsvs | Intermediate folder cleanup (OnDisk mode) | See Disk mode |
logLevel | ERROR / WARN / INFO (default) / DEBUG The log is displayed in the console and also in a file beside the CompareTool.exe | |
maximumDifferences | Discard extra errors if too many error lines | See Html options |
reportSideBySide | Another display mode for the HTML report | See Report side by side |
reportSplitLine | Generate sub-reports if too many error lines | See Html options |
reportTitle | Customize the report title | See The index page |
splitLines | Split data before comparison | See Comparison performance |
splitSize | Split data before comparison | See Comparison performance |
timeStampedReportFolder | Append a timestamp to the report folder, to keep history when doing several runs | |
twoPassDifferences | Reconcile data across splits | See Comparison performance |
ignoreBlankLines | Blank lines in files are ignored. Only useful in other comparison types, not database. | |
spacesMode | NOTHING: does nothing TRIM_BLANK: trim the field values if they are blank TRIM_START: removes the initial spaces on each line TRIM_END: removes the final spaces on each line TRIM: removes the initial and final spaces on each line REMOVE: removes all spaces in lines, even in the middle of the line | |
caseInsensitive | Comparison ignores field case |
As stated in Query files, a query file is only a SELECT query, where you can put anything you want. It will often be a simple “select all columns from a table”, but more can be done:
Anyway, keep in mind that the tool will compare the columns returned by the query, which may not be actual columns in a table. The aliases in the select clause (implied or explicit) will be used to compare data and also to supply the headers in the report table.
The same way, the business key “columns” explained in Business keys correspond to such aliases and may in fact be more complex constructs.
Though very awkward, the following query file is valid: SELECT (column1 || column2) as id1, column3, UPPER(column4) as upper4 FROM mytable; -- Business_key: id1,column3;
But keep in mind that the overall performance of your comparison is affected by the query complexity.