Data Loading Step

Introduction

The data loading step is used to populate the database.                     
This page lists the properties to define for setting up the data loading step and includes examples for loading various data types.

Properties

Common properties

All database types

Property nameDescriptionValuesDefault valueMandatory
input.folder

The folder containing data:

  • Plain files must have the name of the table. Folders must have the name of the table, and all files directly inside will be used (no recursion)
  • Archive files must have the name of the table, and all files inside matching the zipContentWildcard property will be used
String                    
Absolute or relative path folder
NoneYes
databaseThe identifier corresponds to database configuration stepStringNoneYes
enableWhether the step is enabledtrue/falsetrueNo
encodingData encodingStringValue of property defined in global stepNo
multithreadNumber of threads to use for parallel steps.                    
One thread by table to load.
IntegerValue of property defined in global stepNo
zipContentWildcardComma-separated list of wildcards used to filter the content of data archive filesComma-separated list*No
nbFileThreadsNumber of files to be processed in parallel within a single table thread.                    
For Oracle, triggers the PARALLEL=true SQL Loader option, which is incompatible with already existing constraints on the table to migrate to.
Integer1No
convertGraphicDataToFullWidth

This property determines whether to convert half-width characters to full-width for ‘Graphic’ type columns in the database.                    
When 'false' (default):

  • Migrates original data as-is from the input file.

When 'true':

  • Converts all 'Graphic' type data to full-width characters and trims any extra characters if present
true/falsefalseNo
temp.folderDirectory path for temporary files. If not specified, Data Migrator will automatically create and use a temporary folder within the reverse project directory.Absolute folder pathNoneNo
keepDebugFilesKeep temporary files at end of migration.                   
Not available for POSTRESQL since pipe mechanism is used, so use ebcdicCsvDumpFolder instead
true/falsefalseNo

Oracle database

Property nameDescriptionValuesDefault valueMandatory
oracle.disableSqlldrWhether or not to disable the SQL Loader step, allowing to generate CSV files onlytrue/falsefalseNo
oracle.ctlOptionClauseSqlldr

To specify command-line parameters in the SQL*Loader control file. Refer doc for more information.

Provide each parameter and it's value in the key=value format and multi parameters are separated by comma as key1=value1,key2=value2

Comma-separated key valueNoneNo
date.formatDate pattern to add for Date column (DB2) definitionStringNoneNo
time.formatTimestamp pattern to add for Time column (DB2) definitionStringNoneNo
timestamp.formatTimestamp pattern to add for Timestamp column (DB2) definitionStringNoneNo
error.maxNbMaximal number of tolerated errors before canceling the data migration.Integer50No
spannedModeWhen this option is active, tables containing BLOBs / CLOBs are considered to have been unloaded with the SPANNED YES option.                   
In this mode, the VARCHAR, BLOB and CLOB have a dynamic payload carried by the 2 (resp. 4) bytes preceding the data.                   
Each BLOB is exported to a separate dat file, referenced in the ctl command.                   
CLOBs are exported in a single file, with the CSV delimiter configured, and referenced in the ctl command.
true/falsefalseNo

MSSQL database

Property nameDescriptionValuesDefault valueMandatory
mssqlInsertModeThis property can be used to provide the mode to load data.                 
By default Data Migrator use BULK INSERT statement to load data.                 
The INSERT INTO statement is used for the POC projects and BCP utility is used in order to support more types of databases. To use this utility see install BCP utility
BULK_INSERT                
BCP_UTILITY                
INSERT_INTO
BULK_INSERTNo
mssql_server_temp.folder[DOCKER] Path of the temp.folder from the server where SQL Server is runningStringtemp.folderNo

CSV properties

Property nameDescriptionValuesDefault valueMandatory
csvSeparatorSeparator between fields , (Comma)No
csvQuoteQuote character ' (Single quote)No
csvNullReplacement character for empty value NoneNo
csvOneLiner

This property determines how Data Migrator handles multi-line data in CSV files when importing into PostgreSQL or Oracle databases.              
Purpose

  • Addresses the challenge of line feeds (newline characters) within column data in databases like Oracle or PostgreSQL.
  • Manages CSV input files where a single table row may span multiple lines due to line feeds in the data.

Behavior

  • When false (default):
    • Reads CSV file line by line
    • Pushes each line as a separate row to the database
    • Optimized for performance with data lacking line feeds
  • When true:
    • Recognizes multi-line fields using csvQuotes
    • Merges lines belonging to the same field
    • Correctly imports data with line feeds

Usage             
Set csvQuotes to enclose entire column data containing line feeds               
Example: "Hello\nWorld" for data spanning two lines              
Additional feature              
For NUMERIC columns: Replaces commas with dots in decimal values (PostgreSQL format)               
Note: This property is applicable only to PostgreSQL and Oracle databases.

true/falsefalseNo
csvWithHeaderThis property allows you to define whether data files contain a header.true/falsefalseNo

EBCDIC properties

Property nameDescriptionValuesDefault valueMandatory
ebcdicOidIndicate the presence of a OID before each records.true/falsetrueNo
ebcdicOidLengthIndicate the length of a OID before each records.Integer2No
ebcdicOidForLengthWorks with ebcdicOid= true and ebcdicOidLength !=0, take oid bytes to compute length of current record instead length deduced from SQLModel.jsontrue/falsefalseNo
ebcdicZonedModeIndicate the strategy to apply when handling the Zone-Decimal fields.EBCDIC_STRICT            
EBCDIC_MODIFIED            
AS400
EBCDIC_STRICTNo
ebcdicCodePointShiftCode shift point to apply. More information is in this FAQ         
If you use CUSTOM930 in this case, no need to use this property. This charset already includes on its own the Code shift point mechanism.
Integer384No
ebcdicVarcharAsCharConsider all VARCHAR columns as CHAR while migrating datatrue/falsefalseNo
ebcdicByteReplacementSpecify the byte replacement list by setting to replace and replacing codes point            
Before applying codeshiftpoint offset if any, this enables to convert each byte having the to replace code point in considered encoding by the replacing code point.            
Example : 30:94|XX:YY With CP037 replace (IRS) by [;]
emptyNoneNo
ebcdicNullByteIndicatorPositionModeWhere to find the Null Byte Indicator (NBI) in provided data files for nullable columnsALWAYS_LEFT            
RIGHT_FOR_DATES            
ALWAYS_RIGHT            
IGNORED
RIGHT_FOR_DATESNo
ebcdicEmptyVarcharModeBehavior to apply while dealing with VARCHAR columns with a size 0LOW_VALUE            
NULL_VALUE            
SINGLE_SPACE            
FILL_SPACE
LOW_VALUENo
ebcdicVarcharKeepPaddingGet all the VARCHAR definition data if provided instead of usefull data provided by the 2 first bytes length. So padding characters can be catched.            
Cannot be used with ebcdicFilesWithVarcharInVB to true
true/falsefalseNo
ebcdicFilesWithVarcharInVBFor VARCHAR, 2-bytes length are use to read byte instead of use size column definition from SQLModel.jsontrue/falsefalseNo
ignoreDeletedRecordsSet to true to pre-read the record and test whether or not the bytes are all blanks on the EBCDIC encoding. If blank, tool ignore the record and continue to the next one. This option is useful if REUSDLT option in AS/400 set as YES to reuse deleted records.true/falsefalseNo
ebcdicCsvDumpFolderAbsolute or relative path of the folder to store the CSV data which extracts from the EBCIDIC.                   
More information is in debug page.
StringNoneNo
onyExtractCsvTo instruct the Data-Migrator only to extract the CSV from the EBCDIC. By default, value of property is “false”. If value is “true” then data migrator will generate CSV and wouldn’t migrate data. This property is depends on “ebcdicCsvDumpFolder” property for the path. If that property value is empty or none, then automatically generate CSV and cutting log file under “data/extract” foldertrue/falsefalseNo

Examples

CSV Data loading

The following setting illustrates the configuration for CSV data loading in Data Migrator

[CSV2DB - Loading1]
csvQuote="
database=cnxOracle
input.folder=datas
keepDebugFiles=true
multithread=5
temp.folder=dataTemp

This configuration section defines:

  • CSV quote character is set to double quote (")
  • CSV field separator is a semicolon (;)
  • The operation uses the Oracle connection identified by cnxOracle
  • Input CSV files are located in the datas folder
  • Temporary files are kept after migration for debugging
  • 5 threads are used for parallel processing
  • Temporary files are stored in the dataTemp folder

EBCDIC Data loading

The following setting illustrates the configuration for EBCDIC data loading in Data Migrator

[EBCDIC2DB - Loading1]
database=cnxOracle
encoding=CP1145
ebcdicCodePointShift=0
ebcdicZonedMode=EBCDIC_MODIFIED
input.folder=datas
keepDebugFiles=true
multithread=10
temp.folder=dataTemp 

This configuration section defines:

  • Database connection uses cnxOracle
  • The operation uses the Oracle connection identified by cnxOracle
  • EBCDIC code point shift is set to 0
  • EBCDIC zoned decimal handling uses EBCDIC_MODIFIED strategy
  • Input EBCDIC files are located in the datas folder
  • Temporary files are kept after migration for debugging
  • 10 threads are used for parallel processing
  • Temporary files are stored in the dataTemp folder