22. Copy data across databases

22.1. General parameters for the WbCopy command.
22.2. Copying data from one or more tables
22.3. Copying data based on a SQL query
22.4. Update mode
22.5. Synchronizing tables
22.6. Examples

The WbCopy is essentially the command line version of the the DataPumper. For a more detailed explanation of the copy process, please refer to that section. It basically chains a WbExport and a WbImport statement without the need of an intermediate data file. The WbCopy command requires that a connection to the source and target database can be made at the same time from the computer running SQL Workbench/J

[Note]

Some JDBC drivers (e.g. PostgreSQL, jTDS and the Microsoft Driver) read the full result obtained from the database into memory. In that case, copying large results might require a lot of memory. Please refer to the chapter Common problems for details on how to configure the individual drivers if this happens to you.

22.1. General parameters for the WbCopy command.

WbCopy supports conditional execution

ParameterDescription
-sourceProfile

The name of the connection profile to use as the source connection. If -sourceprofile is not specified, the current connection is used as the source.

If the profile name contains spaces or dashes, it has to be quoted.

This parameter supports auto-completion

-sourceGroup

If the name of your source profile is not unique across all profiles, you will need to specify the group in which the profile is located with this parameter.

If the group name contains spaces or dashes, it has to be quoted.

-sourceConnection

Allows to specify a full connection definition as a single parameter (and thus does not require a pre-defined connection profile).

The connection is specified with a comma separated list of key value pairs:

  • username - the username for the connection
  • password - the password for the connection
  • url - the JDBC URL
  • driver - the class name for the JDBC driver. If this is not specified, SQL Workbench/J will try to determine the driver from the JDBC URL
  • driverJar - the full path to the JDBC driver. This not required if a driver for the specified class is already configured

e.g.: "username=foo,password=bar,url=jdbc:postgresql://localhost/mydb"

For a sample connection string please see the documentation for WbConnect.

If this parmeter is specified, -sourceProfile is ignored

-targetProfile

The name of the connection profile to use as the target connection. If -targetProfile is not specified, the current connection is used as the target.

If the profile name contains spaces or dashes, it has to be quoted.

This parameter supports auto-completion

-targetGroup

If the name of your target profile is not unique across all profiles, you will need to specify the group in which the profile is located with this parameter.

If the group name contains spaces or dashes, it has to be quoted.

-targetConnection

Allows to specify a full connection definition as a single parameter (and thus does not require a pre-defined connection profile).

The connection is specified with a comma separated list of key value pairs:

  • username - the username for the connection
  • password - the password for the connection
  • url - the JDBC URL
  • driver - the class name for the JDBC driver. If this is not specified, SQL Workbench/J will try to determine the driver from the JDBC URL
  • driverJar - the full path to the JDBC driver. This not required if a driver for the specified class is already configured

e.g.: "username=foo,password=bar,url=jdbc:postgresql://localhost/mydb"

If this parmeter is specified, -sourceProfile is ignored

-commitEvery The number of rows after which a commit is sent to the target database. This parameter is ignored if JDBC batching (-batchSize) is used.
-deleteTarget

Possible values: true, false

If this parameter is set to true, all rows are deleted from the target table using a DELETE statement before copying the data.

-truncateTable

Possible values: true, false

If this parameter is set to true, all rows are removed from the target table using a TRUNCATE statement before copying the data.

Not all DBMS support the TRUNCATE command. If this option is used for a DBMS that does not support TRUNCATE or where the target table(s) cannot be truncated due to other restrictions, an error will occur.

-mode

Defines how the data should be sent to the database. Possible values are INSERT, UPDATE, 'INSERT,UPDATE' and 'UPDATE,INSERT'. Please refer to the description of the WbImport command for details on.

-syncDelete

If this option is enabled -syncDelete=true, SQL Workbench/J will check each row from the target table if it's present in the source table. Rows in the target table that are not present in the source will be deleted. As this is implemented by checking each row individually in the source table, this can take some time for large tables. This option requires that each table in question has a primary key defined.

Combined with an UPDATE,INSERT or UPDATE,INSERT mode this creates an exact copy of the source table.

If more than one table is copied, the delete process is started after all inserts and updates have been processed. It is recommended to use the -checkDependencies parameter to make sure the deletes are processed in the correct order (which is most probably already needed to process inserts correctly).

To only generate the SQL statements that would synchronize two databases, you can use the command WbDataDiff

-keyColumns

Defines the key columns for the target table. This parameter is only necessary if import is running in UPDATE mode. It is ignored when specifying more than one table with the -sourceTable argument. In that case each table must have a primary key.

It is assumed that the values for the key columns will never be NULL.

-ignoreIdentityColumns

Possible values: true, false

Controls if identity or auto-increment columns will be included in the import.

If this is used, the JDBC driver (of the target database) must correctly report the column to be excluded as an AUTOINCREMENT column. This can be verified in the table definition display of the DbExplorer. If the column is reported with YES for the AUTOINCREMENT property, then this column will be excluded during the import.

-batchSize

Enable the use of the JDBC batch update feature, by setting the size of the batch queue. Any value greater than 1 will enable batch modee. If the JDBC driver supports this, the INSERT (or UPDATE) performance can be increased.

This parameter will be ignored if the driver does not support batch updates or if the mode is not UPDATE or INSERT (i.e. if -mode=update,insert or -mode=insert,update is used).

-commitBatch

Valid values: true, false

When using the -batchSize parameter, the -commitEvery is ignored (as not all JDBC drivers support a COMMIT inside a JDBC batch operation. When using -commitBatch=true SQL Workbench/J will send a COMMIT to the database server after each JDBC batch is sent to the server.

-continueOnError

Defines the behaviour if an error occurs in one of the statements. If this is set to true the copy process will continue even if one statement fails. If set to false the copy process will be halted on the first error. The default value is false.

With PostgreSQL continueOnError will only work, if the use of savepoints is enabled using -useSavepoint=true.

-transactionControl

Possible values: true, false

Controls if SQL Workbench/J handles the transaction for the copy, or if the copy must be committed (or rolled back) manually. If -transactionControl=false is specified, SQL Workbench/J will neither send a COMMIT nor a ROLLBACK at the end. This can be used when multiple copy statements should be run in a single transaction. This can be combined with the cleanup and error scripts in batch mode.

-useSavepoint

Possible values: true, false

Controls if SQL Workbench/J guards every insert or update statement with a savepoint to recover from individual error during import, when continueOnError is set to true.

Using a savepoint for each DML statement can drastically reduce the performance of the import.

-maxRows

Limit the number of rows read from the source table(s) if the value for this parameter is greater than 0 (zero). This limit is always applied to all tables which are specified.

The limiting of rows is done without adding an ORDER BY to the source query. This means that the sample of the rows is not deterministic, when running the statement again, a different set of rows might be selected.

This option is primarily intended for testing purposes.

-trimCharData

Possible values: true, false

If this parameter is set to true, values from CHAR columns will be trimmed from trailing whitespace. This is equivalent to the Trim CHAR data in the connection profile.

-showProgress

Valid values: true, false, <numeric value>

Control the update frequence in the status bar (when running in GUI mode). The default is every 10th row is reported. To disable the display of the progress specify a value of 0 (zero) or the value false. true will set the progress interval to 1 (one).

22.2. Copying data from one or more tables

ParameterDescription
-sourceSchema

The name of the schema to be copied. When using this parameter, all tables from the specified schema are copied to the target. You must specify either -sourceSchema, -sourceTable or -sourceQuery

-sourceTable

The name of the table(s) to be copied. You can either specifiy a list of tables: -sourceTable=table1,table2. Or select the tables using a wildcard: -sourceTable=* will copy all tables accessible to the user. If more than one table is specified using this parameter, the -targetTable parameter is ignored.

-includeViews

Valid values are true or false.

By default only regular tables are taken into account if the source tables are specified through a wildcard. Using -includeViews=true, the search for matching names will include views (in addition to regular tables).

-excludeTables

The tables listed in this parameter will not be copied. This can be used when all but a few tables should be copied from one database to another. First all tables specified through -sourceTable will be evaluated. The tables specified by -excludeTables can include wildcards in the same way, -sourceTable allows wildcards.

-sourceTable=* -excludeTables=TEMP* will copy all tables, but not those starting with TEMP.

This parameter supports auto-completion.

-checkDependencies

Valid values are true or false.

When copying more than one file into tables with foreign key constraints, this switch can be used to import the files in the correct order (child tables first). When -checkDependencies=true is passed, SQL Workbench/J will check the foreign key dependencies for the tables specified with -sourceTable

-targetSchema The name of the target schema into which the tables should be copied. When this parameter is not specified, the default schema of the target connection is used.
-sourceWhere A WHERE condition that is applied to the source table.
-targetTable The name of the table into which the data should be written. This parameter is ignored if more than one table is copied.
-createTarget

If this parameter is set to true the target table will be created, if it doesn't exist. Valid values are true or false.

[Important]

Using -createTarget=true is intended as a quick and dirty way of creating a target table "on the fly" during the copy process. Tables created this way should not be considered "production-ready". The created tables will only have the primary key and not-null constraints created. All other constraints from the source table are ignored.

Because the automatic mapping of table definitions will only work in the most simple cases this feature is not suited to synchronize the table definitions between two different DBMS products.

Because of these limitations this feature can not considered a replacement for a proper schema management. If you have the requirement to keep the schema definition of different DBMS in sync please consider a tool like Liquibase or Flyway. Do not try to use WbCopy for this.

If you want to migrate a table (or several tables) from one DBMS to another, consider using WbSchemaReport together with an XSLT transformation

When using this option with different source and target DBMS, the information about the data types to be used in the target database are retrieved from the JDBC driver. In some cases this information might not be accurate or complete. You can enhance the information from the driver by configuring your own mappings in workbench.settings. Please see the section Customizing data type mapping for details.

If the automatic mapping generates an invalid CREATE TABLE statement, you will need to create the table manually in the target database.

-removeDefaults

Valid values are true or false.

This parameter is only valid in combination with -createTarget=true. If set to true, any default value (or expression) defined on the source table will be ignored when creating the target table. This is useful if the source and target DBMS use different syntax for default values.

-tableType

When -createTarget is set to true, this parameter can be used to control the SQL statement that is generated to create the target table. This is useful if the target table should e.g. be a temporary table

When using the auto-completion for this parameter, all defined "create types" that are configured in workbench.settings (or are part of the default settings) are displayed together with the name of the DBMS they are used for. The list is not limited to definitions for the target database! The specified type must nonetheless match a type defined for the target connection. If you specify a type that does not exist, the default CREATE TABLE will be used.

For details on how to configure a CREATE TABLE template for this parameter, please refer to the chapter Settings related to SQL statement generation

-skipTargetCheck

Normally WbCopy will check if the specified target table does exist. However, some JDBC drivers do not always return all table information correctly (e.g. temporary tables). If you know that the target table exists, the parameter -skipTargetCheck=true can be used to tell WbCopy, that the (column) definition of the source table should be assumed for the target table and not further test for the target table will be done.

-dropTarget

Possible values: false, true, cascade

If this parameter is set to true the target table will be dropped before it is created.

For database systems that support it (Oracle, PostgreSQL), a DROP ... CASCADE is used when the value cascade is specified. If the DBMS does not support cascading drops, this is the same as specifying true.

-columns

Defines the columns to be copied. If this parameter is not specified, then all matching columns are copied from source to target. Matching is done on name and data type. You can either specify a list of columns or a column mapping.

When supplying a list of columns, the data from each column in the source table will be copied into the corresponding column (i.e. one with the same name) in the target table. If -createTarget=true is specified then this list also defines the columns of the target table to be created. The names have to be separated by comma: -columns=firstname, lastname, zipcode

A column mapping defines which column from the source table maps to which column of the target table (if the column names do not match) If -createTarget=true then the target table will be created from the specified target names: -columns=firstname/surname, lastname/name, zipcode/zip Will copy the column firstname from the source table to a column named surname in the target table, and so on.

This parameter is ignored if more than one table is copied.

When using a SQL query as the data source a mapping cannot be specified.
Please check Copying data based on a SQL query for details.

-adjustSequences

Possible values: true, false

For DBMS that support sequences which are associated with a column, this parameter can be used to adjust the next value for the sequence to the maximum value of the imported data. This can also be used to synchronize identity columns for DBMS that allow overriding the generated values.

Currently this is implemented for PostgreSQL, DB2 (LUW), H2 Database and HyperSQL (aka HSQLDB).

-preTableStatement -postTableStatement

This parameter defines a SQL statement that should be executed before the import process starts inserting data into the target table. The name of the current table (when e.g. importing a whole directory) can be referenced using ${table.name}.

To define a statement that should be executed after all rows have been inserted and have been committed, you can use the -postTableStatement parameter.

These parameters can e.g. be used to enable identity insert for MS SQL Server:

-preTableStatement="set identity_insert ${table.name} on"
-postTableStatement="set identity_insert ${table.name} off"

Errors resulting from executing these statements will be ignored. If you want to abort the import in that case you can specify -ignorePrePostErrors=false and -continueOnError=false.

These statements are only used if more than one table is processed.

-runTableStatementOnError

Possible values: true, false

Controls the execution of the post-table statement in case an error occurred while importing the data. By default the post-table statement is executed even if the import was not successful. If this is should not happen, use -runTableStatementOnError=false.

-ignorePrePostErrors

Possible values: true, false

Controls handling of errors for the SQL statements defined through the -preTableStatement and -postTableStatement parameters. If this is set to true (the default), errors resulting from executing the supplied statements are ignored. If set to false then error handling depends on the parameter -continueOnError.

22.3. Copying data based on a SQL query

ParameterDescription
-sourceQuery

The SQL query to be used as the source data (instead of a table).

This parameter is ignored if -sourceSchema or -sourceTable is specified.

-columns

The list of columns from the target table, in the order in which they appear in the source query.

If the column names in the query match the column names in the target table, this parameter is not necessary.

If you do specify this parameter, note that this is not a column mapping. It only lists the columns in the correct order .

22.4. Update mode

The WbCopy command understands the same update mode parameter as the WbImport command. For a discussion on the different update modes, please refer to the WbImport command.

22.5. Synchronizing tables

Using -mode=update,insert ensures that all rows that are present in the source table do exist in the target table and that all values for non-key columns are identical.

When you need to keep two tables completely in sync, rows that are present in the target table that do not exist in the source table need to be deleted. This is what the parameter -syncDelete is for. If this is enabled (-syncDelete=true) then SQL Workbench/J will check every row from the target table if it is present in the source table. This check is based on the primary keys of the target table and assumes that the source table as the same primary key.

Testing if each row in the target table exists in the source table is a substantial overhead, so you should enable this option only when really needed. DELETEs in the target table are batched according to the -batchSize setting of the WbCopy command. To increase performance, you should enable batching for the whole process.

Internally the rows from the source table are checked in chunks, which means that SQL Workbench/J will generate a SELECT statement that contains a WHERE condition for each row retrieved from the target table. The default chunk size is relatively small to avoid problems with large SQL statements. This approach was taken to minimize the number of statements sent to the server.

The automatic fallback from update,insert or insert,update mode to insert mode applies for synchronizing tables using WbCopy as well.

22.6. Examples

22.6.1. Copy one table to another where all column names match

WbCopy -sourceProfile=ProfileA
       -targetProfile=ProfileB
       -sourceTable=the_table
       -targetTable=the_other_table;

22.6.2. Synchronize the tables between two databases

This example will copy the data from the tables in the source database to the corresponding tables in the target database. Rows that are not available in the source tables are deleted from the target tables.

WbCopy -sourceProfile=ProfileA
       -targetProfile=ProfileB
       -sourceTable=*
       -mode=update,insert
       -syncDelete=true;

22.6.3. Copy only selected rows

WbCopy -sourceProfile=ProfileA
       -targetProfile=ProfileB
       -sourceTable=the_table
       -sourceWhere="lastname LIKE 'D%'"
       -targetTable=the_other_table;

This example will run the statement SELECT * FROM the_table WHERE lastname like 'D%' and copy all corresponding columns to the target table the_other_table.

22.6.4. Copy data between tables with different columns

This example copies only selected columns from the source table. The column names in the two tables do not match and a column mapping is defined. Before the copy is started all rows are deleted from the target table.

WbCopy -sourceProfile=ProfileA
       -targetProfile=ProfileB
       -sourceTable=person
       -targetTable=contacts
       -deleteTarget=true
       -columns=firstname/surname, lastname/name, birthday/dob;

22.6.5. Copy data based on a SQL query

When using a query as the source for the WbCopy command, the column mapping is specified by simply supplying the order of the target columns as they appear in the SELECT statement.

WbCopy -sourceProfile=ProfileA
       -targetProfile=ProfileB
       -sourceQuery="SELECT firstname, lastname, birthday FROM person"
       -targetTable=contacts
       -deleteTarget=true
       -columns=surname, name, dob;

This copies the data based on the SELECT statement into the table CONTACTS of the target database. The -columns parameter defines that the first column of the SELECT (firstname) is copied into the target column with the name surname, the second result column (lastname) is copied into the target column name and the last source column (birthday) is copied into the target column dob.

This example could also be written as:

WbCopy -sourceProfile=ProfileA
       -targetProfile=ProfileB
       -sourceQuery="SELECT firstname as surname, lastname as name, birthday as dob FROM person"
       -targetTable=contacts
       -deleteTarget=true