Configuring Replication Package
Replication package is a specific kind of a package, which helps you simply and easily replicate cloud data to relational databases or cloud data warehouses on an ongoing basis with the minimum necessary configuration. Replication offers dynamic, continuous, incremental delivery of high volumes of data with very low latency.
Creating a Replication Package
To create a replication package, click +NEW in the top menu and select Replication in the Integration column. When the package details page opens, perform the following steps:
- Specify a source connection you replicate data from and a target connection you replicate data to;
- Specify the necessary Options — Incremental Updates, Create Tables, Drop Tables, Create Foreign Keys;
- Select objects from the source list;
- Optionally schedule the package for automatic execution.
Creating Source and Target Connection
If you have already created source and target connections, simply select them from the drop-down list under Connection.
If you haven’t created the source or target connection yet, click +New connection at the bottom of the drop-down list, select the connector in the opened Connection window and specify the connection parameters. To create different connections, you need to enter different sets of parameters. See the Connectors section for more details.
Selecting Replication Options
Incremental Updates. If you select this checkbox, Skyvia does not perform a full replication (copying of all the data) each time the package is executed. Instead it performs a full replication only the first time the package is run. During subsequent replications, Skyvia detects data that were changed in your cloud app since the last package execution and then applies these changes to your database. It deletes records that were deleted in the cloud app since the last package execution, updates records that were updated and inserts the newly inserted records. Such replication is known as incremental replication.
Incremental updates are not supported for some objects in certain cloud apps. To learn more, go to Replication Overview.
Create Tables. If you select this checkbox, system will try to create source tables in the target database. If the Incremental Updates checkbox is selected, Skyvia will try to create source tables in the target database only for the first package execution (full replication). You can find more information below.
Drop Tables. If you select this checkbox, Skyvia will try to drop source tables in the target database before creating them. If the Incremental Updates checkbox is selected, Skyvia will try to drop source tables in the target database only for the first package execution (full replication). This checkbox is enabled only if the Create Tables checkbox is selected. You can find more information below.
Create Foreign Keys. If you select this checkbox, system will create the foreign keys in the target database according to relations between the replicated objects in the source CRM. This checkbox is enabled only if the Create Tables checkbox is selected. Note that Skyvia does not create foreign keys for many-to-many relations.
Direct id check. This checkbox determines how Skyvia checks whether a replicated record is already present in the target database. By default (if this check box is not selected) Skyvia queries all the records from the target table to its cache and performs this check against its cache. However, this may be inefficient in case if a current replication run replicates only a few records, but there are a lot of records already in the target table. In this case, select this checkbox, and Skyvia will perform this check directly against the target database. Note that this checkbox is only available in packages, using the new data integration runtime (with Use new runtime check box selected).
When you create a replication package, Incremental Updates, Create Tables and Create Foreign Keys checkboxes are selected by default. Clear them if needed.
Incremental Updates Are Not Selected
When the Incremental Updates checkbox is not selected, the Create Tables and Drop Tables options are applied every time a replication package is run (if checkboxes next to these options are selected).
If the Create Tables checkbox is selected, system will try to create tables for all the selected objects. So, if there is any table with such a name in a target database, the package will fail. If you simply rename tables, which are left from the previous replication, their constraints may still have the same name, and you will get an error about constraint with the same name. It means that if the Create Tables checkbox is selected, and the Drop Tables checkbox is not, there will be no tables with the names of replicated objects in the target database in order for the package to succeed.
If the Drop Tables checkbox is selected, system will try to delete tables for all the replicated objects before creating them. If there is no such a table for any of the objects, the package will fail. It means that if both Create Tables and Drop Tables checkboxes are selected, the tables for ALL the replicated objects must be present in the database in order for the package to succeed.
Please note if there are tables only for part of the replicated objects in the database, there is no working combination for this case. You will have to delete tables manually.
Incremental Updates Are Selected
When the Incremental Updates checkbox is selected, the Create Tables and Drop Tables options are applied only when full replication is performed:
- for the first package run after the package creation;
- for the first package run after resetting the package LastSyncTime parameter.
The LastSyncTime parameter is the parameter that shows sync time since the last replication. When replication is executed first time, Skyvia sets this parameter to the current time, and the next package execution will load the changes made since the time specified in this parameter. You can reset this parameter in order to reload all the data. You can reset it manually in the package details page or Skyvia will offer you to reset this parameter automatically and run the full replication (full replication with optional table creation) whenever you perform any package modification except for changing the package schedule.
If you want to reset the parameter manually, click Parameters in the toolbar on the left. When the parameter editor window opens, click Reset value to reload all the data.
Selecting Data to Replicate
After you have chosen both connections on the left, you can select source objects arranged in an alphabetic order in the table on the right. You can select either some or all of the objects to replicate them to target. The checkbox in the header under the Select Objects title allows selecting all the displayed objects.
Filtering and Selecting Objects
For better visual perception and easier search, Skyvia offers additional filtering settings for objects:
Filtering using Show selected only link. When you apply this filter, system displays only the objects you have selected. To display all objects again, click Show all link.
Filtering by object name. When you enter part of the object name (for example “account”) in the field above the table, system automatically displays all objects containing “account” name. If you want to hide objects with this name, select Hide from the drop-down list. Depending on your needs, you may apply multiple filters at the same time and save the applied filters for your replication package. To do it, enter a part of object name in the objects if name contains field, switch from Show to Hide if necessary, and click Save filter. The filter is displayed. After this you may add more filters in the same way.
After you have selected objects, you can optionally exclude some of their fields, which you do not want to replicate. For this, click the corresponding Edit icon next to the object. When the task editor window opens, clear only those fields of the object, which you do not want to replicate and leave selected only those ones intended for replication. Note that if you want to use Incremental Updates, the object key fields and such additional fields as CreatedDate, UpdatedDate, DateCreated, DateModified, LastModifiedDate, Modified Time, Created Time, etc. must be included. Here is the list of such fields for different cloud sources.
In our screen, we show fields of the Account object — both selected and cleared.
When replicating data Amazon Redshift, replication task editor provides more settings, allowing you to configure the target Redshift table and column parameters. See more details in the Editing Replication Task for Amazon Redshift topic.
Filtering Object Data
Additionally you can filter source data for replication by applying certain conditions to object fields in the task editor window. Click +Condition and set conditions you want to be met during replication. In our example, we want to replicate only accounts of the Customer-Direct type, and we specify it in the filtering area highlighted in blue. To learn more about filters, go to Filter Settings topic.
Metadata Changes and Editing Replication Package
Skyvia does not support automatic detection of metadata changes. If metadata are changed in your cloud data source, a replication package will continue to run knowing nothing about these changes. If a new field is added to a replicated object, the replication will continue working, but the new field will not be replicated. If a replicated field or a replicated object is deleted or renamed in the data source, the replication package will start failing until you perform the necessary changes to it.
In order to detect these metadata changes in a replication package, you need to perform the following steps:
- Clear the connection metadata cache. By default, Skyvia caches metadata of available objects for cloud sources. So if a field is added, removed or changed in the source object or you add a new object, the cache should be refreshed. To do this, you need to open your connection and click the Clear now link. You also may edit and configure your connection for the metadata cache to be periodically refreshed.
Edit your replication package and perform the following steps:
- For each replicated object whose metadata were changed, click Edit and then click the Refresh button in the task editor window. This removes the information about fields that were deleted in the data source from the replication package;
- Select checkboxes for fields added in the data source in the task editor window;
- By default, when editing a replication package, Skyvia displays only previously selected objects. If you want to add a new object to be replicated, click the Show All link;
- Select checkboxes for all the objects you want to add to the package.
After this, the tables must be re-created in order to have columns for the new fields and not to have them for deleted fields. If you haven’t added any new objects to the package, select the Create Tables and Drop Tables checkboxes in order to re-create the tables. You don’t need to clear the Drop Tables checkbox after this even if the Incremental Updates checkbox is selected. In the latter case, the Create Tables and Drop Tables options are applied only for the first replication, or when the LastSyncTime parameter is reset.
However, if you added any new objects to be replicated, you cannot simply select the Drop Tables checkbox. In this case the replication package will try dropping tables for the new objects too, and since there are no such tables in the target database, it will fail. So you may either manually drop all the tables for the package not selecting the Drop Tables checkbox or select this checkbox and create the missing tables with the names of the new cloud objects in the database manually.
Whenever you edit a replication package with the Incremental Updates checkbox selected and modify anything except the package schedule, Skyvia asks you whether to reset the LastSyncTime parameter. If you choose to reset it, Skyvia reloads all the data from the source objects during the first replication after package editing. The Create Tables and Drop Tables options are also applied (if selected) even when the Incremental Updates checkbox is selected too.
If you choose not to reset LastSyncTime, the next replication is performed as usual. Tables will not be dropped/created. However, if you modify the package so that the target database must be modified too for it (for example, added more objects/fields to replication), you will need to modify the structure of your tables manually.