Configuring Replication Package
Replication package is a specific kind of a package, which helps you simply and easily replicate cloud data to relational databases or cloud data warehouses on an ongoing basis with the minimum necessary configuration. Replication offers dynamic, continuous, incremental delivery of high volumes of data with very low latency.
Creating a Replication Package
To create a replication package, click +NEW in the top menu and select Replication in the Integration column. When the package details page opens, perform the following steps:
- Specify a source connection you replicate data from and a target connection you replicate data to;
- Specify the necessary Options — Incremental Updates, Create Tables, Drop Tables, Create Foreign Keys;
- Select objects from the source list;
- Optionally schedule the package for automatic execution.
Creating Source and Target Connection
If you have already created source and target connections, simply select them from the drop-down list under Connection.
If you haven’t created the source or target connection yet, click +New connection at the bottom of the drop-down list, select the connector in the opened Connection window and specify the connection parameters. To create different connections, you need to enter different sets of parameters. See the Connectors section for more details.
Selecting Replication Options
Incremental Updates. If you select this checkbox, Skyvia does not perform a full replication (copying of all the data) each time the package is executed. Instead it performs a full replication only the first time the package is run. During subsequent replications, Skyvia detects data that were changed in your cloud app since the last package execution and then applies these changes to your database. It deletes records that were deleted in the cloud app since the last package execution, updates records that were updated and inserts the newly inserted records. Such replication is known as incremental replication.
Incremental updates are not supported for some objects in certain cloud apps. To learn more, go to Replication Overview.
Create Tables. If you select this checkbox, system will try to create source tables in the target database. If the Incremental Updates checkbox is selected, Skyvia will try to create source tables in the target database only for the first package execution (full replication). You can find more information below.
If the table with the same name already exists in a database, Skyvia will check the structure of that table. If it has the same structure, Skyvia will upsert data from replication object to the selected table. If the table has different structure, you will receive an error. However, if you select Drop Tables checkbox, that table will be deleted and new table with the same name and corresponding structure will be created.
Drop Tables. If you select this checkbox, Skyvia will try to drop source tables in the target database before creating them. If the Incremental Updates checkbox is selected, Skyvia will try to drop source tables in the target database only for the first package execution (full replication). This checkbox is enabled only if the Create Tables checkbox is selected. You can find more information below.
Create Foreign Keys. If you select this checkbox, system will create the foreign keys in the target database according to relations between the replicated objects in the source CRM. This checkbox is enabled only if the Create Tables checkbox is selected. Note that Skyvia does not create foreign keys for many-to-many relations.
When you create a replication package, Incremental Updates, Create Tables and Create Foreign Keys checkboxes are selected by default. Clear them if needed.
Incremental Updates in More Details
When the Incremental Updates checkbox is not selected, the Create Tables and Drop Tables options are applied every time a replication package is run (if the corresponding checkboxes are selected).
When the Incremental Updates checkbox is selected, the Create Tables and Drop Tables options are applied only when full replication is performed:
- for the first package run after the package creation;
- for the first package run after resetting the package LastSyncTime parameter.
The LastSyncTime parameter is the parameter that shows sync time since the last replication. When replication is executed first time, Skyvia sets this parameter to the current time, and the next package execution will load the changes made since the time specified in this parameter. You can reset this parameter in order to reload all the data. You can reset it manually in the package details page or Skyvia will offer you to reset this parameter automatically and run the full replication (full replication with optional table creation) whenever you perform any package modification except for changing the package schedule.
If you want to reset the parameter manually, click Parameters in the toolbar on the left. When the parameter editor window opens, click Reset value to reload all the data.
Additional replication options control how target tables and columns are named. They are useful in a number of cases, for example, if you need to replicate multiple instances of a cloud app to the same database, but to different tables, or when you need to replicate data to preexisting tables with custom names.
- Direct Id Check. This checkbox determines how Skyvia checks whether a replicated record is already present in the target database. By default (if this check box is not selected) Skyvia queries all the records from the target table to its cache and performs this check against its cache. However, this may be inefficient in case if a current replication run replicates only a few records, but there are a lot of records already in the target table. In this case, select this checkbox, and Skyvia will perform this check directly against the target database. Note that this checkbox is only available in packages, using the new data integration runtime (with Use new runtime check box selected).
- Schema. This setting is available only if target is SQL Server. It allows specifying SQL Server schema to load data to. By default, the data are loaded to the default schema of the user, used in the target SQL Server connection. Usually, it is the dbo schema.
- Case. Determines how case is changed when converting a cloud object name to the target database table name. Note that some databases may not support uppercase characters in table names or can be configured to always use lowercase table names. It can take the following values:
- Unchanged — case is unchanged, database table names will have the same case as the original cloud object names.
- Lower — all letters of a cloud object name are converted to lower case.
- Upper — all letters of a cloud object name are converted to upper case.
- Capitalized — the first letter of a cloud object name and the first letter after each underscore or space are converted to upper case, other letters are converted to lowercase. For example, for the FAQ__ka object, a Faq__Ka table is created.
- FirstLetterUppercase — The result name of database table starts with the capital letter, other letters are converted to lower case. For example, for the FAQ__ka object, a Faq__ka table is created.
- Prefix. Adds the specified prefix to the result table name.
- Sanitize Table/Column Names. Enables name sanitization. It removes all non-alphanumeric characters including spaces from the source object name and replacing them with underscores.
- Remove Underscores This checkbox determines whether to remove underscores from the target table name.
Track Target Naming Changes. This checkbox determines whether to track the changes of target naming settings and apply them to an existing database next time the package runs. If the Incremental Updates checkbox is selected, the naming changes are applied using ALTER statements. Otherwise, if Create Tables and Drop Tables checkboxes are selected and naming settings change, Skyvia attempts to drop tables with old names and create tables with new names.
If the Track Target Naming Changes checkbox is not selected, and you change naming settings for an existing replication package with the database already created, you need to manually apply the naming changes to the database before the next replication run; otherwise, the next run will result in an error, because database tables with changed names would not be found. If this checkbox is selected, Skyvia tries to apply the changes automatically.
Note that if this checkbox is selected, Skyvia creates an additional table __skyvia_objects in the target database to store information on naming settings and changes in order to know which naming changes were already applied, and which it needs to apply during the next replication.
Target Naming settings are ignored for replication tasks, in which you have specified a custom target table name manually. The Schema setting (for replication to SQL Server) is applied in this case.
Selecting Data to Replicate
After you have chosen both connections on the left, you can select source objects arranged in an alphabetic order in the table on the right. You can select either some or all of the objects to replicate them to target. The checkbox in the header under the Select Objects title allows selecting all the displayed objects.
Filtering and Selecting Objects
For better visual perception and easier search, Skyvia offers additional filtering settings for objects:
Filtering using Show selected only link. When you apply this filter, system displays only the objects you have selected. To display all objects again, click Show all link.
Filtering by object name. When you enter part of the object name (for example “account”) in the field above the table, system automatically displays all objects containing “account” name. If you want to hide objects with this name, select Hide from the drop-down list. Depending on your needs, you may apply multiple filters at the same time and save the applied filters for your replication package. To do it, enter a part of object name in the objects if name contains field, switch from Show to Hide if necessary, and click Save filter. The filter is displayed. After this you may add more filters in the same way.
After you have selected objects, you can optionally exclude some of their fields, which you do not want to replicate. For this, click the corresponding Edit icon next to the object. When the task editor window opens, clear only those fields of the object, which you do not want to replicate and leave selected only those ones intended for replication. Note that if you want to use Incremental Updates, the object key fields and such additional fields as CreatedDate, UpdatedDate, DateCreated, DateModified, LastModifiedDate, Modified Time, Created Time, etc. must be included. Here is the list of such fields for different cloud sources.
In our screen, we show fields of the Account object — both selected and cleared.
When replicating data Amazon Redshift, replication task editor provides more settings, allowing you to configure the target Redshift table and column parameters. See more details in the Editing Replication Task for Amazon Redshift topic.
Filtering Object Data
Additionally you can filter source data for replication by applying certain conditions to object fields in the task editor window. Click +Condition and set conditions you want to be met during replication. In our example, we want to replicate only accounts of the Customer-Direct type, and we specify it in the filtering area highlighted in blue. To learn more about filters, go to Filter Settings topic.
Setting Custom Table Names
You can also specify a custom table name in the task editor window. Just enter it to the Target name box. In this case, the Target Naming settings are not applied for this task. If you want to return to the generated names, click the button in the Target name box.
You must use only alphanumeric characters, digits, and underscore
Metadata Changes and Editing Replication Package
Skyvia does not support automatic detection of metadata changes. If metadata are changed in your cloud data source, a replication package will continue to run knowing nothing about these changes. If a new field is added to a replicated object, the replication will continue working, but the new field will not be replicated. If a replicated field or a replicated object is deleted or renamed in the data source, the replication package will start failing until you perform the necessary changes to it.
In order to detect these metadata changes in a replication package, you need to perform the following steps:
- Clear the connection metadata cache. By default, Skyvia caches metadata of available objects for cloud sources. So if a field is added, removed or changed in the source object or you add a new object, the cache should be refreshed. To do this, you need to open your connection and click the Clear now link. You also may edit and configure your connection for the metadata cache to be periodically refreshed.
Edit your replication package and perform the following steps:
- For each replicated object whose metadata were changed, click Edit and then click the Refresh button in the task editor window. This removes the information about fields that were deleted in the data source from the replication package;
- Select checkboxes for fields added in the data source in the task editor window;
- By default, when editing a replication package, Skyvia displays only previously selected objects. If you want to add a new object to be replicated, click the Show All link;
- Select checkboxes for all the objects you want to add to the package.
After this, the tables must be re-created in order to have columns for the new fields and not to have them for deleted fields. If you haven’t added any new objects to the package, select the Create Tables and Drop Tables checkboxes in order to re-create the tables. You don’t need to clear the Drop Tables checkbox after this even if the Incremental Updates checkbox is selected. In the latter case, the Create Tables and Drop Tables options are applied only for the first replication, or when the LastSyncTime parameter is reset.
However, if you added any new objects to be replicated, you cannot simply select the Drop Tables checkbox. In this case the replication package will try dropping tables for the new objects too, and since there are no such tables in the target database, it will fail. So you may either manually drop all the tables for the package not selecting the Drop Tables checkbox or select this checkbox and create the missing tables with the names of the new cloud objects in the database manually.
Whenever you edit a replication package with the Incremental Updates checkbox selected and modify anything except the package schedule, Skyvia asks you whether to reset the LastSyncTime parameter. If you choose to reset it, Skyvia reloads all the data from the source objects during the first replication after package editing. The Create Tables and Drop Tables options are also applied (if selected) even when the Incremental Updates checkbox is selected too.
If you choose not to reset LastSyncTime, the next replication is performed as usual. Tables will not be dropped/created. However, if you modify the package so that the target database must be modified too for it (for example, added more objects/fields to replication), you will need to modify the structure of your tables manually.