Configuring Replication
Replication integration is a specific kind of an integration, which helps you simply and easily replicate cloud data to relational databases or cloud data warehouses on an ongoing basis with the minimum necessary configuration. Replication offers dynamic, continuous, incremental delivery of high volumes of data with very low latency.
Creating a Replication integration
To create a replication integration, click +NEW in the top menu and select Replication in the Integration column. When the replication details page opens, perform the following steps:
- Specify a source connection you replicate data from and a target connection you replicate data to;
- Specify the necessary Options — Incremental Updates, Update Schema, Create Tables, Drop Tables, Create Foreign Keys;
- Select objects from the source list;
- Optionally schedule replication for automatic execution.
Creating Source and Target Connection
If you have already created source and target connections, simply select them from the drop-down list under Connection.
If you haven’t created the source or target connection yet, click +New connection at the bottom of the drop-down list, select the connector in the opened Connection window and specify the connection parameters. To create different connections, you need to enter different sets of parameters. See the Connectors section for more details.
Selecting Replication Options
-
Incremental Updates. If you select this checkbox, Skyvia does not perform a full replication (copying of all the data) each time replication is executed. Instead it performs a full replication only the first time the integration is run. During subsequent replications, Skyvia detects data that were changed in your cloud app since the last integration execution and then applies these changes to your database. It deletes records that were deleted in the cloud app since the last execution, updates records that were updated and inserts the newly inserted records. Such replication is known as incremental replication.
Incremental updates are not supported for some objects in certain cloud apps. To learn more, go to Incremental Replication and Schema Updates.
-
Update Schema. If you select this checkbox, Skyvia checks target schema and compares it with the schema it would build for the objects and fields, selected in the replication integration. If there are differences, Skyvia applies certain differences to the target database without re-creating all the tables. Note that not all differences are applied. See details in Incremental Replication and Schema Updates.
-
Create Tables. If you select this checkbox, system will try to create source tables in the target database. If the Incremental Updates checkbox is selected, Skyvia will try to create source tables in the target database only for the first replication execution (full replication). You can find more information below.
If the table with the same name already exists in a database, Skyvia will check the structure of that table. If it has the same structure, Skyvia will upsert data from replication object to the selected table. If the table has different structure, you will receive an error. However, if you select Drop Tables checkbox, that table will be deleted and new table with the same name and corresponding structure will be created.
-
Drop Tables. If you select this checkbox, Skyvia will try to drop source tables in the target database before creating them. If the Incremental Updates checkbox is selected, Skyvia will try to drop source tables in the target database only for the first execution (full replication). This checkbox is enabled only if the Create Tables checkbox is selected. You can find more information below.
-
Create Foreign Keys. If you select this checkbox, system will create the foreign keys in the target database according to relations between the replicated objects in the source CRM. This checkbox is enabled only if the Create Tables checkbox is selected. Note that Skyvia does not create foreign keys for many-to-many relations.
When you create a replication integration, Incremental Updates, Create Tables and Create Foreign Keys checkboxes are selected by default. Clear them if needed.
Incremental Updates in More Details
When the Incremental Updates checkbox is not selected, the Create Tables and Drop Tables options are applied every time replication is run (if the corresponding checkboxes are selected).
When the Incremental Updates checkbox is selected, the Create Tables and Drop Tables options are applied only when full replication is performed:
- for the first run after the integration creation;
- for the first run after resetting the LastSyncTime integration parameter.
The LastSyncTime parameter is the parameter that shows sync time since the last replication. When replication is executed first time, Skyvia sets this parameter to the current time, and the next integration execution will load the changes made since the time specified in this parameter.
If you want to reset the parameter manually, click Parameters in the toolbar on the left. When the parameter editor window opens, click Reset value to reload all the data.
Additional Options
Additional replication options control how target tables and columns are named. They are useful in a number of cases, for example, if you need to replicate multiple instances of a cloud app to the same database, but to different tables, or when you need to replicate data to preexisting tables with custom names.
- Direct Id Check. This checkbox determines how Skyvia checks whether a replicated record is already present in the target database. By default (if this check box is not selected) Skyvia queries all the records from the target table to its cache and performs this check against its cache. However, this may be inefficient in case if a current replication run replicates only a few records, but there are a lot of records already in the target table. In this case, select this checkbox, and Skyvia will perform this check directly against the target database. Note that this checkbox is only available in packages, using the new data integration runtime (with Use new runtime check box selected).
- Schema. This setting is available only if target is SQL Server. It allows specifying SQL Server schema to load data to. By default, the data are loaded to the default schema of the user, used in the target SQL Server connection. Usually, it is the dbo schema.
- Case. Determines how case is changed when converting a cloud object name to the target database table name. Note that some databases may not support uppercase characters in table names or can be configured to always use lowercase table names. It can take the following values:
- Unchanged — case is unchanged, database table names will have the same case as the original cloud object names.
- Lower — all letters of a cloud object name are converted to lower case.
- Upper — all letters of a cloud object name are converted to upper case.
- Capitalized — the first letter of a cloud object name and the first letter after each underscore or space are converted to upper case, other letters are converted to lowercase. For example, for the FAQ__ka object, a Faq__Ka table is created.
- FirstLetterUppercase — The result name of database table starts with the capital letter, other letters are converted to lower case. For example, for the FAQ__ka object, a Faq__ka table is created.
- Prefix. Adds the specified prefix to the result table name.
- Sanitize Table/Column Names. Enables name sanitization. It removes all non-alphanumeric characters including spaces from the source object name and replacing them with underscores.
- Remove Underscores This checkbox determines whether to remove underscores from the target table name.
-
Track Target Naming Changes. This checkbox determines whether to track the changes of target naming settings and apply them to an existing database next time the integration runs. If the Incremental Updates checkbox is selected, the naming changes are applied using ALTER statements. Otherwise, if Create Tables and Drop Tables checkboxes are selected and naming settings change, Skyvia attempts to drop tables with old names and create tables with new names.
If the Track Target Naming Changes checkbox is not selected, and you change naming settings for an existing replication integration with the database already created, you need to manually apply the naming changes to the database before the next replication run; otherwise, the next run will result in an error, because database tables with changed names would not be found. If this checkbox is selected, Skyvia tries to apply the changes automatically.
Note that if this checkbox is selected, Skyvia creates an additional table __skyvia_objects in the target database to store information on naming settings and changes in order to know which naming changes were already applied, and which it needs to apply during the next replication.
Target Naming settings are ignored for replication tasks, in which you have specified a custom target table name manually. The Schema setting (for replication to SQL Server) is applied in this case.
Selecting Data to Replicate
After you have chosen both connections on the left, you can select source objects arranged in an alphabetic order in the table on the right. You can select either some or all of the objects to replicate them to target. The checkbox in the header under the Select Objects title allows selecting all the displayed objects.
Filtering and Selecting Objects
For better visual perception and easier search, Skyvia offers additional filtering settings for objects:
-
Filtering using Show selected only link. When you apply this filter, system displays only the objects you have selected. To display all objects again, click Show all link.
-
Filtering by object name. When you enter part of the object name (for example “account”) in the field above the table, system automatically displays all objects containing “account” name. If you want to hide objects with this name, select Hide from the drop-down list. Depending on your needs, you may apply multiple filters at the same time and save the applied filters. To do it, enter a part of object name in the objects if name contains field, switch from Show to Hide if necessary, and click Save filter. The filter is displayed. After this you may add more filters in the same way.
Selecting Fields
After you have selected objects, you can optionally exclude some of their fields, which you do not want to replicate. For this, click the corresponding Edit icon next to the object. When the task editor window opens, clear only those fields of the object, which you do not want to replicate and leave selected only those ones intended for replication. Note that if you want to use Incremental Updates, the object key fields and such additional fields as CreatedDate, UpdatedDate, DateCreated, DateModified, LastModifiedDate, Modified Time, Created Time, etc. must be included. Here is the list of such fields for different cloud sources.
In our screen, we show fields of the Account object — both selected and cleared.
When replicating data Amazon Redshift, replication task editor provides more settings, allowing you to configure the target Redshift table and column parameters. See more details in the Editing Replication Task for Amazon Redshift topic.
Filtering Object Data
Additionally you can filter source data for replication by applying certain conditions to object fields in the task editor window. Click +Condition and set conditions you want to be met during replication. In our example, we want to replicate only accounts of the Customer-Direct type, and we specify it in the filtering area highlighted in blue. To learn more about filters, go to Filter Settings topic.
Setting Custom Table Names
You can also specify a custom table name in the task editor window. Just enter it to the Target name box. In this case, the Target Naming settings are not applied for this task. If you want to return to the generated names, click the button in the Target name box.
You must use only alphanumeric characters, digits, and underscore.
Replicating Data from a Database
You can choose your database as Source in Replication. This adds one extra step to the replication configuration - configuring the Ingestion mode. Ingestion mode defines how to track changes in Source during the incremental replicaiton. There are three available Ingestion modes: New, New and modified, and Log-Based.
Currently Skyvia supports replication from SQL Server and MySQL. We work on supporting more databases and they will be available in the near future.
Ingestion Modes
You select the Ingestion mode for the whole replication integration on the main page of your replication integration.
You can change the Ingestion mode separately for each replication task in the task settings.
Ingestion Mode: Default
The tasks with the Default Ingestion mode selected use the ingestion mode, selected for the whole replication integration.
Ingestion Mode: New
The New mode allows you to track only new records in Source and incrementally replicate them to Target. To configure a New mode select a DATETIME column that can be used to track a creation date, or an auto-increment number column such as Id. Skyvia will automatically provide you with the selection of columns with the suitable column types.
Ingestion Mode: New and Modified
The New and modified mode allows you to track new and modified records in Source and incrementally replicate them to Target. To configure a New and modified mode select a DATETIME or autoincremental number column that can be used to track record modifications from the Modified column dropdown. You can use the same column in the Created column dropdown to track new records unless this column is NULLABLE and is not updated at the time of row creation. In this case, select a separate NOT NULLABLE DATETIME or autoincremental number column from the Created column dropdown. Skyvia automatically provides you with the list of columns that can be used as Created and Modified columns.
Note that New and modified mode does not track deleted records.
Ingestion Mode: Log-Based
This mode uses database change-tracking and logging tools to detect new, modified, and deleted records between replication runs. The technics of tracking changes can differ from database to database. To learn how to enable change tracking for your database visit the connectors topic.
Log-Based mode is currently available for SQL Server only.