Replication Overview

Replication is used to create a copy of cloud application data in a relational database and keep it up-to-date. When performing replication, you define application objects and their fields to copy and configure replication options. Unlike import, replication does not support loading data of different structure in source and in target and using any custom mapping; however, it is much simpler to configure. Replication also does not require a database with a prepared schema — it can create database tables for cloud data automatically.

Replication can be used when you need to backup cloud application data to a relational database, archiving historical data that is no more a subject to change and is not needed in the cloud application. It can be useful in different data integration scenarios, or just for performing data analysis and reporting with powerful data analysis and reporting tools developed for relational databases.

Replication Source and Target

Skyvia’s replication loads data from a cloud application to a database or data warehouse service. So a replication source can only be a cloud application, like Salesforce or Mailchimp, etc. And a target can be a cloud data warehouse service or a database server, either cloud-based, like Amazon RDS or Google Cloud SQL, or on-premise one.

replication package

Note that if you want to use an on-premise database server, it must be accessible from the Internet. You can find some advice on configuring your database server in order to access it from Skyvia in the How to Configure Local Database Server to Access It from Skyvia topic.

For cloud data warehouses, Skyvia uses warehouse-specific optimized data uploading. It uploads data to a data warehouse-specific storage as CSV files and then commands the warehouse to import them. Thus, in order to perform replication to a cloud data warehouse, you need to specify not just mandatory connection parameters, but also the optional storage service-related parameters.

Automatic Schema Creation

Skyvia does not require already created database tables for data replication. It can automatically create the necessary database tables with the same structure as the replicated cloud objects. You can configure table name generation to replicate multiple instances of the same cloud app to the same database, but to different tables.

Tables creation is optional. Skyvia can use existing ones, but in this case their structure —– columns, data types —– must be the same as the structure of the cloud objects.

Skyvia can also create foreign keys in the database, corresponding to the relations between cloud objects. This can be convenient, but for some data sources it is not recommended. Foreign key constraints are enforced in databases, but may not be enforced in some cloud sources, where child records may reference an already deleted, not existing parent record. If you receive foreign key violation errors during replication, try repeating a full replication with re-creating tables, but without creating foreign keys.

Foreign keys are not created for polymorphic relations between the cloud object (relations when the same field of a child object may reference different parent objects).

Target Table Naming

By default, Skyvia uses cloud object names as table names. However, different target databases may have stricter limitations on table names than the replicated cloud source. Skyvia takes care about such cases and sanitizes table names. It replaces characters, not allowed in table names in the target database, with underscores. If the cloud object name is too long, and target database does not allow that long names, Skyvia truncates such names, and adds hash to the end to ensure that the truncated name is still unique.

Additionally, Skyvia allows you customizing table names. It allows both changing names, based on the names of cloud objects — changing case, adding prefix, etc. and specifying completely custom table names. Customizing table names can be useful in case if you have a pre-existing schema with custom table names or if when some other software expects specific table names, not coinciding with cloud object names.

To generate simpler, more consistent and readable table and column names, Skyvia provides additional name sanitization option. It allows removing all non-alphanumeric characters except underscores for any target database, regardless of what this database support and what it doesn’t.

Replication also allows specifying target SQL Server schema. For other databases, the schema is specified at the connection level.

Keeping Database Up-to-date

Skyvia can keep the database up-to-date with the cloud application. For this, you can schedule your replication for automatic execution. There are two ways to keep the database up-to-date:

  1. Performing a full replication each time with dropping and re-creating tables.
  2. Performing a full replication only the first time, to load legacy data, and then performing incremental replication.

Partial Replication

You can replicate all or only part of the data with Skyvia. You can select the objects to replicate data from when configuring replication. For each of the selected objects, you may exclude any fields from replication. For example, if you don’t want to replicate personal information, like emails, names, etc. to the database, you may simply exclude these fields.

However, please note that you need to have at least the following fields selected in order to use incremental replication:

  • object key fields
  • fields storing record creation and modification time
  • fields, determining whether a record was deleted (for data sources and objects that have such fields)

In addition to excluding fields, you may also apply filters for each replicated object separately and replicate only records, matching certain conditions. This can be combined with both full and incremental replication — incremental replication in such case will load only records, matching the filter AND created/modified since the previous replication run.

Data Hashing

You can hash data in string or binary fields during replication. This allows you to anonymize sensitive data. See Data Hashing for more details.

Metadata Changes

Note that replication doesn’t detect metadata changes in the source. If metadata is changed in your cloud data source, replication will continue to run knowing nothing about these changes. If a new field is added in a replicated object, the replication will continue working, but the new field won’t be replicated. If a replicated field or a replicated object is deleted or renamed in the data source, the replication will start failing until you perform the necessary changes to it.

However, Skyvia supports applying schema changes to the target database without re-creating tables and reloading already replicated data. See the Incremental Replication and Schema Updates topic for more information. You can find the instruction on what you should do when source metadata changes in its Metadata Change Detection paragraph.

Replication Tutorials

Skyvia provides the following tutorials on data replication: