How to Configure Cloud Data Synchronization with Empty Database

This tutorial shows how to make a copy of cloud data in an empty database and configure a bi-directional synchronization between this copy and the original cloud application. For this tutorial, we select Salesforce cloud source as an example and its Account and Contact objects.

In Skyvia, a Replication Package is used to create a copy of cloud data in a database. It can create tables for data automatically and then keep the database up-to-date performing one-way synchronization periodically from the cloud source. If such synchronization in one direction is enough for you, you can take a look on our Replication Tutorials.

Synchronization Packages, on the other hand, perform bi-directional data synchronization. However, they are not intended for creating an exact data copy and cannot create tables for the data automatically. They are intended for synchronization of existing tables, and they support synchronization of data having different structure.

So, if you have just an empty database and want to create a copy of cloud data in it and then sync changes between this copy and original cloud source in both direction, you have two options. One is to manually create tables in the database, which is not always convenient. Another way is to use a replication package for creating the necessary database tables, then manually modify these tables so that they meet Skyvia’s synchronization requirements, and finally, configure a synchronization package.

Synchronization Requirements to Consider

Note that Skyvia has certain requirements to consider for synchronization. Skyvia supports synchronization not for every object in every cloud source. It requires that these cloud objects have fields, storing the record creation and modification time. You can find the information about these limitations in the corresponding Cloud Sources topics. Some cloud sources, like Salesforce, support synchronization for most objects, and some — only for a few objects.

As for databases, Skyvia requires synchronized tables to have auto-generated primary keys. This is the reason why we should manually modify database tables after creating them via a replication package, because replication package creates primary keys that are not auto-generated.

Connections

For our package we need to create a connection to our cloud source and database. You can find the information about connecting Skyvia to various cloud sources in the Cloud Sources section. In this tutorial, we create a Salesforce connection. To create a connection to Salesforce, perform the following steps:

  1. Click +NEW in the top menu.
  2. Click the Connection button in the menu on the left.
  3. In the opened Select Connector page, select Salesforce. To quickly find it, you can either use the Type to filter box or filter connectors by categories using the All list (for Salesforce, select the CRM category).
  4. The default name of new connection is Untitled. Just click it to rename the connection, for example, to Salesforce1.
  5. From the Environment drop-down list, select the Salesforce environment type to import data to. Since this is just a sample walkthrough, the Sandbox environment is recommended.

    Connection Editor window

  6. From the Authentication drop-down list, select the authentication method for connecting to Salesforce. If you don’t mind storing your Salesforce credentials on Skyvia server, select User Name & Password. If you prefer not to store your credentials, select OAuth 2.0.
  7. If you have selected User Name & Password, on the previous step, specify your Salesforce account e-mail, password, and security token. Otherwise, if you have selected OAuth 2.0 authentication, click the Log In with Salesforce button and login via the Salesforce website on the opened page. The result OAuth token will be stored in the connection data. Your Salesforce credentials will not be stored on our website.

    Log In window

  8. Click the Save button to save the connection.

    Connection Editor window 2

As for connecting to databases, see the information about database connections in the Databases section. Please note that Skyvia does not support synchronization for cloud data warehouse services — Azure Synapse Analytics, Amazon Redshift, and Google BigQuery.

Replication Package

To create a replication package, perform the following actions:

  1. Click +NEW in the top menu.
  2. In the Integration column, click Replication. The replication package details page will open.
  3. Rename your package by clicking and editing the package name. The default package name is Untitled. Please note, if you omit this step, the package name will remain Untitled in the list of created packages.
  4. Under Source, in the Connection drop-down list, select the Salesforce connection.

    Source Connection

  5. Under Target, in the Connection drop-down list, select the corresponding connection.
  6. Important! Clear the Create Foreign Keys checkbox. If you need a bi-directional synchronization, you should not have foreign keys created by replication. If you don’t clear this checkbox, you will need to drop the foreign keys manually later.
  7. In the grid under Select Objects, select checkboxes for objects you want to synchronize. In our example, these are Account and Contact objects.

    Select Objects table

  8. The first run of the synchronization package will re-load all the records from Salesforce to the target database anyway. So it’s not necessary that replication package actually load records. We need it only to create the tables.

    If you don’t want to spend time on additional configuration, you may omit the steps and allow replication to load data. In this case, it may take quite some time for replication to load all the data, and you will need to delete data from the database tables before running the synchronization anyway. Besides, these extra records are added to your subscription counter.

    But if you don’t want to unnecessarily load data, you can avoid this by configuring filters for replication:

    1. Click Edit icon next to the selected object. The task editor window will open.
    2. In the replication task editor window, click +Condition.
    3. In the added boxes, specify a filter condition that is never true for any record. In the first box, select a field; in the second — comparison operator, and in the third — specify some value for comparison.

      For example, for Salesforce I can use a condition Id equals 0. It cannot be true in Salesforce for any record. Thus, all the records will be filtered out by this filter, and the replication package will create a table, but won’t load any records.

      Filter Settings

    4. Repeat the steps a-с for all the objects you want to synchronize.
  9. Click the Create button to create the package.
  10. Run the package by clicking Run.

Editing Database Tables

After replication creates the tables, we need to edit them. We need to delete their primary key constraint and add a new autogenerated one. You can either use some visual database tools to make these changes or run SQL scripts against your database, for example, with Skyvia Query.

If you want to use a database tool, you need to perform the following actions:

  1. If you didn’t add filters so that no data were replicated, truncate your database tables, which were created after the replication.
  2. Delete the tables’ primary key constraints.
  3. You may optionally delete their primary key columns as well. Their values are automatically generated on the cloud side and cannot be synchronized when loading data from the database to the cloud source. They will only be synced when loading data from the cloud source to database.
  4. Add a new autogenerated column for the primary key. For example, you may call it sync_id:
    • In PostgreSQL you may use the serial data type to make a column autogenerated.
    • In MySQL you may use a BIGINT data type with the AUTOINCREMENT option.
    • For SQL Server, you may use a BIGINT data type with the IDENTITY option.
    • For Oracle 12c and higher, you may use a NUMBER data type with the GENERATED ALWAYS AS IDENTITY option.
    • For lower Oracle versions, you will need to create a sequence and a trigger to assign values from this sequence to the primary key column when inserting records.

Alternatively, you may do the same, using DDL scripts.

  1. Click +NEW in the top menu.
  2. In the Query column, click SQL.
  3. Click Select connection on the left and select the connection to your database, where you replicate data, in the list.
  4. Enter your script to the query editor box. Here we provide scripts with Account and Contact objects as an example. You can create your own scripts based on this example by replacing Account and Contact with your object names:

    • For SQL Server:

      1
      2
      3
      4
      5
      6
      7
      
      ALTER TABLE Contact  DROP CONSTRAINT PK_Contact;
      
      ALTER TABLE Contact  ADD Sync_id BIGINT IDENTITY PRIMARY KEY;
      
      ALTER TABLE Account DROP CONSTRAINT PK_Account;
      
      ALTER TABLE Account ADD Sync_id BIGINT IDENTITY PRIMARY KEY;
      
    • For MySQL:

      1
      2
      3
      4
      5
      6
      7
      
      ALTER TABLE contact DROP PRIMARY KEY;
      
      ALTER TABLE contact ADD sync_id BIGINT AUTO_INCREMENT PRIMARY KEY;
      
      ALTER TABLE account DROP PRIMARY KEY;
      
      ALTER TABLE account ADD sync_id BIGINT AUTO_INCREMENT PRIMARY KEY;
      
    • For PostgreSQL:

      1
      2
      3
      4
      5
      6
      7
      
      ALTER TABLE "Contact"  DROP CONSTRAINT "PK_Contact";
      
      ALTER TABLE "Contact"  ADD Sync_id SERIAL PRIMARY KEY;
      
      ALTER TABLE "Account" DROP CONSTRAINT "PK_Account";
      
      ALTER TABLE "Account" ADD Sync_id SERIAL PRIMARY KEY
      
    • For Oracle (12c or higher):

      1
      2
      3
      4
      5
      6
      7
      
      ALTER TABLE "Contact"  DROP CONSTRAINT "PK_Contact";
      
      ALTER TABLE "Contact"  ADD Sync_id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY;
      
      ALTER TABLE "Account" DROP CONSTRAINT "PK_Account";
      
      ALTER TABLE "Account" ADD Sync_id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY;
      
  5. Click Execute or press F9.

Synchronization Package

To create a synchronization package, perform the following steps:

  1. Click +NEW in the top menu.
  2. In the Integration column, click Synchronization. The sync package details page will open.
  3. Rename your package by clicking and editing the package name. The default package name is Untitled. Please note, if you omit this step, the package name will remain Untitled in the list of created packages.
  4. Now it’s necessary to specify source and target connections. Note that synchronization is bi-directional, and the only difference between the source and target is that source changes have a priority when solving change conflicts (when a record was changed both in source and target between synchronizations). Let Salesforce be the source in our example.
  5. Under Source, in the Connection drop-down list, select the Salesforce connection.

    Source Connection

  6. Under Target, in the Connection drop-down list, select the SQL Azure connection.

    Target Connection

  7. Click the Add new link.
  8. In the Source list, select Account.
  9. In the Target list, select the Account table and click Next step.
  10. Since all the columns in source and target have the same names and types, they are mapped automatically. Click Target to Source in order for columns to be also mapped for the opposite direction.
  11. Click Save.
  12. Repeat steps 7-11 for all other objects that you want to sync.

Scheduling Package Execution

After we have created the synchronization package, we want to keep the data in sync automatically. For this, we will configure the package to run every hour during workdays.

Skyvia uses LastModifiedData and CreatedDate fields to track changes in Salesforce, and it creates its own tracking tables and triggers to track changes in the database.

Note that since we used Salesforce connection as the source, its changes have a priority, and if a record was changed both in Salesforce and in the database, Salesforce changes are applied.

Perform the following actions to set the schedule:

  1. Click Schedule on the left side of the toolbar.
  2. Under Run every, select Week.
  3. Under Days of week, select checkboxes with all the workdays.

    Schedule screenshot 1

  4. Click Occur once at and select Recur every.

    Schedule screenshot 2

  5. Enter “1” (without quotes) into the Recur every box and click the Set time restrictions link.

    Schedule screenshot 3

  6. Enter 09:00 and 18:00 to the corresponding boxes.

    Schedule screenshot 4

  7. Click Now to put the schedule into action immediately or select At a specific time in the Starting list and specify the necessary date and time you want the schedule to be enabled from.

    Schedule screenshot 5

  8. Click Save to schedule package execution.

After this our package will run automatically every hour between 09:00 and 18:00 of every workday.

You can also visit Scheduling Packages to get more detailed information on setting a package schedule.