Snowflake

Snowflake is a scalable cloud-based data platform, which enables data storage, processing and analytic solutions that are faster, more flexible and easier to use. The Snowflake data platform is not built on any existing database technology or “big data” one, instead it combines a completely new SQL query engine with an innovative architecture natively designed for the cloud.

Snowflake-Specific Features and Limitations

Skyvia has the following limitations for Snowflake:

  • Skyvia supports the following types of import operations for Snowflake: INSERT, UPDATE, DELETE. Please note that UPSERT operation is not supported.
  • Replication is supported for Snowflake, but it works ONLY in the Bulk Load mode (the checkbox Use Bulk Import should be selected). To connect to Snowflake, Schema and Files Storage for Bulk parameters are also necessary.
  • Synchronization is not supported for Snowflake.

Establishing Connection

To create a connection to Snowflake, select the authentication type and specify the required connection parameters. Skyvia supports the following authentication types for Snowflake connection:

OAuth Authentication

To connect to Snowflake using OAuth, you have to create an OAuth security integration in Snowflake first and obtain the Client Id and Client Secret.

Creating Snowflake OAuth Security Integration

If you want to use OAuth authentication in Skyvia, create a Snowflake OAuth Security Integration first. To do that, run the following script in Snowflake.

1
2
3
4
5
6
7
8
9
10
USE ROLE ACCOUNTADMIN;
 
CREATE SECURITY INTEGRATION Skyvia
TYPE = OAUTH
ENABLED = TRUE
OAUTH_CLIENT = CUSTOM
OAUTH_CLIENT_TYPE = 'CONFIDENTIAL'
OAUTH_REDIRECT_URI = 'https://app.skyvia.com/oauthcallback/snowflake'
OAUTH_ISSUE_REFRESH_TOKENS = TRUE
OAUTH_REFRESH_TOKEN_VALIDITY = 7776000;

The refresh token validity default value is 7776000 (90 days). If you need to change the refresh token validity value, ask your Snowflake account administrator to send a request to Snowflake Support.

Viewing OAuth Security Integration

To view the created integration, run the following command:

1
2
3
USE ROLE ACCOUNTADMIN;

DESC SECURITY INTEGRATION SKYVIA;

view integtration

See more information about Snowflake security integrations here.

Getting Client Id and Client Secret

To obtain the Client Secret and Client Id run the following command.

1
2
3
USE ROLE ACCOUNTADMIN;

SELECT system$show_oauth_client_secrets('SKYVIA');

Client Secrets

Creating OAuth Connection

To connect to Snowflake using the OAuth 2.0 create the OAuth Security Integration and perform the following steps:

Oauth Connection Editor

  1. Specify the Snowflake account domain.
  2. Enter Client Id.
  3. Enter Client Secret.
  4. Click Sign In with Snowflake and enter your credentials.
  5. Specify the name of the database you want to connect to.

Basic Authentication

To connect to Snowflake using basic authentication, specify the following required connection parameters:

Basic Authorization

  • Domain is a Snowflake account domain.
  • User is a user name to log in with.
  • Password is a password to log in with.
  • Database is a database name.

Key-pair Authentication

To connect to Snowflake using the key pair, enter your Snowflake domain, user name and specify the private key

Getting Credentials for Key-pair Authentication

To obtain the public-private key pair, perform the following steps:

  1. Generate a private key using OpenSSL tool.
  2. Generate the public key for your private key.
  3. Assign the public key to your Snowflake user.

Generating a Private Key

You can use an encrypted or unencrypted private key. To generate an unencrypted private key, run the following command.

1
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out <key_name> -nocrypt

To generate an encrypted private key, run the command below and set an encryption password, the passphrase that you need to provide while connecting to Snowflake using the key pair authentication.

1
openssl genrsa 2048 | openssl pkcs8 -topk8 -v2 des3 -inform PEM -out <key_name>

Replace the <key_name> placeholder values in the commands above with your own key name.

The private key is generated in the PEM format.

1
2
3
-----BEGIN ENCRYPTED PRIVATE KEY-----
MIIFHDBOBgk...
----END ENCRYPTED PRIVATE KEY-----

Save the private key file in a secure location.

Generating a public key

To use key-pair authentication, generate a public key for the private key created earlier. To do that open a terminal window, and on the command line, run the following command.

1
openssl rsa -in <private_key_file> -pubout -out <public_key_file>

Replace the placeholder values in the command above with your own public key file name. If you are generating a public key for an encrypted private key, you will need to provide the encryption password used to create the private key. The public key is generated in the PEM format.

1
2
3
-----BEGIN PUBLIC KEY-----
MIIBIjANBgk...
-----END PUBLIC KEY-----

Save the public key file in a secure location.

Associating Public Key with Snowflake User

To associate the created public key with the Snowflake user, go to Snowflake and run the following command.

1
ALTER USER <your_snowflake_user> SET RSA_PUBLIC_KEY='MIIBIjANBgk...';

Replace the <your_snowflake_user> placeholder with your user name values. Specify the public key value omitting the —–BEGIN PUBLIC KEY—– and —–END PUBLIC KEY—– comments.

Creating Key-pair Connection

To connect to Snowflake using the key-pair authentication, specify the following required connection parameters.

Key-pair connection

  • Domain is a Snowflake account domain.
  • User is a user name to log in with.
  • Private Key is an RSA private key. If you use the encrypted private key, also specify the Passphrase - the password for RSA encrypted private key.

Additional Connection Parameters

If you need to perform import or replication, you will also have to specify some additional parameters described below.

  • Schema is a current schema name in the database. Although this parameter is optional, you need to specify it when using replication or import in the Bulk Load mode.
  • Warehouse is a name of the warehouse used for a database.
  • Role is a role name used to connect.

Advanced Settings

  • Command Timeout — this parameter specifies the wait time before terminating an attempt to execute a command and generating an error. Note that it is the time to wait for any server reply since the command was sent to a server, and it doesn’t include the time necessary to fetch data if the command selects some data.
  • STRING(38) As Guid — this parameter specifies whether to interpret values in the columns of the STRING (38) type as Guid values.
  • Files Storage for Bulk — you need to specify an external temporary file storage service to perform replication or import in the Bulk Load mode in Snowflake. This parameter allows you to select such possible external file storage services as None, Amazon S3 and Azure. Depending on the selected storage service, you need to specify additional parameters, which are mandatory to connect to one or another storage service.
  • Use Bulk Import — this parameter appears when you select Amazon S3 or Azure as an external file storage. For bulk loading, it is necessary to specify an external service for storing temporary files. You can also disable bulk import by clearing the Use Bulk Import checkbox. This parameter affects only import integrations with the INSERT, UPDATE and DELETE operation and with Snowflake as a target.

    By default, such integrations import data in the following way: Skyvia writes data into multiple temporary CSV files, uploads them to Amazon S3 or Azure (depending on which one you’ve selected) and then tells Snowflake to import data from these CSV files. These actions are performed simultaneously, providing the best performance. After the CSV files are imported, they are deleted. However, when data are imported this way, it’s not possible to obtain a per-record error log.

    If you disable bulk import, Skyvia will use INSERT, UPDATE and DELETE statements for importing data. This allows you to obtain a per-record error log but provides far less performance. Thus, disabling bulk import for Snowflake is not recommended and should only be considered if you need to import not that many records and need to have a per-record error log or you don’t want to use external storage for temporary files.

When selecting Amazon S3 Storage from Files Storage for Bulk, you need to specify the following additional parameters:

  • AWS Access Key ID — first part of your Amazon Web Services access key.
  • AWS Secret Key — second part of your Amazon Web Services access key. Read more about AWS access keys here.
  • AWS Security Token — an alternative option to AWS Access Key ID and AWS Secret Key pair. The Amazon Web Services security token is a temporary limited-privilege credential.
  • S3 Region — AWS region, where your S3 storage is hosted.
  • S3 Bucket Name — it is an optional parameter. You may specify the name of your S3 bucket to temporarily load the file with imported or replicated data to. If you leave it empty, Skyvia will create a new bucket when importing or replicating data and delete it after the operation is finished.

When selecting Azure Blob Storage from Files Storage for Bulk, you need to specify the following additional parameters:

  • Azure Storage Account — Azure storage account name.
  • Azure Storage Account Key — your 512-bit storage access key.
  • Azure Storage Endpoints Protocol — this parameter determines the protocol to use (HTTPS or HTTP). You may optionally change the protocol to use from default https to http, but this is not recommended.