Skip to content

Create a Snowflake Data Source

This page details how to register Snowflake data sources using the legacy workflow. To register data sources using the simplified workflow, see this how-to guide.

Requirements

  • CREATE_DATA_SOURCE Immuta permission
  • USAGE Snowflake privilege on the schema and database
  • REFERENCES Snowflake privilege on the tables

Enter connection information

  1. Click the plus button in the top left of the Immuta console.
  2. Select New Data Source.
  3. Select the Snowflake tile in the Data Platform section.

  4. Complete these fields in the Connection Information box:

    • Server: hostname or IP address
    • Port: port configured for Snowflake, typically port 443
    • SSL: when enabled, ensures communication between Immuta and the remote database is encrypted
    • Warehouse: Snowflake warehouse that contains the remote database
    • Database: remote database

    Best practice: Use SSL

    Although not required, all connections should use SSL. Additional connection string arguments may also be provided.

    Note: Only Immuta uses the connection you provide and injects all policy controls when users query the system. In other words, users always connect through Immuta with policies enforced and have no direct association with this connection.

  5. From the Select Authentication Method Dropdown, select either Username and Password or Key Pair Authentication:

    Username and password

    1. Complete the following fields:

      • Username: This username will be used to connect to the remote database and retrieve records for this data source.
      • Password: This password will be used with the above username to connect to the remote database.
    2. You can then choose to enter Additional Connection String Options or Upload Certificates to connect to the database.

    Key pair authentication

    1. Complete the Username field. This username will be used to connect to the remote database and retrieve records for this data source.
    2. Opt to enter the private key file password in the Additional Connection String Options. Use the following format: PRIV_KEY_FILE_PWD=<your_pw>
    3. Click Select a File, and upload a Snowflake key pair file.

    Snowflake External OAuth

    File naming convention

    If you are uploading more than one file, ensure the certificate used for the OAuth authentication has the key name "oauth client certificate."

    1. Fill out the Token Endpoint, which is where the generated token is sent. It is also known as aud (Audience) and iss (Issuer).
    2. Fill out the Client ID, which is the subject of the generated token. It is also known as sub (Subject).
    3. To use a certificate, keep the Use Certificate checkbox enabled and complete the steps below. You cannot pass a client secret if you use this method for obtaining the access token.
      1. Opt to fill out the Resource field with a URI of the resource where the requested token will be used.
      2. Enter the x509 Certificate Thumbprint. This identifies the corresponding key to the token and is often abbreviated as x5t or is called sub (Subject).
      3. Upload the PEM Certificate, which is the client certificate that is used to sign the authorization request.
    4. To pass a client secret, uncheck the Use Certificate checkbox and complete the fields below. You cannot use a certificate if you use this method for obtaining the access token.
      1. Scope (string): The scope limits the operations and roles allowed in Snowflake by the access token. See the Snowflake documentation for details about creating scopes for External OAuth.
      2. Client Secret (string): Immuta uses this secret to authenticate with the authorization server when it requests a token.
  6. Click the Test Connection button.

    If the connection is successful, a check mark and successful connection notification will appear and you can proceed. You must be able to connect to this data source using the connection information that you just entered to proceed.

Considerations

  • Immuta pushes down joins to be processed on the native database when possible. To ensure this happens, make sure the connection information matches between data sources, including host, port, ssl, username, and password. You will see performance degradation on joins against the same database if this information doesn't match.
  • If a client certificate is required to connect to the source database, you can add it in the Upload Certificates section at the bottom of the form.

Select virtual population

  1. Decide how to virtually populate the data source by selecting Create sources for all tables in this database and monitor for changes or Schema/Table.

  2. Complete the workflow for Create sources for all tables in this database and monitor for changes or Schema/Table selection:

    Create sources for all tables in this database and monitor for changes

    Selecting this option will create and keep in sync all data sources within this database. New schemas will be automatically detected and the corresponding data sources and schema projects will be created.

    Schema/table

    Selecting this option will create and keep in sync all tables within the schemas selected. No new schemas will be detected.

    1. If you choose Schema/Table, click Edit in the table selection box that appears.
    2. By default, all schemas and tables are selected. Select and deselect by clicking the checkbox to the left of the name in the Import Schemas/Tables menu. You can create multiple data sources at one time by selecting an entire schema or multiple tables.
    3. After making your selections, click Apply.

Enter basic information

Provide information about your source to make it discoverable to users.

  1. Enter the SQL Schema Name Format to be the SQL name that the data source exists under in Immuta. It must include a schema macro but you may personalize it using lowercase letters, numbers, and underscores to personalize the format. It may have up to 255 characters.
  2. Enter the Schema Project Name Format to be the name of the schema project in the Immuta UI. If you enter a name that already exists, the name will automatically be incremented. For example, if the schema project Customer table already exists and you enter that name in this field, the name for this second schema project will automatically become Customer table 2 when you create it.

    1. When selecting Create sources for all tables in this database and monitor for changes you may personalize this field as you wish, but it must include a schema macro.
    2. When selecting Schema/Table this field is prepopulated with the recommended project name and you can edit freely.
  3. Select the Data Source Name Format, which will be the format of the name of the data source in the Immuta UI.

    <Tablename>

    The data source name will be the name of the remote table, and the case of the data source name will match the case of the macro.

    <Schema><Tablename>

    The data source name will be the name of the remote schema followed by the name of the remote table, and the case of the data source name will match the cases of the macros.

    Custom

    Enter a custom template for the Data Source Name. You may personalize this field as you wish, but it must include a tablename macro. The case of the macro will apply to the data source name (i.e., <Tablename> will result in "Data Source Name," <tablename> will result in "data source name," and <TABLENAME> will result in "DATA SOURCE NAME").

  4. Enter the SQL Table Name Format, which will be the format of the name of the table in Immuta. It must include a table name macro, but you may personalize the format using lowercase letters, numbers, and underscores. It may have up to 255 characters.

Enable or disable schema monitoring

Schema monitoring best practices

Schema monitoring is a powerful tool that ensures tables are all governed by Immuta.

  • Consider using schema monitoring later in your onboarding process, not during your initial setup and configuration when tables are not in a stable state.
  • Consider using Immuta’s API to either run the schema monitoring job when your ETL process adds new tables or to add new tables.
  • Activate the new column added templated global policy to protect potentially sensitive data. This policy will null the new columns until a data owner reviews new columns that have been added, protecting your data to avoid data leaks on new columns getting added without being reviewed first.

When selecting the Schema/Table option, opt to enable Schema Monitoring by selecting the checkbox in this section.

Note: This step will only appear if all tables within a server have been selected for creation.

Opt to configure advanced settings

Although not required, completing these steps will help maximize the utility of your data source.

Column detection

This setting monitors when remote tables' columns have been changed, updates the corresponding data sources in Immuta, and notifies data owners of these changes.

To enable, select the checkbox in this section.

See Schema projects overview to learn more about column detection.

Event time

An event time column denotes the time associated with records returned from this data source. For example, if your data source contains news articles, the time that the article was published would be an appropriate event time column.

  1. Click the Edit button in the Event Time section.
  2. Select the column(s).
  3. Click Apply.

Selecting an event time column will enable

Latency

This setting impacts how often Immuta checks for new values in a column that is driving row-level redaction policies. For example, if you are redacting rows based on a country column in the data, and you add a new country, it will not be seen by the Immuta policy until this period expires.

  1. Click Edit in the Latency section.
  2. Complete the Set Time field, and then select MINUTES, HOURS, or DAYS from the subsequent dropdown menu.
  3. Click Apply.

Sensitive data discovery

Data Owners can disable Sensitive Data Discovery for their data sources in this section.

  1. Click Edit in this section.
  2. Select Enabled or Disabled in the window that appears, and then click Apply.

Data source tags

Adding tags to your data source allows users to search for the data source using the tags and governors to apply global policies to the data source. Note if schema monitoring is enabled, any tags added now will also be added to the tables that are detected. Tags can also be added after you create your data source from the data source details page on the overview tab or the data dictionary tab.

  1. Click the Edit button in the Data Source Tags section.
  2. Begin typing in the Search by Tag Name box to select your tag, and then click Add.

Create the data source

Click Create to register your data source.