Skip to content

Enable Native Sensitive Data Discovery (SDD)

Only available with Snowflake integrations.

Required Immuta permission: APPLICATION_ADMIN and GOVERNOR

  1. Collaborate with your Immuta representative to turn on native SDD for Snowflake (public preview) and Classification (private preview) in your Immuta instance. Specify which instance you would like to turn on SDD.
  2. Once it has been turned on, navigate to the App Settings page and click Sensitive Data Discovery in the navigation menu.
  3. Select the Enable Sensitive Data Discovery (SDD) checkbox to enable SDD.
  4. Click Save and then click Confirm to apply your changes. Note that the Immuta instance will have a system restart.
  5. Run SDD for a select group of data sources using one of the following options:

    1. Make the following request using a JSON file as the payload.

      curl \
          --request POST \
          --header "Content-Type: application/json" \
          --header "Authorization: Bearer dea464c07bd07300095caa8" \
          --data @example-payload.json \
          https://your-immuta-url.immuta.com/sdd/run
      

      In the JSON file, specify the data sources you want to run SDD on.

      {
      "sources":["Example Data Source Name", "Example Data Source 2 Name"]
      }
      
    2. Make the following request specifying the data sources in the request.

      curl \
          --request 'POST' \
          'https://your-immuta-url.immuta.com/sdd/run' \
          --header 'Content-Type: application/json' \
          --header 'Authorization: 438a3096966c4a5188b3b468cedb213e' \
          --data '{"sources":["Example Data Source Name", "Example Data Source 2 Name"]}'
      

    A successful request will have the code 200 and a body with the number of jobs created from the request:

    {
        "jobCount": 2
    }
    
  6. Navigate to the Data Source Overview page of the data source you listed in the payload.

  7. Click the Data Dictionary tab.
  8. Assess whether the Discovered tags applied are accurate.
  9. If they are, then repeat the steps above for the other data sources listed in the payload. If the tags are not accurate, you will need to tune SDD. Tuning SDD requires initial work, but will make this process automated in the future. Follow the steps below to tune SDD:

    1. Create a new identification framework.
    2. Configure the resulting tags. Note that if you are using native SDD, the attributes minConfidence and sampleSize are not supported.
    3. Test that new identification framework with the data source. From here, either repeat step 9b and reconfigure the tags and confidence, or if you are happy with the results, proceed to the next step.
    4. Configure SDD to run your new framework on all data sources.
    5. Repeat steps 6-8 to assess the rest of the data sources.
    6. Once all data source tags seem appropriate, proceed with the rest of onboarding.

Enable SDD for all data sources

Required Immuta permission: AUDIT

Make the following request using the Immuta API to run SDD for all data sources, specifying all as true:

curl \
    --request 'POST' \
    'https://your-immuta-url.immuta.com/sdd/run' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: 438a3096966c4a5188b3b468cedb213e' \
    --data '{"all": true}'

A successful request will have the code 200 and a body with the number of jobs created from the request:

{
    "jobCount": 12
}