Skip to content

Documentation

Data Source Health Checks

Light Dark

SaaS

Immuta Documentation

Overview
What is Immuta?
Data and Integrations
Data and Integrations
- Section Contents
- Immuta Integrations
- Snowflake
  Snowflake
  - Section Contents
  - Getting Started
  - How-to Guides
    How-to Guides
    
    Configure Snowflake Integration
    
    Snowflake Table Grants Migration
    
    Edit or Remove a Snowflake Integration
    
    Integration Settings
    Integration Settings
    
    Enable Snowflake Table Grants
    
    Use Snowflake Data Sharing with Immuta
    
    Snowflake Low Row Access Policy Mode
    Snowflake Low Row Access Policy Mode
    
    Enable Snowflake Low Row Access Policy Mode
    
    Upgrade Snowflake Low Row Access Policy Mode
    
    Snowflake Lineage Tag Propagation
  - Reference Guides
    Reference Guides
    
    Snowflake Integration Reference Guide
    
    Snowflake Table Grants
    
    Snowflake Data Sharing with Immuta
    
    Snowflake Low Row Access Policy Mode Overview
    
    Snowflake Lineage Tag Propagation
  - Concept Guide
    Concept Guide
    
    Phased Snowflake Onboarding Approach
- Databricks Unity Catalog
  Databricks Unity Catalog
  - Section Contents
  - Getting Started
  - How-to Guides
    How-to Guides
    
    Configure Databricks Unity Catalog Integration
    
    Migrate to Unity Catalog
  - Reference Guide
    Reference Guide
    
    Unity Catalog Integration Reference
- Databricks Spark
  Databricks Spark
  - Section Contents
  - How-to Guides
    How-to Guides
    
    Configuration
    Configuration
    
    Introduction
    
    Simplified Databricks Configuration
    
    Manual Databricks Installation
    
    Manually Update Your Databricks Cluster
    
    Install a Trusted Library
    
    DBFS Access
    
    Limited Enforcement in Databricks
    
    Hiding the Immuta Database in Databricks
    
    Run spark-submit Jobs on Databricks
    
    Project UDFs Cache Settings
    
    External Metastores
  - Reference Guides
    Reference Guides
    
    Databricks Spark Integration Overview
    
    Databricks Spark Pre-Configuration Details
    
    Configuration Settings
    Configuration Settings
    
    Cluster Policies
    Cluster Policies
    
    Python & SQL
    
    Python & SQL & R
    
    Python & SQL & R with Library Support
    
    Scala
    
    Sparklyr
    
    Environment Variables
    
    Ephemeral Overrides
    
    Py4j Security Error
    
    Scala Cluster Security Details
    
    Security Configuration for Performance
    
    Databricks Change Data Feed
    
    Databricks Libraries
    
    Delta Lake API Reference Guide
    
    Spark Direct File Reads
    
    Databricks Metastore Magic
- Starburst (Trino)
  Starburst (Trino)
  - Section Contents
  - Getting Started
  - How-to Guide
    How-to Guide
    
    Starburst (Trino) Integration
    
    Customize Read and Write Access Policies for Starburst (Trino)
  - Reference Guide
    Reference Guide
    
    Starburst (Trino) Integration Overview
- Redshift
  Redshift
  - Section Contents
  - Getting Started
  - How-to Guides
    How-to Guides
    
    Redshift Configuration
    
    Redshift Spectrum Configuration
  - Reference Guides
    Reference Guides
    
    Redshift Integration Overview
    
    Redshift Pre-Configuration Details
- Azure Synapse Analytics
  Azure Synapse Analytics
  - Section Contents
  - Getting Started
  - How-to Guide
    How-to Guide
    
    Azure Synapse Analytics Configuration
  - Reference Guides
    Reference Guides
    
    Azure Synapse Integration Overview
    
    Azure Synapse Pre-Configuration Details
- Amazon S3
- Google BigQuery
- Registering a Host
  Registering a Host
  - Section Contents
  - How-to Guides
    How-to Guides
    
    Register a Snowflake Host
    
    Register a Databricks Host
    
    Crawl a Host or Object
  - Reference Guide
    Reference Guide
    
    Enhanced Onboarding and Data Source Registration
- Registering Metadata
  Registering Metadata
  - Section Contents
  - Data Sources in Immuta
  - Register Data Sources
    Register Data Sources
    
    Section Contents
    
    Amazon S3 Data Source
    
    Azure Synapse Analytics Data Source
    
    Databricks Data Source
    
    Google BigQuery Data Source
    
    Redshift Data Source
    
    Snowflake Data Source
    
    Bulk Create Snowflake Data Sources
    
    Starburst (Trino) Data Source
  - Data Source Settings
    Data Source Settings
    
    Section Contents
    
    How-to Guides
    How-to Guides
    
    Manage Data Source Settings
    
    Manage Data Source Members
    
    Data Source Access Requests
    
    Disable Data Sampling
    
    Data Dictionary
    
    Reference Guide
    Reference Guide
    
    Data Source Health Checks Data Source Health Checks
    Table of contents
    
    Unhealthy Databricks data sources
    
    Limitations
  - Schema Monitoring
    Schema Monitoring
    
    Section Contents
    
    How-to Guides
    How-to Guides
    
    Manage Schema Projects
    
    Run Schema Monitoring Jobs
    
    Reference Guides
    Reference Guides
    
    Schema Monitoring
    
    Schema Projects
    
    Concept Guides
    Concept Guides
    
    Why Use Schema Monitoring?
- Catalogs
  Catalogs
  - Section Contents
  - Getting Started
  - How-to Guide
    How-to Guide
    
    Configure an External Catalog
  - Reference Guides
    Reference Guides
    
    External Catalog Integrations
    
    Custom REST Catalogs
    Custom REST Catalogs
    
    Custom REST Catalog Interface Introduction
    
    Custom REST Catalog Interface Endpoints
- Tags
  Tags
  - Section Contents
  - How-to Guides
    How-to Guides
    
    Create Tags
    
    Add Tags to Data Sources and Projects
  - Reference Guide
    Reference Guide
    
    Tags
People
People
- Getting Started
- Identity Managers (IAMs)
  Identity Managers (IAMs)
  - Section Contents
  - How-to Guides
    How-to Guides
    
    Microsoft Entra ID
    
    Okta
    Okta
    
    Okta and LDAP
    
    Okta and OpenID Connect
    
    Integrate Okta SAML SCIM with Immuta
    
    OneLogin
    
    SAML
  - Reference Guides
    Reference Guides
    
    Identity Managers
    
    SAML SLO
    
    SAML IAM Protocol Configuration Options
- Immuta Users
  Immuta Users
  - Section Contents
  - How-to Guides
    How-to Guides
    
    Manage Personas and Permissions
    
    User Impersonation
    
    Manage Attributes and Groups
    
    External User ID Mapping
    
    External User Info Endpoint
  - Reference Guides
    Reference Guides
    
    Personas and Permissions
    
    Attributes and Groups
Discover Your Data
Discover Your Data
- Getting Started
- Introduction
- Identification
  Identification
  - Overview
  - How-to Guides
    How-to Guides
    
    Set Up Sensitive Data Discovery
    
    Create Frameworks
    
    Create Identifiers
    
    Run Identification on Data Sources
    
    Manage Global SDD Settings
    
    Migrate From Legacy to Native SDD
  - Reference Guides
    Reference Guides
    
    How Competitive Criteria Analysis Works
    
    Built-In Discovered Tags
    
    Built-In Identifiers
- Classification
  Classification
  - Overview
  - How-to Guides
    How-to Guides
    
    Activate a Framework
    
    Adjust and Accept Tags
    
    Use a Built-In Classification Framework with Your Own Tags
  - Reference Guides
    Reference Guides
    
    Immuta DSF
    
    Built-in Frameworks
Detect Your Activity
Detect Your Activity
- Getting Started
  Getting Started
  - Select Your Use Case
  - Use Case
    Use Case
    
    Monitor and Secure Sensitive Data Platform Query Activity
    Monitor and Secure Sensitive Data Platform Query Activity
    
    Overview
    
    SaaS Benefits
    
    User Identity Best Practices
    
    Native Integration Architecture
    
    Snowflake Roles Best Practices
    
    Register Data
    
    Automate Entity and Sensitivity Discovery
    
    Onboard Detect with Discover
    
    Using Immuta Detect
    
    General Immuta Configuration
    General Immuta Configuration
    
    Overview
    
    SaaS Benefits
    
    User Identity Best Practices
    
    Native Integration Architecture
    
    Databricks Roles Best Practices
    
    Register Data
- Introduction
- Audit
  Audit
  - Section Contents
  - How-to Guides
    How-to Guides
    
    Export Audit Logs to S3
    
    Export Audit Logs to ADLS
    
    Run Governance Reports
  - Reference Guides
    Reference Guides
    
    Universal Audit Model Overview
    
    Snowflake Audit
    
    Databricks Unity Catalog Audit
    
    Databricks Audit
    
    Starburst Audit
    
    UAM Schema Reference Guide
    
    Audit Export CLI Reference Guide
    
    Audit Export GraphQL Reference Guide
    
    Governance Reports Overview
  - Deprecated Audit Guides
    Deprecated Audit Guides
    
    Legacy to UAM Audit Events
    
    View and Download Audit Logs
- Detection
  Detection
  - Overview
  - How-to Guides
    How-to Guides
    
    Use the Dashboards
  - Reference Guides
    Reference Guides
    
    Dashboards
    
    Unknown Users
- Monitors
  Monitors
  - Monitors and Observations Overview
  - Create a Monitor
Secure Your Data
Secure Your Data
- Getting Started
  Getting Started
  - Select Your Use Case
  - Use Case
    Use Case
    
    Automate Data Access Control Decisions
    Automate Data Access Control Decisions
    
    Overview
    
    The Two Paths
    
    Managing User Metadata
    
    Managing Data Metadata
    
    Author Policy
    
    Test and Deploy Policy
    
    Compliantly Open more Sensitive Data for ML and Analytics
    Compliantly Open more Sensitive Data for ML and Analytics
    
    Overview
    
    Managing User Metadata
    
    Managing Data Metadata
    
    Author Policy
    
    Federated Governance for Data Mesh and Self-Serve Data Access
    Federated Governance for Data Mesh and Self-Serve Data Access
    
    Overview
    
    Defining Domains
    
    Managing Data Products
    
    Managing Data Metadata
    
    Applying Federated Governance
    
    Discover and Subscribe to Data Products
- Introduction
  Introduction
- Authoring Policies in Secure
  Authoring Policies in Secure
  - Overview
  - Authoring Policies at Scale
  - Data Engineering with Limited Policy Downtime
  - Subscription Policies
    Subscription Policies
    
    Section Contents
    
    How-to Guides
    How-to Guides
    
    Subscription Policy
    
    ABAC Subscription Policy
    
    Advanced DSL Builder
    
    Restricted Subscription Policy
    
    Clone, Activate, or Stage a Global Policy
    
    Reference Guides
    Reference Guides
    
    Subscription Policies
    
    Subscription Policy Access Types
    
    Advanced Use of Special Functions
  - Data Policies
    Data Policies
    
    Overview
    
    How-to Guides
    How-to Guides
    
    Masking Policy
    
    Minimization Policy
    
    Purpose-Based Restriction Policy
    
    Restricted Data Policy
    
    Row-level Policy
    
    Time-Based Restriction Policy
    
    Certifications Exemptions and Diffs
    
    External Masking Interface (Deprecated)
    
    Reference Guides
    Reference Guides
    
    All Data Policies
    
    Masking Policies
    
    Row-level Policies
    
    Custom WHERE Clause Functions
    
    Data Policy Conflicts and Fallback
    
    Custom Policy Certifications
    
    Orchestrated Masking Policies
- Domains
  Domains
- Projects and Purpose-Based Access Control
  Projects and Purpose-Based Access Control
  - Section Contents
  - Projects and Purpose Controls
    Projects and Purpose Controls
    
    Section Contents
    
    Getting Started
    
    How-to Guides
    How-to Guides
    
    Create a Project
    
    Create a Purpose
    
    Adjust a Policy
    
    Project Management
    Project Management
    
    Manage Projects and Project Settings
    
    Manage Data Sources
    
    Manage Members
    
    Reference Guides
    Reference Guides
    
    Projects and Purposes
    
    Policy Adjustments
    
    Concept Guide
    Concept Guide
    
    Why Use Purposes?
  - Equalized Access
    Equalized Access
    
    Section Contents
    
    Manage Equalization How-To Guide
    
    Equalized Access Reference Guide
    
    Why Equalize Access?
  - Masked Joins
    Masked Joins
    
    Section Contents
    
    Enable Masked Joins
    
    Why Use Masked Joins?
  - Writing to Projects
    Writing to Projects
    
    Section Contents
    
    How-to Guides
    How-to Guides
    
    Create a Snowflake Project Workspace
    
    Create a Databricks Project Workspace
    
    Writing to Projects
    
    Reference Guides
    Reference Guides
    
    Writing to Projects
    
    Project UDFs
- Data Consumers
  Data Consumers
  - Section Contents
  - Subscribe to Data Sources
  - Query Data
    Query Data
    
    Snowflake
    
    Databricks
    
    Databricks SQL
    
    Starburst (Trino)
    
    Redshift
    
    Azure Synapse Analytics
  - Subscribe to Projects
Application Configuration
Application Configuration
- Section Contents
- How-To Guides
  How-To Guides
  - App Settings
  - Private Networking Support
    Private Networking Support
    
    Amazon Redshift
    
    Databricks
    Databricks
    
    Overview
    
    AWS PrivateLink
    
    Azure Private Link
    
    Snowflake
    Snowflake
    
    Overview
    
    AWS PrivateLink
    
    Azure Private Link
    
    Starburst (Trino)
    Starburst (Trino)
    
    Overview
    
    AWS PrivateLink
    
    Azure Private Link
  - BI Tools
    BI Tools
    
    Configuration Recommendations
    
    Configuration Examples
    Configuration Examples
    
    Power BI
    
    Tableau
  - IP Filtering
  - System Status Bundle
- Reference Guides
  Reference Guides
Releases
Releases
- Deployment Notes
- Immuta Support Matrix Overview
- Immuta CLI Release Notes
- Preview Features
  Preview Features
  - Preview Levels
  - Features in Preview
Developer Guides
Developer Guides
- The Immuta CLI
  The Immuta CLI
  - Introduction
  - Install and Configure the CLI
  - Manage Instances
  - Manage Data Sources
  - Manage Sensitive Data Discovery
    Manage Sensitive Data Discovery
    
    Introduction
    
    Manage Rules
    
    Manage Frameworks
    
    Run Sensitive Data Discovery
  - Manage Policies
  - Manage Purposes
  - Manage Projects
- Immuta API
  Immuta API
  - Introduction
  - Integrations API
    Integrations API
    
    Overview
    
    Getting Started
    
    How-To Guides
    How-To Guides
    
    Overview
    
    Amazon S3
    
    Azure Synapse Analytics
    
    Databricks Unity Catalog
    
    Google BigQuery
    
    Redshift
    
    Snowflake
    
    Starburst
    
    Reference Guides
    Reference Guides
    
    Overview
    
    Integrations API Endpoints
    
    Integrations API Payloads
    
    Response Schema
    
    Status Codes and Error Messages
  - Version 2 API
    Version 2 API
    
    Overview
    
    Request Payload Examples
    Request Payload Examples
    
    Data Sources
    Data Sources
    
    Payload Attribute Details
    
    Request Payload Examples
    
    Policies
    
    Projects
    
    Purposes
  - Version 1 API
    Version 1 API
    
    Overview
    
    Authenticate with the API
    
    Configure Immuta
    Configure Immuta
    
    Overview
    
    Activities and Notifications
    
    Fingerprint Service Status
    
    Frameworks
    
    IAMs
    
    Licenses
    
    Jobs
    
    Search Filters
    
    Sensitive Data Discovery
    
    Tags
    
    Webhooks
    Webhooks
    
    Webhooks
    
    Connect Data
    Connect Data
    
    Overview
    
    Create Data Sources
    Create Data Sources
    
    Azure Synapse Analytics API Reference Guide
    
    Databricks API Reference Guide
    
    Redshift API Reference Guide
    
    Snowflake API Reference Guide
    
    Trino API Reference Guide
    
    Data Dictionary API Reference Guide
    
    Manage and Audit Data Access
    Manage and Audit Data Access
    
    Overview
    
    Data and Subscription Policies
    
    Write Policies
    Write Policies
    
    Write Policy Endpoints
    
    Write Policy Payload Reference
    
    Domains API Reference Guide
    
    Manage Access Requests
    
    Policy Handler Object
    
    Search Audit Logs
    
    Search Connection Strings
    
    Search for Organizations
    
    Search Schemas
    
    Subscribe to and Manage Data Sources
    
    Create Projects
    Create Projects
    
    Overview
    
    Manage Projects
    
    Manage Purposes

Data Source Health Checks

When an Immuta data source is created, background jobs use the connection information provided to compute health checks dependent on the type of data source created and how it was configured. These data source health checks include the

blob crawl status: indicates whether the blob was successfully crawled.
column detection status: indicates whether the job run to determine if a column was added or removed from the remote table registered as an Immuta data source was successful.
external catalog link status: indicates whether or not the external catalog was successfully linked to the data source.
fingerprint generation status: indicates whether or not the data source fingerprint was successfully generated.
framework classification status: indicates whether classification was successfully run on the data source to determine the sensitivity of the data source.
global policy applied status: indicates whether global policies were successfully applied to the data source.
high cardinality calculation status: indicates whether the data source's high cardinality column was successfully calculated.
native SQL sync status (for Snowflake data sources): indicates whether Snowflake governance policies have been successfully synced.
native SQL view creation status (for Redshift data sources): indicates whether native views were properly created for Redshift tables registered in Immuta.
row count status: indicates whether the number of rows in the data source was successfully calculated.
schema detection status: indicates whether the job run to determine if a remote table was added or removed from the schema was successful.
sensitive data discovery status: indicates whether sensitive data discovery was successfully run on the data source.

After these jobs complete, the health status for each is updated to indicate whether the status check passed, was skipped, is unknown, or failed.

These background jobs can be disabled during data source creation by adding a specific tag to prevent automatic table statistics. This prevent statistics tag can be set on the app settings page by a system administrator. However, with automatic table statistics disabled these policies will be unavailable until the data source owner manually generates the fingerprint:

Masking with format preserving masking
Masking with k-anonymization
Masking using randomized response

Unhealthy Databricks data sources

Unhealthy data sources may fail their row count queries if they run against a cluster that has the Databricks query watchdog enabled.

Limitations

Data sources with over 1600 columns will not have health checks run, but will still appear as healthy. The health check cannot be run automatically or manually.