A guide to securing sensitive data in Snowflake

This article explains how Snowflake supports Row-Level Security (fine-grained access control—FGAC). It describes a simple but powerful security architecture successfully deployed by some of the largest enterprises using Snowflake globally. It also explains how row-level access control fits into an overall Role-Based Access Control (RBAC) architecture and how this powerful but flexible solution can provide complete control over data access.

What is Row Level Security?

The diagram below illustrates the primary challenge: ensuring data is only visible to specific individuals who need to know. If the user (or, more likely, their role) is not allowed to view the data, it's simply not visible. Even the DBAs, account admins, and table owners cannot access the data unless authorized.

While RBAC secures access to Tables, row level access is used to control access to sub-sets of the data. In the above diagram users are authorised to view data for one or more REGIONS, and the System Administrator has no access to the data at all.

Ideal Row Level Security Features

The ideal features of a Row Access Security system include:

Central Management: The ability to define a single, centrally managed set of rules or policies controlling sensitive data access.
Easy Deployment: Ideally, we'd like to define a policy once and deploy it against as many tables, schemas, or databases as needed.
Simplified Change Management: With centrally defined policies, we'd like to change the data access rules without having to reapply them again.
Segregation of Duties: This includes the ability of a central administrator to decide which data needs to be protected independently of the data owner.
Integration with RBAC: The solution must integrate with the overall role-based access control (RBAC) architecture without adding significant complexity.

How does Snowflake implement Row Level Security?

Snowflake achieves all of the above using Row Access Policies. A row access policy is a small, centrally defined procedure that returns a Boolean value (TRUE or FALSE) depending on whether the user can view the specific row. The steps to defining row-level security include:

Decide which table or view needs to be secured. For example, the SALES table, which has access controlled by REGION.
Create an ENTITLEMENTS table that records the roles allowed to view the data for each REGION.
Create a Row Access Policy to implement the rule. This will typically query the ENTITLEMENTS table and return TRUE if the executing user has the relevant sensitive data role.
Deploy the Row Access Policy against the SALES table.

The diagram below illustrates the key components of the solution.

The components include:

Sales Table - This holds the data we need to secure.
Entitlements Table - Which records for each REGION which roles are allowed to view the data.
Row Access Policy - Which enforces the security access rules.
Sensitive Data Roles - These are granted to users to integrate the solution into the overall RBAC architecture.

Effectively, the ENTITLEMENTS table maps which ROLES are allowed to view data for each REGION, and the Row Access Policy implements the rule. Once a Row Access Policy is applied to the SALES table, nobody will have access unless they are granted the appropriate Sensitive Data Role, and even then, only to the data they are authorized to view.

Deploying a Snowflake Row Access Policy?

Assuming we need to secure access to SALES data by REGION, let’s assume we have a simple table as follows:

We need to create an ENTITLEMENTS table to map the REGION code to the name of a suitable ROLE used to control access to the data. The SQL below illustrates the code to build and populate this table.

create or replace table entitlements (
    domain               varchar
,   region               varchar
,   accessible_to_role   varchar);

insert into entitlements values 
('SALES', 'NORTH', 'SALES_NORTH'),
('SALES', 'SOUTH', 'SALES_SOUTH'),
('SALES', 'EAST',  'SALES_EAST'),
('SALES', 'WEST',  'SALES_WEST'),
('SALES', 'NORTH', 'SALES_ALL_REGIONS'),
('SALES', 'SOUTH', 'SALES_ALL_REGIONS'),
('SALES', 'EAST',  'SALES_ALL_REGIONS'),
('SALES', 'WEST',  'SALES_ALL_REGIONS');

Although initially, we only needed to control access to SALES data, the design allowed us to extend the solution to control access to any data on the system. We need to enter values for a different DOMAIN and add one or more columns to record which values are used to map to the access role.

When we query the ENTITLEMENTS table, we see the following results, which show that each REGION is mapped to the appropriate ROLE.

After creating the SALES and ENTITLEMENTS tables, we need to create the Snowflake Row Access Policy to restrict who can view data. The SQL below shows an example of the code needed.

create or replace row access policy sales_by_region as (sales_region varchar) returns boolean ->
exists (
          select 1 
          from  entitlements e
          where e.domain = 'SALES'
          and   e.region = sales_region
          and   contains(current_available_roles(),e.accessible_to_role));

This simple routine returns either TRUE or FALSE to indicate whether the user can view the data for the specific row. It performs a fast lookup of the ENTITLEMENTS table for the given REGION and checks whether the user has the appropriate sensitive data role.

It's important to note that we use the CURRENT_AVAILABLE_ROLES() function to check whether the current user has been granted the appropriate role. This need not be their current_role(), but has to be given to any role this user has been provided.

The final step is to apply the Row Access Policy to any table or view that includes sensitive data. The SQL below shows the method.

alter table sales add row access policy sales_by_region on (region);

Testing Row Access Security

To test the security, we granted our user the appropriate Sensitive Data ROLEs and executed the following simple SQL.

select *
from sales;

The results returned below show only the authorised data for the WEST region is visible

.However, when we also granted the role NORTH_SALES, the query returned the following:

Integrating the solution with RBAC

Role-based access Control (RBAC) can be extremely challenging to design and maintain, and it's important to avoid adding to an already complex problem. However, because of the way we implemented row-level security, it's remarkably easy to deploy.

The diagram below illustrates a simple RBAC architecture that allows the SALES team and their MANAGERS the ability to read data from the SALES schema.

When we deploy row-based access control, we need to deploy a sequence of ROLES to control access to the sensitive data. The diagram below illustrates a potential solution whereby the roles are either granted directly to individual users or (in the case of the management team) to the PROD_SALES_MANAGER role.

Notice that the Sensitive Data Roles don't need to be directly granted access to the underlying data. This means we keep the RBAC solution utterly separate from handling sensitive data, which hugely simplifies the solution and allows us to separate these two challenges.

Furthermore, we can also deliver automatic data masking for specific columns (for example, holding Personally Identifying Data (PII), described in the article, What is Snowflake Dynamic Data Masking?

Conclusion

As enterprises store increasing volumes of potentially sensitive data in Snowflake, we must control access. I would argue, it's even more important to ensure the solution is simple, otherwise you'll never manage to fully understand it, let alone prove the system is secure and only grants data access to those who need it.

Using the techniques described here, we can deliver a highly scalable but simple solution to controlling access to sensitive data. This builds upon the RBAC framework described in the sequence of articles, including Introducing Role-Based Access Control and What is Snowflake Dynamic Data Masking?

Using these articles as a guideline, you should be able to implement Snowflake's best practices for both RBAC and handling sensitive data. Just keep it simple.

Securing Data with Row Level Security in Snowflake