Azure HSM: Navigating Compliance for FSI
Introduction
As a technology enthusiast I recently had the opportunity to dive deep into the world of Azure Managed Hardware Security Modules (HSMs) for FSI customer. These powerful cryptographic guardians play a pivotal role in helping Non-Banking Financial Companies (NBFCs) meet the stringent compliance requirements i.e. FIPS-140-L3 set by the Reserve Bank of India (RBI). In this blog post, I’ll cover some best practices in implementation of Azure Managed HSM, explore its practical applications, and guide you through its operational aspects.
What is Managed HSM? Why its important for NBFC
A managed HSM is a single-tenant, highly available, and FIPS 140-2 Level 3 validated hardware security module. Imagine a secure vault within Azure, purpose-built to protect your most sensitive secrets. Whether you’re safeguarding financial transactions, securing healthcare data, or ensuring the integrity of critical applications, Managed HSM has your back.
Managed HSM establishes a cryptographic boundary for key material using a unique security domain. It ensures that Microsoft cannot access your keys within the HSM. Customers have full ownership and control over your cryptographic keys. it Isolates your keys within the HSM, preventing unauthorized access.
Security domain is an encrypted blob file unique to each managed HSM instance. It contains critical artifacts such as:
HSM backup, User credentials, Signing key, Data encryption key.
Advantages of HSM on AKV:
Granular Access Control:
Per-key permissions enable fine-grained control over access.
Local RBAC model ensures designated HSM cluster administrators have full control.
Private Endpoints:
Securely connect to Managed HSM from your application using private endpoints.
Ensures data privacy by avoiding public internet access.
FIPS 140-2 Level 3 Validated HSMs:
Managed HSMs use Marvell Liquid Security HSM adapters.
Complies with stringent security standards.
Integrated Monitoring and Audit:
Fully integrated with Azure Monitor.
Provides complete logs of all activity.
Use Azure Log Analytics for analytics and alerts.
Data Residency:
- Managed HSM ensures data doesn’t leave the region where the HSM instance is deployed.
Centralized Key Management:
- Manage critical keys across your organization in one place, follow the least privileged access principle.
Operational Excellence: Making HSM Private for secure access.
Access to a managed HSM is controlled through two interfaces:
Management plane: On the management plane, you manage the HSM itself. Operations in this plane include creating and deleting managed HSMs and retrieving managed HSM properties.
Data plane: On the data plane, you work with the data that's stored in a managed HSM. which is basically the keys generated on HSM or imported to HSM from different key manager.
Authorization
There are two level of permission required to work with HSM.
Azure RBAC: All management plane operation on HSM, Operations in this plane include Create/Delete, Backup/restore, Networking, Manage security domain.
Local RBAC: Role assignment at this is either scope at All keys or Single key.
Perform all sort of operation i.e. Create/Retrieve/Delete on Keys are granted using Local RBAC.
Networking
In Networking section for HSM, options are either 'Allow All network' or 'Private Endpoints with allow trusted services.'
As general rule of thumb always prefers private connection over public.
All Networks: Exposed over public endpoint, accessible over internet by default.
Private Endpoint: It only exposes HSM on your specific VNet, Resources in that VNet only will have access to HSM Data Plane.
Create private endpoint in HSM steps are similar to private endpoints for any Az resources. specify VNet and Subnet for PE where an application/service has line of sight to HSM PE.One thing to note with private endpoint is that it restricts access to HSM data plane from Az ARM interface. you would need Azure VM in Same VNet or its Peered VNet to be able Access/manage HSM keys from Az ARM interface i.e. Portal/CLI
Error observed accessing HSM via portal once PE is enabled for HSM.
Operational Excellence: Encryption with managed HSM
Based on your requirement use HSM keys to encrypt data at rest on Azure service such as Blob Storage, PostgreSQL, MySQL etc.
Let's take an example of encryption existing blob storage with CMK on HSM, we already have HSM configured with Private endpoint & Required RBAC "Managed HSM Crypto User."
Noticed once we enable PE on HSM, we can't access data plane on HSM, that means all operation on keys would be restricted from portal/cli including local RBAC management.
When trying to encrypt storage account using CMK option and after HSM is selected, noticed error related to connection with HSM data plane on Storage account blade.
Note: Allow Microsoft trusted service is also enabled along with PE on HSM
I have tested couple of more service i.e. MySQL/PGSQL with similar error that mean its common with all Azure data services that supports encryption with CMK on HSM
Primary reason for this error, when we enable Private network on HSM. it lockdown network access on HSM and only enable access to HSM via Private Network i.e. VNet connected Device to maximize security.
Even though option for "Allow Microsoft trusted services" is enabled, setting up encryption on Blob storage failed because request to HSM API's goes via end user browser not via Storage service IP range.
When we access HSM interface from Azure portal, user's browser interacts with managed HSM API. Even when configuring encryption for other services via portal user's browser used as client to interact with ARM Api for HSM
Here is error Screenshot of setting encryption for new Blob Storage
Now if admin want to perform operations on Private managed HSM, would need an Azure VM which has line of sight connectivity towards HSM private endpoint.
*
In this screenshot users are able to configure encryption on Blob storage via Azure portal on Az VM*
Resiliency: Disaster Recovery with HSM
HSM offers multi region feature which allows data from primary instance replicate to Secondary instance.
Once HSM Replica is enabled for secondary region, its function as Active-Passive behind backend Traffic manager endpoint.
However, replica instance isn't visible to users in its subscription rather function in backend as extension to primary instance.
Failover of HSM is managed by Azure in case of outage of its service for primary instance.
Since secondary instance not visible on portal/cli interfaces many data related services such as Postgres, MySQL requires keystore/HSM to be available local on DR region.
Recommendation deploy another HSM instance rather than replica if users are configured HSM with many native Az DB services.
Disaster Recovery: Backup Restore on Managed HSM
Easiest way is to setup another HSM instance in DR region to perform complete Backup & Restore. HSM stores it backup on Blob Storage would need access to Blob storage with sufficient RBAC on security principle i.e. managed identity.
Note: you would need security domain of primary HSM while restoring backup
Create manage identity and Assign RBAC 'Storage Blob Contributor' on SA.
Associate managed identity on your primary HSM to enable backup write permission on SA (Storage account).
az keyvault update-hsm --hsm-name primary-hsm --mi-user-assigned "/subscriptions/subid/resourcegroups/rgname/providers/Microsoft.ManagedIdentity/userAssignedIdentities/manageidentityname"
Once MI is associated to HSM, create container on SA for your HSM backups & triggers backup using Az CLI (backup/restore options are not on Az Portal)
az keyvault backup start --use-managed-identity true --hsm-name primary-hsm --storage-account-name hsmbackupsaname --blob-container-name conatiner1 --subscription Subs-guid
Look for success message when received response from backup command, Now create another vanilla HSM instance in DR region, but don't Activate it
Normally after creating HSM instance, we initialize and download the new HSM's Security Domain as mentioned at start. However, since we're executing DR procedure, we will enable Security Recovery mode on this HSM.
az keyvault security-domain init-recovery --hsm-name secondry-hsm --sd-exchange-key hsmrecoveryfilename
Collect/download security domain of primary HSM along with its 2/3 Keys based on Quorum configurations.
Before we triggers restore make sure secondary HSM has access to Blob storage where backup is stored
Now Initiate restore of backup on secondary HSM, noticed that I have received an error which means before restoration of any backup on HSM, backup of target HSM should be triggered within 30 minutes.
Once backup is completed, we reinitiated restoration of secondary HSM, this time it completed successfully.
I have tried to cover important areas around Implementation of HSM which faced during discussion with FSI customers. Feel free to share your thoughts or question in comments & for more details around HSM solution, refer to official documentation.
Subscribe to my newsletter
Read articles from Osama Shaikh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Osama Shaikh
Osama Shaikh
I have been working as App/Infra Solution Architect with Microsoft from 5 years. Helping diverse set of customers across vertical i.e. BFSI, ITES, Digital Native in their journey towards cloud