Introduction to Azure Storage

AshwinAshwin
10 min read

Azure Storage is Microsoft's highly available, secure, and massively scalable cloud storage solution designed to cater to modern data storage needs. It is a storage platform that can be used for various data objects in the cloud. These data objects can be accessed globally over HTTP or HTTPS, making it a versatile choice for developers and businesses. Additionally, Azure Storage provides tools like the Azure Portal, Azure PowerShell, Azure CLI, and Azure Storage Explorer for easy interaction and management. Understanding the various components and options of Azure Storage is essential for building real-world cloud solutions.

Azure Storage: Core Services

Azure Storage provides a suite of managed, scalable, and secure solutions tailored to different data types and use cases:

1. Blob Storage

  • Use: Object storage for unstructured data such as media files, backups, logs, and big data analytics.

  • Features: Offers multiple tiers (Hot, Cool, Cold, Archive) for cost optimization, high durability, geo-replication, encryption, and integrated access control.

2. File Storage (Azure Files)

  • Use: Managed file shares via SMB or NFS, ideal for replacing on-premise file shares or enabling collaborative access.

3. Table Storage

  • Use: NoSQL, schema-less storage for structured/semi-structured data like catalog items, logs, or IoT data.

4. Queue Storage

  • Use: Message queuing for decoupling application components and enabling asynchronous processing.

5. Disk Storage

  • Use: Persistent, block-level storage for Azure VMs—manages high-performance virtual disks (standard SSD/HDD, premium SSDs, Ultra).

Azure Blob Storage supports three primary blob types, each designed for specific use cases:

  • Block Blobs

    Optimised for streaming and storing cloud-native applications' data. They can handle up to about 4.7 TB of data and are ideal for storing text or binary data, such as documents, media files, or application installers. Here, the data is static and rarely changes.

  • Append Blobs

    These are similar to block blobs but optimised for append operations. They're perfect for scenarios where the data is mostly static but experiences frequent incremental changes. For example, logging data from virtual machines.

  • Page Blobs

    Designed for frequent random read/write operations. They are used as the OS and data disks for Azure Virtual Machines and can be up to 8 TB in size.

Blob Access Tiers

Azure Blob Storage provides different access tiers to optimise storage costs based on data access patterns:

  • Hot Tier: For frequently accessed data.

  • Cool Tier: For infrequently accessed data that will remain stored for at least 30 days.

  • Cold Tier: For infrequently accessed data that you want to keep for at least another 90 days

  • Archive Tier: For data that will remain untouched for extended periods and can tolerate retrieval latencies (between 10 - 15 hours sometimes)

It's essential to choose the right access tier based on your data's lifecycle to optimise costs. For instance, data that's frequently accessed initially but becomes less accessed over time can start in the hot tier and then be transitioned to the cool or archive tiers as its access pattern changes. The move between tiers can be automated by making use of blob lifecycle management policies.

Lifecycle Management

Azure Blob Storage offers lifecycle management policies that automate tasks like transitioning blobs to cooler storage tiers and deleting blobs at the end of their lifecycle. By defining rules in these policies, you can move blobs between the hot, cool, cold, and archive tiers, or even delete blobs that are past a specified age. This is particularly useful for optimising costs and ensuring that data is stored in the most cost-effective manner based on its access patterns and age. All without manual intervention.

Lifecycle Management Policy Definition

A lifecycle management policy is defined using a JSON document that contains a collection of rules. Each rule within the policy specifies the conditions under which certain actions (like transitioning to a different tier or deleting the blob) should be taken.

Here's a sample policy below:

Explanation

The JSON in the image defines an Azure Blob Storage lifecycle management rule named cloudville-sample-storage-lifecycle-rule. It helps optimize storage costs by automatically transitioning or deleting blobs based on their age or activity:

  • After 30 days, blobs are moved to the Cool tier.

  • After 60 days, they are promoted to the Cold tier.

  • After 180 days of no modification and at least 7 days since the last tier change, blobs move to the Archive tier.

  • After 5 years (1825 days), the blobs are automatically deleted.

This rule applies only to block blobs within the samplecontainer/ path. Such automation ensures that infrequently accessed data is stored cost-effectively while maintaining data lifecycle hygiene.

NOTE: “The platform runs the lifecycle policy once a day. Once you configure or edit a policy, it can take up to 24 hours for changes to go into effect. Once the policy is in effect, it could take up to 24 hours for some actions to run. Therefore, the policy actions may take up to 48 hours to complete.

If you disable a policy, then no new policy runs will be scheduled, but if a run is already in progress, that run will continue until it completes.”

Uses of Blob Storage

Azure Blob Storage is not just for storing data; it has numerous other uses which including:

  1. Azure Blob Storage is not just for storing data; it has numerous other uses which including:

    1. Static Website Hosting: Azure Blob Storage can be used to host static websites. This is particularly useful for sites that don't require server-side processing. The static website feature allows you to serve static content directly from the blob container without the need for a separate web server.

    2. Content Delivery Networks (CDN): Blob Storage can be integrated with Azure CDN to deliver large amounts of content to users with high-bandwidth and low-latency. This is especially useful for delivering multimedia content, software patches, and other large files to a global audience.

    3. Data Archiving and Backup: Given its scalability and cost-effectiveness, Blob Storage is an ideal solution for archiving data that doesn't need to be accessed frequently. With the lifecycle management feature, older data can be automatically moved to cooler storage tiers, optimising costs.

    4. Big Data Analytics: Blob Storage can be used to store large datasets that can be processed using Azure HDInsight or other big data processing solutions.

    5. Media Streaming: Store and stream audio and video content directly from Blob Storage.

    6. Disaster Recovery: Use Blob Storage as a backup solution to ensure data availability in case of system failures or other unforeseen events.

Additional Specialized Storage Solutions

  • Azure Data Lake Storage (Gen2): Built atop Blob storage; optimized for analytics, big data, and AI workloads.

  • Azure NetApp Files: Enterprise-grade, high-performance file shares supporting SMB and NFS for demanding workloads.

  • Azure Elastic SAN: Cloud-native SAN offering high-performance, scalable block storage aimed at enterprise applications.

  • Azure Container Storage (Preview): Persistent storage tailored for container workloads (e.g., AKS).

  • Azure Data Box: Physical devices for bulk offline transfer of data to Azure, especially for slow or restricted networks.

Why Choose Azure Storage?

Azure Storage stands out for:

Context/NeedRecommended Azure Storage Service
Unstructured files, media, and backupsBlob Storage
Cloud file shares (SMB/NFS)Azure Files
NoSQL store for simple structured dataTable Storage
Messaging & async workflowsQueue Storage
Persistent VM disksAzure Disk Storage
Big data analytics & data lake needsData Lake Storage Gen2
Enterprise file use & high throughputAzure NetApp Files
High-performance SAN requirementsAzure Elastic SAN
Container workloads needing volumeAzure Container Storage
Bulk data migrationsAzure Data Box

Other significant properties of a storage account include:

  1. Redundancy

    This determines how many copies of your data are created and where the copies are stored. The options range from Locally Redundant Storage (LRS) where 3 copies are created and stored within the same datacenter to GZRS (Geo Zone Redundant Storage) where copies are replicated across geographical regions. Redundancy ensures that your data is always available even in the face of failures. You can always consult Microsoft docs to decide on the data redundancy option that suits your purpose.

  2. Types (of Storage Accounts)

    Azure Storage offers a variety of storage account types to cater to different needs. Each type supports distinct features and has its own pricing model. Here's a breakdown:

    • Standard general-purpose v2 (GPv2)

      This is the most versatile storage account type, supporting Blob Storage (including Data Lake Storage Gen2), Queue Storage, Table Storage, and Azure Files. It offers a range of redundancy options, including locally redundant storage (LRS), geo-redundant storage (GRS), and more. It's recommended for most scenarios using Azure Storage.

    • Premium block blobs

      This type is optimised for block blobs and append blobs. It's ideal for scenarios with high transaction rates or those that require consistently low storage latency.

    • Premium file shares

      Exclusively for Azure Files, this account type supports both Server Message Block (SMB) and NFS file shares. It's recommended for enterprise or high-performance scale applications.

    • Premium page blobs

      This account type is designed specifically for page blobs.

    • Legacy storage accounts

      While Microsoft doesn't recommend these for most new deployments, they are still supported. The legacy types include the standard general-purpose v1 and Blob Storage accounts. The general-purpose v1 accounts might be suitable for specific scenarios, such as applications requiring the Azure classic deployment model or those that are transaction-intensive.

  1. Performance Tiers

    Azure Storage offers two primary performance tiers:

    • Standard

      Suitable for storing infrequently accessed data. It's backed by magnetic drives (HDDs) and offers a cost-effective solution for data that doesn't require high I/O performance. It is the recommended option for most scenarios. All the redundancy options are available when you choose the standard performance tier for your storage account.

    • Premium

      Backed by solid-state drives (SSDs) and designed for high-performance and low-latency workloads. It's ideal for scenarios where quick data access (low latency) is crucial, such as databases or VM disks. When you choose the premium performance tier for your storage account, you will also have to select what “premium account type” you want the storage account to be. Furthermore, the redundancy options will be limited to a choice between LRS and ZRS.

  1. Access Tiers

    Azure Storage provides different access tiers to help users store their blob data in the most cost-effective manner based on their usage patterns:

    • Hot tier

      Optimised for storing data that is accessed frequently. It's ideal for data that's in active use or expected to be accessed (read or written) frequently.

    • Cool tier

      Suitable for data that is infrequently accessed and stored for at least 30 days. Examples include short-term backup and older media content not viewed frequently.

    • Archive tier

      Designed for data that will remain in a dormant state for extended periods and can tolerate retrieval latencies. It's the most cost-effective for long-term storage but has higher data retrieval costs.

Each of these tiers has its own pricing model, and transitioning between them can result in cost savings based on the data's lifecycle.

Use Cases

Given the diverse services within Azure Storage, it caters to a wide range of scenarios. Below is a description of what each storage service is ideal for:

  • Azure Files: Perfect for "lift and shift" cloud migrations where applications already use native file system APIs. It can replace on-premises file servers or NAS devices and can store tools accessible from multiple VMs.

  • Azure Blobs: Ideal for applications that need to support streaming or random access. It's also suitable for building enterprise data lakes on Azure for big data analytics.

  • Azure Elastic SAN: Best for large-scale storage interoperable with various compute resources like SQL, MariaDB, Azure VMs, and Azure Kubernetes Services.

  • Azure Disks: Suitable for "lift and shift" applications that use native file system APIs and for storing data that doesn't need external access.

  • Azure Queues: Useful for decoupling application components and facilitating asynchronous communication between them.

  • Azure Tables: Ideal for storing flexible datasets like user data for web applications, address books, or other metadata.

The Microsoft Docs has sample scenarios for Azure Storage services that you can check out. Another one can be found here: Find the Azure storage tools or products you need.

0
Subscribe to my newsletter

Read articles from Ashwin directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ashwin
Ashwin

I'm a DevOps magician, conjuring automation spells and banishing manual headaches. With Jenkins, Docker, and Kubernetes in my toolkit, I turn deployment chaos into a comedy show. Let's sprinkle some DevOps magic and watch the sparks fly!