Deploying Azure Data Factory, Azure Data bricks, Azure Data Lake storage & MySql DB using Terraform

KUNAL DASKUNAL DAS
3 min read

Kunal Das, Author

Reach at : https://heylink.me/kunaldas

Here I am going to share some terraform code to deploy ADF, ADLS, ADB, and several other necessary resources.

Table of Contents

Let’s start with a resource group where we will store all the resources required.

Resource Group

data "azurerm_client_config" "Current" {}
resource "azurerm_resource_group" "RG" {
  name     = var.ResourceGroup.Name
  location = var.ResourceGroup.Location
}

points to note that we will fetch the RG name and RG location in the next resource declaration.

Azure Data Factory:

resource "azurerm_data_factory" "DataFactory" {
  name                = "DataFactory Name"
  location            = azurerm_resource_group.RG.location
  resource_group_name = azurerm_resource_group.RG.name

  identity {
    type = "SystemAssigned"
  }
}

Azure Data Bricks:

resource "azurerm_databricks_workspace" "Databricks" {
  location                      = azurerm_resource_group.RG.location
  name                          = "Databricks Name"
  resource_group_name           = azurerm_resource_group.RG.name
  managed_resource_group_name   = "Databricks Managed Resource Group"
  sku                           = "Databricks Sku"

  custom_parameters {
    no_public_ip        = true
    virtual_network_id  = azurerm_virtual_network.DatabricksVnet.id
    public_subnet_name  = azurerm_subnet.DatabricksSubnetPublic.name
    private_subnet_name = azurerm_subnet.DatabricksSubnetPrivate.name
  }

  depends_on = [
    azurerm_subnet_network_security_group_association.public,
    azurerm_subnet_network_security_group_association.private
  ]
}

Virtual network:

resource "azurerm_virtual_network" "DatabricksVnet" {
  name                     = "VNET NAME"
  resource_group_name      = azurerm_resource_group.RG.name
  location                 = azurerm_resource_group.RG.location
  address_space            = ["VNET CIDR"]
}

Network Security group for ADB:

resource "azurerm_network_security_group" "DatabricksNSG" {
  name                     = "VirtualNetwork NSG Name"
  resource_group_name      = azurerm_resource_group.RG.name
  location                 = azurerm_resource_group.RG.location
}

Public subnet for Databricks:

resource "azurerm_subnet" "DatabricksSubnetPublic" {
  name                 = "VirtualNetwork PublicSubnet Name"
  resource_group_name  = azurerm_resource_group.RG.name
  virtual_network_name = azurerm_virtual_network.DatabricksVnet.name
  address_prefixes     = ["VirtualNetwork PublicSubnet CIDR"]
  service_endpoints    = ["Microsoft.Storage"]

  delegation {
    name = "Microsoft.Databricks.workspaces"
    service_delegation {
      name = "Microsoft.Databricks/workspaces"
      actions = [
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
        "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
    }
  }
}

Private subnet for Databricks:

resource "azurerm_subnet" "DatabricksSubnetPrivate" {
  name                 = "VirtualNetwork PrivateSubnet Name"
  resource_group_name  = azurerm_resource_group.RG.name
  virtual_network_name = azurerm_virtual_network.DatabricksVnet.name
  address_prefixes     = ["VirtualNetwork PrivateSubnet CIDR"]

  delegation {
    name = "Microsoft.Databricks.workspaces"
    service_delegation {
      name = "Microsoft.Databricks/workspaces"
      actions = [
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
        "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
    }
  }
}

Network security group for Public subnet:

resource "azurerm_subnet_network_security_group_association" "public" {
  subnet_id                 = azurerm_subnet.DatabricksSubnetPublic.id
  network_security_group_id = azurerm_network_security_group.DatabricksNSG.id
}

Network security group for Privatesubnet:

resource "azurerm_subnet_network_security_group_association" "private" {
  subnet_id                 = azurerm_subnet.DatabricksSubnetPrivate.id
  network_security_group_id = azurerm_network_security_group.DatabricksNSG.id
}

Now as all the associated network configuration done let’s move to the DATA LAKE STORAGE account creation

Data Lake storage account:

resource "azurerm_storage_account" "DataLake" {
  name                     = "DataLake Name"
  resource_group_name      = azurerm_resource_group.RG.name
  location                 = azurerm_resource_group.RG.location
  account_tier             = "DataLake Tier"
  account_replication_type = "DataLake Replication"
  is_hns_enabled           = true
  min_tls_version          = "DataLake TLSVersion"

  network_rules {
    # bypass                     = "AzureServices"
    default_action             = "Allow"    
  }
}

Storage account container:

resource "azurerm_storage_container" "DataLakeContainer" {  
  for_each              = "DataLake Container"
  name                  = each.key
  storage_account_name  = azurerm_storage_account.DataLake.name
  container_access_type = "private"
}

Now, let us create SQL related resources

Storage Admin password:

resource "random_string" "SQLAdminPassword" {
  length      = 5
  special     = true
  min_upper   = 2
  min_numeric = 2
  min_special = 2
}

SQL server :

resource "azurerm_mssql_server" "SQLServer" {
  name                         = "SQLServer Name"
  resource_group_name          = azurerm_resource_group.RG.name
  location                     = azurerm_resource_group.RG.location
  version                      = "SQLServer Version"
  administrator_login          = "SQLServer AdministratorLogin"
  administrator_login_password = random_string.SQLAdminPassword.result
  minimum_tls_version          = "SQLServer  TLS Version"
}

SQL Database:

resource "azurerm_mssql_database" "SQLDatabase" {
  name           = "SQLDatabase Name"
  server_id      = azurerm_mssql_server.SQLServer.id
  collation      = "SQL_collation"
  max_size_gb    = "SQLDatabase MaxSizeGB"
  sku_name       = "SQLDatabase SKU"
  zone_redundant = "SQLDatabase ZoneRedundant"
}

This is a complete part-by-part snippet to create a running ADB ADF system, feel free to reach me in case any clarification is required!

Read my blogs:

Medium Logo

Dev.to Logo

Hashnode Logo

Connect with Me:

kunald_official

kunaldaskd

1
Subscribe to my newsletter

Read articles from KUNAL DAS directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

KUNAL DAS
KUNAL DAS

I work with cross-functional teams to deliver cutting-edge solutions for data and analytics projects. I leverage my expertise in cloud platforms, automation, CI/CD, data analysis, and business intelligence to build efficient DevOps pipelines, craft insightful visualizations, and optimize ML models.