CI/CD for Azure Data Factory: Selective Deployment & Full Deployment Using Azure DevOps YAML Pipelines (Dev → Test → Prod)

By Shubham Sahu
Tags: Azure Data Factory, CI/CD, Azure DevOps, Data Engineering, YAML
🚀 Introduction
Modern data engineering teams need a fast, reliable, and repeatable way to deploy Azure Data Factory (ADF) pipelines across multiple environments (dev, test, prod). Manual deployments are error-prone and slow. With the right CI/CD setup, we can deliver versioned, auditable, and environment-aware ADF pipelines using YAML pipelines in Azure DevOps.
In this article, we’ll walk through a CI/CD approach using:
ADF JSON files managed in Git
YAML pipelines for build and release
Environment-specific deployment filtering via CSV files
The
SQLPlayer.DataFactoryTools
DevOps extension
📊 Architecture Overview
This diagram illustrates a custom CI/CD workflow for Azure Data Factory using Git integration and Azure DevOps YAML pipelines. Developers commit ADF JSON files into Git branches. Once merged into the mainline, the Build pipeline validates and packages the artifacts. The Release pipeline deploys them across environments—Dev, Test, and Prod—using the SQLPlayer task. The adf_publish
branch stores published JSON artifacts, keeping it in sync with the live factory.
📂 Folder Structure
We can use a clean Git repo layout for maintainability:
adf-deployment-repo/
├── LinkedServices/
│ ├── AzureBlobStorageLS.json
│ ├── AzureSqlDatabaseLS.json
├── Datasets/
│ ├── OrdersDataset.json
│ ├── ProductsDataset.json
├── Pipelines/
│ ├── pipeline_TransformOrders.json
│ ├── pipeline_LoadToSQL.json
│ ├── pipeline_NotifyComplete.json
├── deployment/
│ ├── config-dev.csv
│ ├── config-test.csv
│ ├── config-prod.csv
├── build-dataFactory.yaml
├── release-dataFactory.yaml
📄 CSV Configuration Example
The config file defines exactly which ADF objects to deploy:
config-test.csv
Pipeline;pipeline_TransformOrders;True
Pipeline;pipeline_LoadToSQL;True
Dataset;OrdersDataset;True
LinkedService;AzureBlobStorageLS;True;StorageConnectionString=DefaultEndpointsProtocol=https;AccountName=testblob;AccountKey=***
🎓 Step 1: Build Pipeline (Validation + Packaging)
Create a pipeline named build-dataFactory.yaml
:
trigger: none
pool:
vmImage: ubuntu-latest
jobs:
- job: BuildADF
steps:
- task: SQLPlayer.DataFactoryTools.BuildADF.BuildADFTask@1
displayName: 'Validate ADF JSON Files'
inputs:
DataFactoryCodePath: '$(Build.SourcesDirectory)'
- task: CopyFiles@2
displayName: 'Copy Files to Artifact'
inputs:
Contents: '**/*'
TargetFolder: '$(Build.ArtifactStagingDirectory)'
- task: PublishBuildArtifacts@1
inputs:
PathtoPublish: '$(Build.ArtifactStagingDirectory)'
ArtifactName: 'drop'
This pipeline validates ADF objects and publishes them as artifacts for release.
🚪 Step 2: Release Pipeline (Selective Deployment)
Create a pipeline named release-dataFactory.yaml
that uses CSV files to control what gets deployed per environment.
parameters:
- name: environment
type: string
default: dev
variables:
- group: adf-${{ parameters.environment }}
- name: configFile
value: config-${{ parameters.environment }}.csv
stages:
- stage: DeployADF
displayName: Deploy to ${{ parameters.environment }}
jobs:
- job: Deploy
pool:
vmImage: ubuntu-latest
steps:
- download: current
artifact: drop
- task: SQLPlayer.DataFactoryTools.PublishADF.PublishADFTask@1
displayName: 'Deploy ADF to ${{ parameters.environment }}'
inputs:
azureSubscription: '$(serviceConnectionName)'
ResourceGroupName: '$(resourceGroupName)'
DataFactoryName: '$(dataFactoryName)'
DataFactoryCodePath: '$(Pipeline.Workspace)/drop/drop'
Location: '$(region)'
StageType: FilePath
StageConfigFile: '$(Pipeline.Workspace)/drop/drop/deployment/$(configFile)'
DeleteNotInSource: false
CreateNewInstance: false
IncrementalDeployment: false
FilteringType: Inline
FilterText: |
-managedPrivateEndpoint*
-IntegrationRuntim*
TriggerStartMethod: KeepPreviousState
📦 Full Deployment (All Files)
To perform a full deployment (deploying all objects in the repo, not selectively), simply omit the config CSV file by using StageType: Path
instead of FilePath
.
YAML Example for Full Deployment:
- task: SQLPlayer.DataFactoryTools.PublishADF.PublishADFTask@1
displayName: 'Full Deployment to ${{ parameters.environment }}'
inputs:
azureSubscription: '$(serviceConnectionName)'
ResourceGroupName: '$(resourceGroupName)'
DataFactoryName: '$(dataFactoryName)'
DataFactoryCodePath: '$(Pipeline.Workspace)/drop/drop'
Location: '$(region)'
StageType: Path
DeleteNotInSource: false
CreateNewInstance: false
IncrementalDeployment: false
FilteringType: Inline
FilterText: |
-managedPrivateEndpoint*
-IntegrationRuntim*
TriggerStartMethod: KeepPreviousState
✅ This method will deploy everything found in the artifact folder.
✅ Optional: Incremental Deployment Mode
You can enable incremental deployment to improve performance in large environments by avoiding redeployment of unchanged objects.
YAML Sample:
- task: SQLPlayer.DataFactoryTools.PublishADF.PublishADFTask@1
displayName: 'Incremental Deploy ADF to ${{ parameters.environment }}'
inputs:
azureSubscription: '$(serviceConnectionName)'
ResourceGroupName: '$(resourceGroupName)'
DataFactoryName: '$(dataFactoryName)'
DataFactoryCodePath: '$(Pipeline.Workspace)/drop/drop'
Location: '$(region)'
StageType: FilePath
StageConfigFile: '$(Pipeline.Workspace)/drop/drop/deployment/$(configFile)'
DeleteNotInSource: false
CreateNewInstance: false
IncrementalDeployment: true
IncrementalDeploymentStorageUri: 'https://<yourstorage>.blob.core.windows.net/adf-deployment-state'
FilteringType: Inline
FilterText: |
-managedPrivateEndpoint*
-IntegrationRuntim*
TriggerStartMethod: KeepPreviousState
How It Works:
Uses blob storage to store a deployment state JSON file
Compares current objects to previous deployment
Only changed objects are deployed
Requirements:
The storage container must exist and allow write access to your pipeline identity
Ensure no one edits ADF manually in Azure Portal (hash mismatch risk)
Remove the deployment state file to force a full redeploy
🔧 Assign Role to the Specific Container Only (More Secure)
Limiting access only to the container used for incremental deployment (e.g., adf-deployment-state
) increases security and enforces the principle of least privilege.
How:
Go to the Storage Account in the Azure Portal
Select Containers → click on
adf-deployment-state
Click Access Control (IAM) for this container
Click + Add role assignment
Configure the following:
Role: Storage Blob Data Contributor
Assign to: Your DevOps service principal (used in the pipeline)
Scope: This container only
Your pipeline can only access the adf-deployment-state
container, ensuring minimal access is granted and other containers in the same storage account remain isolated.
File Created in Blob:
adf-prod.adftools_deployment_state.json
🌌 Promotion Flow: Dev → Test → Prod
Developer commits ADF JSON to Git (feature branch)
PR is created and merged to
develop
(triggers build & deploy to dev)Code is promoted to
release/test
via PR, triggering deploy to testAfter approval, code is merged to
main
, triggering deployment to prod
🔧 Pro Tips
Use
TriggerStartMethod: KeepPreviousState
to avoid auto-starting triggersStore secrets in Azure DevOps variable groups or Key Vault
Filter IRs, managed VNETs, and private endpoints
Only list stable objects in production config files
Use
IncrementalDeployment: false
for full control; usetrue
only if state file is managed properly
🌟 Benefits of This Method
✅ Clean, traceable deployments via Git
✅ Environment-specific filtering
✅ No ARM templates or complicated parameters
✅ Compatible with DataOps principles (audit, rollback, security)
🧩 Best Practices for ADF Deployment in a DataOps Project
✅ Use a clean Git structure (LinkedServices/, Pipelines/, Datasets/, deployment/)
✅ Maintain separate
config-<env>.csv
files for environment-level filtering✅ Store secrets in Azure DevOps variable groups or Azure Key Vault
✅ Always validate ADF JSONs in build before attempting release
✅ Use
FilteringType
in YAML to exclude IR, VNet, private endpoints✅ Avoid using
IncrementalDeployment: true
unless using Blob state tracking correctly✅ Provide
IncrementalDeploymentStorageUri
when enabling incremental mode✅ Implement manual approval gates before prod deployment
✅ Track all changes via Git PRs and tag releases
✅ Promote from Dev → Test → Prod using separate configs
📖 Summary
This method of deploying ADF using Azure DevOps YAML pipelines + config files gives you:
Full control of what goes to each environment
Secure, repeatable, automated CI/CD
Fast onboarding for data engineers
🔗 References
Author: Shubham Sahu
Azure DevOps Lead | DevOps Enthusiast | DataOps Advocate
Subscribe to my newsletter
Read articles from Shubham Sahu directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Shubham Sahu
Shubham Sahu
Proficient in variety of DevOps, DataOps MLOps & AIOps technologies, including Azure, AWS, Linux, Shell Scripting, Python, Docker, Terraform, Kubernetes. I have strong ability to troubleshoot and resolve issues and are consistently motivated to expand their knowledge and skills through expansion of new technologies.