Azure Data Factory: "Join" 2 or more CSV Files and Convert to JSON Format

Table of contents
- Step 1: Inspecting the CSV Files in Data Lake: Your First Step to Data Optimization
- Step 2: Configuring the Data Flow Sources: Pointing to the Customer.CSV Files and use Join tool after that.
- Step 3: Use Join on Customer id as that is the common field and choose inner join.
- Step 4: Sink Location would be a JSON File in data lake so dataset has been chosen accordingly.
- Step 5: Integrating Data Flow into a Pipeline: Directing Data to ALDS’s Join_example folder. Data must be saved into Json format in Data Lake.
- Step 5: Pipeline Execution Success: Ensuring Smooth Data Transfer
- Step 6: Data Flow Success: Confirming Effective Data Transformation
- Step 7: Verifying JSON file in the data lake in Azure.

Step 1: Inspecting the CSV Files in Data Lake: Your First Step to Data Optimization
Step 2: Configuring the Data Flow Sources: Pointing to the Customer.CSV Files and use Join tool after that.
Step 3: Use Join on Customer id as that is the common field and choose inner join.
Step 4: Sink Location would be a JSON File in data lake so dataset has been chosen accordingly.
Step 5: Integrating Data Flow into a Pipeline: Directing Data to ALDS’s Join_example folder. Data must be saved into Json format in Data Lake.
Step 5: Pipeline Execution Success: Ensuring Smooth Data Transfer
Step 6: Data Flow Success: Confirming Effective Data Transformation
Step 7: Verifying JSON file in the data lake in Azure.
Subscribe to my newsletter
Read articles from Arpit Tyagi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Arpit Tyagi
Arpit Tyagi
Experienced Data Engineer passionate about building and optimizing data infrastructure to fuel powerful insights and decision-making. With a deep understanding of data pipelines, ETL processes, and cloud platforms, I specialize in transforming raw data into clean, structured datasets that empower analytics and machine learning applications. My expertise includes designing scalable architectures, managing large datasets, and ensuring data quality across the entire lifecycle. I thrive on solving complex data challenges using modern tools and technologies like Azure, Tableau, Alteryx, Spark. Through this blog, I aim to share best practices, tutorials, and industry insights to help fellow data engineers and enthusiasts master the art of building data-driven solutions.