Who's Calling?

Peer GrønnerupPeer Grønnerup
9 min read

During Microsoft Fabric project implementations, I’m frequently asked a deceptively simple question: “Under which identity is this running?” It turns out, the answer isn’t always straightforward - and to be honest, it’s a topic I’ve also found quite complex at times.

Just because a schedule was created by you doesn’t necessarily mean the entire job triggered by that schedule runs in your user context - or for that matter, in the context of the identity who created the item. And with the introduction of Service Principal support, things haven’t exactly become clearer. In fact, it often adds an extra layer of complexity to the already tricky landscape of execution context in Fabric.

In this post, I want to share some of the insights I’ve gathered - especially when working with data pipelines that trigger child notebooks and other downstream activities. We’ll look at how identities are used across different components, what you need to be aware of, and how to avoid common pitfalls. Or in short: Who’s calling? 📞

Finally, I’ll touch on a known bug in the Fabric API and the SemPy library that affects notebook execution in Service Principal contexts, a setup that’s becoming increasingly common in enterprise-grade, multi-environment data platforms.

Test Setup: Simulating Real-World Scheduling Scenarios

To explore how execution context behaves in Microsoft Fabric, I created a simple but representative setup. Using the Fabric CLI, I triggered on-demand executions of Fabric items like data pipelines that call child notebooks as well as triggering notebooks directly.

This setup allows us to control exactly who initiates the run - be it a user or a Service Principal - and observe how that identity flows (or doesn’t) through the various components.

Key components of the setup:

  • A Data Pipeline with multiple activities (e.g., Invoke Notebook and Invoke Data Pipeline)

  • A Notebook which prints identity info as well as runtime properties and other relevant info

  • A parent Notebook which executes a child notebook (as the one above)

  • Fabric CLI-triggered job runs using both user identity and Service Principal

This approach mimics many enterprise deployment scenarios, especially in multi-environment setups.

Execution Scenarios: What Identity Is Actually Used?

Regardless of whether a job is triggered by a user or a Service Principal, the same core logic applies when it comes to execution context in Microsoft Fabric. However, what happens next depends heavily on the type of item being executed and how it's executed.

Let’s break it down…

Top-Level Execution: Who Triggers the Job?

When a pipeline or notebook is triggered - either manually, via schedule, or through a CLI/API call - the top-level item (the pipeline or notebook itself) is executed in the context of the identity that triggered it.

That could be:

  • A user account (e.g., developer in dev/test)

  • A service principal (e.g., a scheduled run in production)

So far, so good. But once you go deeper, into child components and downstream activities, the picture becomes more complicated.


Notebook Execution from Notebooks

When one notebook triggers another (e.g., using notebookutils.notebook.run()), the child notebooks always inherit the execution context of the parent notebook.

If a notebook is triggered by a Service Principal, all downstream notebooks will run under the same Service Principal.

If a user triggers the parent notebook, all child notebooks will run under that user’s identity.

This behavior is consistent and predictable across environments.


Data Pipelines: A More Complex Story

With Data Pipelines, execution context is activity-specific. Here’s what governs it:

🔹 Activities that use connections

Examples: Copy Data, Invoke Pipeline (preview), Azure Databricks, Semantic model refresh, Web etc.
These activities run under the identity associated with the connection object used.

🔹 Activities that do not use connections

Examples: Notebook, Invoke Pipeline (Legacy activity), Dataflow, Spark Job Definition etc.
These activities run under the identity of the user or service principal who last modified the pipeline. This is the identity shown as "Last Modified By" in the Data Pipeline settings.

⚠️ Yes, that means if you last edited a pipeline in dev as yourself, but deploy it in test using a service principal, the execution identity in test will be the service principal - even if the original intent was to run it as a user.

Real-Life Example: A Lakehouse Medallion Architecture

Let’s ground this in a practical scenario - a common Lakehouse Data Platform with a 3-layer medallion architecture:

  1. A controller pipeline kicks off the process.

  2. It calls child pipelines that ingest raw data into the bronze layer.

  3. Then it triggers a notebook that processes bronze into silver.

  4. Another notebook handles transformations into gold (curated data).

  5. Finally, the pipeline refreshes a semantic model as the last step.

Here’s how execution context breaks down:

  • Activities using connections (e.g., Copy Data or Semantic model refresh) run under the connection identity.

  • Notebooks in the pipeline (with no connection) run as the last modified identity of the pipeline - which could be a user or service principal.

  • If a child pipeline triggers a notebook, the same logic applies: the last modified identity of that pipeline determines the execution context of its notebook.

So yes, it’s entirely possible that a single run involves:

  • Data ingestion as one identity (connection)

  • Silver transformation as another (pipeline author)

  • Gold orchestration as yet another (child pipeline modifier)


Feeling Lost? You’re Not Alone

If you’re scratching your head, you’re not alone. The behavior is by design, but it does mean we need to be deliberate about how we:

  • Modify items

  • Manage dependencies downstream

  • Set up connections

  • Deploy across environments

Most importantly: how things run in development may not reflect how they run in test or production - especially if you use a service principal for automated deployments.

That’s why understanding execution context is critical for ensuring consistent behavior across environments in enterprise-grade solutions.

Known Bug: When Notebooks Fail Under a Service Principal

While building enterprise-ready Fabric solutions, it’s increasingly common to run notebooks using Service Principals. However, there's a known bug that can cause unexpected failures when doing so.

What’s the Problem?

Running a notebook under a Service Principal can break certain functions and environment references, especially those related to runtime context and authentication. The issue appears to stem from the scope or limitations of the Service Principal's token, and Microsoft has acknowledged it as a bug. The Fabric product team is actively working on a fix.

What Fails?

Here’s a list of some of the functions and methods that return None or throw errors when executed in a notebook under a Service Principal. Note that mssparkutils is going to be deprecated, notebookutils is the way to go. This is just to illustrate the issue:

  • mssparkutils.env.getWorkspaceName()

  • mssparkutils.env.getUserName()

  • notebookutils.runtime.context.get('currentWorkspaceName')

  • fabric.resolve_workspace_id()

  • fabric.resolve_workspace_name()

  • Any SemPy FabricRestClient operations

  • Manual API calls using tokens from notebookutils.mssparkutils.credentials.getToken("https://api.fabric.microsoft.com")

⚠️ Importing sempy.fabric Under a Service Principal

When executing a notebook in the context of a Service Principal, simply importing sempy.fabric will result in the following exception:

Exception: Fetch cluster details returns 401:b''
## Not In PBI Synapse Platform ##

This error occurs because SemPy attempts to fetch cluster and workspace metadata using the execution identity’s token - which, as mentioned earlier, lacks proper context or scope when it belongs to a Service Principal.

In short, any method that fetches workspace name or user name - or relies on the executing identity’s token for SemPy or REST API calls - is likely to fail or return None.

What Still Works?

Surprisingly, not everything is broken. Here are some functions that still work under a Service Principal:

  • spark.conf.get('trident.workspace.id') – this gives you the workspace ID reliably

  • sempy.fabric.get_workspace_id() – still functional, eventhough importing sempy.fabric will throw an exception as shown above.

  • notebookutils.credentials.getSecret(...) – useful for pulling secrets like client credentials from a Key Vault

Using these, you can still manually generate a token and pass it into your REST requests - or even inject a custom token_provider into the SemPy FabricRestClient.

Workarounds

If you hit this issue, here are some paths forward:

  • Avoid relying on runtime context methods when running under a Service Principal

  • Use a manual token approach: fetch your own token using credentials from Key Vault and use that in REST requests

  • Where possible, shift context resolution logic out of notebooks and into deployment orchestration or pipeline steps

  • Watch for updates: Microsoft is aware of the issue and a fix is on the way

Why This Bug Matters for CI/CD and Execution Context

This issue ties directly back to the core topic of this blog post - execution context in Microsoft Fabric. Remember that when a notebook is triggered by a Data Pipeline, its execution identity depends on who last modified the data pipeline.

In modern CI/CD workflows - whether you're using Azure DevOps Pipelines, GitHub Actions, or any other automation platform - you’re most likely deploying with a Service Principal. That means after every deployment, the "Last Modified By" identity on your Data Pipelines becomes the Service Principal.

This wouldn’t be an issue if notebooks worked reliably under Service Principal identity. But as we've seen above, notebooks run into serious limitations when executed in that context - missing environment properties, failed API calls, and broken logic in dynamic configurations.

A Practical Workaround: Let a Web Activity Re-Assign Ownership

Here’s one way to get around it:
Use a Web activity in a Fabric Pipeline - configured with an OAuth2 connection for a specific user - to update the description of your Data Pipelines post-deployment.

Why this works:

  • A Web activity executes in the context of the connection identity

  • Updating the pipeline’s description (even just reapplying the same description) is enough to change the "Last Modified By" property

  • As a result, all notebooks executed by those pipelines will now run in the context of the user tied to the OAuth2 connection, not the Service Principal

This allows you to:

  • Deploy pipelines automatically with a Service Principal

  • Then post-process them to re-assign their execution identity to a user, for scenarios where notebook behavior matters

This approach also allows you to apply filters to target only specific Data Pipelines, updating the Last Modified By property selectively. This way, you can still support notebook execution under a Service Principal where needed.

Pipeline Template: Available on GitHub

You can see a visual of this post-deployment ownership adjustment pipeline below.

I’ve also published the pipeline definition on my GitHub including a short description on how to use the 2 parameters: View on GitHub

Note: All activities in the definition are currently disabled by default so you can safely copy-paste it into your own Fabric Data Pipeline json definition and adjust the connection settings, pipeline selection logic etc. as needed.

0
Subscribe to my newsletter

Read articles from Peer Grønnerup directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Peer Grønnerup
Peer Grønnerup

Principal Architect | Microsoft Fabric Expert | Data & AI Enthusiast With over 15 years of experience in Data and BI, I specialize in Microsoft Fabric, helping organizations build scalable data platforms with cutting-edge technologies. As a Principal Architect at twoday, I focus on automating data workflows, optimizing CI/CD pipelines, and leveraging Fabric REST APIs to drive efficiency and innovation. I share my insights and knowledge through my blog, Peer Insights, where I explore how to leverage Microsoft Fabric REST APIs to automate platform management, manage CI/CD pipelines, and kickstart Fabric journeys.