Quick Note : Fabric Spark Resource Profile

Sandeep PawarSandeep Pawar
3 min read

Fabric Spark Resource Profiles were announced at FabCon 2025. It makes it very easy to apply Spark configurations based on job needs and patterns. I love it and plan to use it. This blog isn't a deep dive into profiles; I just wanted to share a quick note that you should know about.

By default, all new workspaces in Fabric will use the writeHeavy profile, which is optimized for frequent writes, like those needed for ingestion. I like and support this change. Workspaces created before FabCon use Spark configurations optimized for Power BI read scenarios, meaning faster reads by Direct Lake. This Power BI-focused setup often caused challenges with jobs taking longer in landing and staging layers, where delta tables aren't usually used for Power BI reports. The new change and profiles will solve this issue without requiring users to make any changes.

However, I want to point out that since new workspaces use different Spark configurations, you might notice changes in job durations and how data is laid out in the delta table, unless you explicitly defined the Spark configurations in the notebook or used Environment.

For example, if I have an existing Dev workspace created before FabCon and I create a new feature workspace to optimize the Spark jobs, both using the default runtime, I will notice different durations and performance because the configurations are different. If the job is write-heavy, notebooks in the feature workspace may finish faster than the Dev workspace. Below, I created a feature workspace and compared the configs between Dev workspace and the feature workspace:

from pyspark import SparkConf
import pandas as pd

## Sandeep Pawar | fabric.guru
## Retrieves spark configs in the runtime used in the notebook

conf = SparkConf()
conf_list = conf.getAll()
configs_df = pd.DataFrame(conf_list, columns=['Config', 'Default'])
config_df = pd.DataFrame({
    'Config': configs_df['Config'],
    'Runtime_Values': [spark.conf.get(config) for config in configs_df['Config']]
})
display(config_df)

💡
Notice that by default VORDER is now set to false (I like this). If you are using the tables for Direct Lake, be sure to set it to true if you your tests show that it improves the query performance.

If you want to make sure all workspaces are using the same configs, either use Environments (best practice IMO) or use %%configure -f to define it at session scope. (Note that this will add to the session start up time). If you created new workspaces around FabCon (Apr 1-3 2025), I recommend checking the configs to make sure it is based on your patterns and requirements. (Note: the List workspaces or Get Workspaces API does not show workspace creation date, if anyone knows how to get the creation date, please let me know. Update : Gil Raviv and Frank Preusker shared that Workspace Activity API should show the workspace creation date, thank you ! I have a blog on using the Activity API.).

References

I highly recommend reading below blogs to learn more :

1
Subscribe to my newsletter

Read articles from Sandeep Pawar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sandeep Pawar
Sandeep Pawar

Microsoft MVP with expertise in data analytics, data science and generative AI using Microsoft data platform.