Background

Recently, one of our critical ADF production pipelines failed unexpectedly with the following error from the Databricks notebook activity:

"Could not reach driver of cluster 0619-120457-7gl1w8rb."

The notebook had been working fine earlier in the day. Upon investigation, I found that this started happening after the cluster auto-terminated and restarted—and suddenly many notebooks attached to the same cluster were failing on import statements related to pandas, pyarrow, and scikit-learn.

Symptom

Inside notebook logs and standard output, I saw this common error:

A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6 as it may crash.
...
ImportError: PyArrow >= 4.0.0 must be installed; however, it was not found.

At this point, it was clear there was a version mismatch between NumPy, PyArrow, and Pandas—breaking the runtime.

Investigation Strategy

Since Databricks installs all libraries listed in the cluster Library UI each time the cluster restarts, I suspected one of these libraries silently installed or upgraded NumPy. Here's how I investigated:

Cloned the cluster.
Opened a new notebook and ran:
```
 %pip show numpy
```
Result: numpy 2.2.6 was installed!
Uninstalled each custom Python library one-by-one via UI and re-tested until I found the culprit.

Culprit: `btyd`

I found that adding the btyd library (without version pinning) was implicitly upgrading NumPy to version 2.2.6. This version is incompatible with most packages compiled for NumPy 1.x, including pyarrow, which is crucial for Spark DataFrame I/O in Databricks.

This led to:

Broken notebook sessions
Inability to import pandas or pyarrow
ADF pipeline failures

Solution

To resolve the issue without removing btyd (which is still needed), I took the following steps:

Identified that numpy==1.26.0 is the latest stable version before NumPy 2.x.
In a notebook:
```
 %pip install numpy==1.26.0
```
Restarted the kernel:
```
 dbutils.library.restartPython()
```

Verified version:

 pythonCopyEditimport numpy as np
 print(np.__version__)  # 1.26.0

Finally, I added numpy==1.26.0 as a separate entry at the end of the cluster’s library list via the Databricks Libraries UI.

This ensures that Databricks installs all other libraries first (including btyd), and then overrides NumPy 2.x with the compatible 1.26.0 version.

Post-Resolution Actions

I tested a few notebooks and confirmed that PyArrow, Pandas, and other dependent libraries worked.
I checked scheduled ADF pipelines that ran during the error window and confirmed some of them failed with the driver error.
These need to be re-run manually now that the cluster is fixed.
We must monitor all downstream pipelines using the cluster to catch any lingering failures.

Key Lessons

Always pin versions of libraries when working in shared production clusters.
Not all PyPI packages are responsible with dependency declarations—btyd is a perfect example.
Avoid installing packages like NumPy blindly, especially on top of environments with native bindings (like Spark and Arrow).
Clone the cluster before testing conflicting packages—this saved me from breaking other workloads.

Have you experienced similar dependency chaos in Databricks or any other managed environment? Share your experience in the comments!

⚠️ How I Diagnosed and Fixed a Silent NumPy Upgrade That Broke My Databricks Pipelines

Background

Symptom

Investigation Strategy

Culprit: `btyd`

Solution

Post-Resolution Actions

Key Lessons

Subscribe to my newsletter

Muhammad Atif Hafeez

Muhammad Atif Hafeez

⚠️ How I Diagnosed and Fixed a Silent NumPy Upgrade That Broke My Databricks Pipelines

Background

Symptom

Investigation Strategy

Culprit: btyd

Solution

Post-Resolution Actions

Key Lessons

Subscribe to my newsletter

Muhammad Atif Hafeez

Muhammad Atif Hafeez

Culprit: `btyd`