Recently, I ran into a frustrating issue with our MariaDB setup in Kubernetes using the Bitnami Helm chart. One of the read replicas (secondary node) kept restarting constantly with exit code 139, which is usually a segmentation fault. Meanwhile, the primary node was unaffected and continued working fine.

🔍 The Situation

We had a Bitnami MariaDB setup running in a Kubernetes cluster with a primary-secondary replication topology. Everything had been running fine for months until one day, I noticed that the secondary pod was crash-looping with this:

Exit Code: 139 (Segmentation fault)

Not super helpful.

🛠️ What I Tried (That Didn't Work)

Since the primary was healthy, I figured I could just rebuild the secondary.

I mounted the secondary PVC (which was still referenced, but I had already cleared the data earlier in panic).
Tried doing a mysqldump from the primary and restoring into the secondary.
- Problem: The dump was way too big and took too long.

✅ What Finally Worked

Here’s the step-by-step of what actually worked to recover the replica:

1. 🔑 Identify Master Status on the Primary

Logged into the primary MariaDB instance and captured the binlog position:

SHOW MASTER STATUS;

This gave me:

File: mysql-bin.000123
Position: 456789

2. 🧊 Lock Tables on the Primary

FLUSH TABLES WITH READ LOCK;

Leave this session open — it holds the read lock.

3. 🚚 Use `rsync` to Copy the Data

With both primary and secondary PVCs mounted in a helper pod, I ran:

rsync -a /mnt/primary-data/ /mnt/secondary-data/

✅ This is faster than mysqldump and preserves internal replication metadata.

4. 🔄 Configure the Replica

Once the copy completed, I configured the replica to start replication:

CHANGE MASTER TO
  MASTER_HOST='primary-hostname',
  MASTER_USER='replicator',
  MASTER_PASSWORD='repl-password',
  MASTER_LOG_FILE='mysql-bin.000123',
  MASTER_LOG_POS=456789;

START SLAVE;

5. 🔓 Unlock the Primary

Go back to the terminal holding the read lock and run:

UNLOCK TABLES;

6. ✅ Confirm Replication is Working

On the secondary:

SHOW SLAVE STATUS\G

You should see:

Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Seconds_Behind_Master: 0

💡 Key Takeaways

Segfault (code 139) can be a symptom of a broken MariaDB datadir, especially after PVC loss
rsync is faster and more robust than mysqldump for full-state syncing
Lock the primary when syncing to avoid inconsistent data
Don't forget to note the binlog file and position before syncing
Bitnami MariaDB replicas work beautifully if the initial state is consistent

Hopefully, this helps someone avoid a full day of trial and error!

Let me know if you've had similar horror stories in Kubernetes + databases!

How I Recovered a Broken MariaDB Replica in Kubernetes (Bitnami + K8s)

Table of contents

🔍 The Situation

🛠️ What I Tried (That Didn't Work)

✅ What Finally Worked

1. 🔑 Identify Master Status on the Primary

2. 🧊 Lock Tables on the Primary

3. 🚚 Use `rsync` to Copy the Data

4. 🔄 Configure the Replica

5. 🔓 Unlock the Primary

6. ✅ Confirm Replication is Working

💡 Key Takeaways

Subscribe to my newsletter

Vikram

Vikram

How I Recovered a Broken MariaDB Replica in Kubernetes (Bitnami + K8s)

Table of contents

🔍 The Situation

🛠️ What I Tried (That Didn't Work)

✅ What Finally Worked

1. 🔑 Identify Master Status on the Primary

2. 🧊 Lock Tables on the Primary

3. 🚚 Use rsync to Copy the Data

4. 🔄 Configure the Replica

5. 🔓 Unlock the Primary

6. ✅ Confirm Replication is Working

💡 Key Takeaways

Subscribe to my newsletter

Vikram

Vikram

3. 🚚 Use `rsync` to Copy the Data