How to Minimize Downtime Caused by Table Locks During PostgreSQL Schema Changes?


🚨 Ever faced an unexpected downtime from a seemingly harmless database deployment?
Imagine you're adding a new column to an existing table (e.g., users
) in PostgreSQL. You decided to skip specifying a default value, thinking it would be a metadata-only change. So no rows need updating, and the operation should finish in the blink of an eye. Easy-peasy, right?
But then… the query runs for an unexpectedly long time, blocking other queries, and before you know it, your application grinds to a halt.😨
🔍 Why does a metadata change take longer time and disrupt everything?
PostgreSQL DDL statements, like ALTER TABLE
, acquire an ACCESS EXCLUSIVE
lock on the target table. This lock is the most restrictive, conflicting with all other lock types.🔒
Here’s the chain of chaos:
1️⃣ A long-running query (like a SELECT
) already holds an ACCESS SHARE
lock on the users
table.
2️⃣ Your ALTER TABLE
query tries to acquire an ACCESS EXCLUSIVE
lock and has to wait. The DDL statement can simply wait patiently until it's able to acquire its ACCESS EXCLUSIVE
lock, right? Here’s the problem.
3️⃣ While waiting, all other queries needing locks on the table queue up behind it—even simple SELECT
queries! 🚦
If that long-running query drags on, the entire table becomes inaccessible for reads and writes until the ALTER TABLE
completes, causing widespread downtime.
🛠️ How to prevent this?
Here’s lock_timeout
to the rescue!🦸♂️ It defines how long a statement will wait to acquire a lock before giving up.
Here’s an example:
SET lock_timeout = '5s'; -- Set lock timeout to 5 seconds
ALTER TABLE users ADD COLUMN last_login TIMESTAMP;
RESET lock_timeout; -- Restore to default
With lock_timeout
set ⏱️ -
If the operation can’t acquire its lock within specified time, it gracefully exits. Adding retries with exponential backoff can help handle these transient conflicts.✅
This ensures other queries like
SELECT
andUPDATE
can run without being stuck in the queue for an extended period.💡
📌 The takeaway
Setting a reasonable lock_timeout
for potentially locking operations is a simple yet effective way to mitigate the impact of DDL changes on your database's availability. 🕒✨
Next time you’re about to make such database change, don’t forget to safeguard your app’s availability with this simple trick! 🚀
Subscribe to my newsletter
Read articles from Arindam Ghosh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Arindam Ghosh
Arindam Ghosh
Database reliability engineer 👨💻 with expertise in designing, implementing, and maintaining highly available and scalable database systems 💻 Enjoy automating various database tasks ⚙️ and implementing database DevOps 🚀 🙏