pt-table-sync

CosmicoppaiCosmicoppai
2 min read

I recently had a situation when a lot of rows from one of our replicas was missing.
One of the solutions that came in mind is to use the snapshot of the source and reconfigure it as replica. While this would have worked, it seemed like an extreme approach for what might be a localized issue.

After digging into the logs after a while, the problem seems to be only isolated to a single table. So, I tried pt-table-sync.

What is pt-table-sync?

pt-table-sync is a command-line utility that helps synchronize data across different MySQL servers. It can be used in various scenarios, such as:

  • Synchronizing data between a source and its replica

  • Resolving inconsistencies between two source servers

  • Comparing and fixing differences between any two MySQL data sources

The synchronization command looks like this:

pt-table-sync --execute h=x.x.x.x,D=database,t=table,u=user,p=pass h=localhost,u=user,p=pass

Here’s what each parameter does:

  • h=x.x.x.x → The host of the source database (where data is correct)

  • D=database → The database name

  • t=table → The table name that needs synchronization

  • u=user / p=passMySQL credentials for authentication

  • h=localhost → The replica server where data needs to be fixed

pt-table-sync generally sync data from the replica to the source, i.e. from the 2nd data source to the first as changes to the replica are usually the source of the problems in the first place. i.e. the changes made on the source will be replicated down the replica via the normal replication process.

How pt-table-sync Works

The synchronization process in pt-table-sync consists of three main operations:

  • UPDATE →For stale data NOOP operation is triggered on source, that only affects the replica.

  • DELETE → DELETE statements on the source for rows that don't exist there but exist in replica.

  • INSERT → For missing data, that exist on source but not on replica. It’s retriggered so it can pass via binary logs

By default, pt-table-sync generates and executes these statements to bring the replica in sync with the source. If you want to preview these operations before executing them, you can use:

pt-table-sync --print h=x.x.x.x,D=database,t=table,u=user,p=pass h=localhost,u=user,p=pass

This prints out the SQL statements that would be executed, allowing you to review changes before applying them.

0
Subscribe to my newsletter

Read articles from Cosmicoppai directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Cosmicoppai
Cosmicoppai