Change Data Capture in SQL Server

mortylenmortylen
9 min read

Change Data Capture (CDC) is a technology in SQL Server designed to capture changes in database tables. It enables efficient tracking of INSERT, UPDATE, and DELETE operations and is often used in ETL processes, replication, or change backups.
CDC is utilized where it is important to know the history of changes in the database, or where data synchronization or replication between systems is required.
Unlike triggers, CDC works asynchronously – it does not capture changes at the moment they occur but reads the database transaction log at regular intervals and stores detected changes in special system tables. These tables contain detailed information about changes, including original and new values, and are part of the monitored database itself, making them easily queryable using SQL.

Change Data Capture functionality is not available in the SQL Server Express edition. It is supported only in higher editions such as Standard, Enterprise, or Developer.

Main Components

The main components of CDC in SQL Server include:

  • Transaction Log: Serves as the source of information about changes made in database tables.

  • Capture Job: A SQL Agent job that regularly reads the transaction log and extracts changes.

  • Cleanup Job: A SQL Agent job that removes older records from CDC tables based on retention settings.

  • CDC Capture Tables: System tables where captured changes are stored, including operation type, original and new values.

Workflow

  1. Change in monitored table
    The user performs an INSERT, UPDATE, or DELETE operation on a monitored table.

  2. Recording in transaction log
    SQL Server records this change in the database’s transaction log.

  3. CDC Capture Job reads transaction log
    SQL Server Agent regularly (typically every few seconds) runs the CDC capture job, which reads new change records for all monitored tables.

  4. Writing changes to CDC capture tables
    The job extracts information about the changes (original and new values, operation type, timestamp) and stores them in system CDC tables.

  5. Cleanup Job deletes old records
    Based on the configured retention period (e.g., 3 days), the CDC cleanup job runs to remove old records from CDC tables.

  6. User reads changes
    The user can query special views to retrieve changes for a specified time period and process them further.

User modifies table (INSERT/UPDATE/DELETE)
       ↓
Transaction Log (records changes)
       ↓
CDC Capture Job (reads changes from log)
       ↓
CDC Capture Tables (e.g. cdc.dbo_Employees_CT)
       ↓
CDC Cleanup Job (deletes old records)

Setting Up and Testing CDC

Now that we understand what the technology is about, let's proceed with a simple test.
We will create a test database and a table within it, where we will perform changes and verify their capture using CDC.

Create Database and Table

Let's create a test database, for example TestCDC, and within it a table called Employees.

-- Create Database
CREATE DATABASE TestCDC;
GO

-- Create Table
USE TestCDC;
GO

CREATE TABLE Employees (
    ID INT PRIMARY KEY,
    FullName NVARCHAR(100),
    Position NVARCHAR(100),
    Rating DECIMAL(10,2)
);
GO

Enable Change Data Capture

Now that we have something to test on, let's enable Change Data Capture on our new database and table.

-- Enable CDC for Database
USE TestCDC;
GO

EXEC sys.sp_cdc_enable_db;
GO

-- Enable CDC for Table
EXEC sys.sp_cdc_enable_table
    @source_schema = N'dbo',
    @source_name = N'Employees',
    @role_name = NULL,
    @supports_net_changes = 0;
GO
  • @source_schema: Name of the schema where the tracked table is located.

  • @source_name: Name of the table we want to track.

  • @role_name: Name of the database role that will have access to CDC data. If set to NULL, anyone with access to the database will be able to access the CDC data.

  • @supports_net_changes: If set to 1, enables the "net change" mode – an aggregated view of the changes (only the latest version of each row). If set to 0, all changes are recorded in detail.

Test Change Data Capture

We have everything set up for our testing. CDC is enabled for both the database and the tracked table.
Let's test it by inserting, updating, and deleting records in the table.

USE TestCDC;
GO

-- Insert data
INSERT INTO Employees (ID, FullName, Position, Rating)
VALUES (1, 'John Novak', 'Analytic', 18.00),
       (2, 'Peter Burn', 'Programmer', 20.00);
GO

-- Update data
UPDATE Employees
SET Rating = 23.00, Position = 'Gardener'
WHERE ID = 2;
GO

-- Delete data
DELETE FROM Employees
WHERE ID = 1;
GO

In our test database TestCDC, we should now see several new system tables that contain the captured changes:

  • cdc.captured_columns: A list of columns that are being tracked by CDC.

  • cdc.change_tables: Metadata about all CDC instances in the database, i.e., which tables are being tracked.

  • cdc.dbo_Employees_CT: Change data (INSERT/UPDATE/DELETE) for the specific Employees table.

  • cdc.ddl_history: History of DDL changes on tracked tables (e.g., ALTER TABLE), records structural modifications to the tables.

  • cdc.index_columns: Information about indexes on tracked tables, mainly primary keys used for identifying changes.

  • cdc.lsn_time_mapping: Mapping of LSN (Log Sequence Number) to the actual change time (datetime).

  • dbo.systranschemas: Internal helper table used to identify schemas for CDC.

Let's try a few simple queries and look at their results.

Viewing Captured Changes

-- Change data
SELECT * FROM cdc.dbo_Employees_CT

-- Added timestamp
SELECT 
  *, 
  sys.fn_cdc_map_lsn_to_time(__$start_lsn) AS CaptureTimeStamp
FROM cdc.dbo_Employees_CT;

-- A slightly cleaner view
SELECT 
    __$start_lsn,
    sys.fn_cdc_map_lsn_to_time(__$start_lsn) AS CaptureTimeStamp,
    __$operation,
    CASE __$operation
        WHEN 1 THEN 'DELETE'
        WHEN 2 THEN 'INSERT'
        WHEN 3 THEN 'UPDATE (old)'
        WHEN 4 THEN 'UPDATE (new)'
    END AS OperationType,
    ID,
    FullName,
    Position,
    Rating
FROM cdc.dbo_Employees_CT
ORDER BY __$start_lsn;

The value of the __$operation field indicates the type of change:

  • 1 = delete (DELETE)

  • 2 = insert (INSERT)

  • 3 = old row (previous state during an UPDATE)

  • 4 = new row (new state during an UPDATE)

Disabling Change Data Capture

If change tracking is no longer needed, it's a good idea to disable CDC.
This saves database performance and disk space.

USE TestCDC;
GO

-- Stops tracking changes for a single table
EXEC sys.sp_cdc_disable_table 
    @source_schema = N'dbo', 
    @source_name = N'Employees', 
    @capture_instance = N'dbo_Employees';


-- Disables CDC for the entire database
EXEC sys.sp_cdc_disable_db;

A Bit About Configuration

We have CDC set up and change tracking is working quite well. But what about maintenance and performance?
Change Data Capture automatically deletes old records every few days.
It relies on two main SQL Server Agent Jobs:

  • Capture Job: Reads the transaction log and copies changes into the CDC tables.

  • Cleanup Job: Regularly deletes old records from the CDC tables based on the retention settings.

These jobs can be configured according to our preferences.

Capture Job – Settings:

  • maxtrans: Maximum number of transactions processed in a single batch.

  • maxscans: Maximum number of log reads before a short pause.

  • continuous: 1 = job runs continuously, 0 = runs once (useful for testing).

  • pollinginterval: Number of seconds between log scans (applies only if continuous = 1).

-- Example: sets the log polling interval to every 5 seconds.
EXEC sys.sp_cdc_change_job 
    @job_type = N'capture',
    @pollinginterval = 5;

Cleanup Job – Settings:

  • retention: Number of minutes after which records in the CDC .CT tables are removed.

  • threshold: Number of records deleted in a single batch (performance control).

-- Example: sets cleanup of old data to run on a daily cycle.
EXEC sys.sp_cdc_change_job
    @job_type = N'cleanup',
    @retention = 1440; --one day in minute

Restarting CDC Jobs After Configuration Changes

When changing job settings, it is necessary to restart these services.

-- Restart Capture Job.
EXEC sys.sp_cdc_stop_job @job_type = N'capture';
EXEC sys.sp_cdc_start_job @job_type = N'capture';

-- Restart Cleanup Job.
EXEC sys.sp_cdc_stop_job @job_type = N'cleanup';
EXEC sys.sp_cdc_start_job @job_type = N'cleanup';

Monitoring the Size of CDC Tables

In a production environment, it is advisable to monitor the behavior of different configurations and their impact on database size and resource usage. For example, monitoring the size:

SELECT 
    t.name AS TableName,
    SUM(ps.used_page_count) * 8 / 1024 AS SizeMB
FROM sys.dm_db_partition_stats ps
JOIN sys.tables t ON t.object_id = ps.object_id
WHERE t.schema_id = SCHEMA_ID('cdc')
GROUP BY t.name
ORDER BY SizeMB DESC;

For Larger Tables and Frequent Changes, Try:

  • For larger tables and frequent changes, try reducing the @retention from the default 3 days to less, for example to 1 day (1440 minutes).

  • Depending on CPU performance, try increasing the @threshold for more efficient cleanup.

  • For frequent changes, you can reduce the @pollinginterval so the capture job reacts faster and processes changes more quickly.

The status of individual CDC jobs can also be monitored using sys.sp_cdc_help_jobs. This is a system stored procedure in SQL Server that provides information about the Change Data Capture agent jobs in the database.
It helps to find out which CDC agent jobs are configured, whether they are running correctly, their status, and more.

-- List and status of CDC Jobs.
EXEC sys.sp_cdc_help_jobs;

CDC Job Status Output by SQL Server Version

The output may vary depending on the SQL Server version. Versions 2022 and newer report the job status directly. Older versions (2012 - 2019) only show the job settings.

SQL Server Version 2022 and Newer:

  • job_type: capture or cleanup.

  • job_id: GUID identifier of the SQL Agent Job.

  • enabled: 1 = job is enabled, 0 = job is disabled.

  • status: Job status (numeric value, e.g., 1 = running).

  • last_run_date: Last run date of the job (YYYYMMDD).

  • last_run_time: Last run time of the job (HHMMSS).

  • last_run_outcome: Result of the last run (0 = failure, 1 = success).

  • message: Text message (if any).

SQL Server Version 2012 – 2019:

  • job_id: Unique job identifier (GUID).

  • job_type: Job type: capture or cleanup.

  • job_name: Name of the job in SQL Server Agent.

  • maxtrans: (capture only) Maximum number of transactions processed in one batch.

  • maxscans: (capture only) Maximum number of log scans per cycle.

  • continuous: 1 = job runs continuously (CDC capture job), 0 = no.

  • pollinginterval: Time (in seconds) between log scans if continuous = 1.

  • retention: (cleanup only) Number of minutes records are retained in CT tables.

  • threshold: (cleanup only) Number of records cleaned in one batch.

If you want to check the status on an older server, you can use the following query:

SELECT 
    j.name AS JobName,
    ja.start_execution_date AS LastStart,
    ja.stop_execution_date AS LastStop,
    ja.run_requested_date,
    ja.run_requested_source,
    h.run_status, -- 1 = Success, 0 = Failed
    h.message
FROM msdb.dbo.sysjobs j
LEFT JOIN msdb.dbo.sysjobactivity ja ON j.job_id = ja.job_id
LEFT JOIN msdb.dbo.sysjobhistory h ON j.job_id = h.job_id
WHERE j.name LIKE 'cdc.%';

Note

Before using CDC, it is recommended to set the recovery model to FULL.
The recovery model has a crucial impact on how CDC works because CDC does not read directly from the tracked table, but from the transaction log.
It is important how long the records are retained in the log — if the recovery model is set to SIMPLE, the log records might be deleted before CDC can process them.

Checking the Recovery Model:

SELECT name, recovery_model_desc
FROM sys.databases
WHERE name = 'YUOR-DATABASE-NAME';

Changing the Recovery Model:

ALTER DATABASE YOUR-DATABASE-NAME SET RECOVERY FULL;

Things to Watch Out For

  • Unbacked log: The log grows indefinitely.

  • CDC job not running: No new records, risk of data loss.

  • Large INSERT/UPDATE: Can cause temporary performance degradation.

  • Too long retention: Unnecessarily large .CT tables.

  • CDC enabled on a table without a PK: Difficult to track changed rows.

Conclusion

Change Data Capture is a very powerful tool for recording changes in SQL Server, minimizing performance impact, eliminating the need for custom triggers, and ideal for data warehousing, replication, and monitoring. A properly configured CDC system allows you to track and analyze all changes in data without complex logic or manual intervention.


If you found this useful, consider supporting me:

Buy Me a Coffee


👉 My github profile GitHub

👉 My blog page Hashnode

📷 Cover photo by Raghavendra Saralaya

0
Subscribe to my newsletter

Read articles from mortylen directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

mortylen
mortylen

By day, I work on enterprise and industrial systems; by night, I experiment with modern technologies. I’m committed to lifelong learning and continuously exploring new ideas.