Change data capture (CDC) provides efficient, distributed, row-level changefeeds into a configurable sink for downstream processing such as reporting, caching, or full-text indexing.
What is change data capture?
While CockroachDB is an excellent system of record, it also needs to coexist with other systems. For example, you might want to keep your data mirrored in full-text indexes, analytics engines, or big data pipelines.
The main feature of CDC is the changefeed, which targets an allowlist of tables, called the "watched rows". There are two implementations of changefeeds:
Core changefeeds | Enterprise changefeeds |
---|---|
Useful for prototyping or quick testing. | Recommended for production use. |
Available in all products. | Available in CockroachDB Dedicated or with an Enterprise license in CockroachDB Self-Hosted or CockroachDB Serverless (beta). |
Streams indefinitely until underlying SQL connection is closed. | Maintains connection to configured sink. |
Create with EXPERIMENTAL CHANGEFEED FOR . |
Create with CREATE CHANGEFEED . |
Watches one or multiple tables in a comma-separated list. Emits every change to a "watched" row as a record. | Watches one or multiple tables in a comma-separated list. Emits every change to a "watched" row as a record in a configurable format ( JSON or Avro) to a configurable sink (e.g., Kafka). |
CREATE changefeed and cancel by closing the connection. |
Manage changefeed with CREATE , PAUSE , RESUME , ALTER , and CANCEL , as well as monitor and debug. |
See Ordering Guarantees for detail on CockroachDB's at-least-once-delivery-guarantee as well as explanation on how rows are emitted.
Known limitations
- Changefeeds cannot be backed up or restored. Tracking GitHub Issue
- Changefeed target options are limited to tables. Tracking GitHub Issue
- Using a cloud storage sink only works with
JSON
and emits newline-delimited json files. Tracking GitHub Issue - Webhook sinks only support HTTPS. Use the
insecure_tls_skip_verify
parameter when testing to disable certificate verification; however, this still requires HTTPS and certificates. Tracking GitHub Issue - Webhook sinks and Google Cloud Pub/Sub sinks only have support for emitting
JSON
. Tracking GitHub Issue - There is no concurrency configurability for webhook sinks. Tracking GitHub Issue
- Changefeeds will emit
NULL
values forVIRTUAL
computed columns and not the column's computed value. Tracking GitHub Issue - Using the
split_column_families
andresolved
options on the same changefeed will cause an error when using the following sinks: Kafka and Google Cloud Pub/Sub. Instead, use the individualFAMILY
keyword to specify column families when creating a changefeed. Tracking GitHub Issue - There is no configuration for unordered messages for Google Cloud Pub/Sub sinks. You must specify the
region
parameter in the URI to maintain ordering guarantees. Tracking GitHub Issue