Stream Configuration Guide
Overview
A stream in DBConvert Streams represents a configured data pipeline that moves data between source and target databases. This guide covers all configuration options and settings available for stream setup.
Basic Configuration
Here's a complete example of a stream configuration:
{
"id": "config_2sZyLN37tM65K585HTeph462N58",
"name": "MySQL_to_PostgreSQL_stream",
"source": "conn_2sZyDMeIbdaXnN9OL7vYzzAiZU5",
"target": "conn_2sZyBc83hypbiTMrNTotKPyjwFC",
"mode": "convert",
"dataBundleSize": 500,
"reportingIntervals": {
"source": 3,
"target": 3
},
"createStructure": true,
"tables": [
{
"name": "users",
"query": "SELECT * FROM users WHERE active = true"
}
],
"limits": {
"numberOfEvents": 1000000,
"elapsedTime": 3600
}
}
Core Settings
Setting | Type | Description | Default |
---|---|---|---|
id | string | Stream identifier (auto-generated) | auto |
name | string | Stream name | required |
source | string | Source connection ID | required |
target | string | Target connection ID | required |
mode | string | "convert" or "cdc" | required |
createStructure | boolean | Create tables in target | true |
dataBundleSize | number | Records per batch (10-1000) | 500 |
Mode Selection
Convert Mode
- One-time data transfer
- Ideal for migrations or initial loads
- Supports custom SQL queries
- Full table structure creation
CDC Mode (Change Data Capture)
- Real-time change replication
- Captures INSERT, UPDATE, DELETE operations
- Continuous synchronization
- Requires specific database configuration
Table Configuration
Basic Table Selection
{
"tables": [
{
"name": "users"
},
{
"name": "orders"
}
]
}
Table Parameters
Parameter | Mode | Description | Example |
---|---|---|---|
name | Both | Name of the table to replicate | "users" |
query | Convert | Custom SQL query to filter or transform data during conversion. Only used in convert mode. | "SELECT * FROM users WHERE active = true" |
operations | CDC | Array of operations to track in CDC mode. Possible values: ["insert"] , ["update"] , ["delete"] , or any combination. | ["insert", "update"] |
Query Parameter (Convert Mode)
The query
parameter allows you to:
- Filter source data using WHERE clauses
- Transform data using SQL functions
- Join multiple tables
- Order results using ORDER BY
- Limit result set size using LIMIT
Example queries:
{
"tables": [
{
"name": "users",
"query": "SELECT * FROM users WHERE status = 'active'"
},
{
"name": "orders",
"query": "SELECT o.*, c.name as customer_name FROM orders o JOIN customers c ON o.customer_id = c.id"
}
]
}
Operations Parameter (CDC Mode)
The operations
parameter specifies which database operations to track in CDC mode:
insert
: Track new record insertionsupdate
: Track record modificationsdelete
: Track record deletions
Example configurations:
{
"tables": [
{
"name": "audit_log",
"operations": ["insert"] // Only track new records
},
{
"name": "inventory",
"operations": ["update", "delete"] // Track modifications and deletions
},
{
"name": "orders",
"operations": ["insert", "update", "delete"] // Track all changes
}
]
}
Best Practices
- In Convert mode, use the
query
parameter to minimize data transfer by filtering at the source - In CDC mode, only track necessary operations to reduce overhead
- Consider table usage patterns when selecting operations to track
Performance Settings
Data Bundle Size
{
"dataBundleSize": 500 // Default value
}
- Larger (closer to 1000): Better for simple tables
- Smaller: Better for complex tables or binary data
- Adjust based on:
- Table complexity
- Record size
- Available memory
- Network capacity
Progress Reporting
{
"reportingIntervals": {
"source": 3, // Report source progress every 3 seconds
"target": 3 // Report target progress every 3 seconds
}
}
Stream Limits
{
"limits": {
"numberOfEvents": 1000000, // Stop after processing 1M events
"elapsedTime": 3600 // Stop after 1 hour (in seconds)
}
}
The limits
configuration allows you to set boundaries for stream execution:
numberOfEvents
: Maximum number of events to process before stoppingelapsedTime
: Maximum duration in seconds before stopping the stream
Note
If the "tables" field is omitted in the stream configuration, all tables from the source database will be converted.
Stream Control Operations
Available through API or Dashboard:
- Start: Begin data transfer
- Pause: Temporarily halt transfer
- Resume: Continue paused transfer
- Stop: End transfer process
Best Practices
Initial Setup
- Start with default settings
- Test with small data subset
- Gradually optimize based on results
Performance Optimization
Bundle Size:
- Start with default (500)
- Monitor system resources
- Adjust based on performance
Progress Reporting:
- More frequent: Better visibility
- Less frequent: Lower overhead
- Balance based on needs
Table Structure:
- Use
createStructure: true
for initial setup - Consider
noCreateIndexes
for large initial loads - Create indexes after data load for better performance
- Use
Error Handling
The stream configuration includes automatic error handling:
- Retries for temporary connection issues
- Transaction rollback on errors
- Detailed error reporting in logs