Skip to content

Stream Configuration Guide

Overview

A stream in DBConvert Streams represents a configured data pipeline that moves data between source and target databases. This guide covers all configuration options and settings available for stream setup.

Basic Configuration

Here's a complete example of a stream configuration:

json
{
    "id": "config_2sZyLN37tM65K585HTeph462N58",
    "name": "MySQL_to_PostgreSQL_stream",
    "source": "conn_2sZyDMeIbdaXnN9OL7vYzzAiZU5",
    "target": "conn_2sZyBc83hypbiTMrNTotKPyjwFC",
    "mode": "convert",
    "dataBundleSize": 500,
    "reportingIntervals": {
        "source": 3,
        "target": 3
    },
    "createStructure": true,
    "tables": [
        {
            "name": "users",
            "query": "SELECT * FROM users WHERE active = true"
        }
    ],
    "limits": {
        "numberOfEvents": 1000000,
        "elapsedTime": 3600
    }
}

Core Settings

SettingTypeDescriptionDefault
idstringStream identifier (auto-generated)auto
namestringStream namerequired
sourcestringSource connection IDrequired
targetstringTarget connection IDrequired
modestring"convert" or "cdc"required
createStructurebooleanCreate tables in targettrue
dataBundleSizenumberRecords per batch (10-1000)500

Mode Selection

Convert Mode

  • One-time data transfer
  • Ideal for migrations or initial loads
  • Supports custom SQL queries
  • Full table structure creation

CDC Mode (Change Data Capture)

  • Real-time change replication
  • Captures INSERT, UPDATE, DELETE operations
  • Continuous synchronization
  • Requires specific database configuration

Table Configuration

Basic Table Selection

json
{
    "tables": [
        {
            "name": "users"
        },
        {
            "name": "orders"
        }
    ]
}

Table Parameters

ParameterModeDescriptionExample
nameBothName of the table to replicate"users"
queryConvertCustom SQL query to filter or transform data during conversion. Only used in convert mode."SELECT * FROM users WHERE active = true"
operationsCDCArray of operations to track in CDC mode. Possible values: ["insert"], ["update"], ["delete"], or any combination.["insert", "update"]

Query Parameter (Convert Mode)

The query parameter allows you to:

  • Filter source data using WHERE clauses
  • Transform data using SQL functions
  • Join multiple tables
  • Order results using ORDER BY
  • Limit result set size using LIMIT

Example queries:

json
{
    "tables": [
        {
            "name": "users",
            "query": "SELECT * FROM users WHERE status = 'active'"
        },
        {
            "name": "orders",
            "query": "SELECT o.*, c.name as customer_name FROM orders o JOIN customers c ON o.customer_id = c.id"
        }
    ]
}

Operations Parameter (CDC Mode)

The operations parameter specifies which database operations to track in CDC mode:

  • insert: Track new record insertions
  • update: Track record modifications
  • delete: Track record deletions

Example configurations:

json
{
    "tables": [
        {
            "name": "audit_log",
            "operations": ["insert"]  // Only track new records
        },
        {
            "name": "inventory",
            "operations": ["update", "delete"]  // Track modifications and deletions
        },
        {
            "name": "orders",
            "operations": ["insert", "update", "delete"]  // Track all changes
        }
    ]
}

Best Practices

  • In Convert mode, use the query parameter to minimize data transfer by filtering at the source
  • In CDC mode, only track necessary operations to reduce overhead
  • Consider table usage patterns when selecting operations to track

Performance Settings

Data Bundle Size

json
{
    "dataBundleSize": 500  // Default value
}
  • Larger (closer to 1000): Better for simple tables
  • Smaller: Better for complex tables or binary data
  • Adjust based on:
    • Table complexity
    • Record size
    • Available memory
    • Network capacity

Progress Reporting

json
{
    "reportingIntervals": {
        "source": 3,  // Report source progress every 3 seconds
        "target": 3   // Report target progress every 3 seconds
    }
}

Stream Limits

json
{
    "limits": {
        "numberOfEvents": 1000000,    // Stop after processing 1M events
        "elapsedTime": 3600          // Stop after 1 hour (in seconds)
    }
}

The limits configuration allows you to set boundaries for stream execution:

  • numberOfEvents: Maximum number of events to process before stopping
  • elapsedTime: Maximum duration in seconds before stopping the stream

Note

If the "tables" field is omitted in the stream configuration, all tables from the source database will be converted.

Stream Control Operations

Available through API or Dashboard:

  • Start: Begin data transfer
  • Pause: Temporarily halt transfer
  • Resume: Continue paused transfer
  • Stop: End transfer process

Best Practices

Initial Setup

  1. Start with default settings
  2. Test with small data subset
  3. Gradually optimize based on results

Performance Optimization

  1. Bundle Size:

    • Start with default (500)
    • Monitor system resources
    • Adjust based on performance
  2. Progress Reporting:

    • More frequent: Better visibility
    • Less frequent: Lower overhead
    • Balance based on needs
  3. Table Structure:

    • Use createStructure: true for initial setup
    • Consider noCreateIndexes for large initial loads
    • Create indexes after data load for better performance

Error Handling

The stream configuration includes automatic error handling:

  • Retries for temporary connection issues
  • Transaction rollback on errors
  • Detailed error reporting in logs

DBConvert Streams - event driven replication for databases