Skip to content

System Architecture.

Overview

DBConvert Streams is a distributed data processing platform designed for database migration and real-time replication. The architecture emphasizes scalability, reliability, and efficient handling of both real-time change data capture (CDC) and bulk data transfers.

General Structure

DBConvert Streams follows a modern client-server architecture consisting of:

Backend Components

The backend is implemented as a set of lightweight Go binaries that provide the core functionality:

1. API Server:

  • Exposes RESTful endpoints for stream and connection management
  • Handles user authentication and authorization (requires API Key obtained at https://streams.dbconvert.com/account)
  • Manages stream configurations and lifecycle

2. Source Reader:

  • Connects to and reads from source databases
  • Implements both CDC and conversion mode logic
  • Publishes data to NATS messaging system
  • Reports progress metrics

3. Target Writer:

  • Consumes data from NATS
  • Writes to target databases
  • Handles schema creation and data type mapping
  • Can be scaled horizontally by running multiple instances

Each binary is designed to be:

  • Lightweight (up to 30MB)
  • Configurable through command-line flags and environment variables

Frontend Application

The frontend is a cutting-edge web application that provides:

  • Intuitive dashboard for stream management
  • Real-time monitoring and statistics
  • Database connection management
  • Stream configuration wizards
  • System health monitoring

Technical Characteristics:

  • Browser-based access
  • Responsive design for different screen sizes
  • Live log streaming via WebSocket connections
  • Integration with the backend API

Key UI Components:

  • Connection management interface
  • Stream configuration wizard
  • Real-time monitoring dashboard
  • Usage statistics and quota tracking
  • System status indicators

This architecture allows for flexible deployment options while maintaining a clear separation of concerns between the data processing logic in the backend and the user interface in the frontend.

Core Components

The platform consists of three main components working in concert:

1. API Server

Serves as the central management interface by:

Authentication and Authorization:

Managing user authentication and authorization through API keys (obtained from https://streams.dbconvert.com/account)

Database Connection Management

  • Creating and configuring connections
  • Testing connection validity
  • Updating connection parameters
  • Removing connections

Data Stream Management

  • Creating and configuring streams
  • Starting and stopping stream operations
  • Monitoring stream status
  • Tracking transfer progress

Monitoring

  • Real-time status information
  • Performance metrics

2. Source Reader

  • Reads data from source databases using specialized adapters
  • Supports two distinct reading modes:
    • CDC Mode: Captures changes from transaction logs (WAL/binlog)
    • Conversion Mode: Performs direct table reads with intelligent chunking
  • Manages connection pooling and implements retry logic
  • Publishes data to NATS messaging system

3. Target Writer

  • Consumes data from NATS and writes to target databases
  • Handles schema creation and automatic type mapping between different databases
  • Supports horizontal scaling for improved performance
  • Implements transaction management and consistency checks
  • Can run multiple instances in parallel for better throughput

Infrastructure Components

NATS Integration

  • Functions as the backbone for inter-component communication
  • Provides reliable message streaming between Source Readers and Target Writers
  • Supports persistence and replay of messages for fault tolerance
  • Enables horizontal scaling through distributed messaging
  • Creates dedicated streams for reliable data transfer operations

HashiCorp Vault Integration

  • Securely stores and manages sensitive information:
    • Database credentials
    • SSL certificates
    • Connection parameters

HashiCorp Consul Integration

  • Handles service discovery and registration
  • Provides health checking and monitoring
  • Manages distributed configuration

Data Flow Architecture

1. Initialization Phase

  1. Reader retrieves meta-structures (tables/indexes) from source database
  2. Meta-structures are sent to NATS for creating corresponding structures on target
  3. Target Writer creates structures on target database

2. Data Transfer Phase

  1. Source Reader retrieves data in configurable bundles
  2. Data is published to NATS streams
  3. Target Writers consume and write data in parallel
  4. Progress metrics from Reader and Writers are collected and reported

3. Monitoring and Control

  • Real-time progress tracking and metrics collection
  • Support for horizontal scaling of Target Writers
  • Reporting of stream status and progress
  • Pause and resume operations
  • Comprehensive logging

Data Flow Diagram

Security Architecture

Authentication and Authorization

  • API key validation for service access
  • SSL/TLS encryption for data in transit

Database Security

  • Secure credential management through Vault
  • Support for SSL/TLS database connections
  • Client certificate management
  • Encrypted storage of sensitive configuration

Deployment Options

Primary Deployment Methods

  • Docker container deployment (recommended)

    • Complete docker-compose configurations
    • Infrastructure included (NATS, Consul, Vault)
    • Simplified management and updates
  • Linux binary deployment

    • Supported on AMD64 and ARM64
    • Systemd service management
    • Direct host installation

Cloud Platform Deployment

DBConvert Streams can run on any cloud platform that supports Docker containers or Linux servers:

  • Amazon EC2
  • Google Cloud Platform
  • Microsoft Azure
  • DigitalOcean
  • Any platform supporting Docker/Linux

On-premises Installation

  • Docker-based deployment
  • Binary installation on Linux servers
  • Full infrastructure stack included

Performance Features

Optimization Techniques

Limitations and Constraints

  • MySQL CDC requires binary logging enabled
  • PostgreSQL CDC requires logical replication configuration
  • Specific database version requirements:
    • PostgreSQL 10 or higher
    • MySQL 8.0 or higher
  • API key required for all operations (Obtain API key at https://streams.dbconvert.com/account)
  • Proper database permissions needed for CDC/replication

This architecture provides the foundation for DBConvert Streams' reliable data transfer capabilities while maintaining flexibility for different deployment scenarios and use cases.

DBConvert Streams - event driven replication for databases