System Architecture.
Overview
DBConvert Streams is a distributed data processing platform designed for database migration and real-time replication. The architecture emphasizes scalability, reliability, and efficient handling of both real-time change data capture (CDC) and bulk data transfers.
General Structure
DBConvert Streams follows a modern client-server architecture consisting of:
Backend Components
The backend is implemented as a set of lightweight Go binaries that provide the core functionality:
1. API Server:
- Exposes RESTful endpoints for stream and connection management
- Handles user authentication and authorization (requires API Key obtained at https://streams.dbconvert.com/account)
- Manages stream configurations and lifecycle
2. Source Reader:
- Connects to and reads from source databases
- Implements both CDC and conversion mode logic
- Publishes data to NATS messaging system
- Reports progress metrics
3. Target Writer:
- Consumes data from NATS
- Writes to target databases
- Handles schema creation and data type mapping
- Can be scaled horizontally by running multiple instances
Each binary is designed to be:
- Lightweight (up to 30MB)
- Configurable through command-line flags and environment variables
Frontend Application
The frontend is a cutting-edge web application that provides:
- Intuitive dashboard for stream management
- Real-time monitoring and statistics
- Database connection management
- Stream configuration wizards
- System health monitoring
Technical Characteristics:
- Browser-based access
- Responsive design for different screen sizes
- Live log streaming via WebSocket connections
- Integration with the backend API
Key UI Components:
- Connection management interface
- Stream configuration wizard
- Real-time monitoring dashboard
- Usage statistics and quota tracking
- System status indicators
This architecture allows for flexible deployment options while maintaining a clear separation of concerns between the data processing logic in the backend and the user interface in the frontend.
Core Components
The platform consists of three main components working in concert:
1. API Server
Serves as the central management interface by:
Authentication and Authorization:
Managing user authentication and authorization through API keys (obtained from https://streams.dbconvert.com/account)
Database Connection Management
- Creating and configuring connections
- Testing connection validity
- Updating connection parameters
- Removing connections
Data Stream Management
- Creating and configuring streams
- Starting and stopping stream operations
- Monitoring stream status
- Tracking transfer progress
Monitoring
- Real-time status information
- Performance metrics
2. Source Reader
- Reads data from source databases using specialized adapters
- Supports two distinct reading modes:
- CDC Mode: Captures changes from transaction logs (WAL/binlog)
- Conversion Mode: Performs direct table reads with intelligent chunking
- Manages connection pooling and implements retry logic
- Publishes data to NATS messaging system
3. Target Writer
- Consumes data from NATS and writes to target databases
- Handles schema creation and automatic type mapping between different databases
- Supports horizontal scaling for improved performance
- Implements transaction management and consistency checks
- Can run multiple instances in parallel for better throughput
Infrastructure Components
NATS Integration
- Functions as the backbone for inter-component communication
- Provides reliable message streaming between Source Readers and Target Writers
- Supports persistence and replay of messages for fault tolerance
- Enables horizontal scaling through distributed messaging
- Creates dedicated streams for reliable data transfer operations
HashiCorp Vault Integration
- Securely stores and manages sensitive information:
- Database credentials
- SSL certificates
- Connection parameters
HashiCorp Consul Integration
- Handles service discovery and registration
- Provides health checking and monitoring
- Manages distributed configuration
Data Flow Architecture
1. Initialization Phase
- Reader retrieves meta-structures (tables/indexes) from source database
- Meta-structures are sent to NATS for creating corresponding structures on target
- Target Writer creates structures on target database
2. Data Transfer Phase
- Source Reader retrieves data in configurable bundles
- Data is published to NATS streams
- Target Writers consume and write data in parallel
- Progress metrics from Reader and Writers are collected and reported
3. Monitoring and Control
- Real-time progress tracking and metrics collection
- Support for horizontal scaling of Target Writers
- Reporting of stream status and progress
- Pause and resume operations
- Comprehensive logging
Data Flow Diagram
Security Architecture
Authentication and Authorization
- API key validation for service access
- SSL/TLS encryption for data in transit
Database Security
- Secure credential management through Vault
- Support for SSL/TLS database connections
- Client certificate management
- Encrypted storage of sensitive configuration
Deployment Options
Primary Deployment Methods
Docker container deployment (recommended)
- Complete docker-compose configurations
- Infrastructure included (NATS, Consul, Vault)
- Simplified management and updates
Linux binary deployment
- Supported on AMD64 and ARM64
- Systemd service management
- Direct host installation
Cloud Platform Deployment
DBConvert Streams can run on any cloud platform that supports Docker containers or Linux servers:
- Amazon EC2
- Google Cloud Platform
- Microsoft Azure
- DigitalOcean
- Any platform supporting Docker/Linux
On-premises Installation
- Docker-based deployment
- Binary installation on Linux servers
- Full infrastructure stack included
Performance Features
Optimization Techniques
- Configurable data bundle sizes (10-1000 records)
- Parallel processing of insert operations
- Intelligent chunking for large tables
- Connection pooling and resource management
Limitations and Constraints
- MySQL CDC requires binary logging enabled
- PostgreSQL CDC requires logical replication configuration
- Specific database version requirements:
- PostgreSQL 10 or higher
- MySQL 8.0 or higher
- API key required for all operations (Obtain API key at https://streams.dbconvert.com/account)
- Proper database permissions needed for CDC/replication
This architecture provides the foundation for DBConvert Streams' reliable data transfer capabilities while maintaining flexibility for different deployment scenarios and use cases.