Design Principles
VMProber is built on several key design principles that guide architectural decisions.
1. Reliability First
WAL System
- All probe results written to Write-Ahead Log before processing
- Ensures no data loss on crashes or restarts
- Supports compression and rotation for efficiency
Retry Logic
- Exponential backoff for failed operations
- Configurable retry attempts and delays
- Circuit breaker pattern to prevent cascading failures
Graceful Shutdown
- Proper cleanup of resources
- Finishes in-flight operations
- Prioritized shutdown order
2. Performance
Efficient Scheduling
- Jitter to distribute load
- Rate limiting per host and globally
- Worker pool with bounded concurrency
Batch Processing
- Batch metrics export for efficiency
- Buffer management for optimal throughput
- Connection pooling for HTTP clients
Resource Management
- Memory-efficient data structures
- Proper goroutine lifecycle management
- CPU usage monitoring and limits
3. Observability
Comprehensive Metrics
- Prometheus-compatible metrics
- Detailed probe statistics
- System resource metrics
Structured Logging
- JSON format for machine parsing
- Contextual information in logs
- Configurable log levels
Health Checks
- Health endpoint for liveness
- Readiness endpoint for startup
- Component-level health status
4. Extensibility
Pluggable Architecture
- Interface-based design
- Easy to add new probe types
- Configurable components
Hot Reload
- Configuration reload without restart
- Dynamic target discovery
- Runtime configuration updates
Modular Design
- Clear separation of concerns
- Minimal dependencies between modules
- Well-defined interfaces
5. Production Readiness
Security
- TLS support for HTTP server
- Authentication for push endpoints
- Input validation and sanitization
Scalability
- Horizontal scaling support
- Efficient resource usage
- Performance under load
Maintainability
- Clear code structure
- Comprehensive documentation
- Testing at all levels
Implementation Patterns
Interface-Based Design
All major components use interfaces for flexibility and testability.
Context Propagation
Context used throughout for cancellation and timeouts.
Error Handling
Consistent error handling with wrapping and context.
Configuration Management
Centralized configuration with validation and hot reload.
Next Steps
- Component Architecture - How components are designed
- Data Flow - How data moves through the system