Welcome to ObservML Documentation
ObservML is a comprehensive machine learning framework designed for process monitoring, anomaly detection, fault isolation, and time series analysis. Built with a plugin-based architecture, it provides easy-to-use microservices with integrated MLOps, tracking, and deployment capabilities.
What is ObservML?
ObservML (Observable Machine Learning) is a modular framework that focuses on industrial process monitoring and anomaly detection. The project evolved from MMLW (Modular Machine Learning Workflows) to provide a more robust, scalable, and production-ready solution for machine learning in industrial environments.
Key Features
- ExperimentHub Architecture: Central management system for multiple experiments
- Plugin System: Extensible architecture with MLOps, DataStream, and custom plugins
- Configuration-Driven: YAML-based configuration for easy setup and deployment
- REST API: FastAPI-based API with automatic documentation
- Real-time Processing: RabbitMQ integration for streaming data and predictions
- MLOps Integration: Built-in MLflow support for experiment tracking and model registry
Prerequisites
Before getting started with ObservML, ensure you have the following:
System Requirements
- Computer: At least 16GB RAM (32GB+ recommended for production), 32-64GB free disk space
- Operating System: Linux, macOS, or Windows with WSL2
- Python: Version 3.11 or higher
- Docker: Docker Engine and Docker Compose (for containerized deployment)
- Git: For cloning the repository
Development Tools (Optional)
- IDE: VSCode, PyCharm, or any preferred IDE
- Poetry: For dependency management (recommended)
- Make: For using Makefile commands (optional)
Quick Installation
1. Clone the Repository
git clone https://github.com/adamipkovich/observml.git
cd observml
2. Choose Installation Method
Option A: Poetry (Recommended)
# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
poetry install
# Activate virtual environment
poetry shell
Option B: pip
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
3. Start Infrastructure Services
# Start MLflow and RabbitMQ using Docker
docker-compose up -d mlflow rabbitmq
4. Configure ObservML
# Copy example configuration
cp hub_config.yaml hub_config.yaml
# Edit configuration as needed
nano hub_config.yaml
5. Start the API Server
python ExperimentHubAPI.py
The API will be available at: - API Server: http://localhost:8010 - API Documentation: http://localhost:8010/docs - MLflow UI: http://localhost:5000 - RabbitMQ Management: http://localhost:15672 (guest/guest)
Architecture Overview
ObservML is built around several core components:
ExperimentHub
The central component that manages all experiments, plugins, and configurations. It provides: - Experiment lifecycle management (create, train, predict, save, load) - Plugin coordination and health monitoring - Configuration management - API endpoint handling
Plugin System
ObservML uses a plugin architecture for extensibility:
- MLOps Plugin: Handles experiment tracking and model registry (MLflow)
- DataStream Plugin: Manages real-time data streaming (RabbitMQ)
- Custom Plugins: Extensible system for adding new functionality
Experiment Types
Modular experiment implementations for different use cases: - Time Series Analysis: Forecasting and anomaly detection in time series data - Fault Detection: Anomaly detection in sensor data - Fault Isolation: Classification and root cause analysis - Process Mining: Business process analysis and optimization
Basic Workflow
ObservML follows a simple workflow for machine learning experiments:
graph TD
A[Configure System] --> B[Create Experiment]
B --> C[Send Data via RabbitMQ]
C --> D[Train Model]
D --> E[Save to MLflow]
E --> F[Make Predictions]
F --> G[Visualize Results]
G --> H[Monitor Performance]
H --> I{Retrain Needed?}
I -->|Yes| D
I -->|No| F
Step-by-Step Process
- Configuration: Set up plugins and experiment types in
hub_config.yaml - Data Preparation: Send training data through RabbitMQ queues
- Experiment Creation: Create and configure experiments via API
- Training: Train models with automatic MLflow tracking
- Prediction: Make real-time predictions on streaming data
- Monitoring: Track model performance and trigger retraining when needed
Communication Architecture
ObservML uses a microservices architecture with message-based communication:
graph LR
A[Data Source] --> B[RabbitMQ]
B --> C[ExperimentHub]
C --> D[MLflow]
C --> E[Model Storage]
F[API Client] --> G[FastAPI]
G --> C
C --> H[Predictions]
H --> I[Visualization]
Key Benefits
- Decoupled Architecture: Services can be scaled independently
- Asynchronous Processing: Non-blocking operations for better performance
- Fault Tolerance: Message queues provide reliability and retry mechanisms
- Scalability: Easy to add more workers or services as needed
Getting Help
Documentation Structure
This documentation is organized into several sections:
- Local Development: Setting up a development environment
- Deployment: Production deployment with Docker
- Client Usage: Using the API and client libraries
- API Reference: Complete API documentation
- REST API Reference: Comprehensive REST API guide with framework architecture
- Configuration: Configuration options and examples
- Plugin System: Extending ObservML with plugins
- Models: Available models and algorithms
- Experiments: Experiment types and configurations
Support Channels
- GitHub Issues: Report bugs and request features
- GitHub Discussions: Ask questions and share ideas
- Documentation: Comprehensive guides and API reference
- Examples: Sample configurations and use cases
Contributing
ObservML is an open-source project and welcomes contributions:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
Next Steps
Now that you have ObservML installed, you can:
- Set up local development for experimentation
- Deploy to production using Docker
- Learn the API for programmatic access
- Explore examples to understand common use cases
- Configure experiments for your specific needs
Project History
ObservML evolved from the MMLW (Modular Machine Learning Workflows) project, which was part of research project 2020-1.1.2-PIACI-KFI-2020-00062. The project has been redesigned with:
- Modern plugin architecture
- Improved scalability and performance
- Better separation of concerns
- Enhanced configuration management
- Production-ready deployment options
The focus remains on providing accessible machine learning tools for industrial process monitoring and anomaly detection, but with a more robust and extensible foundation.