Welcome to ObservML Documentation

ObservML is a comprehensive machine learning framework designed for process monitoring, anomaly detection, fault isolation, and time series analysis. Built with a plugin-based architecture, it provides easy-to-use microservices with integrated MLOps, tracking, and deployment capabilities.

What is ObservML?

ObservML (Observable Machine Learning) is a modular framework that focuses on industrial process monitoring and anomaly detection. The project evolved from MMLW (Modular Machine Learning Workflows) to provide a more robust, scalable, and production-ready solution for machine learning in industrial environments.

Key Features

ExperimentHub Architecture: Central management system for multiple experiments
Plugin System: Extensible architecture with MLOps, DataStream, and custom plugins
Configuration-Driven: YAML-based configuration for easy setup and deployment
REST API: FastAPI-based API with automatic documentation
Real-time Processing: RabbitMQ integration for streaming data and predictions
MLOps Integration: Built-in MLflow support for experiment tracking and model registry

Prerequisites

Before getting started with ObservML, ensure you have the following:

System Requirements

Computer: At least 16GB RAM (32GB+ recommended for production), 32-64GB free disk space
Operating System: Linux, macOS, or Windows with WSL2
Python: Version 3.11 or higher
Docker: Docker Engine and Docker Compose (for containerized deployment)
Git: For cloning the repository

Development Tools (Optional)

IDE: VSCode, PyCharm, or any preferred IDE
Poetry: For dependency management (recommended)
Make: For using Makefile commands (optional)

Quick Installation

1. Clone the Repository

git clone https://github.com/adamipkovich/observml.git
cd observml

2. Choose Installation Method

Option A: Poetry (Recommended)

# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
poetry install

# Activate virtual environment
poetry shell

Option B: pip

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

3. Start Infrastructure Services

# Start MLflow and RabbitMQ using Docker
docker-compose up -d mlflow rabbitmq

4. Configure ObservML

# Copy example configuration
cp hub_config.yaml hub_config.yaml

# Edit configuration as needed
nano hub_config.yaml

5. Start the API Server

python ExperimentHubAPI.py

The API will be available at: - API Server: http://localhost:8010 - API Documentation: http://localhost:8010/docs - MLflow UI: http://localhost:5000 - RabbitMQ Management: http://localhost:15672 (guest/guest)

Architecture Overview

ObservML is built around several core components:

ExperimentHub

The central component that manages all experiments, plugins, and configurations. It provides: - Experiment lifecycle management (create, train, predict, save, load) - Plugin coordination and health monitoring - Configuration management - API endpoint handling

Plugin System

ObservML uses a plugin architecture for extensibility:

MLOps Plugin: Handles experiment tracking and model registry (MLflow)
DataStream Plugin: Manages real-time data streaming (RabbitMQ)
Custom Plugins: Extensible system for adding new functionality

Experiment Types

Modular experiment implementations for different use cases: - Time Series Analysis: Forecasting and anomaly detection in time series data - Fault Detection: Anomaly detection in sensor data - Fault Isolation: Classification and root cause analysis - Process Mining: Business process analysis and optimization

Basic Workflow

ObservML follows a simple workflow for machine learning experiments:

graph TD
    A[Configure System] --> B[Create Experiment]
    B --> C[Send Data via RabbitMQ]
    C --> D[Train Model]
    D --> E[Save to MLflow]
    E --> F[Make Predictions]
    F --> G[Visualize Results]
    G --> H[Monitor Performance]
    H --> I{Retrain Needed?}
    I -->|Yes| D
    I -->|No| F

Step-by-Step Process

Configuration: Set up plugins and experiment types in hub_config.yaml
Data Preparation: Send training data through RabbitMQ queues
Experiment Creation: Create and configure experiments via API
Training: Train models with automatic MLflow tracking
Prediction: Make real-time predictions on streaming data
Monitoring: Track model performance and trigger retraining when needed

Communication Architecture

ObservML uses a microservices architecture with message-based communication:

graph LR
    A[Data Source] --> B[RabbitMQ]
    B --> C[ExperimentHub]
    C --> D[MLflow]
    C --> E[Model Storage]
    F[API Client] --> G[FastAPI]
    G --> C
    C --> H[Predictions]
    H --> I[Visualization]

Key Benefits

Decoupled Architecture: Services can be scaled independently
Asynchronous Processing: Non-blocking operations for better performance
Fault Tolerance: Message queues provide reliability and retry mechanisms
Scalability: Easy to add more workers or services as needed

Getting Help

Documentation Structure

This documentation is organized into several sections:

Local Development: Setting up a development environment
Deployment: Production deployment with Docker
Client Usage: Using the API and client libraries
API Reference: Complete API documentation
REST API Reference: Comprehensive REST API guide with framework architecture
Configuration: Configuration options and examples
Plugin System: Extending ObservML with plugins
Models: Available models and algorithms
Experiments: Experiment types and configurations

Support Channels

GitHub Issues: Report bugs and request features
GitHub Discussions: Ask questions and share ideas
Documentation: Comprehensive guides and API reference
Examples: Sample configurations and use cases

Contributing

ObservML is an open-source project and welcomes contributions:

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Submit a pull request

Next Steps

Now that you have ObservML installed, you can:

Set up local development for experimentation
Deploy to production using Docker
Learn the API for programmatic access
Explore examples to understand common use cases
Configure experiments for your specific needs

Project History

ObservML evolved from the MMLW (Modular Machine Learning Workflows) project, which was part of research project 2020-1.1.2-PIACI-KFI-2020-00062. The project has been redesigned with:

Modern plugin architecture
Improved scalability and performance
Better separation of concerns
Enhanced configuration management
Production-ready deployment options

The focus remains on providing accessible machine learning tools for industrial process monitoring and anomaly detection, but with a more robust and extensible foundation.