Ear-segmentation-ai

System Architecture

Overview

The Ear Segmentation AI system is built with a modular architecture that separates concerns and allows for easy extension and maintenance.

┌─────────────────────────────────────────────────────────────┐
│                          CLI Layer                          │
│                    (Typer + Rich CLI)                       │
├─────────────────────────────────────────────────────────────┤
│                          API Layer                          │
│              (ImageProcessor, VideoProcessor)               │
├─────────────────────────────────────────────────────────────┤
│                         Core Layer                          │
│           (ModelManager, EarPredictor, Config)              │
├─────────────────────────────────────────────────────────────┤
│                    Processing Pipeline                      │
│     (Preprocessing → Inference → Postprocessing)           │
├─────────────────────────────────────────────────────────────┤
│                      Infrastructure                         │
│           (Logging, Exceptions, Validators)                │
└─────────────────────────────────────────────────────────────┘

Core Components

1. Model Management (core/model.py)

2. Prediction Engine (core/predictor.py)

3. Configuration (core/config.py)

4. API Layer (api/)

5. CLI Interface (cli/)

Data Flow

Image Processing Pipeline

Input Image  Validation  Preprocessing  Model Inference  Post-processing  Results
                                                                           
   Load      Check size    Resize/Norm    U-Net Forward    Threshold      Mask +
  Image      & format      to 480×320       Pass           Binary Mask   Metadata

Video Processing Pipeline

Video Input  Frame Extraction  Batch Processing  Frame Assembly  Output Video
                                                                     
   Open       Read frames         Process each      Add overlays    Write with
  Stream      sequentially         with model       if requested      codec

Model Architecture

U-Net with ResNet18 Encoder

Input/Output Specifications

Design Patterns

1. Singleton Pattern

Used in ModelManager to ensure single model instance:

class ModelManager:
    _instance = None
    _model = None

    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

2. Factory Pattern

Processor creation based on input type:

def create_processor(input_type: str) -> BaseProcessor:
    if input_type == "image":
        return ImageProcessor()
    elif input_type == "video":
        return VideoProcessor()

3. Strategy Pattern

Different processing strategies for various input types while maintaining consistent interface.

Extension Points

1. Adding New Models

2. Custom Preprocessing

3. New Output Formats

Performance Considerations

1. Memory Management

2. GPU Optimization

3. Caching Strategy

Security Considerations

1. Input Validation

2. Model Integrity

3. Output Sanitization