The Ear Segmentation AI system is built with a modular architecture that separates concerns and allows for easy extension and maintenance.
┌─────────────────────────────────────────────────────────────┐
│ CLI Layer │
│ (Typer + Rich CLI) │
├─────────────────────────────────────────────────────────────┤
│ API Layer │
│ (ImageProcessor, VideoProcessor) │
├─────────────────────────────────────────────────────────────┤
│ Core Layer │
│ (ModelManager, EarPredictor, Config) │
├─────────────────────────────────────────────────────────────┤
│ Processing Pipeline │
│ (Preprocessing → Inference → Postprocessing) │
├─────────────────────────────────────────────────────────────┤
│ Infrastructure │
│ (Logging, Exceptions, Validators) │
└─────────────────────────────────────────────────────────────┘
core/model.py
)core/predictor.py
)core/config.py
)api/
)cli/
)version
: Show version infoprocess-image
: Process imagesprocess-camera
: Real-time camerabenchmark
: Performance testingInput Image → Validation → Preprocessing → Model Inference → Post-processing → Results
↓ ↓ ↓ ↓ ↓ ↓
Load Check size Resize/Norm U-Net Forward Threshold Mask +
Image & format to 480×320 Pass Binary Mask Metadata
Video Input → Frame Extraction → Batch Processing → Frame Assembly → Output Video
↓ ↓ ↓ ↓ ↓
Open Read frames Process each Add overlays Write with
Stream sequentially with model if requested codec
Used in ModelManager to ensure single model instance:
class ModelManager:
_instance = None
_model = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
Processor creation based on input type:
def create_processor(input_type: str) -> BaseProcessor:
if input_type == "image":
return ImageProcessor()
elif input_type == "video":
return VideoProcessor()
Different processing strategies for various input types while maintaining consistent interface.
preprocessing/transforms.py