A state-of-the-art ear segmentation library powered by deep learning. Detect and segment human ears in images and video streams with high accuracy and real-time performance.
# Using pip
pip install earsegmentationai
# Using poetry (recommended)
poetry add earsegmentationai
For detailed installation instructions, see Installation Guide.
from earsegmentationai import ImageProcessor
# Initialize processor
processor = ImageProcessor(device="cpu") # or "cuda:0" for GPU
# Process single image
result = processor.process("path/to/image.jpg")
print(f"Ear detected: {result.has_ear}")
print(f"Ear area: {result.ear_percentage:.2f}% of image")
# Process with visualization
result = processor.process(
"path/to/image.jpg",
return_visualization=True
)
# Process single image
earsegmentationai process-image path/to/image.jpg --save-viz
# Process directory
earsegmentationai process-image path/to/images/ -o output/
# Real-time webcam
earsegmentationai webcam --device cuda:0
# Process video
earsegmentationai process-video path/to/video.mp4 -o output.avi
from earsegmentationai import ImageProcessor
processor = ImageProcessor(device="cuda:0")
# Process multiple images
results = processor.process([
"image1.jpg",
"image2.jpg",
"image3.jpg"
])
print(f"Detection rate: {results.detection_rate:.1f}%")
print(f"Average ear area: {results.average_ear_area:.0f} pixels")
from earsegmentationai import VideoProcessor
processor = VideoProcessor(
device="cuda:0",
skip_frames=2, # Process every 3rd frame
smooth_masks=True # Temporal smoothing
)
# Process video file
stats = processor.process(
"video.mp4",
output_path="output.mp4",
display=True
)
print(f"FPS: {stats['average_fps']:.1f}")
print(f"Detection rate: {stats['detection_rate']:.1f}%")
from earsegmentationai import ImageProcessor, Config
# Create custom configuration
config = Config(
model={"architecture": "FPN", "encoder_name": "resnet50"},
processing={"input_size": (640, 480), "batch_size": 8}
)
processor = ImageProcessor(config=config, threshold=0.7)
Parameter | Default | Description |
---|---|---|
architecture |
"Unet" |
Model architecture (Unet, FPN, PSPNet, DeepLabV3, DeepLabV3Plus) |
encoder_name |
"resnet18" |
Encoder backbone |
input_size |
(480, 320) |
Input image size (width, height) |
threshold |
0.5 |
Binary mask threshold |
Parameter | Default | Description |
---|---|---|
device |
"cpu" |
Processing device (cpu, cuda:0) |
batch_size |
1 |
Batch size for processing |
skip_frames |
0 |
Frame skipping for video (0 = process all) |
smooth_masks |
True |
Enable temporal smoothing for video |
The library uses a modular architecture with clear separation of concerns:
earsegmentationai/
โโโ core/ # Core model and prediction logic
โโโ preprocessing/ # Image preprocessing and validation
โโโ postprocessing/ # Visualization and export utilities
โโโ api/ # High-level Python API
โโโ cli/ # Command-line interface
โโโ utils/ # Logging, exceptions, and helpers
# Run all tests
make test
# Run with coverage
make test-cov
# Run specific test suite
poetry run pytest tests/unit/test_transforms.py
We welcome contributions! Please see our Contributing Guide for details.
# Setup development environment
make install-dev
# Run linting and formatting
make format
make lint
# Run pre-commit hooks
make pre-commit
Device | Image Size | FPS | Memory |
---|---|---|---|
CPU (i7-9700K) | 480ร320 | 15 | 200 MB |
GPU (RTX 3080) | 480ร320 | 120 | 400 MB |
GPU (RTX 3080) | 1920ร1080 | 45 | 800 MB |
This project is licensed under the MIT License - see the LICENSE file for details.
Made with โค๏ธ by the Ear Segmentation AI Team