Combining models

Contents

Combining models#

Annolid lets you mix and match tools depending on your data and the question you’re asking. A common pattern is:

Use a foundation model to reduce manual labeling.
Use a tracker to propagate labels through time.
Use a task-specific model (e.g., YOLO pose) if you need extra signals.
Export everything to CSV and combine analyses offline.

Practical combinations that work well#

AI polygons + tracking#

Use AI polygons (SAM-family) to outline each instance on a single frame.
Track forward with Cutie / EfficientTAM-style VOS backends.
Review/correct, then export a tracking CSV.

Text prompt + refinement#

Use text prompts (Grounding DINO → SAM) to quickly bootstrap object masks when manual clicks are slow.
Convert masks to polygons and continue with your usual tracking workflow.

Segmentation + keypoint tracking#

Use mask tracking (polygons) for identity and body region.
Use DINO-based keypoint tracking (see book/tutorials/DINOv3_keypoint_tracking.md) when you need body-part trajectories.

Custom YOLO + Annolid review#

Train a YOLO segmentation/pose model on your own labels.
Run inference (including real-time use-cases), then review and correct results in Annolid.

Export strategy#

To combine outputs cleanly across tools, pick one “source of truth” format:

LabelMe JSON (best for review/editing in Annolid)
CSV (best for analysis in Python/R)
COCO / YOLO (best for training external models)