Combining models#
Annolid lets you mix and match tools depending on your data and the question you’re asking. A common pattern is:
Use a foundation model to reduce manual labeling.
Use a tracker to propagate labels through time.
Use a task-specific model (e.g., YOLO pose) if you need extra signals.
Export everything to CSV and combine analyses offline.
Practical combinations that work well#
AI polygons + tracking#
Use AI polygons (SAM-family) to outline each instance on a single frame.
Track forward with Cutie / EfficientTAM-style VOS backends.
Review/correct, then export a tracking CSV.
Text prompt + refinement#
Use text prompts (Grounding DINO → SAM) to quickly bootstrap object masks when manual clicks are slow.
Convert masks to polygons and continue with your usual tracking workflow.
Segmentation + keypoint tracking#
Use mask tracking (polygons) for identity and body region.
Use DINO-based keypoint tracking (see
book/tutorials/DINOv3_keypoint_tracking.md) when you need body-part trajectories.
Custom YOLO + Annolid review#
Train a YOLO segmentation/pose model on your own labels.
Run inference (including real-time use-cases), then review and correct results in Annolid.
Export strategy#
To combine outputs cleanly across tools, pick one “source of truth” format:
LabelMe JSON (best for review/editing in Annolid)
CSV (best for analysis in Python/R)
COCO / YOLO (best for training external models)