Tracking Correction with SAM3 Agent and Annolid Bot¶
Use this workflow when a saved Annolid tracking NDJSON needs correction after missed detections, short occlusions, or likely identity switches.
The correction tool works on LabelMe-style Annolid NDJSON files. It can:
- fill empty frame records from a second tracking NDJSON,
- run the SAM3 agent first and use its output as the correction source,
- repair short temporal gaps and likely ID switches from the existing NDJSON,
- preserve manual annotations and unrelated shapes by default.
When to Use This¶
Use tracking correction when:
- an animal or object disappears for a few frames and then reappears,
- the tracker writes empty
shapesfor some frames, - identities switch after occlusion or close interaction,
- you want to merge better SAM3 agent results into an existing annotation file.
Do not treat this as a replacement for review. Long occlusions, crossings, and visually identical instances still require manual validation.
Inputs and Outputs¶
Required input:
ndjson_path: the target Annolid NDJSON file to correct.
Optional inputs:
source_ndjson_path: a second NDJSON file with better shapes for some frames.video_path: source video path, used when asking the tool to run SAM3 first.agent_prompt: text prompt for SAM3, such asmouseorfly.output_ndjson_path: where to write the corrected NDJSON.
Recommended output practice:
- Write to a new
output_ndjson_pathfirst. - Open the corrected file in Annolid and review the changed frames.
- Replace the original only after the corrected file is verified.
If output_ndjson_path is omitted, the tool writes in place.
Correction Modes¶
Mode 1: Fill Empty Frames from a Source NDJSON¶
Use this when you already have a second NDJSON from SAM3, Cutie, or another tracking pass.
Example Annolid Bot prompt:
Correct /data/session1/session1_annotations.ndjson using
/data/session1/sam3_predictions.ndjson.
Write /data/session1/session1_corrected.ndjson.
Only fill frames that currently have empty shapes.
Equivalent structured tool fields:
gui_correct_tracking_ndjson(
ndjson_path="/data/session1/session1_annotations.ndjson",
source_ndjson_path="/data/session1/sam3_predictions.ndjson",
output_ndjson_path="/data/session1/session1_corrected.ndjson",
replace_only_empty_shapes=true
)
Default behavior is conservative:
- frames with existing shapes are skipped,
- empty target frames are filled from the source,
- frame metadata such as
frame,frame_number,imagePath, width, and height is preserved when possible.
Mode 2: Run SAM3 Agent, Then Correct the NDJSON¶
Use this when you want Annolid Bot to run SAM3 agent tracking and merge the result into your target NDJSON.
Example prompt:
Run SAM3 Agent on /data/session1/video.mp4 for prompt "mouse",
then correct /data/session1/session1_annotations.ndjson.
Write /data/session1/session1_corrected.ndjson.
Only fill empty frames.
Equivalent structured tool fields:
gui_correct_tracking_ndjson(
ndjson_path="/data/session1/session1_annotations.ndjson",
video_path="/data/session1/video.mp4",
agent_prompt="mouse",
run_sam3_agent=true,
output_ndjson_path="/data/session1/session1_corrected.ndjson",
window_size=5,
stride=5
)
Use a dry SAM3 workflow first if you are still choosing output folders or tracking settings. See SAM3 Agent Video Tracking.
Mode 3: Repair Occlusions and ID Switches¶
Use temporal repair when the target NDJSON itself has enough stable frames to infer instance continuity.
Example prompt:
Repair occlusions and ID switches in
/data/session1/session1_annotations.ndjson after frame 500.
There should be 4 instances. Fill gaps up to 20 frames and use max match
distance 80 pixels. Write /data/session1/session1_temporal_repair.ndjson.
Equivalent structured tool fields:
gui_correct_tracking_ndjson(
ndjson_path="/data/session1/session1_annotations.ndjson",
output_ndjson_path="/data/session1/session1_temporal_repair.ndjson",
temporal_repair=true,
start_frame=500,
expected_instance_count=4,
max_gap_frames=20,
max_match_distance=80
)
Temporal repair uses centroid continuity. It chooses reference instances from
the first usable frame at or after start_frame, then:
- matches later shapes to the nearest previous track within
max_match_distance, - restores the canonical label, group ID, and track flags after likely ID switches,
- fills short missing tracks by interpolation when a later recovery exists,
- carries the previous shape when no later recovery exists inside the gap limit,
- marks generated or corrected shapes with correction flags.
If duplicate labels do not have usable track IDs, the repair assigns stable
synthetic IDs such as track_1 and track_2 in shape flags.
Important Options¶
| Option | Default | Use |
|---|---|---|
replace_only_empty_shapes |
true |
Only target records with empty shapes are corrected. Set to false when you intentionally want non-empty frames considered for update or temporal repair. |
replace_all_shapes |
false |
Source shapes are merged while preserving unrelated/manual target shapes. Set to true only when the source should replace every shape in a corrected frame. |
allow_append_new_frames |
false |
When true, source frames that do not exist in the target NDJSON are appended. Leave this off if the target file should define the authoritative frame set. |
temporal_repair |
false |
Enables occlusion filling and ID-switch correction. |
start_frame |
0 |
First frame to use for temporal repair. Set this to a stable frame after the expected instances are visible. |
expected_instance_count |
unset | Number of instances expected after start_frame. This helps the repair choose a stable reference frame and detect missing tracks. |
max_gap_frames |
5 |
Maximum gap length to fill. Keep this small for conservative correction. |
max_match_distance |
80 |
Maximum centroid distance, in pixels, for assigning a current shape to a previous track. Lower values reduce accidental identity changes; higher values tolerate faster movement. |
Reading the Result Summary¶
Annolid Bot returns a JSON summary. Important fields include:
output_ndjson_path: corrected file path.target_records: valid target records read.source_records: valid source records read.candidate_source_frames: source frames that had frame metadata.replaced_frames: target frames corrected from source.appended_frames: new frames appended from source.skipped_non_empty_frames: frames skipped because they already had shapes.target_invalid_lines: malformed target lines preserved but not parsed.source_invalid_lines: malformed source lines ignored.temporal_reference_instances: tracks used as temporal references.missing_shapes_filled: shapes created for short occlusion gaps.id_switches_corrected: shapes whose identity metadata was corrected.
If candidate_source_frames is 0, the source NDJSON does not contain usable
frame or frame_number metadata.
Review Checklist¶
After correction:
- Open the corrected NDJSON in Annolid.
- Jump to frames around each occlusion or interaction.
- Check corrected shapes with
annolid_correctionflags. - Verify that the instance count is correct after
start_frame. - Confirm that manual annotations or behavior labels were not unintentionally replaced.
Troubleshooting¶
| Problem | What to check |
|---|---|
| No source NDJSON was resolved. | Provide source_ndjson_path, or set run_sam3_agent=true with both video_path and agent_prompt, or set temporal_repair=true to repair the target NDJSON from itself. |
| The tool reports no candidate source frames. | The source records are missing frame or frame_number, or the file is not an Annolid tracking NDJSON. |
| Too many frames were skipped. | The default only fills empty frames. If you need to correct non-empty frames, set replace_only_empty_shapes=false and review the output carefully. |
| ID switches remain after repair. | Start from a more stable start_frame, lower or raise max_match_distance based on object speed, and verify that expected_instance_count matches the visible instances. |
| False ID corrections appear after crossings. | Reduce max_match_distance, reduce max_gap_frames, or split the repair into shorter ranges around stable intervals. Centroid-based repair is intentionally conservative but cannot prove identity through long ambiguous crossings. |