Agent and Automation¶
Use this section when you are configuring Annolid Bot, connecting external tools, or running repeatable workflows through the agent stack.
Start Here¶
Agent CLI
Run Annolid-native CLI flows through the typed annolid_run path.
MCP
Connect external tools and resources through Model Context Protocol servers.
Open MCP guideCodex and ACP
Bridge Annolid with Codex-style workflows and ACP-compatible runtime paths.
Open Codex and ACP guideCalendar
Schedule and coordinate tasks with Google Calendar-aware agent flows.
Open calendar guideWorkspace and Secrets
Configure Google Workspace, local secret storage, and channel-safe credentials.
Open workspace guideMemory and Security
Manage retrieval-backed memory and harden agent behavior before scaling up.
Open security guideRecommended Sequence¶
- Start with Annolid Agent and annolid-run for safe CLI execution.
- Configure MCP if you need external tools or browser/file bridges.
- Set up Google Workspace and Agent Secrets before enabling integrations.
- Review Agent Security and Memory Subsystem before turning on broader automation.
What Lives Here¶
- typed agent tool execution,
- bot-assisted model discovery, training help, and background fine-tuning runs,
- MCP connectivity,
- Codex and ACP integration notes,
- calendar and workspace integrations,
- secret handling,
- memory-backed agent behavior,
- security and operational guardrails.
Bot Training Workflows¶
Annolid Bot now exposes typed training tools for model discovery and launch:
annolid_dataset_inspectinspects a dataset folder, summarizes raw LabelMe annotations, detects external formats such as DeepLabCut, COCO, and YOLO folders, distinguishes between saved trainable specs and inferred dataset layouts, and recommends the next prep/training step.annolid_dataset_prepareprepares a dataset folder for training by generating a LabelMe spec with train/val/test splits, inferring and writing a reusable COCO spec, importing DeepLabCut training data into LabelMe plus an Annolid index, or exporting a YOLO dataset from raw LabelMe or COCO annotations.annolid_train_modelslists trainable model families, aliases, and task hints.annolid_train_helpreturns plugin-specific training help such asdino_kpsegflags.annolid_train_startlaunches long-runningannolid-run train ...jobs in the background and returns a managed shellsession_idfor follow-up polling throughexec_process. It can now acceptdataset_folder, auto-resolve saved dataset configs, and for DinoKPSEG it can stage an inferred COCO spec into the workspace cache when the folder is structurally valid butcoco_spec.yamlhas not been written yet.
This is intended for workflows such as:
- DINOv3-based keypoint segmentation fine-tuning with
dino_kpseg - Ultralytics YOLO pose fine-tuning with the
posetask preset - Ultralytics YOLO segmentation or detection runs with the matching task preset
Typical dataset-to-training flow:
- Inspect the folder with
annolid_dataset_inspect - If the folder only has inferred COCO structure or raw LabelMe annotations, run
annolid_dataset_prepare - Start training with
annolid_train_startand passdataset_folder
For external pose datasets, the first prep step can now stay inside the bot workflow. In particular, a DeepLabCut project with labeled-data/**/CollectedData_*.csv can be converted through annolid_dataset_prepare(mode="deeplabcut_import"), which writes LabelMe JSON sidecars, an optional pose schema, and an Annolid label index that can then feed the rest of the dataset/training pipeline. COCO folders can be turned into coco_spec.yaml with mode="coco_spec" or materialized into YOLO data.yaml datasets with mode="coco_to_yolo" when the target model family expects Ultralytics layout. If the target model is dino_kpseg, the bot can also stage an inferred COCO spec automatically under the workspace cache during annolid_train_start.
The training launcher prefers the workspace .venv interpreter when present so bot-initiated runs use the same dependency environment recommended for local validation.
Bot Evaluation Reports¶
Annolid Bot also exposes a typed evaluation reporting tool:
annolid_eval_startlaunches supported evaluation jobs such as DinoKPSEG evaluation or YOLO validation and returns a managed shell sessionannolid_eval_reportreads saved evaluation artifacts such as DinoKPSEG eval JSON, YOLOresults.csvandpredictions.json, or behavior-classifiermetrics.json- it normalizes core metrics into a paper-style summary table
- it can write JSON, Markdown, CSV, and LaTeX report files when a report directory is provided
Typical bot evaluation flow:
- Start the eval job with
annolid_eval_start - Poll the background session with
exec_process - Turn the produced artifacts into a paper-ready summary with
annolid_eval_report
Supported eval launch families today:
dino_kpsegyolobehavior_classifier
The report output is designed to support common ML reporting practice:
- explicit dataset/model/split metadata
- primary test metrics in a compact table
- confidence intervals when the source metrics support them
- explicit quality checks that flag missing CI coverage, weak sample size, or run-stability gaps
- artifact inventory for confusion matrices, curves, and raw metric files
- reproducibility notes so the bot reports from saved artifacts instead of handwritten numbers
For yolo, prefer launching evaluation with save_json=true so Ultralytics writes predictions.json. When the run directory also lets the report tool resolve a COCO-style annotation file from args.yaml and the dataset YAML, annolid_eval_report can add deterministic bootstrap confidence intervals for mAP@50 and mAP@50-95 instead of leaving those cells as NA.
For behavior_classifier, the eval launcher can now also write confusion-matrix and precision-recall curve figures directly during evaluation via --plot-dir, so the resulting run is closer to paper-ready without a separate plotting step.
Research Paper Swarm Launcher¶
The GUI now includes AI & Models → Draft Research Paper with Swarm…, which opens Annolid Bot with a prefilled prompt that invokes draft_paper_swarm.
- If a PDF is open, the launcher passes the current PDF title and path into the prompt so the swarm can ground the draft in the active paper.
- The prompt asks the bot to search literature first when needed, then draft a structured paper with outline, sections, and citations.
- This is a thin GUI entry point over the same swarm-backed paper drafting workflow used by the agent tools.
Citation Integrity Workflow¶
Annolid Bot and the Citation Manager now support citation verification reports for research-pipeline quality checks.
gui_save_citationsupportsverify_after_save=trueto emit a per-entry integrity result and report artifact.gui_verify_citationsverifies an existing.bibfile in batch and returnsverified/suspicious/hallucinated/skippedcounts plus an aggregate integrity score.- Direct command example:
verify citations from refs.bib limit 200 - Citation Manager dialog now includes:
Verify after savefor context savesVerify .bibfor batch verification of the selected BibTeX file
Report artifacts are written under .annolid_cache/citation_verification/ next to the selected .bib file.
annolid_eval_report can optionally enforce citation-quality gates when generating paper-ready model reports:
citation_gate=trueenables citation checkscitation_report_pathpoints to a batch verification report JSON (or auto-discovery can be used)citation_hallucinated_maxsets a hard fail thresholdcitation_suspicious_rate_warnandcitation_integrity_min_warnset warning thresholdscitation_gate_required=trueupgrades missing citation reports from warn to fail
Novelty Preflight¶
For paper-drafting workflows, use annolid_novelty_check before writing claims.
- It scores lexical overlap between your proposed idea and related-work summaries.
- It reports coverage quality (
low/medium/high) for the provided literature context. - It returns a recommendation:
proceedwhen overlap is low and coverage is acceptabledifferentiatewhen overlap is moderate or literature coverage is weakabortwhen overlap crosses the configured high-risk threshold
You can provide related work inline (related_work) or via related_work_json_path.
Unified Paper-Run Report¶
To consolidate paper-drafting signals in one artifact, use annolid_paper_run_report.
- It keeps
annolid_eval_reportunchanged and composes a new additive report. - It merges:
- model evaluation table from
annolid_eval_report, - citation verification batch summary,
- novelty preflight summary,
- warnings and reproducibility checklist.
- Inputs can be passed as an in-memory eval report object or as
eval_report_json_path, plus optionalcitation_report_pathandnovelty_report_path. - It returns a unified markdown+JSON payload and can optionally write files when
report_diris provided withallow_mutation=true.
Paper-Ready Quality Gates¶
For export-time paper readiness, annolid_paper_run_report supports configurable gates:
paper_ready_gate=trueenables gating checks.citation_integrity_floorsets the minimum allowed citation integrity score (0.0to1.0).novelty_coverage_floorsets the minimum allowed novelty coverage score (idea_token_coverage,0.0to1.0).require_citation_summaryandrequire_novelty_summarycontrol whether missing summaries are treated as blocking failures.
When paper_ready_gate=true and a gate check fails, file export to report_dir is blocked and the tool returns ok=false with the assembled report payload for inspection.