Self-Improving Documentation QA: Link and Example Validation
Create an agent that validates docs examples by executing them in a sandbox, checks links, and ensures version correctness. Propose fixes with evidence.
Safe Self-Improvement for Authentication/Authorization Code
Create a strict policy for edits to auth code: mandatory human review, extra tests, formal checks, and staged rollout. Include “never auto-merge auth changes.”
Design an agent that tunes security scanners (SAST rules, allowlists) based on confirmed findings and false positives. Require approvals for any rule weakening.
Design a multi-repo agent that only edits repos it’s authorized for, respects codeowners, and uses per-repo policies. Include cross-repo dependency coordination rules.
Self-Improving Agent Evals: Add New Tests From Failures
Create a loop where production failures and near-misses become new eval tests. The agent should propose test additions with minimal reproductions and acceptance criteria.
Design a process to adjust safety filters based on measured false positive/negative rates. Require evaluation sets, human review, and rollback if harm risk rises.
Agent Governance: Approvals, Logs, and Periodic Audits
Design governance for self-improving systems: approval rules, quarterly audits, access reviews, and incident drills. Include “who can change the agent” controls.