/autoresearch
Autoresearch
Self-improving skill optimization through scored experiments
Overview
Runs automated experiment loops against any skill's SKILL.md, testing hypotheses, scoring outputs with pluggable judges, and committing improvements or reverting regressions. Based on Karpathy's autoresearch pattern, adapted for prompt optimization.
What It Does
- Forms testable hypotheses from failure modes and applies surgical edits to SKILL.md files
- Generates skill outputs against a test corpus and scores with anti-slop, structural, and custom judges
- Commits improvements that pass threshold and reverts regressions automatically via git
- Detects plateaus through consecutive revert counting and rotates across hypothesis categories
Inputs
- Skill name
- Test corpus
- Judge configuration
- Iteration count
Outputs
- Modified SKILL.md (on experiment branch)
- Experiment log with per-iteration scores
Example
Run /autoresearch produce-content --iterations 5. It identifies that the skill scores low on structural completeness, adds a required-sections checklist to the SKILL.md, re-scores, and commits the +0.8 improvement. Two other hypotheses regress and get reverted.
Deep Dives
Related Skills & Workflows
Eval Loop
Trace quality gaps to root causes, fix iteratively
SkillPrompt Creator
Systematic design for production-quality AI prompts
SkillCodex Review
Cross-model adversarial review via OpenAI Codex
WorkflowContent Calendar Builder
Quarterly content calendar grounded in keyword data and ICP pain points
WorkflowEvent Marketing Playbook
Pre/during/post event execution plan with follow-up sequences
WorkflowSEO Content Brief
Keyword-grounded content briefs with SERP analysis in 6 minutes
Ready to use /autoresearch?
This skill ships with every Knowledge OS installation. Set up your system in 90 minutes.
Built and maintained by Victor Sowers at STEEPWORKS