/autoresearch

Autoresearch

Name: Autoresearch
Author: STEEPWORKS

Self-improving skill optimization through scored experiments

Bespoke TierValidation & Quality

Overview

Runs automated experiment loops against any skill's SKILL.md, testing hypotheses, scoring outputs with pluggable judges, and committing improvements or reverting regressions. Based on Karpathy's autoresearch pattern, adapted for prompt optimization.

What It Does

Forms testable hypotheses from failure modes and applies surgical edits to SKILL.md files
Generates skill outputs against a test corpus and scores with anti-slop, structural, and custom judges
Commits improvements that pass threshold and reverts regressions automatically via git
Detects plateaus through consecutive revert counting and rotates across hypothesis categories

Inputs

Skill name
Test corpus
Judge configuration
Iteration count

Outputs

Modified SKILL.md (on experiment branch)
Experiment log with per-iteration scores

Example

/autoresearch

Run /autoresearch produce-content --iterations 5. It identifies that the skill scores low on structural completeness, adds a required-sections checklist to the SKILL.md, re-scores, and commits the +0.8 improvement. Two other hypotheses regress and get reverted.

Related Skills

/eval-loop

Eval Loop

Trace quality gaps to root causes, fix iteratively

/prompt-creator

Prompt Creator

Systematic design for production-quality AI prompts

/codex-review

Codex Review

Cross-model adversarial review via OpenAI Codex

Deep Dives

Knowledge OS Guide

Related Skills & Workflows

Skill

Ready to use /autoresearch?

This skill ships with every Knowledge OS installation. Set up your system in 90 minutes.

View Pricing All 42 Skills

Built and maintained by Victor Sowers at STEEPWORKS