These roundups are one of the few places where the agent-reliability papers and the evaluation-methodology papers show up side by side, which I appreciate. The "Natural-Language Agent Harnesses" and "Meta-Harness" pairing is especially interesting to me: one pushes specification up into prose, the other tries to automate the scaffolding that turns prose into a testable loop. The unresolved question for me is whether natural-language harnesses actually reduce the spec surface or just relocate ambiguity from code into English. I'd love to see a future issue pair these with a paper that measures how often harness authors and model outputs disagree on what the spec *meant*, since that gap is where most verification work ends up living in practice.
These roundups are one of the few places where the agent-reliability papers and the evaluation-methodology papers show up side by side, which I appreciate. The "Natural-Language Agent Harnesses" and "Meta-Harness" pairing is especially interesting to me: one pushes specification up into prose, the other tries to automate the scaffolding that turns prose into a testable loop. The unresolved question for me is whether natural-language harnesses actually reduce the spec surface or just relocate ambiguity from code into English. I'd love to see a future issue pair these with a paper that measures how often harness authors and model outputs disagree on what the spec *meant*, since that gap is where most verification work ends up living in practice.
Strong snapshot of where the field actually is right now
https://knowledgenetworks.substack.com/p/david-cross-explains-ai-native-thinking?r=7ykchg&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true