Discussion about this post

User's avatar
Jenny Ouyang's avatar

The instruction-following paradox here is wild. Agents do exactly what you tell them to do but that doesn't mean they solve the problem faster.

I've noticed this with my own AGENTS.md files. Some of the agent spends more time checking boxes than figuring out where the actual bug lives, or some will try to complete every step of the designed agenda without making the judgement whether a step is necessary or not.

And yes to the finding about LLM-generated files just restating README content. I think I've been guilty of that. Really need to keep it minimal and specific.

Thanks for sharing this!

JP's avatar

The anchoring effect surprised me the most out of all of this. Agents that saw uv mentioned in context files used it 160x more, whether it was the right tool or not. I dug into all three failure mechanisms and the practical fixes too: https://sulat.com/p/agents-md-hurting-you

No posts

Ready for more?