🤖 AI Agents Weekly: Evaluating AGENTS.md, Perplexity Computer, Nano Banana 2, Doc-to-LoRA, Hermes Agent, Mercury 2, and More
Evaluating AGENTS.md, Perplexity Computer, Nano Banana 2, Doc-to-LoRA, Hermes Agent, Mercury 2, and More
In today’s issue:
AGENTS.md files hurt coding agent performance
Perplexity launches Computer for end-to-end tasks
Google launches Nano Banana 2 for free
Sakana AI ships Doc-to-LoRA and Text-to-LoRA
Notion launches Custom Agents in 3.3
Nous Research releases Hermes Agent open source
GPT-5.3-Codex available for all developers
Mercury 2 ships reasoning diffusion LLM
Qwen 3.5 medium model series drops
Claude Code ships auto-memory across sessions
RoguePilot exposes GitHub Copilot vulnerability
Vercel open-sources Chat SDK for multi-platform bots
And all the top AI dev news, papers, and tools.
Top Stories
Evaluating AGENTS.md: Are Context Files Helpful for Coding Agents?
Researchers from UIUC and Microsoft Research evaluated whether repository-level context files like AGENTS.md actually improve coding agent performance. The counterintuitive finding: context files reduce task success rates compared to providing no context at all, while increasing inference costs by over 20%.
Lower success rates: Both LLM-generated and human-written context files caused agents to solve fewer tasks on SWE-bench compared to agents given no repository context, challenging the widely adopted practice of writing detailed agent instructions.
Broader but less effective exploration: Context files prompted agents to explore more thoroughly, including more testing and file traversal, but the additional constraints made tasks harder rather than easier.
Minimal is better: The authors recommend that context files describe only minimal requirements rather than comprehensive specifications, as unnecessary constraints actively hurt agent performance.
Practical implications: The findings suggest developers should rethink how they structure AGENTS.md, CLAUDE.md, and similar context files, focusing on essential guardrails rather than exhaustive instructions.

