AI Newsletter

AI Newsletter

🤖 AI Agents Weekly: Evaluating AGENTS.md, Perplexity Computer, Nano Banana 2, Doc-to-LoRA, Hermes Agent, Mercury 2, and More

Evaluating AGENTS.md, Perplexity Computer, Nano Banana 2, Doc-to-LoRA, Hermes Agent, Mercury 2, and More

Feb 28, 2026
∙ Paid

In today’s issue:

  • AGENTS.md files hurt coding agent performance

  • Perplexity launches Computer for end-to-end tasks

  • Google launches Nano Banana 2 for free

  • Sakana AI ships Doc-to-LoRA and Text-to-LoRA

  • Notion launches Custom Agents in 3.3

  • Nous Research releases Hermes Agent open source

  • GPT-5.3-Codex available for all developers

  • Mercury 2 ships reasoning diffusion LLM

  • Qwen 3.5 medium model series drops

  • Claude Code ships auto-memory across sessions

  • RoguePilot exposes GitHub Copilot vulnerability

  • Vercel open-sources Chat SDK for multi-platform bots

And all the top AI dev news, papers, and tools.



Top Stories

Evaluating AGENTS.md: Are Context Files Helpful for Coding Agents?

Evaluating AGENTS.md

Researchers from UIUC and Microsoft Research evaluated whether repository-level context files like AGENTS.md actually improve coding agent performance. The counterintuitive finding: context files reduce task success rates compared to providing no context at all, while increasing inference costs by over 20%.

  • Lower success rates: Both LLM-generated and human-written context files caused agents to solve fewer tasks on SWE-bench compared to agents given no repository context, challenging the widely adopted practice of writing detailed agent instructions.

  • Broader but less effective exploration: Context files prompted agents to explore more thoroughly, including more testing and file traversal, but the additional constraints made tasks harder rather than easier.

  • Minimal is better: The authors recommend that context files describe only minimal requirements rather than comprehensive specifications, as unnecessary constraints actively hurt agent performance.

  • Practical implications: The findings suggest developers should rethink how they structure AGENTS.md, CLAUDE.md, and similar context files, focusing on essential guardrails rather than exhaustive instructions.

Paper

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 elvis · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture