🤖AI Agents Weekly: CodeScientist, Nova Act, Awesome MCP Servers, AWS MCP Servers
CodeScientist, Nova Act, Awesome MCP Servers, AWS MCP Servers
In today’s issue:
AI2 releases CodeScientist
Letta AI releases open format for stateful agents
OpenHands LM is a new 32B open-source, local coding agent
DAIR.AI shared a new guide on getting started with MCP
AWS releases AWS MCP Servers
Nova Act: Amazon’s browser-native agent model
HuggingFace releases YourBench
Evaluating AI Agents on replicating AI research
Awesome MCP Servers is a curated list of MCP servers
AI dev news and much more
Top Stories
CodeScientist
Researchers at AI2 release CodeScientist, a system that autonomously generates and tests scientific hypotheses via code-based experimentation. It’s among the first to produce validated discoveries with minimal human input. Key ideas:
Code-first scientific agent – CodeScientist reviews research papers and assembles experiments using vetted Python code blocks (e.g., for analysis, simulation). It follows a five-step pipeline: Ideation → Planning → Code Execution → Reporting → Meta-Analysis.
Validated AI discoveries – From 50 AI research papers on agents and virtual environments, CodeScientist proposed 19 findings. Of these, 6 were judged scientifically sound and novel. Examples:
Confidence ≠Accuracy – LLM self-assessed confidence in simulations often mismatched actual accuracy.
Simpler state = better prediction – Using binary vs. text states improved model reliability.
Graph memory helps – Agents with graph-structured memory outperformed baselines in a scientific simulation game.
Human-guided autonomy – Full automation is possible, but brief human feedback (e.g., ranking ideas) significantly boosts output quality. Human-in-the-loop interaction improves idea selection and experiment debugging.
Challenges remain – Despite successes, over half the generated experiments fail due to code errors, not scientific flaws. Peer review is still needed to verify results, and current systems lack deep methodological rigor.

