🤖AI Agents Weekly: CodeScientist, Nova Act, Awesome MCP Servers, AWS MCP Servers
CodeScientist, Nova Act, Awesome MCP Servers, AWS MCP Servers
In today’s issue:
AI2 releases CodeScientist
Letta AI releases open format for stateful agents
OpenHands LM is a new 32B open-source, local coding agent
DAIR.AI shared a new guide on getting started with MCP
AWS releases AWS MCP Servers
Nova Act: Amazon’s browser-native agent model
HuggingFace releases YourBench
Evaluating AI Agents on replicating AI research
Awesome MCP Servers is a curated list of MCP servers
AI dev news and much more
Top Stories
CodeScientist
Researchers at AI2 release CodeScientist, a system that autonomously generates and tests scientific hypotheses via code-based experimentation. It’s among the first to produce validated discoveries with minimal human input. Key ideas:
Code-first scientific agent – CodeScientist reviews research papers and assembles experiments using vetted Python code blocks (e.g., for analysis, simulation). It follows a five-step pipeline: Ideation → Planning → Code Execution → Reporting → Meta-Analysis.
Validated AI discoveries – From 50 AI research papers on agents and virtual environments, CodeScientist proposed 19 findings. Of these, 6 were judged scientifically sound and novel. Examples:
Confidence ≠ Accuracy – LLM self-assessed confidence in simulations often mismatched actual accuracy.
Simpler state = better prediction – Using binary vs. text states improved model reliability.
Graph memory helps – Agents with graph-structured memory outperformed baselines in a scientific simulation game.
Human-guided autonomy – Full automation is possible, but brief human feedback (e.g., ranking ideas) significantly boosts output quality. Human-in-the-loop interaction improves idea selection and experiment debugging.
Challenges remain – Despite successes, over half the generated experiments fail due to code errors, not scientific flaws. Peer review is still needed to verify results, and current systems lack deep methodological rigor.
Keep reading with a 7-day free trial
Subscribe to NLP Newsletter to keep reading this post and get 7 days of free access to the full post archives.