🤖 AI Agents Weekly: Context Engineering 2.0, Kimi K2 Thinking, Windsurf Codemaps, Google File Search, Tool-to-Agent Retrieval
Context Engineering 2.0, Kimi K2 Thinking, Windsurf Codemaps, Google File Search, Tool-to-Agent Retrieval
In today’s issue:
Context Engineering 2.0 reframes human-machine communication
Kimi K2 Thinking beats GPT-5 and Claude Sonnet 4.5
Code execution cuts MCP token usage by 98.7%
Denario AI agents conduct end-to-end scientific research
Windsurf launches AI-generated Codemaps
Cursor ships semantic search trained on agent traces
MiniMax-M2 interleaved thinking guide boosts agentic performance
OpenAI releases culture-aware IndQA benchmark
Top AI dev news, tool updates, and more
Top Stories
Context Engineering 2.0
Researchers from SJTU, SII, and GAIR trace the 20+ year evolution of context engineering, reframing it as a fundamental challenge in human-machine communication that spans from primitive computing (Era 1.0) to today’s intelligent agents (Era 2.0) and beyond. The paper provides a systematic definition, historical analysis, and design framework for building context-aware AI systems.
Defines four evolutionary stages based on machine intelligence: Context 1.0 (primitive computing with structured inputs), 2.0 (intelligent agents with natural language), 3.0 (human-level intelligence), and 4.0 (superhuman intelligence), with each stage reducing human-AI interaction cost.
Frames context engineering as entropy reduction, where humans must preprocess high-entropy contexts into low-entropy representations that machines can understand, a gap that narrows as machine intelligence increases.
Provides comprehensive design considerations across context collection (multimodal sensors, layered storage), management (text/multimodal processing, hierarchical memory, self-baking abstractions), and usage (intra-system sharing, cross-system protocols, proactive inference).
Examines practical implementations in CLI tools (Gemini CLI with GEMINI.md files), deep research agents (Tongyi DeepResearch with periodic summarization), and emerging practices like KV caching optimization and tool design strategies.
Identifies open challenges: limited context collection methods, large-scale storage bottlenecks, processing degradation at scale, system instability with lifelong memory, and the need for a semantic operating system that actively manages context like human cognition.

