Hey there! Welcome to BestBlogs.dev Issue #81.
This week's keyword is Long-running Agents . Demo tasks that complete in minutes are always impressive, but production environments demand something different—agents that can reliably execute complex tasks spanning hours or even days. Cursor and Anthropic have taken divergent paths: multi-agent orchestration versus memory continuity for a single agent. With the foundation model layer relatively quiet over the past two weeks, the industry's attention is shifting from "bigger models" to "more reliable agents."
Here are 10 highlights worth your attention this week:
🤖 Cursor uses a Planner-Worker-Judge multi-agent architecture to handle million-line codebases over multiple days. Anthropic takes a different approach—externalizing Git history and work logs to maintain memory continuity across context windows. Two paths, one goal: making agents reliable for long-running tasks .
📁 LangChain founder Harrison Chase dropped a key insight in his Sequoia interview: when agents run long enough, non-determinism makes "code is truth" obsolete . Trace logs become the new source of truth, and context engineering is shifting from nice-to-have to must-have.
📝 Want agents to handle complex tasks? Learn to write specs first. Addy's blog post introduces an "Always/Ask/Never" constraint system and practical techniques for modularizing tasks to avoid the "instruction curse." Specifications are becoming the core deliverable of the AI era.
🧩 MCP is like USB—a unified protocol. Skills are like apps—specific capabilities. But Baoyu points out a hidden risk: a single MCP service can consume tens of thousands of tokens. The context window explosion makes Skills' progressive disclosure approach increasingly attractive.
💡 Martin Fowler's conversation with his team deserves multiple reads. The core insight: programming isn't about translating requirements into syntax—it's about building systems that handle change . LLMs should be the translation layer, not the architect. Real competitive advantage comes from managing complexity through abstraction.
🛠️ From vibe coding's intuition-driven style to vibe engineering's disciplined approach—this evolution is inevitable. AI compresses accidental complexity in implementation, but essential complexity in business logic still requires domain modeling and spec-driven development.
🖥️ MiniMax Agent Desktop shows what desktop agents can do in practice: auto-organizing 400 ebooks, packaging literary translation SOPs, building Xiaohongshu content pipelines. The real value? Turning personal expertise into reusable digital assets.
⚡ Coze 2.0's Agent Plan lets agents autonomously execute long-running tasks with proactive progress updates. The shift from tool to partner—that's the common direction for agent products.
🎯 Miaoya founder Zhang Yueguang made a sharp observation: Miaoya isn't truly AI-native—it's an AI-enhanced internet product. He argues the paradigm has shifted from "process-driven" to "context-driven" , and PMs now need to optimize uncertainty boundaries rather than design deterministic paths.
📈 a16z sees AI as the fourth major platform wave after PC, cloud, and mobile. With AI lowering development barriers, sustainable moats have shifted from code to workflow ownership and closed-loop data accumulation. This really is the golden age for building AI applications.
Hope this issue sparks some new ideas. Stay curious, and see you next week!