Skip to main content
Featured Newsletter

BestBlogs Issue #81: Long-Running Agents

Hey there! Welcome to BestBlogs.dev Issue #81.

This week's keyword is Long-running Agents . Demo tasks that complete in minutes are always impressive, but production environments demand something different—agents that can reliably execute complex tasks spanning hours or even days. Cursor and Anthropic have taken divergent paths: multi-agent orchestration versus memory continuity for a single agent. With the foundation model layer relatively quiet over the past two weeks, the industry's attention is shifting from "bigger models" to "more reliable agents."

Here are 10 highlights worth your attention this week:

🤖 Cursor uses a Planner-Worker-Judge multi-agent architecture to handle million-line codebases over multiple days. Anthropic takes a different approach—externalizing Git history and work logs to maintain memory continuity across context windows. Two paths, one goal: making agents reliable for long-running tasks .

📁 LangChain founder Harrison Chase dropped a key insight in his Sequoia interview: when agents run long enough, non-determinism makes "code is truth" obsolete . Trace logs become the new source of truth, and context engineering is shifting from nice-to-have to must-have.

📝 Want agents to handle complex tasks? Learn to write specs first. Addy's blog post introduces an "Always/Ask/Never" constraint system and practical techniques for modularizing tasks to avoid the "instruction curse." Specifications are becoming the core deliverable of the AI era.

🧩 MCP is like USB—a unified protocol. Skills are like apps—specific capabilities. But Baoyu points out a hidden risk: a single MCP service can consume tens of thousands of tokens. The context window explosion makes Skills' progressive disclosure approach increasingly attractive.

💡 Martin Fowler's conversation with his team deserves multiple reads. The core insight: programming isn't about translating requirements into syntax—it's about building systems that handle change . LLMs should be the translation layer, not the architect. Real competitive advantage comes from managing complexity through abstraction.

🛠️ From vibe coding's intuition-driven style to vibe engineering's disciplined approach—this evolution is inevitable. AI compresses accidental complexity in implementation, but essential complexity in business logic still requires domain modeling and spec-driven development.

🖥️ MiniMax Agent Desktop shows what desktop agents can do in practice: auto-organizing 400 ebooks, packaging literary translation SOPs, building Xiaohongshu content pipelines. The real value? Turning personal expertise into reusable digital assets.

⚡ Coze 2.0's Agent Plan lets agents autonomously execute long-running tasks with proactive progress updates. The shift from tool to partner—that's the common direction for agent products.

🎯 Miaoya founder Zhang Yueguang made a sharp observation: Miaoya isn't truly AI-native—it's an AI-enhanced internet product. He argues the paradigm has shifted from "process-driven" to "context-driven" , and PMs now need to optimize uncertainty boundaries rather than design deterministic paths.

📈 a16z sees AI as the fourth major platform wave after PC, cloud, and mobile. With AI lowering development barriers, sustainable moats have shifted from code to workflow ownership and closed-loop data accumulation. This really is the golden age for building AI applications.

Hope this issue sparks some new ideas. Stay curious, and see you next week!

Elevate
addyo.substack.com
01-19
7093 words · 29 min
93
How to write a good spec for AI agents

This article presents a comprehensive framework for writing high-quality specifications (Specs) for AI coding agents, moving beyond "vibe coding" toward rigorous AI-assisted engineering. Key principles include starting with a high-level vision, utilizing a professional PRD-like structure (covering commands, testing, and style), and modularizing tasks to mitigate the "curse of instructions." It introduces the "Always/Ask First/Never" boundary system to maintain control over AI actions and highlights the importance of Plan Mode and the Model Context Protocol (MCP). This is an essential read for developers seeking to build reliable, maintainable systems in collaboration with increasingly powerful AI agents.

Sequoia Capital
youtube.com
01-21
8008 words · 33 min
93
Context Engineering Our Way to Long-Horizon AI: LangChain’s Harrison Chase

In this insightful episode, LangChain co-founder Harrison Chase explores the paradigm shift from simple LLM scaffolding to sophisticated "Agent Harnesses." He argues that in the realm of Long Horizon Agents, the traditional "code is truth" mantra is being replaced by "Traces" as the definitive source of truth due to the non-deterministic nature of AI. The discussion delves into the intricacies of Context Engineering, the critical role of file systems, and the rise of "First Draft" applications in coding and research.

宝玉的分享
baoyu.io
01-21
4657 words · 19 min
92
What is the difference between MCP and Skills? A comprehensive guide

This article provides a profound analysis of the two pillars in the AI Agent ecosystem: MCP (Model Context Protocol) and Skills. By using the analogy of MCP as a "USB protocol" and Skills as "applications," the author clarifies their distinct roles. A key highlight is the critical evaluation of MCP's overhead, where tool definitions can consume up to 40% of the context window, leading to high costs and reduced accuracy. In contrast, it introduces Skills' "Progressive Disclosure" and script-based execution as a more token-efficient alternative for local workflows.

阿里云开发者
mp.weixin.qq.com
01-22
2459 words · 10 min
92
Stop Writing Prompts Manually! Requirement Clarification + 50+ Professional Framework Auto-Matching for 10x Efficiency!

This article provides an in-depth guide to building a professional "Prompt Optimizer" using Claude Skills and AI coding tools. Addressing common friction points in prompt engineering—such as vague requirements and lack of structured frameworks—the author introduces an automated solution leveraging over 50 global prompt frameworks. It details a complete workflow from defining Skill specifications to developing a Chrome extension. The key takeaway is that in the AI era, the cost of "doing" has plummeted, making "thinking" and task decomposition the primary value-adds.

腾讯云开发者
mp.weixin.qq.com
01-21
13559 words · 55 min
93
Rethinking Software Engineering: Beyond Vibe Coding

This article provides a profound analysis of the paradigm shift in software development—from manual coding to "intent-driven" engineering. It systematically explores the evolution from intuition-based Vibe Coding to the disciplined Vibe Engineering framework. By revisiting classical theories of Essential vs. Accidental Complexity, the author argues that while AI minimizes technical implementation hurdles, the core business logic remains a challenge that requires Domain Modeling and Spec-Driven Development (SDD).

Founder Park
mp.weixin.qq.com
01-20
5746 words · 23 min
92
How to Build Long-running Agents: Cursor and Anthropic Offer Two Distinct Approaches

This article provides a deep dive into the engineering practices of Cursor and Anthropic for building "long-running" AI agents. While Cursor focuses on a hierarchical multi-agent architecture (Planner-Worker-Judge) to handle massive, parallelized coding tasks, Anthropic prioritizes "memory continuity" for single agents by externalizing state through Git history and progress logs. The piece outlines divergent yet effective philosophies for overcoming context window limitations and summarizes common failure modes in complex task execution. It is an essential read for anyone tracking the evolution of AI-driven software development.

Martin Fowler
martinfowler.com
01-21
3235 words · 13 min
94

This article captures a profound dialogue between Unmesh, Martin Fowler, and Rebecca Parsons on the essence of programming. The central thesis posits that programming is not merely translating requirements into syntax, but building systems resilient to change. It explores managing cognitive load, the intertwining of "What" (intent) and "How" (implementation), and using TDD to drive design. Furthermore, it critically assesses the role of LLMs, framing them as "translation layers" rather than architects. For developers seeking to maintain their edge in the AI era, understanding how to manage complexity through stable abstractions remains the most critical skill.

Founder Park
mp.weixin.qq.com
01-21
4822 words · 20 min
93
MiniMax Agent New Year Update: Good AI Products Need to Let Tools Adapt to People

This article explores the "Desktop Agent" trend sweeping the tech world in early 2026, focusing on a deep dive into MiniMax Agent Desktop. Through three practical scenarios—organizing 400+ ebooks, encapsulating literary translation SOPs, and building a Xiaohongshu content pipeline—the author demonstrates how AI leaps from "advisor" to "executor" by accessing local context. A key insight is how Expert Agents transform personal expertise into reusable digital assets.

字节跳动技术团队
mp.weixin.qq.com
01-19
2682 words · 11 min
92
Coze 2.0, Taking Agent to the Next Level

This article provides a comprehensive overview of the Coze 2.0 upgrade, ByteDance's leading AI platform. Coze 2.0 marks a strategic shift from a "reactive tool" to a "proactive partner." Key highlights include Agent Skills, which encapsulate industry best practices into reusable capabilities; Agent Plan, enabling agents to autonomously execute long-term goals and report progress; and Coze Coding, a Vibe Coding platform that simplifies app development.

十字路口Crossing
xiaoyuzhoufm.com
01-18
2684 words · 11 min
92
Paranoia, Ambition, and a Pair of AI Glasses: The Underlying Fuel of Top Product Managers | A Conversation with Li Auto SVP Fan Haoyu

Li Auto's SVP Fan Haoyu delves into the logic behind their first AI hardware, Livis. Diverging from mainstream AR paths, Livis adopts a restrained approach: 36g weight, no-display design, and a self-developed RTOS on a low-power MCU architecture, prioritizing high availability as an "always-on companion." Fan also shares his "Hands in the Mud" philosophy and the "6211" time management framework, discussing how Li Auto expands from its automotive "Island" to an AI-driven "Continent."

Smashing Magazine
smashingmagazine.com
01-22
3913 words · 16 min
92
Beyond Generative: The Rise Of Agentic AI And User-Centric Design

As AI evolves from generative to agentic, UX design is undergoing a paradigm shift from simple usability to trust and accountability. This article explores the core characteristics of AI agents—autonomous reasoning and planning—and proposes a taxonomy of four autonomy levels: Observe-and-Suggest, Plan-and-Propose, Act-with-Confirmation, and Act-Autonomously. Author Victor Yocco provides a practical research playbook for developers and PMs, covering methods like mental-model interviews, agent journey mapping, and simulated misbehavior testing. By establishing metrics like intervention and rollback rates, the piece offers a clear roadmap for building controllable, transparent, and ethical AI agent systems. It is an essential resource for technical leaders exploring the boundaries of AI automation.

130. Zhang Yueguang's First Interview After Two Years of Entrepreneurship: Miao Ya is Not an AI Native Product, From Process to Context Design, One Way Door, and Otome Games

This podcast features an in-depth interview with Zhang Yueguang, the founder of the viral "Miaoya Camera." He reflects on his entrepreneurial journey after leaving ByteDance and Alibaba, offering a provocative thesis: Miaoya is not a true AI Native product, but rather an AI-enhanced Internet product. He argues that the AI era demands a paradigm shift from "Process-oriented design" to "Context-oriented design," where PMs focus on optimizing interaction boundaries rather than deterministic paths. The discussion covers his "One Way Door" product philosophy and his latest ventures in AI Otome games and Agent tools (Dokie).

a16z
youtube.com
01-19
7250 words · 29 min
92
Why NOW is the Golden Era to build AI apps.

This insightful piece synthesizes the a16z AI Apps team's strategic perspective on the AI cycle, framing it as the most rapid platform shift since PC, Cloud, and Mobile. It highlights three pivotal investment themes: the AI-native transformation of existing software categories, "Software as Labor" (where software replaces human tasks at higher price points), and "Walled Gardens" built on proprietary data moats. Through case studies like Eve and Salient, the authors argue that in an era of "Vibe Coding," defensibility is no longer about the code itself but about owning end-to-end workflows and proprietary data loops.

Lenny's Podcast
youtube.com
01-18
7323 words · 30 min
92
How a Meta PM ships products without ever writing code | Zevi Arnovitz

Can a non-technical PM build real apps? Meta's Zevi Arnovitz proves it's possible through a systematic AI workflow. Using Cursor and Claude, he automates development with "slash commands" and solves the code review hurdle via multi-model cross-validation. This session is a high-density guide for anyone looking to bridge the gap between product vision and technical execution using modern AI agents.

All-In Podcast
youtube.com
01-21
8750 words · 35 min
92
Satya Nadella on AI’s Business Revolution: What Happens to SaaS, OpenAI, and Microsoft?

In this insightful All-In podcast episode, Microsoft CEO Satya Nadella explores how AI is reshaping Microsoft and the global business landscape. He introduces the "Macro-delegate and micro-steer" paradigm, explaining how Microsoft achieved massive revenue growth while maintaining a flat headcount through structural shifts like "Full-stack builders." The discussion covers the evolution of AI Agents, the strategic logic behind the OpenAI partnership, and the global diffusion of the US tech stack.

    BestBlogs Issue #81: Long-Running Agents | BestBlogs.dev