Skip to main content
Featured Newsletter

BestBlogs Issue #76: Skills

Hey there! Welcome to BestBlogs.dev Issue #76.

The most thought-provoking idea I encountered this week came from Anthropic's AI Engineer conference talk: stop building agents, start building skills. Their core argument? Current agents have intelligence but lack domain expertise—like asking a brilliant mathematician to do your taxes. Smart, yes. Experienced, no. Skills are essentially folders packaging procedural knowledge, supporting version control, team sharing, and seamless MCP integration. Coincidentally, MCP officially moved under Linux Foundation governance this week, racking up 37,000 stars in just 8 months—clear validation of the hunger for standardization. I wonder: as the "model + runtime + skill library + MCP" architecture takes shape, could Skills become the universal capability carrier across models and products, much like Docker images? This might be the pivotal moment when AI development shifts from reinventing wheels to assembling standardized components.

Here are 10 highlights worth your attention this week:

🤖 GPT-5.2 dropped, and OpenAI is positioning it as "AI for the working professional." Not just for programmers anymore—it's targeting lawyers, designers, and marketing managers. In evaluations across 44 real-world professional tasks, over 70% of outputs matched or exceeded the quality of experts with 14 years of experience. Pragmatic iteration, real-world focus.

🔬 Tsinghua's density law research, published in Nature Machine Intelligence, reveals AI's own "Moore's Law": training and inference efficiency doubles every 3.5 months. This explains how edge models are catching up to cloud giants—and predicts smartphones running self-learning personal LLMs by 2027.

📊 Stanford's study of 120,000 developers found a 0.40 correlation between codebase health and AI ROI. Clean code amplifies AI benefits; technical debt accelerates entropy. The kicker: a 14% PR increase might mask a 9% quality drop and 2.5x more rework—potentially negative ROI overall.

💡 Manus founder Zhang Tao finally addressed external skepticism, articulating the core philosophy: "Less structure, more intelligence." Zero Predefined Workflow hands task decisions entirely to the model, maintaining benchmark leadership. Application-layer teams can compete with OpenAI through model selection flexibility alone.

🌐 a16z's 2026 predictions argue Agent-native infrastructure will become essential, with the core challenge shifting from compute to multi-agent coordination. The bigger insight: 99% of market opportunity lies in traditional verticals, not Silicon Valley tech circles. Enterprise software value moves from systems of record to intelligent execution layers.

🛠️ Tencent's engineering team published a comprehensive guide to persistent memory for AI agents. Using LangGraph, short-term memory leverages Checkpointer for single-session state; long-term memory uses Store for cross-session knowledge sharing. From InMemorySaver to PostgreSQL persistence to semantic search—complete with working code examples.

🎨 Zhipu open-sourced the GLM-4.6V series, with native Function Call capabilities baked into the vision model—"image as parameter, result as context" multimodal tool calling. The 9B Flash version outperforms Qwen3-VL-8B, API pricing dropped 50%, fully open source.

📈 Dify founder Lu Yu unpacked two years of startup lessons behind their 110k+ GitHub stars. Commitment to engineering value and model neutrality, pragmatic transition from high-code to intelligence, plus unexpected success in Japan. Deep understanding of AI applications' "last mile" problem.

🏆 OpenRouter and a16z's joint report, based on 100 trillion tokens of real usage data, reveals key shifts: Chinese open-source model share exploded from 1.2% to nearly 30%; reasoning-optimized models now exceed 50% of traffic; coding dominates over half of total usage. The "Cinderella slipper effect" is worth noting—model retention depends on perfectly solving a specific pain point at launch.

🧩 Investor Zhu Xiaohu's year-end AI industry review: no bubble in sight for at least three years. Competition has shifted from model capability to the battle for super-entry points. His advice for founders? "Diverge 15 degrees from consensus"—focus on vertical niches and grunt work that big tech won't touch.

Hope this issue sparks some new ideas. Stay curious, and see you next week!

数字生命卡兹克
mp.weixin.qq.com
12-11
3391 words · 14 min
95
GPT-5.2 Released: The Ultimate AI for the Hardworking Professional

OpenAI's 10th-anniversary release of GPT-5.2 centers on professional knowledge work capabilities. The article provides deep analysis of two key benchmarks: ARC-AGI-2 scores jumped from 17.6% to 52.9% on fluid intelligence tasks, while GDPval evaluation shows the model matches or exceeds professionals with 14 years of experience in over 70% of real-world tasks across 44 occupations. The author emphasizes this is a practical iteration that serves not just programmers but lawyers, designers, and marketing managers—addressing complex real-world professional challenges.

智谱
mp.weixin.qq.com
12-08
2446 words · 10 min
93
GLM-4.6V Open-source: Enabling Image Comprehension and Automated Task Completion

Zhipu AI open-sources GLM-4.6V multimodal models with a key innovation: natively integrating Function Call capabilities into vision models, enabling "images as parameters, results as context" for multimodal tool calling. With 128k context window supporting 150-page documents or hour-long videos, the article showcases four real-world scenarios: intelligent image-text composition, visual shopping assistance, pixel-perfect frontend replication, and long-context document/video understanding. Performance-wise, the 9B Flash version outperforms Qwen3-VL-8B, while the 106B version matches competitors with twice the parameters, all with 50% API price reduction and full open-source support across multiple frameworks.

赛博禅心
mp.weixin.qq.com
12-06
6519 words · 27 min
92
V3→R1→V3.2|DeepSeek's Technical Advancement

A complete technical evolution map from DeepSeek V3 to V3.2. Covers MoE+MLA architecture, RLVR training, DSA sparse attention, self-verification/refinement, and GRPO improvements. Rich in technical details yet clearly explained with diagrams—the essential reference for understanding how open-source models match closed-source performance through engineering innovation.

AINLP
mp.weixin.qq.com
12-10
14715 words · 59 min
92
A Survey of Unified Multimodal Understanding and Generation: Advances and Challenges in 83 Pages

This 83-page survey from Nanjing University, Institute of Automation CAS, and collaborators systematically reviews 750+ papers on unified multimodal understanding and generation models, constructing a comprehensive taxonomy covering encoding, decoding, modeling, and training strategies. The paper categorizes modeling approaches into external service integration, modular joint modeling, and end-to-end unified modeling, while thoroughly analyzing trade-offs between continuous, discrete, and hybrid representations. For researchers and engineers seeking to understand the full landscape of multimodal foundation models and choose appropriate architectural directions, this is an invaluable panoramic reference.

晚点聊 LateTalk
xiaoyuzhoufm.com
12-11
1685 words · 7 min
93
144: From "Large and Powerful" to "Small and Powerful": The Density Law, Scaling Laws in RL, and the Distributed Future of Intelligence

In this Late Talk episode, Professor Liu Zhiyuan from Tsinghua and Dr. Xiao Chaojun provide an in-depth discussion of their Dancing Law research published in Nature Machine Intelligence. This is AI's "Moore's Law": model training and inference efficiency doubles every 3.5 months. The conversation challenges the industry's scaling law obsession, systematically explaining four dimensions of improving model efficiency—architectural innovations (sparse attention, MoE), data governance (from L0 collection to L4 validation), learning algorithms (RL's scaling challenges), and hardware-software co-optimization. The Mianbi team shares practical experience in automotive smart cockpit deployment and predicts that by 2027, phones will support personalized models with autonomous learning capabilities. They conclude by exploring three AGI phases—autonomous learning, AI collaboration, and creative breakthroughs—alongside a vision of distributed intelligence.

AI Engineer
youtube.com
12-06
4890 words · 20 min
93
Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic

Anthropic introduces a new paradigm for agent development: instead of building complex standalone agents, create composable skills. The talk addresses the core problem with current agents—they have intelligence but lack domain expertise, like a high-IQ mathematician doing your taxes: capable but inexperienced. Skills are essentially folders containing procedural knowledge, with a deliberately simple yet powerful design: version-controllable, team-shareable, progressively disclosed, and seamlessly integrated with MCP servers.

The presentation reveals an emerging universal agent architecture: model + runtime + skill library + MCP, where this modular design enables agents to acquire expertise on demand. Most exciting is the vision: building a collectively evolving knowledge base both within and across organizations, where agents continuously learn and improve through skills.

The GitHub Blog
github.blog
12-09
2047 words · 9 min
92
MCP joins the Linux Foundation: What this means for developers building the next era of AI tools and...

MCP officially transitions to Linux Foundation governance, evolving from Anthropic's open-source project into an industry-standard protocol. The core value is solving AI development's integration fragmentation: a unified protocol enables models to invoke tools and access context in standardized ways, supporting OAuth, remote servers, and long-running tasks. GitHub Octoverse data shows explosive AI development growth, with MCP gaining 37,000 stars in 8 months, validating the need for standardization.

腾讯技术工程
mp.weixin.qq.com
12-08
17463 words · 70 min
92
Equipping AI Agents with Human-like Persistent Memory: A Practical Guide to Long- and Short-term Memory Management Based on LangGraph

This article systematically explains how to equip AI agents with persistent memory capabilities. Starting from core Agent Memory concepts, it details the implementation of short-term and long-term memory in the LangGraph framework: short-term memory manages conversation state through Checkpointer, while long-term memory enables cross-session knowledge sharing via Store. The article includes extensive code examples, covering everything from InMemorySaver to PostgreSQL database persistence, and from basic state management to semantic search implementations. It concludes with a comprehensive Multi-Agent system case study integrating the MCP protocol, demonstrating the combined application of interrupt mechanisms, memory management, and multi-agent collaboration. Ideal for developers building intelligent agent systems.

InfoQ 中文
mp.weixin.qq.com
12-05
14176 words · 57 min
93
Thoughts on AI-Native Databases

This article deeply analyzes the twin challenges traditional databases face in enterprise AI adoption—managing private data and model memory—and proposes three core characteristics for the AI-Native Database: Agent-Oriented Multi-Modal Hybrid Search, Modern Elastic Architecture (Serverless/Distributed), and Deep Data-Model Fusion (AI Functions). The author argues that AI-era search moves beyond singular vector search, necessitating a deep fusion of various search methods. They advocate for extending established relational databases to incorporate AI capabilities as the primary path forward. The piece is both theoretical and practical, introducing OceanBase’s open-sourced seekdb, a lightweight, all-in-one AI database solution for developers. This is a must-read for professionals focused on AI infrastructure and application development.

AI Engineer
youtube.com
12-11
4337 words · 18 min
92
Can you prove AI ROI in Software Engineering? (120k Devs Study) – Yegor Denisov-Blanch, Stanford

Stanford's research involving 120,000 developers systematically reveals the key factors for measuring AI tools' ROI. The study shows that traditional metrics like lines of code and pull requests fail to accurately reflect AI's true impact, advocating for an engineering output model that simulates expert panel assessments. The most critical finding is that codebase hygiene correlates 0.40 with AI benefits—clean code amplifies AI effectiveness while technical debt accelerates entropy. The research also introduces an AI engineering practices benchmark to help teams identify maturity stages. A real-world case demonstrates how a 14% PR increase might mask a 9% code quality decline and 2.5x increase in rework, resulting in negative ROI. This provides tech leaders with a measurement framework that goes beyond surface metrics.

42章经
xiaoyuzhoufm.com
12-06
1328 words · 6 min
93

Dify founder Luyu provides an in-depth review of his two-year startup journey, revealing the strategic thinking behind a GitHub project with 110K+ stars. He explains how Dify differentiated itself from competitors like LangChain, Coze, GPTs, and n8n through three core strategies: open source, B2B focus, and globalization. Key insights include the persistence in engineering value and model neutrality, the pragmatic transition path from high code to intelligence, and the unexpected phenomenal success in the Japanese market. More valuable are his thoughts on solving the "last mile" problem of AI applications and redefining enterprise competitiveness in an AI-symmetric era. This is essential listening for AI entrepreneurs, product managers, and anyone interested in enterprise AI applications.

真格基金
mp.weixin.qq.com
12-10
9047 words · 37 min
93
Zhang Tao Responds to Controversy for the First Time: Why Has Manus Not Been Replaced? | Tsinghua Campus Visit

Manus co-founder Zhang Tao responds systematically to external skepticism for the first time, offering deep insights into the core philosophy of general-purpose agents: Less structure, more intelligence. The article reveals how Manus maintains benchmark leadership through Zero Predefined Workflow, completely delegating task decisions to models. Zhang shares critical product pivots from AI browser to general agent, the role of Session Replay in early viral growth, and how application teams can compete against giants like OpenAI through model selection flexibility.

量子位
qbitai.com
12-09
4193 words · 17 min
92
Unveiling the 'Doubao Phone': Core Tech Open-Sourced Years Ago, GUI Agent Deployment for Two Years, Heralded as the 'World's First True AI Phone'

Complete technical evolution of ByteDance's Doubao Phone core technology UI-TARS: from open-source v1 to UI-TARS-2, achieving system-level GUI automation through four core capabilities. Technical teardown reveals OS innovations: Virtual Display parallel execution, visual isolation design, dual-mode standard/Pro tech stack. Must-read for developers tracking AI Agent deployment.

42章经
mp.weixin.qq.com
12-11
9785 words · 40 min
92
What Will the Next-Generation AI Interaction Look Like? | Chapter 42 Press AI Newsletter

The article systematically examines three directions for next-gen AI interaction: platformized personalized software addressing trust and distribution, AI voice input evolving into a core interaction layer with user data access, and innovative design patterns like parameter sliders and reverse onboarding. The key insight: future product design requires systems thinking—building structures that adapt across time scales, not just polishing UI details.

No Priors
youtube.com
12-11
11156 words · 45 min
92
No Priors Ep. 143 | With ElevenLabs Co-Founder Mati Staniszewski

ElevenLabs co-founder Mati Staniszewski shares how the company reached $300M ARR in just three years. The podcast explores the evolution of voice AI technology, from solving cross-language dubbing challenges to building comprehensive creative and agent platforms. Mati details their "lab" model that balances fundamental research with product development, enabling parallel work between research and engineering teams to quickly transform technical breakthroughs into product value. He emphasizes voice AI's tremendous potential in education, believing personalized AI tutors will revolutionize learning. A highly informative interview for readers interested in voice technology development, AI productization strategies, and startup growth.

腾讯科技
mp.weixin.qq.com
12-10
6877 words · 28 min
92
Elon Musk's Latest Interview: From AI and Mars to Short-form Videos, Exploring Unconventional Survival Philosophies

This Musk interview delves into AI's potential to reshape human labor, the harsh realities of Mars colonization, and technology's double-edged effects. Musk challenges mainstream assumptions with counterintuitive perspectives: Mars isn't a billionaire's escape pod but a high-risk hardcore proving ground; short-form videos erode deep thinking capacity; civilization's survival depends on maintaining sufficient "entertainment value." The interview reveals his daily routine of six-hour sleep cycles and relentless focus on information filtering and priority sorting. For readers contemplating technology trends and civilization's trajectory, this offers thought-provoking intellectual material.

新智元
mp.weixin.qq.com
12-06
6476 words · 26 min
92
100 Trillion Tokens Stun Silicon Valley! Half of the World's Computing Power Writes Code, the Other Half is Exploring Creative Applications?

This joint report from OpenRouter and a16z, based on 100 trillion tokens processed over the past year, reveals several key turning points in AI for 2025. Open-source models have stabilized at around 30% market share, with Chinese models surging from 1.2% to nearly 30%. Reasoning-optimized models now account for over 50% of traffic, marking AI's shift from text generation to complex task execution. Programming dominates overall usage at 50%+, while role-playing commands 52% of open-source model traffic. The report introduces the "Cinderella glass slipper effect," arguing that retention depends on solving specific pain points perfectly at launch rather than competing on price. A valuable industry snapshot grounded in massive real-world usage data.

MiniMax Founder Yan Junjie × Luo Yonghao: No Mountain Is Unscalable

This is a deep dive into how MiniMax, a Chinese AI unicorn, breaks through using technological idealism and first principles despite limited resources. Founder Yan Junjie and host Luo Yonghao discuss MiniMax's unique "full-modal parallelism" (text, speech, video, music) strategy and the organizational philosophy of relying on young, local Chinese talent rather than idolizing Silicon Valley experience. The podcast covers macro analyses of the US-China AI gap (a mere 5% tech gap but a 100x valuation difference) and micro-workplace transformations where the boundaries between product managers and engineers blur in the AI era. Ideal for readers interested in AGI pathways, startup management, and tech commercialization.

122. Zhu Xiaohu's Realistic Tales, Part III: The Feast and Bubble of Artificial Intelligence

Zhu Xiaohu's comprehensive year-end review of the AI industry covers bubble assessments, investment strategies, and US-China competition dynamics. He confidently states that an AI bubble won't materialize for at least three years, while analyzing OpenAI's strategic pivot from AGI pursuit to daily active user applications. A key insight emerges: AI competition has shifted from model capabilities to the battle for super-app entry points. For entrepreneurs, he recommends "offsetting 15 degrees from consensus" and focusing on vertical scenarios and "dirty work" that big tech avoids. Drawing from his deep understanding of mobile internet cycles, Zhu predicts China's advantages in data centers and open-source ecosystems will manifest within 5-10 years.

Founder Park
mp.weixin.qq.com
12-11
10752 words · 44 min
92
a16z Annual Forecast: By 2026, New AI Startup Opportunities Will Emerge in Vertical Industries, as AI Products Become Highly Customized

a16z's investment team delivers a comprehensive forecast of AI opportunities in 2026, spanning infrastructure, applications, and vertical industries. The report highlights that Agent-native infrastructure will become essential, with the core challenge shifting from computing power to multi-agent coordination. Consumer AI products will pivot from "help me" to "see me," with the latter offering stronger user retention. The most critical insight: 99% of market opportunities exist in traditional vertical industries, not Silicon Valley tech circles. The report also predicts that enterprise software value will shift from record systems to intelligent execution layers, and videos will evolve into interactive simulation environments. An invaluable industry reference for entrepreneurs and investors exploring AI opportunities.

    BestBlogs Issue #76: Skills | BestBlogs.dev