Featured Newsletter

BestBlogs.dev Highlights Issue #37

👋 Dear friends, welcome to this week's curated selection of articles in the AI field!

This week, the artificial intelligence domain continues its relentless pace of innovation, bringing forth a wave of exciting advancements. From groundbreaking models to practical applications and thought-provoking ethical considerations, this issue is packed with essential reads to keep you at the forefront of AI. Join us as we spotlight the key developments in the AI space this week, and stay ahead of the curve!

This Week's Highlights:

Chinese Models Rise in Prominence, Global Leaders Enhance Capabilities: Tongyi Qianwen's QwQ-32B , boasting 32 billion parameters, rivals the performance of models orders of magnitude larger, marking a significant leap for Chinese AI and open-source accessibility; the innovative Ovis2 multimodal architecture debuts, topping benchmarks and setting a new direction for multimodal models; Anthropic's Claude 3.7 Sonnet , with its hybrid reasoning and impressive coding prowess, once again pushes the boundaries of AI performance; and OpenAI's GPT-4.5 arrives, emphasizing emotional intelligence and world knowledge, hinting at the next wave of AI model evolution. A constellation of stars, charting new territories in the AI landscape.
The AI Agent Concept Takes Center Stage, Application Scenarios Explode: Monica Manus emerges as a groundbreaking AI Agent product, defining the "Digital Agent" paradigm and transitioning the concept into real-world application; the rapid ascent of AI Coding tools like Lovable showcases AI's transformative potential in software development, democratizing access to technology creation; Taobao's Content AI Team provides a deep dive into AIGC content generation, revealing AI's substantial value in e-commerce; and Tencent Technology's "AGI Road" live series tackles the critical issue of AI hallucinations, prompting essential conversations about trust and reliability in AI. The Agent era is upon us, promising an exciting future for AI applications!
Developer Ecosystem Thrives, Open Source Power Amplified: Dify v1.0.0 launches, signaling a new era for AI application development platforms with its plugin-based architecture and the establishment of a vibrant open Marketplace; A comprehensive overview of 50+ Open-Source AI Agent Projects highlights the dynamic growth and boundless creativity of the open-source community; The open-sourcing of Alibaba Cloud's Tongyi Qianwen QwQ-32B and Ovis2 models further accelerates AI technology adoption and democratizes access to cutting-edge tools. Open source and open collaboration have become the driving forces, collectively building a thriving AI ecosystem!
"Client-Side Thinking" Emerges as a Key Competency, Reshaping Human-AI Collaboration: BestBlogs.dev presents an insightful analysis of "Client-Side Thinking," emphasizing precise problem definition, dynamic expectation calibration, and expert value judgment, identifying "Client-Side Thinking" as a core skill for the AI age and sparking essential reflections on the evolving human-AI partnership; Andrew Ng's interview with Anthropic's CPO explores the critical balance between model quality and user experience, alongside AI product release strategies, offering invaluable insights for AI product builders. A new era of human-AI collaboration is dawning, with "Client-Side Thinking" leading the charge!

🔍 Intrigued? Click the article links below to read more and immerse yourself in the world of AI technology, uncovering new horizons!

Subscribe Now

1Qwen Inference Model QwQ-32B is Now Open Source!
2Alibaba Releases New Inference Model QwQ-32B, Matching Performance of DeepSeek-R1 Full Version
3Gemini 2.0 Deep Dive: Code Execution
4CogView4: Zhipu AI's Open-Source Text-to-Image Model with Bilingual Input & Commercial License
5A Comprehensive Overview of Large Language Models: From Transformer (2017) to DeepSeek-R1 (2025)
6Open Operator， Serverless Browsers and the Future of Computer-Using Agents
7Dify v1.0.0 Officially Released | Adaptable and Flexible, No Fear of Model Changes
8The Age of AI: Essential AI Agents You Should Know
9Video Cover Production with LLMs: Solutions and Practices
10LLM Application Development: A Comprehensive Guide
11Manus Arrives: Is This the 'GPT Moment' for AI Agents?
12Li Jigang's Manus Review: The God Hand
13Z Product | Top Global Products (Feb 24 - Mar 2), Two Chinese-Founded Teams in Top Three
14Extensive Interview with Anthropic CPO: Moving Beyond Models, Regretting the Late Start on First-Party Products
15Lovable: $17 Million+ ARR in 3 Months, User Retention Exceeds ChatGPT
16AiPPT.cn: Achieving 10M Users & Profitability in One Year - A Conversation with PixelBloom CEO Zhao Chong
17The Hallucination Trap: Can AI's Mistakes Destroy the Internet? | AGI Series, Part 4
18Google's Top AI Minds: Jeff Dean and Noam Shazeer on Google's 25-Year Journey in AI
19Stop Memorizing Prompts: The Real Scarce Skill in the AI Era is 'Demand-Oriented Thinking'
20Tech Trends: Thirty Years of Hype Cycles
21GPT-4.5 Goes Big， Claude 3.7 Reasons， Alexa+ Goes Agentic， and more...

Qwen Inference Model QwQ-32B is Now Open Source!

通义大模型

mp.weixin.qq.com

03-06

794 words · 4 min

Qwen Inference Model QwQ-32B is Now Open Source!

This article announces the open-sourcing of the Qwen QwQ-32B inference model. The model performs well across multiple benchmarks. Its mathematical reasoning and coding proficiency are comparable to DeepSeek-R1, and it even surpasses DeepSeek-R1 in instruction following and tool usage. The model was optimized through two rounds of large-scale reinforcement learning, specifically targeting mathematical and programming tasks, as well as general capabilities. Furthermore, QwQ-32B integrates Agent-related capabilities, enabling critical thinking during tool use. It is now open-sourced on ModelScope and Hugging Face under the Apache 2.0 license, facilitating developer adoption and innovation.

Alibaba Releases New Inference Model QwQ-32B, Matching Performance of DeepSeek-R1 Full Version

机器之心

jiqizhixin.com

03-06

1755 words · 8 min

Alibaba Releases New Inference Model QwQ-32B, Matching Performance of DeepSeek-R1 Full Version

Alibaba has open-sourced a new inference model, QwQ-32B, which has 32 billion parameters but performs comparably to the 671 billion parameter DeepSeek-R1 Full Version, contributing to model compression. Based on Qwen2.5-32B, the model extends reinforcement learning (RL) methods, using a cold start and a two-stage training approach, achieving significant performance improvements in mathematics and coding tasks. QwQ-32B has been open-sourced on Hugging Face and ModelScope, and integrates Agent-related capabilities, enabling it to perform critical thinking and adjust reasoning processes based on environmental feedback while using tools. The model performs well in benchmarks such as LiveBench, IFEval, and BFCL, even slightly surpassing DeepSeek-R1-671B. The Qwen team provides feedback by verifying the correctness of generated answers and evaluating code execution servers, thereby continuously improving performance in mathematics and programming tasks. The open-source nature of QwQ-32B helps promote AI research and applications. In the future, the Qwen team plans to combine more powerful base models with RL based on large-scale computing resources to achieve Artificial General Intelligence (AGI).

Gemini 2.0 Deep Dive: Code Execution

Google Developers Blog

developers.googleblog.com

03-06

615 words · 3 min

Google Gemini 2.0 introduces code execution capabilities, granting the model access to a Python sandbox for running code and learning from the results. Accessible through Google AI Studio and the Gemini API, Gemini models can now perform calculations, analyze complex datasets, and generate visualizations, leading to enhanced answer quality. This feature supports file input and Matplotlib chart output, broadening its application in areas like financial analysis and scientific research for more efficient data processing.

CogView4: Zhipu AI's Open-Source Text-to-Image Model with Bilingual Input & Commercial License

魔搭ModelScope社区

mp.weixin.qq.com

03-04

2805 words · 12 min

CogView4: Zhipu AI's Open-Source Text-to-Image Model with Bilingual Input & Commercial License

This article introduces CogView4, Zhipu AI's latest open-source text-to-image model. The model excels in complex semantic alignment and instruction following, supports arbitrary length Chinese-English bilingual input, and generates images of arbitrary resolution. CogView4 achieved a leading score on the DPG-Bench benchmark and is the first open-source image generation model under the Apache 2.0 License, which marks a significant advancement in the field. A key feature of CogView4 is its expertise in processing and generating Chinese text, particularly Chinese characters, making it well-suited for the domestic market. The article details CogView4's technical features, including two-dimensional rotational position encoding for modeling image position, a Flow-matching scheme for diffusion generation modeling, and a multi-stage training strategy on the DiT model architecture. Furthermore, CogView4 overcomes the traditional fixed token length limitation, enhancing training efficiency and empowering users with unprecedented creative control. CogView4's Apache 2.0 License offers commercial advantages, lowering the barrier to entry and promoting the adoption of text-to-image technology across diverse sectors.

A Comprehensive Overview of Large Language Models: From Transformer (2017) to DeepSeek-R1 (2025)

Datawhale

mp.weixin.qq.com

03-01

9623 words · 39 min

A Comprehensive Overview of Large Language Models: From Transformer (2017) to DeepSeek-R1 (2025)

The article meticulously outlines the important advancements in the Large Language Model (LLM) field since the birth of the Transformer architecture in 2017. It begins by introducing the basic concepts of Language Models and Large Language Models, as well as the working principles of Autoregressive Language Models. Subsequently, the article reviews the development of Pre-trained Models such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), as well as alignment techniques such as Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). Next, the article explores the emergence of Multimodal Models such as GPT-4V and GPT-4o, and the role of Open-Source and Open-Weight Models in promoting the democratization of AI (Artificial Intelligence) technology. At the same time, the article also mentions the role of Inference Models in solving complex problems. Finally, the article focuses on cost-effective Inference Models such as DeepSeek-R1, emphasizing their potential in lowering the barrier to AI use and promoting innovation, and anticipates future trends in versatility, multimodality, and reasoning capabilities.

Open Operator， Serverless Browsers and the Future of Computer-Using Agents

Latent Space

latent.space

02-28

14733 words · 59 min

Open Operator， Serverless Browsers and the Future of Computer-Using Agents

In the booming era of AI Agents, Browserbase emerges as a crucial infrastructure provider, tackling the challenges of AI-Web interaction. Addressing the limitations of traditional web scraping due to modern websites' dynamic nature, Browserbase offers scalable and secure browser environments. It maintains a proxy super network to effectively counter anti-bot mechanisms, ensuring stable AI Agent operation. The company's open-source Stagehand framework streamlines AI-browser interaction with Act, Extract, and Observe APIs, lowering the barrier to AI-driven Web automation application development. Browserbase aims to bridge the gap between AI Agents and the Web world.

Dify v1.0.0 Officially Released | Adaptable and Flexible, No Fear of Model Changes

Dify

mp.weixin.qq.com

02-28

3499 words · 14 min

Dify v1.0.0 Officially Released | Adaptable and Flexible, No Fear of Model Changes

Dify v1.0.0 is officially launched, marking a significant leap for Dify as an AI Application Development platform. Key features of the new version include: the introduction of a plugin architecture that migrates models and tools to plugins, the addition of Agent Nodes, intelligent orchestration and decision scheduling support in Workflow and Chatflow, and the launch of a Marketplace, fostering a thriving plugin ecosystem in collaboration with the community, partners, and enterprise developers. Dify is committed to building the next-generation AI Application Development platform, realizing the four core capabilities of AI applications: reasoning, action, dynamic memory, and Multi-Modal I/O. By decoupling and opening up core capabilities through the Plugin Mechanism, the platform's flexibility is enhanced to meet the application development needs of developers in different scenarios. In the future, Dify will continue to improve developer documentation and toolchains, and invite global developers to participate in the co-construction of the platform through online and offline activities.

The Age of AI: Essential AI Agents You Should Know

山行AI

mp.weixin.qq.com

03-04

24171 words · 97 min

The Age of AI: Essential AI Agents You Should Know

This article provides a comprehensive review of popular AI Agents open-source projects, organized by category (general, programming, data analysis, etc.). The article details numerous open-source Agent projects, including Adala, Agent4Rec, and AgentForge. It covers their features, applications, and relevant links. This article enables developers to quickly grasp the current AI Agent landscape and identify suitable open-source projects.

Video Cover Production with LLMs: Solutions and Practices

大淘宝技术

mp.weixin.qq.com

03-05

6812 words · 28 min

Video Cover Production with LLMs: Solutions and Practices

This article details Taobao's design and implementation of a multimodal LLM-based AI Agent for cover generation, addressing inconsistent user-uploaded cover quality that impacts click-through rates. Targeting static and dynamic covers, this solution employs a modular Agent architecture, leveraging multimodal LLM capabilities. It supports various business needs with a white-box approach, offering flexibility and efficiency. This is achieved through the collaboration of core modules (planning, memory, action, and reflection), intelligent marketing highlight generation, and automated decorative text layout, ultimately automating the production of high-quality covers. The article details the technical implementation of each module, including the ReKV-based streaming long video processing engine, the two-stage intelligent frame selection pipeline, intelligent generation of marketing highlights, and automated decorative text layout. Experimental results demonstrate that this solution significantly improves cover click-through rates and encourages content consumption.

LLM Application Development: A Comprehensive Guide

腾讯云开发者

mp.weixin.qq.com

03-04

20078 words · 81 min

LLM Application Development: A Comprehensive Guide

This article provides an introductory guide to Large Language Model application development for developers without an AI background. It explains how LLMs play a role in business, emphasizing that developers can participate without a deep background in AI and mathematics. It details the application development process based on LLMs, including how to collaborate with LLMs using Prompt Engineering and implement complex functions through Function Calling. It delves into how LLMs can be applied to practical business scenarios such as knowledge-based question answering, using RAG (Retrieval-Augmented Generation) technology to address the context length limitations of LLMs, ensuring the relevance and accuracy of retrieval results. Finally, it highlights the potential of AI Agents.

Manus Arrives: Is This the 'GPT Moment' for AI Agents?

极客公园

mp.weixin.qq.com

03-05

4869 words · 20 min

Manus Arrives: Is This the 'GPT Moment' for AI Agents?

The article introduces Manus, the world's first AI Agent product developed by Monica.im. Manus emphasizes the ability to directly deliver final results. Through a multi-agent architecture that simulates human work methods, it runs in an independent virtual machine and can call various tools to complete complex tasks. The article lists application cases of Manus in travel planning, stock analysis, educational content creation, insurance policy comparison, and B2B applications, demonstrating its ability to autonomously plan and execute tasks. Manus is like a digital agent or intern, capable of autonomous learning and optimization based on user needs. Building on its understanding of user needs and technical expertise, Monica.im's evolution from browser plug-ins to AI Agent has culminated in the launch of this innovative product, raising expectations for the future of AI Agents.

Li Jigang's Manus Review: The God Hand

Founder Park

mp.weixin.qq.com

03-06

1071 words · 5 min

Li Jigang provides a hands-on review of Manus, a new product from Monica. The article showcases Manus's ability to generate various web content based on Prompts through examples including comics, animations, and SVG cards. The author uses the concepts of the 'Ladder of Abstraction' and 'Abstraction Leak' to explain the trend of improved AI abstraction and simplified user interfaces, providing a theoretical basis for Manus's advantages. The article also explores the potential for AI to extend human capabilities, such as enhancing mobility through self-driving cars and Robots, and augmenting hand execution through Agents, ultimately achieving a 'God Hand'-like capability.

Z Product | Top Global Products (Feb 24 - Mar 2), Two Chinese-Founded Teams in Top Three

Z Potentials

mp.weixin.qq.com

03-06

4235 words · 17 min

Z Product | Top Global Products (Feb 24 - Mar 2), Two Chinese-Founded Teams in Top Three

This article summarizes the top ten new products on Product Hunt from February 24th to March 2nd. These products are primarily driven by AI technology, spanning image generation, social media analytics, and low-code/no-code (LCNC) tools, showcasing the latest technological innovations at the intersection of AI and various industries, and addressing practical challenges such as improving efficiency and democratizing access. Notably, products from Chinese-founded teams like OpenArt Consistent Characters and Currents AI have performed strongly, offering readers a quick overview of the innovative overseas product ecosystem.

Extensive Interview with Anthropic CPO: Moving Beyond Models, Regretting the Late Start on First-Party Products

Founder Park

mp.weixin.qq.com

03-04

20139 words · 81 min

Extensive Interview with Anthropic CPO: Moving Beyond Models, Regretting the Late Start on First-Party Products

Anthropic CPO Mike Krieger shared the company's strategic transformation in an interview, emphasizing the shift from a model provider to an AI partner, building deep collaborative relationships. Anthropic is heavily investing in first-party products to accelerate learning, enhance brand presence, and establish a sustainable competitive advantage. Krieger also shared his views on DeepSeek and Anthropic's reflections on product releases and marketing. The value of AI lies in its integration with workflows, not just providing the model itself.

Lovable: $17 Million+ ARR in 3 Months, User Retention Exceeds ChatGPT

海外独角兽

mp.weixin.qq.com

03-06

7498 words · 30 min

Lovable: $17 Million+ ARR in 3 Months, User Retention Exceeds ChatGPT

The article introduces Lovable, an AI Coding startup that enables non-technical individuals to quickly build and refine Web Apps using natural language and images through AI technology. Lovable's ARR grew from $0 to $17 Million within three months of launch, with excellent user retention, making it one of the fastest-growing startups in European history. The article analyzes Lovable's product features, team background, growth strategy, market competition, and future impact, while also highlighting potential risks such as intense competition in the AI Coding field and high dependence on partners like Supabase.

AiPPT.cn: Achieving 10M Users & Profitability in One Year - A Conversation with PixelBloom CEO Zhao Chong

Founder Park

mp.weixin.qq.com

03-05

16068 words · 65 min

AiPPT.cn: Achieving 10M Users & Profitability in One Year - A Conversation with PixelBloom CEO Zhao Chong

The article is a compilation of dialogues between Founder Park and Zhao Chong, the founder and CEO of PixelBloom (AiPPT.com), delving into how AiPPT.cn achieved over 10 million users and profitability within a year. AiPPT effectively captured users' minds with a strong brand and significantly differentiated itself from traditional PPT tools through an AI-native experience, providing AI-driven assistance and addressing their pain points in content framework and data organization. In terms of market strategy, AiPPT adopted refined channel and audience operations and actively cooperated with various key traffic sources to output its capabilities to partners. The company also built an integrated platform approach including user growth, R&D, content, and talent, providing support for rapid product iteration and market expansion. Its profit model is mainly subscription and API sharing. Zhao Chong also shared his insights on how startups can break through in a market dominated by giants through differentiated competition. He emphasized the importance of identifying and capitalizing on market gaps.

The Hallucination Trap: Can AI's Mistakes Destroy the Internet? | AGI Series, Part 4

腾讯科技

mp.weixin.qq.com

03-06

11055 words · 45 min

The Hallucination Trap: Can AI's Mistakes Destroy the Internet? | AGI Series, Part 4

This article delves into the 'hallucination' problem of large language models, analyzing it from multiple perspectives, including technical principles, impact on information dissemination, and social governance. Experts point out that these hallucinations are not simply technical defects but stem from their probabilistic, prediction-based nature and inherent information gaps in the training data. Simultaneously, human cognitive biases, questioning of expert authority, and the characteristics of dissemination in the post-truth era exacerbate the spread of false information. The article also explores strategies to address hallucinations, including companies reducing them through post-training and alignment, governments implementing effective regulation, and users improving their AI literacy. The article also discusses the risks that large models may pose, such as a self-perpetuating cycle of indistinguishable true and false information, and the potential impact on individual consciousness. It advocates using 'confabulation' instead of the anthropomorphic term 'hallucination.' Finally, experts provide practical advice, emphasizing the importance of search verification, detailed questioning, and multi-party verification.

Google's Top AI Minds: Jeff Dean and Noam Shazeer on Google's 25-Year Journey in AI

Founder Park

mp.weixin.qq.com

02-28

37866 words · 152 min

In this in-depth interview, Google Chief Scientist Jeff Dean and Transformer co-inventor Noam Shazeer reflect on their 25 years at Google, from early PageRank and MapReduce to today's Transformer, MoE, and the latest Gemini, while envisioning the future of Artificial General Intelligence (AGI). They share unique insights on Moore's Law and TPU development, revealing Google's strategic vision for hardware and algorithm co-design – the Pathways architecture. Noam Shazeer also predicts that “the world's GDP will grow a hundredfold in the near future” and anticipates “running millions of AI researchers in Google data centers and living to 3000.” The interview spans Google's early days, the interplay of computing power and algorithms, the birth of Transformer, breakthroughs in AI research, the evolution of AI hardware, and the challenges and opportunities in AGI research and development, showcasing the profound thinking and predictions of these two AI leaders on technological evolution and the future of AGI.

Stop Memorizing Prompts: The Real Scarce Skill in the AI Era is 'Demand-Oriented Thinking'

人人都是产品经理

woshipm.com

02-28

2671 words · 11 min

Stop Memorizing Prompts: The Real Scarce Skill in the AI Era is 'Demand-Oriented Thinking'

The article points out that in today's rapidly developing AI technology, mastering 'Demand-Oriented Thinking' is more important than mastering prompt skills. As AI models' understanding capabilities improve, prompt templates are gradually becoming obsolete. A qualified 'Client' should possess three core capabilities: precisely defining problems (through thoroughly understanding the target audience, boundary exclusion, and benchmarking), dynamically adjusting expectations (treating AI as a collaborative partner, conducting MVP-style output, and iterating quickly), and professional oversight and value judgment (establishing an 'Input - Output' dual verification mechanism). The article also proposes three practical principles, including transforming instructions into stories, establishing a sense of demand layering, and cultivating AI 'translation' capabilities, emphasizing that in the AI era, the core competitiveness lies in the ability to accurately define problems and inject professional knowledge into the human-computer collaboration loop.

Tech Trends: Thirty Years of Hype Cycles

阮一峰的网络日志

ruanyifeng.com

03-07

4974 words · 20 min

Tech Trends: Thirty Years of Hype Cycles

This issue of the newsletter examines the cyclical nature of tech hype, reviewing the past thirty years of technological advancements. It highlights the real opportunities and wealth-generating potential behind the hype, while also acknowledging the risks. The newsletter encourages tech practitioners to seize these opportunities for rapid career growth. It also features innovative AI applications in mural restoration, detailing the use of computer technology to restore murals damaged during World War II. Furthermore, it addresses the gap between executives and employees, offering insights into technology trends, AI applications, and leadership perspectives.

GPT-4.5 Goes Big， Claude 3.7 Reasons， Alexa+ Goes Agentic， and more...

deeplearning.ai

03-05

3023 words · 13 min

GPT-4.5 Goes Big， Claude 3.7 Reasons， Alexa+ Goes Agentic， and more...

This issue of the deeplearning.ai Batch explores the challenges of VAD (Voice Activity Detection) in voice interaction and introduces Kyutai Labs' Moshi model as a solution through continuous listening. The issue also covers Inception Labs' text generation diffusion model, Mercury Coder, emphasizing its diffusion-based nature and speed. Furthermore, it provides a comparative analysis of OpenAI's GPT-4.5, highlighting its large scale despite being a non-reasoning model, and Anthropic's Claude 3.7 Sonnet, underscoring its hybrid reasoning approach and user-controlled reasoning token budget. The article also mentions OpenAI's GPU shortage and Anthropic's Claude Code tool.

BestBlogs.dev Highlights Issue #37

Contents