the paper compares 4 ways for LLM agents to use websites for shopping tasks, like using raw HTML pages, RAG, MCP, or NLWeb.
Finds that agents using plain HTML pages are less accurate and much slower than the other 3 options.
It studies LLM based assistants that search across online shops for products, compare options, and complete checkout.
These agents use 4 styles, HTML clicking, a retrieval augmented generation RAG search index, Model Context Protocol MCP APIs, and a standardized NLWeb natural language endpoint.
HTML means filling forms and following links, while the other 3 jump straight to structured product data.
The authors build simulated shops and give agents tasks like product search, vague search, finding cheapest offers, and checkout.
They track how many correct products each agent returns, how often the answer is fully right, tokens spent, and runtime.
On average, RAG, MCP, and NLWeb reach about 0.75 F1, while HTML stays near 0.67 and misses more.
HTML agents also burn more compute, needing about 3x more tokens and 5x more time per task than others.
So the study recommends RAG agents, often with GPT5 or cheaper GPT5 mini, and keeping HTML browsing only as backup.
---
Paper Link – arxiv. org/abs/2511.23281
Paper Title: "MCP vs RAG vs NLWeb vs HTML: A Comparison of the Effectiveness and Efficiency of Different Agent Interfaces to the Web (Technical Report)"