2026-04-20 · 11 min

MCP server card, agent skills, and other standards you haven't heard of yet

本文探討網頁架構如何從人類瀏覽進化至自動化系統導向，深入解析 llms.txt、MCP server cards 與 Agent Skills 等新興標準，協助開發者為您的網站建立機讀友善的互動介面。

agents · mcp · llms · web · standards

The web was built for human eyeballs. We spent thirty years optimizing HTML for visual consumption, bolting on schema.org microdata when Google forced our hand, and largely ignoring the fact that machines might want to read our sites without scraping the DOM.

Then came the agents. In 2026, the web is crawling with autonomous systems trying to perform tasks. But agents can't "see" a website. They can't intuitively understand that the "Log In" button is a gateway to authenticated state, or that a site offers an API for executing trades. They rely on the DOM, which is a fundamentally broken interface for a machine.

We are currently in the awkward adolescence of agent-web interaction. To bridge the gap, a cottage industry of "agent-discovery standards" has exploded. Every month, a new .well-known file or text specification drops, promising to make your site "agent-ready."

Most of them are noise. A few are critical infrastructure.

This post is a survey of the emerging agent-discovery standards. I'll cover llms.txt, MCP server cards, agent skills declaration files, Web Bot Auth, Content Signals, and the x402 payment layer. I'll explain what each tries to solve, how they overlap, and whether you should actually bother implementing them on your own site.

The Contenders

llms.txt: The Accidental Standard

The llms.txt file was born out of sheer frustration. Before it existed, if an agent wanted to read your documentation, it had to scrape your site, parse out the navigation, filter out the boilerplate, and try to assemble a coherent context window. It was expensive, slow, and error-prone.

The premise of llms.txt is painfully simple: put a text file at the root of your domain (/llms.txt) that provides a markdown-formatted directory of your site's most critical information, optimized for an LLM's context window.

It caught on because it requires zero engineering effort. You write a markdown file, serve it statically, and agents immediately perform better when reasoning about your documentation.

Why it caught on: It solves an immediate, painful problem for agent developers—context assembly. By providing a pre-digested index and direct links to clean markdown content, you save the agent tokens and time. It's the robots.txt of the agent era.

Criticisms: It's too simple. llms.txt is entirely read-only. It doesn't tell an agent how to do anything; it only tells it what exists. Furthermore, it's unstructured. While there are conventions (like linking to an llms-full.txt), there is no rigid schema. Agents still have to use their reasoning capabilities to parse the file, which feels like a step backward for a supposedly machine-readable standard. It also suffers from staleness—developers update their docs but forget to update their llms.txt.

MCP (Model Context Protocol) Server Cards

The Model Context Protocol (MCP) was Anthropic's attempt to standardize how models interact with external data sources and tools. While MCP itself is a protocol (running over stdio or SSE), the MCP server card is a discovery mechanism.

An MCP server card is typically hosted at .well-known/mcp.json. It tells an agent: "Here is an MCP server you can connect to, here are the transports it supports, and here is a high-level summary of the tools and resources it exposes."

How discovery works: When an agent lands on your site, it checks for .well-known/mcp.json. If present, the agent can initiate an MCP connection (e.g., via Server-Sent Events). Once connected, the agent uses the standard MCP tools/list and resources/list endpoints to introspect the server's capabilities dynamically.

The server card bridges the gap between static discovery (finding the server) and dynamic capability negotiation (using the server).

Why it matters: MCP is the most robust standard for action. If you want an agent to execute code, query a database, or interact with a SaaS platform on behalf of a user, MCP is the established pipe. The server card simply makes that pipe discoverable from the open web, rather than requiring manual configuration by the user.

Agent Skills Declaration

While MCP is focused on generic tools and resources, the Agent Skills declaration (.well-known/agent-skills.json) is focused on high-level, domain-specific workflows.

An "Agent Skill" is essentially a parameterized prompt combined with tool execution logic. The declaration file tells visiting agents: "I have predefined, optimized skills for this specific site."

What fields it contains: A typical agent-skills.json defines a list of skills, each containing:

name: A unique identifier (e.g., github/create-pull-request).
description: What the skill does.
parameters: A JSON Schema of required and optional inputs.
entrypoint: The URI or protocol to invoke the skill.
context_requirements: What prior knowledge or authentication the agent needs.

How it differs from MCP: MCP provides raw primitives (e.g., git_commit, git_push). Agent Skills provide compound workflows (e.g., "Create a PR fixing a bug described in issue #123"). MCP requires the agent to reason about how to sequence tools; Agent Skills offload that reasoning to the site owner, who provides a pre-baked, guaranteed-to-work execution path. It is a declarative API for agents.

Web Bot Auth

Authentication has been the bane of agentic workflows. For years, users had to copy-paste API keys into agent interfaces, or agents had to awkwardly attempt OAuth flows designed for humans in browsers.

Web Bot Auth is an emerging standard for signed bot identity. It allows an agent to cryptographically prove who it is (e.g., "I am a worker for the Gemini CLI system") and on whose behalf it is acting ("I am acting on behalf of user Kieran, who authorized me via standard delegation").

The standard involves HTTP headers (X-Bot-Identity, X-Bot-Signature) and a well-known JWKS endpoint for verifying the agent's public keys.

Why it's necessary: As agents move from reading documentation to mutating state, trust becomes paramount. Web Bot Auth allows a site to differentiate between a malicious scraper and an authorized agent running a task for a paying customer. It enables sites to grant fine-grained permissions to agents without requiring the user to be in the loop for every single action.

Content Signals

Cloudflare introduced Content Signals as a response to the massive backlash from publishers who were angry that AI companies were scraping their content for training data without compensation.

Content Signals (.well-known/content-signals.json or HTTP headers) is a machine-readable policy file. It goes beyond the binary Allow/Disallow of robots.txt.

What it solves: It allows a site owner to specify granular usage rights:

training: Can this content be used to train foundational models?
rag: Can this content be indexed for Retrieval-Augmented Generation?
summary: Can this content be summarized for immediate user display?
attribution_required: Must the agent provide a canonical link if the content is used?

The reality: It is legally untested and entirely honor-based. However, the major foundation model providers (Google, OpenAI, Anthropic) have agreed to respect it because the alternative is regulation. For site owners, it is the only viable mechanism to exert control over how their intellectual property is consumed by AI.

x402: The Payment Layer

Here is where my bias shows, as I spend most of my time working on x402.

If agents are going to perform meaningful work, they need to pay for things. The web lacks a native, machine-to-machine payment standard. x402 is the HTTP status code (Payment Required) repurposed and formalized for the agent era.

When an agent attempts to access a premium resource or invoke a high-cost tool, the server responds with a 402 Payment Required along with an x402-invoice header containing a lightning network invoice or a smart contract payment payload.

What it solves: It enables micro-transactions for API usage, compute time, or premium content without requiring the user to have a pre-existing subscription with the site. The agent negotiates the payment using its own wallet (funded by the user), pays the invoice, and retries the request.

This transforms agents from cost-centers (consuming API keys) into autonomous economic actors.

How They Compose

These standards are not mutually exclusive; they are orthogonal and highly complementary. A fully "agent-ready" site in 2026 implements all of them.

Here is what the file tree of a modern, agent-optimized domain looks like:

kieran-site/
├── .well-known/
│   ├── mcp.json             (Exposes the site's dynamic action capabilities via MCP)
│   ├── agent-skills.json    (Pre-baked complex workflows for agents)
│   ├── content-signals.json (Usage policy for the site's text and data)
│   └── bot-auth.jwks        (Public keys for authenticating incoming agents)
├── public/
│   ├── llms.txt             (The human-readable directory for LLMs)
│   └── llms-full.txt        (The concatenated, token-optimized context dump)
└── api/
    └── execute-trade        (An endpoint that returns a 402 if the agent hasn't paid)

The Flow:

An agent navigates to the domain.
It reads llms.txt to understand the domain's purpose and layout.
It checks content-signals.json to ensure it is allowed to index the site for RAG.
It discovers .well-known/mcp.json and connects to the MCP server.
It reads .well-known/agent-skills.json and realizes there is a pre-baked skill for "Summarize recent posts and post them to Twitter".
The agent attempts to execute the skill. The server demands authentication.
The agent uses Web Bot Auth to sign the request, proving it is acting on behalf of an authorized user.
The server accepts the auth but returns an x402 status because the skill costs $0.05 in compute.
The agent pays the invoice and the skill executes.

This is the holy grail of agentic web interaction: fully automated, secure, compensated, and discoverable.

What's Hype vs. What to Ship

The reality is that you are a busy developer and you do not have time to implement six different experimental standards. You need to know what actually moves the needle today, and what is just performative standards-boarding.

Hype: Agent Skills Declaration

Right now, agent-skills.json is a solution looking for a problem. Most agents are smart enough to use raw MCP tools to achieve their goals. Defining rigid, pre-baked skills is fragile. If the underlying API changes, the skill breaks. Furthermore, there is no centralized registry for these skills, meaning agents have to blindly hope they stumble across one that matches their current objective. Skip this until the major agent frameworks (LangChain, LlamaIndex, Semantic Kernel) enforce a unified schema.

Hype: Content Signals

Unless you are the New York Times, Reddit, or StackOverflow, foundation models do not care about your data enough to respect your granular content-signals.json. They are going to scrape you, or they are going to ignore you. If you don't want to be scraped, use a WAF. If you want to be scraped, do nothing. Implementing Content Signals is an exercise in virtue signaling for small to medium-sized sites.

Emerging, but Wait: Web Bot Auth & x402

Web Bot Auth is critical infrastructure, but the tooling isn't there yet. Setting up the JWKS endpoints and handling the signature verification is still too complex for a weekend project. Wait for services like Clerk or Auth0 to offer native "Bot Auth" toggles.

Similarly, x402 is the future of agent monetization, but unless your userbase consists entirely of crypto-native power users running autonomous wallets, it will just block usage. The UX of funding an agent wallet is still abysmal.

Opinion: What to Ship Today

There are exactly two standards you should implement on your site today.

1. llms.txt Do this immediately. It takes five minutes. Write a markdown file explaining what your site is, where the API docs are, and what the core concepts are. Put it at /llms.txt.

Why? Because developers are already using agents (like Claude Desktop, Cursor, or Gemini CLI) to write code against your product. If an agent can cleanly read your llms.txt, the developer gets a working implementation faster. If it can't, the developer gets hallucinations and blames your product. llms.txt is pure DX (Developer Experience).

2. MCP Server Cards (.well-known/mcp.json) If your site has a product that users interact with—an API, a database, a SaaS tool—you should be building an MCP server. Once you build it, expose it via the server card.

MCP is winning the tool-use war. Anthropic backed it, the open-source community adopted it, and it provides a clean, standardized way for agents to take action. The .well-known/mcp.json file is just the router that points the agent to your MCP endpoint.

If your site is just a blog, you don't need MCP. But if you have a product, MCP is the difference between an agent reading about your product and an agent using your product.

The Bottom Line

We are transitioning from a web of documents to a web of capabilities. The visual DOM will remain for humans, but a parallel, machine-readable web is being constructed underneath it.

You don't need to adopt every specification that hits Hacker News. But you do need to acknowledge that your site's audience is no longer exclusively human. Start small. Ship llms.txt. Explore MCP. Leave the rest to the standards committees until the dust settles.

References

all notes →