Navigating Media Publishing in an Agentic World

Surveying the landscape of standards and protocols that allow publishers to get paid in an agentic web.

VIOLET

April 7, 2026

Abstract patterned illustration in dark red with scattered white squares.

Let’s say you wanna watch a movie in theatres. A good movie please, an artsy flick with limited distribution, ideally subtitled, in a space with no teenagers and a lobby cocktail bar. You ask your AI assistant: “where can I watch an artsy film in an indie theatre in Toronto this weekend?” Your assistant does a couple of web searches and comes back with some suggestions. One of them appeals to you, so you go directly to the theatre’s website and buy tickets. Congratulations: you are destroying the publishing industry.

Here’s the problem: to get you those results, your AI agent had to read a number of different websites, including, in this case, torontoguardian.com, todocanada.ca, moviefone.com, cinemaclock.com, as well as sites like ticketmaster.ca, hotdocs.ca or tiff.net. These two groups of results are quite distinct: the latter set of websites exists exclusively for you to have a place to go to to buy tickets. The website is a storefront for a product, namely movie tickets. The former group, though, is selling you the content of the website itself.

Screenshot of an AI assistant returning a list of indie art house theatre showtimes in Toronto, with results sourced from todocanada.ca, revuecinema.ca, factorytheatre.ca, and torontoguardian.com.

This made sense in a world where you were the one doing the searching. If you landed on cinemaclock.com you’d see their banner ad. Landing on thestar.com would show you their paywall and invite you to subscribe to read the article. Social media complicated this picture: Twitter users would screenshot content and serve as informal mediators for content. AI has only made that mediation steeper. Now your agent is landing on those websites, often skipping their paywall, and synthesizing the results to present them to you. The AI agent relies on these sites to get you your results, but the people who do the work of maintaining and populating them are not being compensated at all.

Fortunately, circumventing AI-driven content mediation is actively being addressed in the publishing industry. Multiple solutions are emerging to address different aspects of this fundamental problem; some of these solutions are open, some are proprietary. Some are interoperable, while some lock publishers into specific vendors. This piece will attempt an overview of the current landscape. As such, some of the specific solutions identified will likely be outdated by the end of the year, but the different layers and their specific requirements will probably stay the same.

Layer 1: Visibility

Let’s say you run a website that is, you suspect, being crawled by agents. The first step in solving any problem is to quantify it: you need a way to identify which agents are crawling your website and how often. This turns out to be the hardest single problem to solve. There is an open standard, RFC 9309 that says robots should identify themselves via the “User-agent” header–a piece of information carried by every request to your site that declares “I am entity X”. Another draft open standard, web bot auth builds on RFC 9421 to provide a richer, cryptographically-signed claim that allows an agent to declare not just which agent they are, but which company is actually hosting the agent. This is increasingly important in a world of open-weight agents that can be whitelabeled and served by different entities.

Both of these, however, are a voluntary self-declaration, and there is no enforcement to make agents self-disclose at all, or to self-disclose in honest ways (you’ll find this is a leitmotif in this piece). Agents in the wild have been found to misrepresent themselves as humans running web browsers. Even when they don’t pretend to be humans, agents often self-report in problematic ways. Google’s AI agent, for example, is indistinguishable from Google’s regular crawler, so it’s impossible to distinguish between “agent-mediated user traffic” and “Google indexing our page to make it search-discoverable.”

The best solution in this space seems to come from Cloudflare, whose AI Crawl Control product uses machine learning heuristics to identify traffic patterns as originating from AI agents. This can be enabled by any Cloudflare-fronted website for free today, which is great news if you’re already in Cloudflare, and locks you into a new vendor if you’re not.

Layer 2: Rights Expression

So you’ve either enabled crawl control or created a visibility layer to surface whatever honest user-agent data you’re able to get. And it turns out agents love to read your content; now you want to get paid when they access that content. The “how do I get paid” question decomposes into several layers, the first being simply “How do I state that I’d like to get paid?”

Fortunately, there is an open standard named Really Simple Licensing (RSL) that allows content providers to state, in machine-readable form, exactly what licensing terms apply to their content. This enables you to declare terms like “Pay me 5 cents per crawl via settlement protocol x402”, or “If you’re agent X, you may access through a separate licensing agreement with vendor Y,” or even “Please negotiate terms of access with my publishing collective at xyz.org.”

The IETF has also put eyes on this layer, establishing an AI Preferences Working Group that aims to produce an open standard by August 2026, building on HTTP headers and the robots exclusion protocol in RFC 9309 to create a standardized internet-native vocabulary to allow content rights holders to express terms of use for their content by robots. If ratified and widely adopted, this new standard could converge with RSL to produce one unified rights expression layer.

As you may surmise, it’s vital that this layer operate through an open protocol precisely because it’s able to express settlement terms that operate through proprietary protocols. Vendor lock-in at this layer would forever lock content providers to specific licensing agreements or settlement layers.

Layer 3: Settlement

You can think of RSL as a way to hand over a bill that says “You owe X dollars, we accept Visa, Mastercard, and cash.” The settlement layer is the credit card terminal that actually allows others to pay that bill. There are many solutions at this layer, and new ones seem to emerge every day. It’s impossible to provide a full accounting of everything that’s available at this layer, but we can draw some broad trends.

Firstly, there’s the open protocol x402. This allows agents to natively settle transactions in stablecoin, or fiat, with no middleman (other than the payment protocol itself) taking a cut. You provision a wallet for your agent, give it a budget, and the agent draws funds whenever it accesses content. This is the most technically open solution, which is inevitably why it has the highest barrier to adoption: current implementations require everyone to be on blockchain, and traditional settlement protocols would need to implement new primitives to enable x402 through fiat.

This is where settlement layers like lobster.cash come in. Lobster cash is integrated with Visa’s proprietary Intelligent Commerce tech to provide an API that allows agents to create virtual Visa credit cards (with spending limits) that they then draw from to pay for content on crawl (or, indeed, to pay for anything). This enables content providers to receive credit card payments directly. The agent credit cards can be paid off via conventional banking methods. If your website is on Cloudflare, their upcoming pay-per-crawl solution works similarly.

What the above solutions have in common is that they essentially present agents with a spot-price to pay for your content, and then provide a settlement mechanism per crawl. They’re basically microtransaction engines. This is challenging to implement: why would the agent pay instead of bouncing to the next search result? Solutions like TollBit work differently, by acting as an intermediary between content creators and AI agents, negotiating rates with the latter, metering access, and passing on a portion of revenue to the former. When an AI provider has an agreement with TollBit, their agent is allowed to crawl TollBit-licensed content without having to pay at the point of consumption. This creates a mediated relationship, similar to how platforms like Spotify work, between listener and content producer, with the result that content providers have access to bargaining power they wouldn’t have on their own, at the cost of vendor lock-in and a portion of proceeds.

“Wait,” I hear you say, “isn’t Spotify awful for artists?” Oh yes! Surrendering your bargaining power to a middleman really only shifts the problem down: who negotiates with the negotiator? What happens as the portion of revenue that they pass on to you, the content provider, shrinks and shrinks? One solution for this problem is provided, again, by the music industry: collective management organizations (CMOs) are a long-established way for content providers to come together to bargain with organizations that want to consume their content. Publishing CMOs have already started to strike deals with AI companies. The open alternative is something like the RSL collective: an open collective of publishing platforms and online publishers that enables its members to collectively bargain for deals with AI providers.

Ultimately, the settlement layer is the most complex and important of all because, well, someone’s gotta move the money, and whoever does will take a cut. This layer will likely continue to see rapid development in the next few years, with a multitude of ad-hoc solutions emerging.

Layer 4: Enforcement

So you have a way to hand agents a bill; they have a way to settle the bill. But can you stop them from leaving your content restaurant with a full belly without paying the bill? This is the enforcement layer, and it subdivides into two distinct problems.

The first question is simple: how do you block access when payment hasn’t been settled? This is, broadly, as easy as setting up a paywall, with a couple of caveats. Most publishing organizations today use client-side paywalls that serve the whole content upfront then use frontend code to hide it until users pay. This does nothing to deter agents–they can simply refuse to execute the paywall code. Migrating to server-side paywalls is the easy intervention here. Next, the paywall needs to be legible to agents as something they can interpret. To this end, server-side paywalls need to be served with the 402: Payment Required HTTP status code. There are also simple ways to associate a paywall response to an RSL license, allowing agents to quickly navigate the licensing gate. Integrated solutions like Cloudflare or TollBit’s do all of these things end-to-end for their users.

The next question is trickier. Once an agent has settled payment, how do you ensure it’s using the licensed content in the ways the license provides for? Agents could choose to, for instance, request data as though they were crawling it for a user response, then use it for training. Malicious agents could go as far as to pay a licensing fee, crawl the content, then rehost it on BitTorrent or IPFS. Again, Cloudflare has a solution here: they explicitly verify bots, and disallow any access to unverified bots. If you misbehave, your bot gets delisted and you can’t access content anymore.

Open solutions are also emerging in this space, including ERC-8004 which seeks to establish an open reputation system. When a bot hits your site, you can verify its public reputation based on interactions it’s had with other sites to decide how to proceed. These reputation systems will most certainly expand. For instance, we will likely soon need reputation systems not just for bots themselves but also for the prompts they’re running. When anyone can deploy an open-weight model like Qwen3.5 on commodity hardware, it becomes much harder to tell who the mediator is: is it the Qwen model itself? The Qwen code harness the model is running inside of? The company hosting the harness? The set of skills the harness has access to? The question “can we trust this agent” becomes more about being able to determine if the intersection of all these things–model, harness, skills, organization, user–is trustworthy. Of course, as long as the cost to list a new bot is low, this can lend itself to a game of whack-a-mole where new malicious bots are spun up to replace the banned ones.

This is why the ultimate enforcement layer is law enforcement. It’s already illegal to program a robot to crawl licensed content and rehost it on BitTorrent. The way this has conventionally been enforced has been through lawsuits, and there’s no reason to think any technical solution will ever supplant a well-functioning judiciary. Legislative action will be needed to ensure a level-playing field. Canada has taken some steps in this direction with bill C-18, showing the federal government is aware of the need to protect the Canadian media landscape. Perhaps the determining factor here is the balance of power between plaintiffs and defendants. Lawsuits where multiple plaintiffs come together to sue big tech companies are much likelier to succeed than lawsuits where those plaintiffs stand alone.

Conclusion

Media publishing revenues have been on the decline for over two decades. Competition from amateur content on social media, the general willingness to trade personal data for free content, and the ease of implementing technical solutions that hop firewalls have all contributed to a sense of urgency media publishing. The emergence of AI agents can be another milestone en route to a world of exponential mediation for content creators, or it can be a pivot point. Agents simplify discoverability of published content, make it possible to process and make sense of large amounts of written language quickly, and potentially automate licensing negotiations and payment settlement. The building blocks are there to create a more equitable future for media publishers. Radical changes are alarming, but if media publishing is to survive, the future will belong to visionary organizations who can leverage this technological disruption to turn crisis into opportunity.

We're always on the lookout for brilliant teams doing bold things with emerging technology. Sound like you? Drop us a line at hello@hypha.coop