77 points by ananddtyagi 1 days ago | 71 comments

bobbiechen 1 days ago [-]

I disagree with the other top-level comments at the moment: I believe Web Bot Auth is a useful and non-centralized emerging standard for self-identifying bots and agents.

This press release today is a better statement of _why_ this feature exists (as opposed to the submission link, which is nuts-and-bolts of implementing): https://blog.cloudflare.com/signed-agents/

Web Bot Auth is a way for bots to self-identify cryptographically. Unlike the user agent header (which is trivially spoofed) or known IPs (painful to manage), Web Bot Auth uses HTTP Message Signatures using the bot's key, which should be published at some well-known location.

This is a good thing! We want bots to be able to self-identify in a way that can't be impersonated. This gives website operators the power to allow or deny well-behaved bots with precision. It doesn't change anything about bots who try to hide their identity, who are not going to self-identify anyways.

It's worth reading the proposal on the details: https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-... . Nothing about this is limited to Cloudflare.

I'm also working on support for Web Bot Auth for our Agent Identification project at Stytch https://www.isagent.dev . Well-behaved bots benefit from this self-identification because it enables a better Agent Experience: https://stytch.com/blog/introducing-is-agent/

binarymax 1 days ago [-]

I agree in principle, but I disagree that it should be designed and mandated by a private gatekeeper

jrochkind1 1 days ago [-]

What's now at the top has links to IETF drafts in the first paragraph. What am I missing?

A way to authenticate identity for crawlers so I can allow-list ones I want to get in, exempt them from turnstile/captcha, etc -- is something I need.

I'm not following what makes this controversial. Cryptographic verification of identity for web requests, sounds right.

binarymax 1 days ago [-]

I think about failure modes. What happens if cloudflare decides you are a bot and you’re not. What recourse do you have? What are the formal mechanisms to ensure a person is not blocked from the majority of the web because cloudflare is a middleman and you are a false positive?

jrochkind1 9 hours ago [-]

I am not following what any of that has to do with the Web Bot Auth protocol?

it seems like complaints about Cloudflare's anti-DOS protection services and how they have a monopoly on such, I get that.

I'm not seeing the connection to a protocol for bots/crawlers voluntarily cryptographically signing their http requests, so sites (anyone implementing the protocol not just cloudflare) can use it to authenticate known actors?

I am interested in using it to exempt bots/crawlers I trust/support/have an agreement with from the anti-bot measures I, like many, am being forced to implement to keep our sites up under an enormously increased wave of what is apparently AI-training-motivated repeat crawling. Right now these measures are keeping out bots I don't want to keep out too. I would like to be able to securely identify them to let them in.

delroth 14 hours ago [-]

Don't use a user agent that sends signed headers identifying you as a bot? How are any of the failure modes you mention not /improved/ by the spec proposal this comment section is about?

justincormack 14 hours ago [-]

This is not a spec sbout false positives, ir is about self identification as a bot.

jacobn 1 days ago [-]

Isn't that how most web standards got their start? One of the interested parties pushed something, then things evolved through the standards process?

(And then it can of course get derailed, but that's a separate story)

marginalia_nu 13 hours ago [-]

I generally agree it's a good thing. It stacks the incentives so that bots can meaningfully build a good reputation, and be rewarded for behaving well.

That said, I do think it's the whole procedure is more than a bit overcomplicated to the degree where I doubt it will be widely implemented. You could likely achieve almost the full effect with a request signing alone.

account42 12 hours ago [-]

> This is a good thing! We want bots to be able to self-identify in a way that can't be impersonated.

Who is we? I absolutely don't want that.

estearum 9 hours ago [-]

Earnest question: why not? I would think "option to prove who you are and guarantee not to be impersonated" is a pretty broadly appealing capability except to people trying to do the impersonating.

skeezyboy 8 hours ago [-]

>"option to prove who you are and guarantee not to be impersonated"

guaranteed as long as no attacker gets hold of the private key, which cannot be guaranteed

estearum 8 hours ago [-]

Yeah, I don't find this to be a compelling argument at all.

That's an argument against all authentication anywhere.

skeezyboy 7 hours ago [-]

> That's an argument against all authentication anywhere.

its a problem isnt it

estearum 2 hours ago [-]

everfrustrated 16 hours ago [-]

Isn't this somewhat equilivent to ensuring cookies are required?

Obviously this technology is different but the same sort of result.

What's the end game here? All humans end up having to use a unique encryption key to prove their humanness also?

pmontra 13 hours ago [-]

I understand your concern and we are probably headed into that direction, but that does not prove humanness any more than the subject of this post proves botness. They prove the knowledge of the value of a key.

sneak 19 hours ago [-]

The problem with this is that key generation is free, so being a well-behaved unknown bot is the same as being an unidentified bot, which means that you go in the block/captcha/throttle bucket.

It is only useful for whitelisting bots, not for banning bad ones, as bad ones can rotate keys.

Whitelisting clients by identity is the death of the open web, and means that nobody will ever be able to compete with capital on even footing.

blackswatter 1 days ago [-]

[dead]

mtrovo 1 days ago [-]

As much as I understand this is needed it rubs me the wrong way.

The standard looks fine as a distributed protocol until you have to register to pay a rent to Cloudflare, which they say will eventually trickle down into publishers pocket but you know what having a middleman this powerful means to the power dynamics of the market. Publishers have a really bad hand no matter what we do to save them, content as we know it will have to adapt.

Give it a couple more iterations and some MBA will come up with the brilliant idea of introducing an internet toll to humans and selling a content bundle with unlimited access to websites.

maxwellg 1 days ago [-]

Cloudflare is only the first to market with a solution. If this proposal catches on every WAF vendor under the sun will have it implemented before the next sales cycle. Enforcement of this standard will be commoditized down to nothing.

tick_tock_tick 1 days ago [-]

There is just too much spam and it's not clear that is a solvable problem without Cloudflare (or some other similar service). Maybe if they get big enough the incentives to spam will vanish and non Cloudflare sites can exist in peace (at-least until enough people leave Cloudflare that spam become profitable again).

palmfacehn 7 hours ago [-]

>3. Register your bot and key directory

Register with CF is the specific part I object to. Of all of the numerous hazards here centralizing the registration with CF is most clearly problematic. This part of the spec could have easily been an additional header linking to key data.

egorfine 13 hours ago [-]

I will employ every tool in my toolbox in different ways before I bend to this. Especially before I bend to cloudflare.

nembal 10 hours ago [-]

Web Bot Auth solves a real problem with a real standard. Per-request signatures make automated traffic accountable, and that is the right long-term primitive. In Cloudflare’s hands, the current implementation is built first for bots, not agents. “Signed agents” read like a label added to a bot-centric system, not a first-class agent identity fabric. The design also centers Cloudflare as the arbiter and on-ramp, which is great for reliability inside their network and great for their business moat, but not great for an open, decentralized agentic web.

While it builds on standards as the top poster notes, cloudflare's version is a business moat driven central registry service and nothing what the decentralized internet would/should look.

i wrote a but more about this on my blog if someone care to read https://blog.agentcommunity.org/2025-08-23-web_auth_box_not_...

jithinraj 1 days ago [-]

Web Bot Auth solves authentication (“who is this bot?”) but not authorization/usage control. We still need a machine-readable policy layer so sites can express “what this bot may do, under which terms” (purpose limits, retention, attribution, optional pricing) at a well-known path, robots.txt-like, but enforceable via signatures.

A practical flow:

1. Bot self-identifies (Web Bot Auth)

2. Fetch policy

3. Accept terms or negotiate (HTTP 402 exists)

4. Present a signed receipt proving consent/payment

5. Origin/CDN verifies receipt and grants access

That keeps things decentralized: identity is transport; policy stays with the site; receipts provide auditability, no single gatekeeper required. There’s ongoing work in this direction (e.g., PEAC using /.well-known/peac.txt) that aims to pair Web Bot Auth with site-controlled terms and verifiable receipts.

Disclosure: I work on PEAC, but the pattern applies regardless of implementation.

nerdsniper 1 days ago [-]

Why use a "web bot" instead of an API? Either can be driven by an AI "agent"...but this just seems like an "API key for a visual api interface", and rather wasteful in cost and resources. If a company could afford to pay a partner for an API key they wouldn't need this. If they can't afford to pay the partner for access -- they'd still be blocked with or without "Web Bot Auth". I don't understand what this is for.

I suspect I'm missing something, what am I missing?

observationist 1 days ago [-]

Part of it, at least, is people thinking they've solved some perceived problem and being told by their chatbot that it's a terrific, brilliant new innovation and they should build a whole new protocol spec for it.

mediaman 23 hours ago [-]

The website the human sees is the new API.

That's needed because many APIs are either nonexistent or extremely marginal in design and content coverage.

notatoad 1 days ago [-]

if you already have an api that exposes all the information that your parter who is willing to pay for an API key wants, then sure, that's perfect. but what if you don't have an API, or your API doesn't expose the information that crawlers are looking for? they want to crawl your website, they're willing to pay for the ability to crawl your website, but you don't want to build an API...

i'm sure the next step here will be a cloudflare product that sits in front of your website and blocks all bot traffic except for the bots that are verified to have paid for access. (or maybe that already exists?)

jimmydoe 9 hours ago [-]

ActivityPub had sth similar? Maybe just reuse that to identify the source identity, then determine if you want to trust that domain or person/bot?

mips_avatar 1 days ago [-]

Cloudflare's verified bots program is a terrible idea. They want to be the central chokepoint for agents, and they're doing it in shady ways like auto enrolling customers into blocking agents.

threatofrain 18 hours ago [-]

That’s not shady, that’s awesome customer value! Bot blocking as Default option is a great choice for all of us.

1gn15 18 hours ago [-]

It's discriminatory against robots and helps make the web even more locked down. DRM never works; the analog hole is always the nuclear option.

In the end, only people with non-mainstream browsers (or using VPN to escape country-level blocks, or Tor, or noJS) suffer.

It's like how anti-piracy measures only affect paying customers, while pirates ironically get a better experience. The best way to get around endless CAPTCHAs is to just use LLMs instead.

threatofrain 15 hours ago [-]

> It's discriminatory against robots

You would... lead your response with that argument? This has nothing to do with DRM. When people talk about how bots suck, the focus is on billion or trillion dollar businesses making everyone on the web pay.

There's also a reason why the bot conversation flared up; we've always had bots, but before the conversation centered on Google and SEO. Now the conversation centers on companies like OpenAI.

johneth 13 hours ago [-]

> It's discriminatory against robots

That's the entire point.

hoppp 22 hours ago [-]

I like that parsable signature in the http message however I dont quite understand how the system differentiates between human users and an llm agent controlling a browser

zb3 1 days ago [-]

Seems like Cloudflare wants to regulate the internet.. they should not have that power.

kylehotchkiss 1 days ago [-]

Disagree. Not everybody wants their sites scraped and their content used to train a model that they'll never see a penny from. Cloudflare is the only party who wants to build a system where both the models and individual sites have their interests respected.

ATechGuy 1 days ago [-]

Are you sure that CF can stop AI bots?

hsbauauvhabzb 1 days ago [-]

Do you have a better alternative?

ATechGuy 1 days ago [-]

Have you looked into open-source alternatives? I'm assuming that it's a pressing problem for you, and you have already explored alternatives.

tick_tock_tick 1 days ago [-]

I have, sadly they are basically worthless and often worse then worthless as they negatively impact the site.

ATechGuy 1 days ago [-]

Interesting. Care to list them here so that we all can learn.

yjftsjthsd-h 1 days ago [-]

https://anubis.techaro.lol/ ?

hsbauauvhabzb 24 hours ago [-]

Your browser is configured to disable cookies. Anubis requires cookies for the legitimate interest of making sure you are a valid client. Please enable cookies for this domain.

Thing is, my browser isn’t configured that way. So works well, I guess.

yjftsjthsd-h 23 hours ago [-]

The target was better than cloudflare, which also demands cookies but with more tracking. This is still better.

hsbauauvhabzb 20 hours ago [-]

I have not disabled cookies. Cloudflare works fine. Users being able to access a website is a pretty important metric when considering which is ‘better’.

specialp 21 hours ago [-]

I will tell you that we have had bot super fight mode on for a year and since then we have not had to address abusing traffic nor deal with legitimate people blocked. There is no way we could have achieved such balance. prior to that it was me blocking every Chinese AS under the sun as they shifted and bombarded us with traffic

1gn15 18 hours ago [-]

> nor deal with legitimate people blocked

How are you so sure of that? Their marketing?

account42 11 hours ago [-]

Simple: If you ignore people who get blocked or just also make sure they get blocked the same way in all ways they could reach you, then you don't have to deal with them and can just ignore the issue. Fun times ahead for us.

observationist 1 days ago [-]

Then put up a goddamn login wall.

The internet was designed to work the way it does for good reasons.

You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet. Nor to monetize, enshittify, balkanize, and fragment the web with no effective recourse or oversight.

Cloudflare shouldn't be allowed to operate, in my view.

acdha 22 hours ago [-]

> You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet.

Are you somehow under the impression that Cloudflare is forcing their service on other companies? They’re not stepping in, the people who own those sites have decided paying them is a better deal than building their own alternatives.

nemothekid 1 days ago [-]

>Then put up a goddamn login wall.

They did exactly that, they just outsourced it to cloudflare. The problem became bad enough that a lot of other people did the same thing.

If your argument is "companies shouldn't be allowed to outsource components to other companies, or cloudflare specifically", then sure, but good luck ever enforcing that.

ChrisArchitect 1 days ago [-]

URL should be blog post:

The age of agents: cryptographically recognizing agent traffic

https://blog.cloudflare.com/signed-agents/

(https://news.ycombinator.com/item?id=45052276)

cuuupid 1 days ago [-]

Cloudflare is the last party that should be running this for two reasons.

1. THey have already proven to be a bad faith actor with their "DDoS protection."

2. This is pretty much the typical Cloudflare HN playbook. They release soemthing targeted at the current wave and hide behind an ideological barrier; meanwhile if you try to use them for anything serious they require a call with sales who jumps you with absurdly high pricing.

Do other cloud providers charge high fees for things they have no business charging for? Absolutely. But they typically tell you upfront and don't run ideological narratives.

This is not a company we should be putting much trust in, especially not with their continued plays to become the gatekeepers of the internet.

tick_tock_tick 1 days ago [-]

1) how so? Pretty much everything they do for DDoS protection is at their customers choice. You might not like what people want for their site but lets not pretend that most companies aren't very happy with it.

2) Then don't use them? Either they provide enough value to pay them or they don't.

esseph 1 days ago [-]

Have you seen large cloud provider billing?????

There is a whole segment of tech designed around helping you understand and manage cloud costs, through consultations, automations, etc. It has spawned companies and career paths!

cuuupid 1 days ago [-]

Yes but they don’t hide that behind ideological nonsense, they own up to it. They’re a good faith actor with a high price tag

hsbauauvhabzb 1 days ago [-]

Ime, cloud cost centres are intentionally confusing and annoying. I get emails telling me to check their dashboard for billing info which I inevitably never do. It’s designed that way.

19 hours ago [-]

bgwalter 1 days ago [-]

Cloudflare is playing both sides: grok.com is served by Cloudflare.

realityfactchex 1 days ago [-]

No offense, but screw CloudFlare, screw their captchas for humans, and screw their wedging themselves between web operators and web users.

They can offer what they want for bots. But stop ruining the experience for humans first.

tick_tock_tick 1 days ago [-]

> screw their wedging themselves between web operators and web users

Web operators choose to use them; hell they even pay Cloudflare to be between them. Seriously I just think you don't understand how bad it is to run a site without someone in-front of it.

specialp 21 hours ago [-]

I run a site that is a primary source of information. We also have customers that subscribe and are very sensitive to heavy handed controls. Before cloudflare and after "AI" we had bots from all over just destroying our endpoints with bursts of mining traffic. While we would love to have more discoverability this is not that. Cloudflare is in a tough spot trying to arbitrate good traffic vs bad. From my experience they are doing this as good as one can.

mcspiff 1 days ago [-]

Couldn’t agree more — Much like running my own DNS or email server, I don’t think I’ll ever go back to running my own website directly on the internet. It’s just not worth the hassle. For stuff only I use, it sits behind my VPN. For anything that _must_ be public, it’s going behind a WAF someone else can run.

immibis 23 hours ago [-]

They don't have to, but they're tricked into doing so. Via marketing.

acdha 22 hours ago [-]

I miss the 90s, too, but these days anyone who wants to deal with current levels of bot traffic is probably going to look at a service like Cloudflare as much cheaper than the amount of ops time they’d otherwise spend keeping things up and secure.

immibis 9 hours ago [-]

You could just, like, not make a website that takes several seconds to handle each request.

I let bots hit Gitea 2-3 times per second on a $10/month VPS, and the only actual problem was that it doesn't seem to ever delete zip snapshots, filling up the disk when enough snapshot links are clicked. So I disabled that feature by setting the snapshots folder read-only. There were no other problems. I mention Gitea because people complain about having to protect Gitea a lot, for some reason.

acdha 4 hours ago [-]

Sure, I’ve been doing that since the 90s. I still pay for hardware and egress, and it turns out that everything has limits for the amount of traffic it can handle which bots can easily saturate. I’ve had sites which were mostly Varnish serving cached content at wire speed go down because they saturated the upstream.

immibis 2 hours ago [-]

I hope 2-3 requests per second is not that limit, or you're fucked.

vntok 9 minutes ago [-]

It is on a simple WordPress install with the top 4 most used plugins, when you don't have a Caching Reverse Proxy like Cloudflare to filter bad traffic and serve fully cached pages from POP nodes located near the visitors.

The alternative, of course, is to set up a caching system server-side (like Redis), which most people who set up their WordPress blog don't have the first idea how to do in a secure way.

acdha 1 hours ago [-]

It’s not, but you’re off by 3+ orders of magnitude on the traffic volume and ignoring the cost of serving non-trivial responses.