Cloudflare’s AI Crawler Update: A Strategic Fork in the Road for Publishers

Cloudflare’s new AI crawler management system, rolling out with default changes on September 15, creates a direct trade-off between blocking AI training and maintaining search engine visibility. The update reclassifies bots by behavior—Search, Agent, Training—and applies the strictest rule to multi-purpose crawlers like Googlebot, which crawl for both search and AI training. Publishers who enabled the older “Block AI bots” setting will automatically block Googlebot unless they manually adjust their settings before the deadline.

According to Cloudflare’s report, AI training now accounts for the majority of crawler requests on its network, up from roughly 20% in spring 2025. Daily AI agent requests surged over 1,700% year-over-year. This data underscores the escalating tension between content protection and discoverability.

For executives, this is not a minor settings change. It is a structural shift in how web content is accessed and monetized. The decision to block or allow AI crawlers now directly impacts search rankings, referral traffic, and potential revenue from AI training data licensing.

How Cloudflare’s Three-Category System Works

Cloudflare now sorts AI crawlers into three behaviors: Search (indexing for search engines), Agent (real-time bots acting for a user, like ChatGPT-User), and Training (crawling to train or fine-tune AI models). Each bot operator is expected to run separate crawlers for each behavior, allowing websites to permit or block based on intent.

Starting September 15, for new customers and new sites, Training and Agent crawlers will be blocked by default on pages that display ads, while Search remains allowed. Existing free customers who have not changed their settings will be migrated to these defaults. Additionally, Cloudflare will treat multi-purpose crawlers—those performing both Search and Training—based on their overall behavior, applying the strictest rule. For example, if a site blocks Training, Googlebot (which does both) will be blocked entirely.

Strategic Winners and Losers

Winners

Publishers and content creators gain granular control over AI crawlers, protecting ad revenue and content from unauthorized training. Cloudflare strengthens its value proposition, potentially increasing customer lock-in and setting industry standards. Licensed AI training data providers will see increased demand as unauthorized scraping is curtailed.

Losers

AI companies relying on web scraping face reduced access to free training data, raising costs. Search engines using AI agents may suffer false positives, impacting result freshness. Small AI startups are disproportionately affected due to lack of resources for licensing deals.

Advertisement

Market Impact: The End of Open Web for AI Training

Cloudflare’s default rules set a precedent that other CDNs and hosting providers may follow, fundamentally altering the economics of AI data acquisition. The web is transitioning from an open-access model to a permission-based one where content access is negotiated. This shift could accelerate the development of private data marketplaces and increase the value of proprietary datasets.

For publishers, the immediate risk is losing Googlebot traffic. A site that blocks AI training without adjusting its Cloudflare settings will effectively block Googlebot, reducing its crawl frequency and potentially harming search rankings. The September 15 deadline is critical: free users who do not opt out will have their settings changed automatically.

What Executives Should Do Now

Review your Cloudflare AI blocking settings before September 15. If you previously enabled “Block AI bots,” you are likely to block Googlebot under the new multi-purpose crawler rule. Decide whether to keep Search crawlers enabled while blocking Training and Agent bots. Consider the long-term trade-off between protecting content from AI training and maintaining search visibility.

Cloudflare is also testing a content-use signal (immediate, reference, full) that could become an industry standard for expressing content usage preferences. Monitor this development as it may offer a more nuanced way to manage AI access without blocking search crawlers.

Bottom Line

Cloudflare’s update forces a strategic choice: protect content from AI training at the cost of search visibility, or allow AI training to maintain discoverability. There is no middle ground for multi-purpose crawlers. Executives must act before September 15 to align their bot management with their business objectives.




Source: Search Engine Journal

Rate the Intelligence Signal

Intelligence FAQ

If you previously enabled 'Block AI bots' or if you block Training crawlers, Googlebot will be blocked because it performs both Search and Training. You must manually adjust settings to allow Search while blocking Training.

Log into your Cloudflare dashboard and review your AI crawler settings. If you want to keep Googlebot, ensure Search crawlers are allowed. If you are a free user and do nothing, your settings will be updated to block Training and Agent crawlers by default on ad-supported pages.