Cloudflare’s New AI Crawler Policy: The End of Free Data Scraping

Cloudflare’s announcement that it will block mixed-use crawlers from ad-supported pages by September 15, 2026, is a direct answer to the question every publisher has been asking: how do we get paid when AI companies use our content? The policy leverages Cloudflare’s position as a gatekeeper for over 20% of the web, giving it the power to enforce a new norm. Over 50% of AI crawl traffic is wasted on re-fetching unchanged pages, according to Cloudflare, highlighting the inefficiency AI companies have been exploiting. For executives, this matters because it signals a structural shift in the economics of AI training data—free access is becoming a liability, and paid access is becoming the standard.

The Mechanics of the Block: What Changes on September 15, 2026

Cloudflare’s default settings will block crawlers that mix search, agent use, and training from any page hosting ads. This applies to new customers, new sites from existing customers, and all existing free customers. The policy targets what Cloudflare calls “mixed-use” crawlers—bots that simultaneously scrape for search indexing, AI training, and agentic services. By forcing separation, Cloudflare aims to give publishers granular control: a crawler can be allowed for search but blocked for training. This is a technical and commercial pivot. The Pay Per Crawl marketplace, now evolving into Pay Per Use, will let publishers charge AI companies when their content creates value, not just when it’s fetched. Initial partners Ceramic.ai and You.com will test the model, paying publishers when content appears in search results or premium access.

Why Google Is the Primary Target

Cloudflare explicitly calls out the “world’s largest search engine” (Google) for having access to “2x more information” than other AI companies. Google’s Googlebot crawls for Search, including AI Overviews and AI Mode, while Google Extended allows opt-outs for training. But Cloudflare argues that Google makes it difficult for publishers to remain discoverable without being used for AI. This policy forces Google to either separate its crawlers or negotiate paid access. Google’s response will be critical: if it complies, it sets a precedent; if it resists, it risks losing access to Cloudflare-protected content.

Strategic Winners and Losers

Winners: Publishers and Cloudflare

Publishers with ad-supported content gain a direct revenue stream from AI crawlers. Cloudflare itself wins by creating a new marketplace and strengthening its value proposition for publishers. Ceramic.ai and You.com gain early access to compensated content, potentially securing favorable terms before the market matures.

Advertisement

Losers: AI Companies and Small Startups

AI companies that rely on free crawler access face increased costs and reduced data availability. Google may lose free access to publisher content for AI Overviews and AI Mode. Small AI startups, unable to afford Pay Per Use fees, may struggle to compete, consolidating power among well-funded players.

Market Impact: From Free-for-All to Compensated Access

The market is moving from a free-for-all scraping model to a compensated access model. CDNs like Cloudflare are becoming gatekeepers and marketplaces, fundamentally altering the economics of AI training data. This shift could accelerate if other CDNs adopt similar policies. Publishers now have leverage, but they must actively opt in to the blocking and marketplace features. The September 2026 deadline gives the industry time to adapt, but the direction is clear: AI companies must pay for content that generates value.

Outlook and Next Steps

Over the next 30 days, watch for reactions from Google, other CDNs, and AI companies. Google may announce separate crawlers or challenge Cloudflare’s policy. Other CDNs like Akamai and Fastly may follow suit. AI companies will likely accelerate negotiations with publishers and CDNs. For executives, the key action is to audit your content’s exposure to AI crawlers and consider opting into Cloudflare’s marketplace. The window for free access is closing.




Source: TechCrunch AI

Rate the Intelligence Signal

Intelligence FAQ

AI companies must separate their crawlers for search, training, and agent use, or pay for access to ad-supported content. This increases costs and reduces free data availability.

Google’s mixed-use Googlebot will be blocked from ad pages unless it separates its crawlers or negotiates paid access. This could limit Google’s data for AI Overviews and AI Mode.