Blog

Does your site still exist for AI? Insights from the latest Arcep report

Arcep, the French telecommunications regulator, published a 104-page report in January 2026 dedicated to the impact of generative AI on the open internet. Based on an empirical study conducted on three AI services (Mistral, Gemini, Perplexity), this document provides concrete data on issues that directly concern SEO professionals and site publishers: traffic decline, opacity of sources cited by AIs, the rise of crawlers, and the emergence of GEO.

What to remember:

  • Search engine traffic will continue to decline with the spread of AI summaries.
  • Being well ranked on Google is no longer enough to appear in AI responses.
  • The rules of GEO are still unclear and specific to each AI service — which complicates any strategy.
  • Mastering crawlers and robots.txt is becoming an urgent technical issue for publishers.
  • Agentic AI will create a new layer of intermediation, potentially even more opaque than previous ones.

From SEO to GEO: a shift underway

The report's central finding is unambiguous: generative AIs are becoming new “entry points” to the internetjust like search engines or social networks. The user no longer navigates from link to link; they ask a question and receive a concise answer expressed in natural language. This shift from a “search engine” to an “answer engine” fundamentally redefines the rules of the game for publishers.

For SEO professionals, this translates into a major evolution of practices: traditional SEO strategies based on site structure, domain authority signals, and content freshness are no longer sufficient. The challenge now is no longer to appear in a good position in a list of results, but toto be identified as a relevant source by a generative agent and actually cited in its answerThis new discipline is commonly referred to as GEO (Generative Engine Optimization).

The IMPACTIA study by PEReN, conducted on 200,000 analyzed citations, provides valuable insight: the sources cited by generative AIs only partially overlap with Google’s top results. The intersection rate with Google’s top 5 does not exceed 19 to 32% depending on the tools tested. In other words, ranking well on Google no longer guarantees being cited by an AIand vice versa.

Traffic in free fall: alarming figures

The Arcep report of January 2026 confirms what many publishers already see in their analytics: the widespread use of AI summaries reduces incoming trafficThe most striking example comes from a Pew Research Center study cited by Arcep: users exposed to a summary via Google AI Overviews click on an external source in only 8% of cases.

Even Wikipedia, yet the domain most frequently cited by all the AIs studied (up to 19% of citations in the “history” category) reports a notable drop in traffic since the arrival of generative AI tools. If even the sites most referenced by these tools are seeing their visitor numbers fall, the situation for smaller publishers is logically even more worrying.

This trend also raises an economic question: visibility largely determines advertising revenue and subscriptions, so a sustained drop in traffic directly threatens the financial viability of independent publishers. News sites have reported sharp drops in audience since major platforms integrated generative AI features into their search interfaces.

Concentration of sources: 2% of domains capture 49% of citations

One of the most notable findings of the IMPACTIA study concerns the extreme concentration of sources used by AIsOn 9,206 domain names cited, the top 2% most referenced — that is, 185 domains — alone represent nearly 49% of the 200,000 citations analyzed. At the other extreme, 72% of domains are cited only between 1 and 10 times in total.

Wikipedia stands out as the dominant source in all tested categories. Surprisingly, homework help sites like studysmarter.fr or alloprof.ca follow. The criteria that explain these choices remain largely opaque : the algorithms used by AIs to select and weight sources are neither documented, standardized, nor accessible to publishers.

Extreme concentration of sources mobilized by AI – Source: Arcep

Another major finding: each AI service has its own citation biasesSome domains that are heavily cited by one AI are almost absent from its competitors. This variability makes it all the more difficult to build a coherent visibility strategy for publishers, who cannot rely on stable, universal rules as they can with traditional SEO.

Crawlers and robots.txt: a relationship that has become strained

On the technical side, the report points to increasing tension around indexing robots (crawlers) used by AI actors to collect data on the web. Since 2022, several site publishers have recorded a significant increase in their traffic related to bots. Cloudflare estimates that crawler traffic could exceed human traffic as early as 2029.

The protocol robots.txt, originally designed to allow publishers to control the indexing of their pages, is weakened. Some AI crawlers do not respect its directives, or even saturate servers to the point of causing outages. The Wikimedia Foundation says that 65% of its traffic now comes from robots.

Crawlers do not always respect the robots.txt file – Source: Arcep

The central problem is the lack of a mechanism to technically distinguish AI crawlers from search engine crawlers: blocking some risks blocking others, and thus penalizing one's own organic search ranking. In the absence of such a mechanism, initial solutions are emerging:

  • The Cloudflare's pay-per-crawl model (using the HTTP 402 “Payment Required” code), which allows publishers to monetize access to their content by AI bots.
  • The ai.txt project (Artificial Intelligence Access Protocol), an evolution of robots.txt that makes it possible to specify, in a granular way, the conditions for data use according to purpose: training, indexing, or agentic features.
  • Some standardization work at the W3C and IETF to adapt existing protocols to the specificities of generative AI.

Agentic AI: the next challenge for service visibility

The report also emphasizes theemergence of agentic AI, systems capable not only of generating text but also of acting directly on behalf of the user: booking a ticket, making a purchase, accessing a third-party service. In this model, the agent chooses the providers, applications and services to use, not the user.

Commercial agreements have already been concluded: OpenAI with Walmart, Etsy and Shopify; Claude (Anthropic) with Notion, Canva and Stripe; Copilot with OpenTable, Kayak and Instacart. Non-partner services simply risk not being offered to users, without the users being aware. This logic of 'closed indexing' constitutes a structural risk for any digital player that does not have the means to negotiate such agreements.

The article "Does your site still exist for AI? Insights from the latest Arcep report" was published on the site Abondance.