Google’s John Mueller Questions Need for LLM-Only Markdown Pages
Google’s John Mueller questions the need for LLM-only markdown pages, raising concerns about their usefulness, quality, and impact on search performance.
John Mueller, Google’s Search Advocate, has expressed skepticism about creating specialized Markdown or JSON web pages exclusively for large language models (LLMs). Mueller stated that LLMs have always worked well with standard HTML pages and sees no clear benefit in producing pages not visible to regular users.
Start of Discussion
The discussion began after Lily Ray posted on Bluesky asking whether websites should create dedicated markdown or JSON pages for LLMs and deliver those URLs specifically to bots, and if Google could clarify its stance on the practice.
Ray asked:
“Not sure if you can answer, but starting to hear a lot about creating separate markdown / JSON pages for LLMs and serving those URLs to bots. Can you share Googleʼs perspective on this?”
The question highlights a growing trend in which publishers generate “shadow” versions of key content in simplified formats designed to be more easily processed by AI systems.
More active discussion on this topic happening on X.
This has been the hot topic lately, I’ve been getting pitched by companies who do this https://t.co/rVnbPKUxZj
— Lily Ray 😏 (@lilyraynyc) November 23, 2025
HTML Still the Norm for LLMs
Mueller replied that LLMs have trained on read & parsed normal web pages since the beginning.
I’m not aware of anything in that regard. In my POV, LLMs have trained on – read & parsed – normal web pages since the beginning, it seems a given that they have no problems dealing with HTML. Why would they want to see a page that no user sees? And, if they check for equivalence, why not use HTML?
When Ray pressed further on whether a separate format could “speed up getting key points across to LLMs,” Mueller responded that if file formats truly offered an advantage, the companies operating those models would be the ones to say so.
Mueller added:
“If those creating and running these systems knew they could create better responses from sites with specific file formats, I expect they would be very vocal about that. AI companies aren’t really known for being shy.“
Mueller noted that certain pages might perform better for AI systems than others, but he doesn’t believe that difference is tied to using HTML instead of Markdown:
“That said I can imagine some pages working better for users and some better for AI systems, but I doubt that’s due to the file format, and it’s definitely not generalizable to everything. (Excluding JS which still seems hard for many of these systems).”
Overall, Mueller’s remarks indicate that, from Google’s perspective, there’s no need to build bot-specific Markdown or JSON duplicates of your pages simply to ensure LLMs can interpret them.
Structured Data More Important Than File Format
In the discussion, some pointed out exceptions like OpenAI’s eCommerce product feeds, which rely on defined JSON schemas for clear data ingestion.
Matt Wright highlighted this as an example where structured data matters most because the AI platform requires a specific format.
Wright explains:
“Interestingly, the OpenAI eCommerce product feeds are live: JSON schemas appear to have a key role in AI search already.“
Also, Wright points to a thread on LinkedIn whereas Chris Long observed that:
“Editorial sites using product schemas, tend to get included in ChatGPT citations.”
Practical Guidance for Site Owners
Mueller’s insights underline that the best strategy is to focus on optimizing existing HTML pages by improving page speed, readability, and implementing structured data schemas wherever platforms provide clear specifications.
Creating separate LLM-optimized Markdown or JSON versions is generally unnecessary unless explicitly required for integrations like product feeds.
What This Means for SEO and AI Integration
This dialogue illustrates the rapid pace of AI-driven changes in search technology and how they translate into technical demands on SEO and development teams often in advance of formal documentation.
For now, the takeaway is to maintain clean, well-structured HTML, minimize excessive JavaScript that hinders parsing, and use structured data compliant with platform requirements.
Bottom Line
This approach remains the most reliable path until LLM providers issue more concrete format guidelines.