• Stop being a LURKER - join our dealer community and get involved. Sign up and start a conversation.

AI SEO or GEO building ideas

I have to give you massive credit here, your diagnosis of the legacy tech bloat is spot on. A 70% 'Poor' PageSpeed rating across the industry is embarrassing, and building a clean, SSR-first infrastructure with pristine schema is exactly what the industry needs to fix the crawl budget issues. I am genuinely looking forward to seeing those technical benchmarks for @DealerInt.

Where I think the architecture still hits a wall is your assumption about how non-Google models acquire data, specifically regarding retail velocity versus index churn.

You mentioned that GMC feeds and SSR schema are how Perplexity and OpenAI 'know' a car arrived. GMC is fantastic for Google's ecosystem, but Google isn't sharing that structured feed with OpenAI or Anthropic. For those models to discover inventory without an aggregator, they are entirely reliant on their own web crawlers hitting your SSR schema.

Even with a lightning fast site, standard web crawling is fundamentally incompatible with automotive retail velocity. While an average unit might sit for 30 to 60 days, the highly desirable, aggressively priced inventory....the exact cars users are actively querying AI for, often move in a matter of days. If a foundational model's crawler only indexes a specific VDP once a week, the AI is going to confidently send shoppers to 404 pages and sold vehicles. Models cannot tolerate that level of hallucination risk.

The 'Aggregator Tax' isn't just a visibility tax; it's a data-licensing reality. Foundational models are striking massive enterprise data deals with centralized hubs precisely because they need a real-time API firehose, not because they want to rely on crawling decentralized local domains, no matter how fast or clean your pipe is.

I completely agree that your infrastructure will absolutely crush legacy platforms on standard Google crawlability. But until OpenAI decides to trust and query thousands of decentralized MCP endpoints instead of buying a clean, normalized data feed from a centralized network, the aggregators still hold the keys to the non-Google LLM intelligence layers.
 
Joe — fair clarification on the GMC distinction. To be precise: GMC is the discovery signal for Google's ecosystem specifically. What I should have said is that SSR-rendered structured schema on the VDP pages themselves is what non-Google crawlers directly consume. Same infrastructure argument, tighter mechanism.

But on inventory velocity — I'd argue our earlier conversation actually answers this. The MCP tool-call layer isn't a crawl solution. It's a real-time verification layer. The crawl solves discovery — does this dealer carry this type of inventory, are they local, are they credible. VDP-level precision — is this specific unit still available at this exact price right now — is exactly what the bottom-of-funnel MCP execution handles.

A direct tool-call to the dealer's live database at the moment the agent needs the answer, not a week-old crawl snapshot. So the failure mode you're describing isn't: crawl inventory get 404s. It's incomplete architecture where the execution layer is missing. Discovery via clean SSR pipe MCP tool-call verifies live status before the recommendation fires. That's the hallucination problem solved at execution, not index.

On the data-licensing deals — they exist today, agreed. But those deals are precisely what WebMCP and OpenAI's MCP announcement are designed to disrupt long-term. The aggregators hold the keys right now because the alternative infrastructure isn't at scale yet. The clean pipe is the prerequisite for that shift — not the solution that waits for it. Looking forward to sharing those crawl-latency benchmarks here when they're ready. That's where the infrastructure argument becomes empirical rather than theoretical.
 
Eric's librarian analogy is perfect. The shift from "be the top result" to "be the source the AI references" changes the game for dealers. One thing I'd add: this same principle applies to on-site search, not just external search engines. If a shopper lands on your site and your inventory search only works through rigid dropdowns, you're forcing them to think like a database query. The dealers I've seen win are the ones making their on-site experience feel more like talking to a knowledgeable salesperson, where the shopper can express what they actually want in plain language. That's the same conversational paradigm that's making AI search engines outperform traditional ones.

Most dealership sites use filters to show intent ... like: price, body style, mileage, etc.

The problem is those assume the shopper already understands the difference in a Ford and a Chevy.

And you can't have any cars falling through the cracks.

For example:
A user says “I want something reliable under $20k that feels newer but not boring.”

That’s not a filter problem, it's an interpretation problem.

So:
  • Are the dealers you’ve seen succeeding replacing filters, or layering conversation on top of them?
  • Whats driving the results, structured data, a different interface?
 
The SEO narrative in automotive has gotten detached from reality. There are only 2 providers actually solving this problem....OneKeel and Horizon.
Thats not true!

There are:
  • Hundreds of agencies
  • In-house dealer teams
  • Marketplace-driven SEO (Autotrader, Cars.com, etc.)
  • Technical platforms doing similar things under different names
Modern search isn’t a content posting problem. It’s an infrastructure problem.
At scale, yes ... SEO becomes about:
  • Internal linking
  • data structure
  • crawl efficiency
  • templating systems
However you can rank with:
  • strong pages
  • targeted intent
  • clean UX
  • proper indexing
You seem to be jumping from “infrastructure matters” to “only infrastructure matters”?
  • programmatic content generation at scale

Yes, but it's dangerous!

Most programmatic content is garbage.

Google doesn’t reward:
  • thin templated pages
  • spun LLM junk
  • duplicate inventory descriptions
Scale without quality = index bloat + ranking suppression
  • originality and de-duplication controls (not recycled LLM output)

This is a real problem:
  • Dealer feeds = duplicate across hundreds of sites
  • OEM descriptions = reused everywhere
However this is not some “secret sauce”

Basic solutions:
  • rewrite descriptions
  • canonicalization
  • structured differentiation
  • a unified intelligence layer informed by real dealership data
  • integrated RAG pipelines for contextual accuracy
  • continuous learning loops tied to performance signals
The buzzword stacking can be a bit overwhelming

Let’s translate:
  • “Unified intelligence layer” = database + logic
  • “RAG pipelines” = using data to generate content (not new)
  • “learning loops” = tracking performance and adjusting (basic SEO iteration)
None of this guarantees rankings?

Google ranks:
  • usefulness
  • relevance
  • authority
  • UX signals
Schema alone isn’t a strategy.
True

Schema is:
  • helpful
  • but not helpful for ranking
Dealer-written content isn’t scalable.

A single dealer writing 500 pages is unrealistic.

However:
  • The highest converting pages are often:
    • human-written
    • specific
    • trust-driven
Scale doesn't beat human!
Social posts don’t move organic search in any meaningful way.

Social posting may not affect rankings directly, however social traffic creates engagement, engagement is a brand signal and that is a ranking signal.

Without a connected system that aligns data, content, and distribution in real time, results will be inconsistent at best, and misleading at worst.

That applies to any business system.

The industry needs to stop pretending otherwise.
I get the infrastructure argument, especially at scale, but I’m trying to separate what actually drives rankings vs what just improves internal efficiency.

For example, a lot of what you described (RAG pipelines, learning loops, orchestration) sounds like it improves content production and consistency but where have you seen that directly translate into ranking gains over simpler systems?

Also curious how you think about this in the context of inventory-heavy sites:

If multiple dealers are all running similar “full-stack engines” on largely overlapping vehicle data, what actually becomes the differentiator in search?

Is it still content structure and distribution, or does it come down more to authority, UX, and engagement signals at that point?

Trying to understand where infrastructure stops being a competitive advantage and starts becoming table stakes.
 
While building out some new features for my own dealer SaaS project recently, I’ve run into exactly what Matt mentioned: legacy platforms stripping rich schema and blocking agents that aren't 'standard' Google bots. It’s a massive barrier for dealers who actually want to be 'the quotable source.'
I guess you would block for control, misuse risk, because a lot of these platforms are 20 years old and have been patched up so many times that letting new crawlers in could break things, expose inconsistencies, create load spikes they can’t handle.

To Gregg’s point about a 'testing ground'—I’d be interested in sharing some of the raw data I’m seeing regarding how AI engines are actually consuming (or failing to consume) different types of dealership inventory feeds. If we’re moving toward a conversational paradigm where a shopper asks, 'Find me a red SUV with 3rd row seating under $35k,' the dealer who wins isn't the one with the best blog—it's the one whose data isn't being throttled by their own provider.

If we were going to test it we would need controlled test environment to compare for starters:
  • standard dealer UX
  • fully structured / clean data exposure
  • and an intent-driven layer that interprets queries like “family SUV under 35k”
Does improved data accessibility alone actually changes outcomes, or is it interpretation and presentation?

In the data you’re seeing, are AI systems consuming inventory feeds in a way that affects user results?
 
  • Like
Reactions: DealerInt
@DjSec — the blocking explanation makes complete sense. A platform that's been patched together for 20 years isn't going to gracefully handle a new crawler class. The blocking isn't malicious, it's self-preservation. The problem is that self-preservation is quietly costing dealers visibility they can't measure.

Your test structure is exactly the right framework. Those three layers map almost perfectly to what I'm observing in early data.

To answer your direct questions:

Data accessibility alone does move the needle — but not uniformly, and not on its own. What I'm seeing is that accessibility and interpretation operate on different query types.

Broad intent queries — "family SUV under $35k," "best truck for towing near me" — are almost entirely an accessibility problem. The AI can't surface a dealer as a candidate if the feed is throttled or the schema is malformed. Legacy sites are largely invisible here regardless of content quality.

Specific unit queries — "2024 Silverado LT white under $48k" — that's where interpretation takes over. Clean data gets you in the room, but whether the AI confidently recommends that specific unit depends on how the VDP content reads to a language model, not just whether the structured data is technically valid.

So accessibility is the floor. Interpretation is the ceiling. Most dealers don't have the floor yet — which is why that's where the controlled test should start.

On your second question: Yes, AI systems are consuming inventory feeds in ways that affect results, but the failure pattern is more interesting than just "feed not found." The more common issue is partial consumption — the model finds the dealer, finds the inventory category, but loses confidence at the VDP level because the page structure reads like it was built for a Google crawler from 2015, not a language model in 2026.

That's the gap the clean-pipe infrastructure closes first. Would be genuinely useful to run your controlled comparison — I'll have some crawl-latency benchmarks ready to contribute when we're set up.
 
  • Like
Reactions: DjSec

Let's define what an “AI-ready” VDP actually looks like?

The goal is to be the source an AI recommends, the page needs to do two things good:
  1. Be fully accessible (crawlable, fast, clean structure)
  2. Be easily understood by a language model
Right now, most VDPs are optimized for a 2015 Google crawler instead of a 2026 AI system.

Same inventory, same data just different presentation layers, however we need to describe the three versions for the test.

What does the Ultimate 2026 VDP look like?

The goal would be to see:
  • which versions actually get surfaced
  • and more importantly, which ones get recommended in conversational queries
Before we can test it, we need to define each version, so lets start with the “ideal” 2026 VDP.

What would have to be present on the page for an AI to confidently select that specific unit in a response?
 
  • Like
Reactions: DealerInt
Good framing. Let me take a first pass at the "ideal" version — push back where you'd define it differently.

For a 2026 AI to confidently recommend a specific unit in a conversational response, the VDP needs to satisfy three distinct confidence layers:

1. Machine trust layer (can the AI access it?

- SSR-rendered HTML — no inventory behind a JS framework wall
- Sub-2 second TTFB — crawl budget goes to fast pages first
- Valid Vehicle schema with every critical field populated: make, model, year, trim, mileage, price, availability status, VIN, condition
- Availability status that updates in real time — "In Stock" as a schema property, not just text on the page
- Canonical URL that doesn't rotate or expire

2. Language model comprehension layer (does the AI understand it?)

- A natural language description paragraph that reads like a knowledgeable salesperson wrote it — not a spec dump. "This 2024 Silverado 1500 LT is well-suited for towing up to 11,000 lbs and comes with the factory tow package already installed" beats a bullet list of specs every time
- Explicit answers to the questions AI shoppers actually ask: Is it good for a family? Does it fit a car seat? What's the payment at current rates? What's included in the price?
- Structured FAQ section on the VDP itself — not sitewide FAQ, unit-specific
- Plaintext price with no asterisks or "see dealer for details" obfuscation — AI models lose confidence on ambiguous pricing

3. Recommendation confidence layer (will the AI stake its reputation on it?

- Dealer reputation signals on the page itself — review schema, rating, response rate
- Financing context — monthly estimate, not just MSRP
- Inventory scarcity signal — "2 in stock" rather than no quantity context
- Clear next action — phone, chat, reserve button. AI won't recommend a dead end.

The gap between most current VDPs and this spec is almost entirely layer 2. Layer 1 is an infrastructure problem most dealers don't control. Layer 3 is a trust problem most dealers ignore. But layer 2 — the comprehension layer — is pure content and presentation, and almost nobody has done it yet.

That's where I'd start the test. Same inventory, same schema — but version A is a standard VDP spec dump, version B has the natural language description and FAQ layer added. See which one gets cited.

What would you add or change?
 
  • Like
Reactions: DjSec
This thread is pretty amazing. Despite the overly technical conversation for the average dealer, you guys are proving that a new breed of SEO considerations needs to be understood.

There is no doubt SEO is climbing the importance ladder because of AI, and your posts will be found for many months/years to come.
 
  • Like
Reactions: DjSec and DealerInt
Good framing. Let me take a first pass at the "ideal" version — push back where you'd define it differently.

For a 2026 AI to confidently recommend a specific unit in a conversational response, the VDP needs to satisfy three distinct confidence layers:

1. Machine trust layer (can the AI access it?

- SSR-rendered HTML — no inventory behind a JS framework wall
- Sub-2 second TTFB — crawl budget goes to fast pages first
- Valid Vehicle schema with every critical field populated: make, model, year, trim, mileage, price, availability status, VIN, condition
- Availability status that updates in real time — "In Stock" as a schema property, not just text on the page
- Canonical URL that doesn't rotate or expire

1. Machine Trust = Agree

  • Fast site
  • Clean HTML (no JS walls)
  • Proper schema
  • Real availability
Your correct: if that fails, nothing else matters!

2. Language model comprehension layer (does the AI understand it?)

- A natural language description paragraph that reads like a knowledgeable salesperson wrote it — not a spec dump. "This 2024 Silverado 1500 LT is well-suited for towing up to 11,000 lbs and comes with the factory tow package already installed" beats a bullet list of specs every time
- Explicit answers to the questions AI shoppers actually ask: Is it good for a family? Does it fit a car seat? What's the payment at current rates? What's included in the price?
- Structured FAQ section on the VDP itself — not sitewide FAQ, unit-specific
- Plaintext price with no asterisks or "see dealer for details" obfuscation — AI models lose confidence on ambiguous pricing

2. Comprehension = Agree

  • Natural language
  • Real answers to real questions
  • Clear pricing
This is the most underrated layer in the industry right now.

3. Recommendation confidence layer (will the AI stake its reputation on it?

- Dealer reputation signals on the page itself — review schema, rating, response rate
- Financing context — monthly estimate, not just MSRP
- Inventory scarcity signal — "2 in stock" rather than no quantity context
- Clear next action — phone, chat, reserve button. AI won't recommend a dead end.

3. Recommendation Confidence = Agree

  • Reviews
  • pricing clarity
  • next steps
However I would add AI doesn't just evaluate a single page on its own, it would look at consistency across pages, site-wide signals, entity-level trust, and even “Does the dealer look reliable across the web?”!

So those things would also need to be created for the test site.

The gap between most current VDPs and this spec is almost entirely layer 2.

I agree ...

Layer 1 is an infrastructure problem most dealers don't control.

Sadly you are correct and it is the most important part, it affects lawsuits, fines, conversions, rankings, and everything thing you do on your site. Since this is the most important part we will build it to meet and exceed all current specs, this way it doesn't affect any test.

Same inventory, same schema — but version A is a standard VDP spec dump, version B has the natural language description and FAQ layer added. See which one gets cited.

What would you add or change?

Should we test to see if it affects:
  • discovery
  • ranking
  • selection across sources