• Stop being a LURKER - join our dealer community and get involved. Sign up and start a conversation.

Google’s New Crawl Limits Reward Fast Dealership Websites

DjSec

Boss
Mar 17, 2025
214
61
Awards
3
First Name
Gregg
Google recently updated how much of a page Googlebot will actually read when crawling. The documented per-page fetch limit dropped from ~15MB to ~2MB. This is not a monthly quota. It is a hard stop on how much of a single page Google processes before ignoring the rest.

For dealerships, this matters because most dealer sites are bloated!

This gives fast websites an even bigger advantage!​

As Google becomes more cost-sensitive, page speed and crawl efficiency will become more important and dealerships with fast, clean sites will gain a measurable advantage while bloated sites will lose more and more ground.
 
There are some nuances to this:
  • The 2MB Cutoff: For standard HTML pages, Googlebot now only processes the first 2MB of uncompressed data. Once this limit is reached, Googlebot stops the fetch and only indexes the downloaded portion.
  • Uncompressed Data: The limit applies to the raw HTML size, not the compressed version (like Gzip or Brotli) sent over the network. If your raw code is bulky, content at the bottom of the page may be ignored.
  • Subresources are Separate: This limit applies to the initial HTML file itself. Resources like CSS, JavaScript files, and images are fetched separately, and each has its own 2MB limit.
  • PDF Exception: Interestingly, Google significantly increased the limit for PDFs to 64MB, recognizing that documents are often much larger than web pages
 
  • Like
Reactions: DjSec
There are some nuances to this:
  • The 2MB Cutoff: For standard HTML pages, Googlebot now only processes the first 2MB of uncompressed data. Once this limit is reached, Googlebot stops the fetch and only indexes the downloaded portion.
  • Uncompressed Data: The limit applies to the raw HTML size, not the compressed version (like Gzip or Brotli) sent over the network. If your raw code is bulky, content at the bottom of the page may be ignored.
  • Subresources are Separate: This limit applies to the initial HTML file itself. Resources like CSS, JavaScript files, and images are fetched separately, and each has its own 2MB limit.
  • PDF Exception: Interestingly, Google significantly increased the limit for PDFs to 64MB, recognizing that documents are often much larger than web pages
It changes the outcome for dealerships.

Yes, subresources are fetched separately. That doesn’t help when the uncompressed HTML itself is bloated and Google stops processing before it reaches vehicle details, service content, or internal links.

Most dealership sites push meaningful content far down the DOM behind navigation, scripts, filters, and repeated components. Hitting a 2MB processing cap means Google often indexes a partial understanding of the page.

This isn’t a problem for lean, fast, server-rendered sites. It is a problem for platform-based dealer sites that treat pages like applications instead of documents.

The result isn’t “Google can’t crawl the site.”
The result is lower crawl efficiency, slower indexing, and weaker signals.

And that opens a competitive gap.
 
  • Like
Reactions: douglaskarr
It changes the outcome for dealerships.

Yes, subresources are fetched separately. That doesn’t help when the uncompressed HTML itself is bloated and Google stops processing before it reaches vehicle details, service content, or internal links.

Most dealership sites push meaningful content far down the DOM behind navigation, scripts, filters, and repeated components. Hitting a 2MB processing cap means Google often indexes a partial understanding of the page.

This isn’t a problem for lean, fast, server-rendered sites. It is a problem for platform-based dealer sites that treat pages like applications instead of documents.

The result isn’t “Google can’t crawl the site.”
The result is lower crawl efficiency, slower indexing, and weaker signals.

And that opens a competitive gap.
No disagreement! I think legacy platforms are becoming increasingly problematic here.
 
  • Like
Reactions: DjSec
How can I test my pages to see where my sites stand? Even better, is there a tool that will scrape my site and show the over-2MB pages?
A site crawler like SEO Screaming Frog does a nice job.

Otherwise, working on them one at a time can be done using Google Chrome Dev Tools. This method is precise because it shows you the "Resource Size" (uncompressed) vs. "Transferred Size" (compressed).
  1. Press F12 to open Developer Tools.
  2. Go to the Network tab.
  3. Refresh the page.
  4. Look for your main URL in the list (usually the first item, type "document").
  5. Hover over the value in the Size column.
    • Top number: Transferred size (compressed).
    • Bottom number: Resource size (This is the one that matters for the 2MB limit).
 
  • Like
Reactions: DjSec
2MB of raw HTML is still MASSIVE and has ZERO to do with how fast a site loads. Don't confuse the issues, load time is how long a page takes to load all assets and render. HTML size is only a VERY small part of that.

For a raw HTML file to be over 2mb, it would need over TWO MILLION characters in the file. According to recent research, the real world median size of raw HTML is 33kb. The heaviest size at the 90th percentile is only 155kb. So bascially, only MAYBE 1% of sites have HTML files that are too large to get hit with the 2mb limit (and that for sure won't be any dealership sites)
 
  • Like
Reactions: craigh
A site crawler like SEO Screaming Frog does a nice job.

Otherwise, working on them one at a time can be done using Google Chrome Dev Tools. This method is precise because it shows you the "Resource Size" (uncompressed) vs. "Transferred Size" (compressed).
  1. Press F12 to open Developer Tools.
  2. Go to the Network tab.
  3. Refresh the page.
  4. Look for your main URL in the list (usually the first item, type "document").
  5. Hover over the value in the Sizecolumn.
    • Top number: Transferred size (compressed).
    • Bottom number: Resource size (This is the one that matters for the 2MB limit).
That’s a solid method. Measuring the raw HTML size is the right way to approach it.

The larger point isn’t panic over 2MB. It’s that dealer platforms often generate inefficient HTML structures that push meaningful content deep into the document behind navigation, filters, scripts, and repeated components.

Even if most pages don’t hit 2MB, crawl efficiency and content ordering still matter. Google processing the first portion of a page more heavily means architecture becomes more important, not less.

The issue isn’t just file size. It’s whether the primary vehicle or service content appears early and cleanly in the HTML, or buried inside platform-generated scaffolding.
 
2MB of raw HTML is still MASSIVE and has ZERO to do with how fast a site loads. Don't confuse the issues, load time is how long a page takes to load all assets and render. HTML size is only a VERY small part of that.
HTML size and load time are different metrics, but they are not unrelated.

The browser must download, parse, and construct the DOM from the raw HTML before it can render. Larger documents increase transfer time, parsing time, and memory usage.

More importantly, the structure of the HTML determines what appears early in the document. That affects:
  • What Google processes first
  • What is seen before any crawl cutoff
  • How efficiently the page intent is understood
For a raw HTML file to be over 2mb, it would need over TWO MILLION characters in the file. According to recent research, the real world median size of raw HTML is 33kb. The heaviest size at the 90th percentile is only 155kb. So bascially, only MAYBE 1% of sites have HTML files that are too large to get hit with the 2mb limit (and that for sure won't be any dealership sites)
Even if 2MB only affects 1% of sites today, Google just signaled something important:

They are optimizing crawl cost.

And dealership sites are historically inefficient.

Dealer platforms routinely generate unnecessarily large, inefficient HTML structures that push meaningful content deep into the DOM. A 2MB processing cap amplifies architectural weaknesses. Lean, server-rendered sites that surface primary content early gain structural crawl advantages, regardless of whether they hit the hard limit.