Crawl budget
Search engines determine how many subpages they crawl per URL. This is not the same for all websites, but is primarily determined by the PageRank of a page. The higher the PageRank, the larger the crawl budget. The crawl budget also determines how often the most important pages of the website are crawled and how often a deep crawl takes place.
Best practice
- Realization of a flat page architecture, where the way to the subpages is as short as possible and only requires a few clicks (navigation structure).
- Very good internal linking of the most important pages.
- Internal linking of pages with many backlinks to pages that are to be crawled more frequently.
- Exclusion of unimportant pages from crawling by Robots.txt (e.g. log-in pages, contact forms, images).
- Exclusion of crawling using metadata (noindex, nofollow).
- Offering an XML sitemap with a URL list of the most important subpages.
Check
- Are the most important pages reachable in few clicks? (max 2)
- is
robots.txt
implemented correctly, excluding login pages and similar, from crawling?