What is the maximum number of URLs allowed in a single sitemap?▾
The Sitemap Protocol 0.9 specification limits a single sitemap file to 50,000 URLs and 50MB uncompressed. Sites with larger catalogs must use a sitemap index file that references multiple child sitemaps, each within these limits.
Does the lastmod date actually influence crawl frequency?▾
Google uses lastmod as a hint, not a guarantee. If lastmod values are accurate and updated when content changes, Google may crawl those URLs more frequently. However, setting lastmod to today's date on every page regardless of changes trains crawlers to ignore it.
Is changefreq required, and does Google use it?▾
changefreq is optional and Google has stated publicly that it largely ignores it in favour of its own crawl-frequency signals. However, it must use one of the valid values (always, hourly, daily, weekly, monthly, yearly, never) if present — invalid values cause validation errors.
Why must URLs in sitemaps use XML entity encoding?▾
Sitemaps are XML documents, so characters like &, ', ", <, and > must be encoded as &, ', ", <, and > respectively. Unencoded ampersands in URLs with query parameters are among the most common sitemap errors.
What XML namespace declaration is required?▾
The root <urlset> element must include xmlns="http://www.sitemaps.org/schemas/sitemap/0.9". Missing or incorrect namespace declarations cause XML parsers and Google's sitemap parser to reject the document.
Can I include image or video URLs in my sitemap?▾
Yes, using the image (xmlns:image) and video (xmlns:video) namespace extensions. These require additional namespace declarations on the root element and follow separate sub-element schemas. This validator checks the core Sitemap 0.9 structure and flags unrecognised extensions.
Should sitemap URLs include trailing slashes?▾
Use whatever canonical form your server returns — with or without trailing slash — and be consistent. The canonical URL in your sitemap should match the canonical in your page's <link rel="canonical"> tag, otherwise Google may see a mismatch and prefer one version over the other.
How do I fix a sitemap that Search Console reports as 'could not be read'?▾
Start by pasting the sitemap content into this validator. The most common causes are XML parse errors (unclosed tags, unencoded ampersands), an incorrect or missing namespace declaration, or a BOM (byte order mark) at the start of the file. The error list will identify the specific issue and line.