Sitemap Protocol
The format is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
....
</urlset>
Notes
- The date must be in W3C date time format
- The priority is a relative number and therefore, it is not beneficial to set all the links to the same value. The default value is 0.5 and the sitemap priority levels will be normalised to this value as the median.
- The sitemap file must be UTF-8 encoded and entity escape codes must be used.
Multiple sitemap files, can be used, as long as each Sitemap file has less than 50,000 URLs and is no larger than 10MB (10,485,760 bytes). The sitemap files may be compressed using gzip to reduce your bandwidth requirement. Nonetheless, uncompressed sitemap file must still meet the 10MB constraint.
If more than 50,000 URLs are needed in the sitemap then multiple sitemap files need to be created and a sitemap index file must be provided to list all the individual sitemaps. The sitemap index file also must meet the same size and URL limit requirements described above.
The format for a sitemap index is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/sitemap1.xml.gz</loc>
<lastmod>2004-10-01T18:23:17+00:00</lastmod>
</sitemap>
...
</sitemapindex>
Notes
- A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file.
- The Sitemap index file must be UTF-8 encoded.
Sitemap File Location
A sitemap is outward looking only and the location of a Sitemap file determines the set of URLs that can be included in that sitemap. For example, a sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.