A SiteMap is a document that lists all of the pages of a website. There are two kinds of site maps, one built for human website visitors and one built especially for search engines – the XML (eXtensible Markup Language) Sitemap.
The site map designed for humans lists and links to every page and area of a website, much like an index or “Table of Contents”. Most will display the pages hierarchically, listing them by category and location within the site. This type of sitemap helps users find the content they want quickly, and also helps search engine bots crawl the site more thoroughly.
The XML Sitemap is not meant for website visitors, and is not a viewable page of the site – as it is nothing but strings of code. It is written specifically for search engines as a way of communicating directly with them about what areas of the site are currently online, functioning, and ready to be crawled and indexed. An XML Sitemap is especially useful if your site has some areas which cannot be navigated to through the site itself, or if your site contains content which search engines have trouble crawling, like Flash or Ajax.
Uploading an XML site map for search engines has essentially replaced the older method of communicating with them – which was to submit URLs through an online form. As a site owner you can either submit your XML sitemap directly or upload it your server and wait for them to find it on their own. All of the major search engines have a method in place for formally submitting your site map to them, though you’ll usually need to set up Webmaster’s accounts to do so. It’s also possible to add the following line of text to your websites robots.txt document to direct all search engine crawlers to the location of your XML sitemap:
Each web page referenced on the Sitemap must be set up like this:
… though not all of those fields are necessary. The “lastmod”, “changefreq”, and “priority” elements are optional.
The “lastmod” element simply states when the particular page was last updated, and has no effect on when the search engines will re-crawl it.
The “changefreq” element is used to tell the search engines how often you anticipate the page changing. The options are “always”, “hourly”, “daily”, “weekly”, “monthly”, “yearly”, or “never”. Using it has no effect at all on how often a page is crawled.
The “priority” element is used to tell the search engines how important each individual page is within the site, on a scale of 0.1 to 1.0, with 1.0 being the most important. Once again, these numbers have no effect at all on which pages will or won’t appear in search results.
However, including these elements but using them incorrectly will have an effect on the search engines. For example, setting all pages of your site at priority 1.0 will probably cause Google to reject your sitemap entirely.
If you have a large website, or a website with content that changes frequently, manually creating and maintaining your own sitemap may be more than you have time to deal with. There are many online automated XML sitemap creators, both free and paid versions, which will create a Sitemap for you to upload – though you will need to re-create it or make updates to it yourself when pages are added or removed from your website. There are also many automated XML sitemap generators available that are installed on your website’s server. When in place, they will create, maintain and update your XML Sitemap automatically.
While a robots.txt page usually tells the search engines what NOT to crawl and add to their database, an XML Sitemap tells them what they are welcome to include. Submitting a Sitemap will not guarantee the inclusion or regular crawling of any pages of a website, and will not improve search engine rankings for the pages included. As a website owner you want to make it as easy as possible for the search engines to visit your site and index its pages. If your website isn’t huge and uses clear and simple text-based navigation you probably don’t NEED an XML sitemap. But, since the opportunity to directly relay information about your website to the search engines exists in this document, it simply makes sense to have one whether it’s needed or not.