What is a Sitemap? The Unsung Hero for Website Visibility

Technical SEO

Sitemaps

  • SEO Marketing Hub 2.0
  • Technical SEO
  • Sitemaps
Website Architecture
Crawl Budget

Sitemaps

Brian Dean

Written by Brian Dean

What Is a Sitemap?

A sitemap is a blueprint of your website that help search engines find, crawl and index all of your website’s content. Sitemaps also tell search engines which pages on your site are most important.

There are four main types of sitemaps:

  • Normal XML Sitemap: This by far the most common type of sitemap. It’s usually in the form of an XML Sitemap that links to different pages on your website.
  • Video Sitemap: Used specifically to help Google understand video content on your page.
  • News Sitemap: Helps Google find content on sites that are approved for Google News.
  • Image Sitemap: Helps Google find all of the images hosted on your site.

Why are Sitemaps Important?

Search engines like Google, Yahoo and Bing use your sitemap to find different pages on your site.

Sitemaps can help search engines find pages

As Google puts it:

“If your site’s pages are properly linked, our web crawlers can usually discover most of your site.”

In other words: you probably don’t NEED a sitemap. But it definitely won’t hurt your SEO efforts. So it makes sense to use them.

There are also a few special cases where a sitemap really comes in handy.

For example, Google largely finds webpages through links. And if your site is brand new and only has a handful external backlinks, then a sitemap is HUGE for helping Google find pages on your site.

Or maybe you run an ecommerce site with 5 million pages. Unless you internal link PERFECTLY and have a ton of external links, Google’s going to have a tough time finding all of those pages. That’s where sitemaps come in.

With that, here’s how to setup a sitemap…and optimize it for SEO.

Best Practices

Create a Sitemap

Your first step is to create a sitemap.

If you use WordPress, you can get a sitemap made for you with the Yoast SEO plugin.

Yoast plugin

The main benefit of using Yoast to make your XML sitemap is that it updates automatically (dynamic sitemap).

So whenever you add a new page to your site (whether it’s a blog post or ecommerce product page), a link to that page will be added to your sitemap file automatically:

New link in sitemap

If you don’t use Yoast, there are lots of other plugins available for WordPress (like Google XML Sitemaps) that you can use to create a sitemap:

WordPress XML sitemap

What if you don’t use WordPress?

No worries. You can use a third-party sitemap generator tool like XML-Sitemaps.com. These will spit out an XML file that you can use as your sitemap.

XML sitemaps

Either way, once your sitemap is created, I recommend manually taking a look at it.

Backlinko XML sitemap

(Your sitemap is usually found at site.com/sitemap.xml. But it depends on your CMS and what program you used to create your sitemap)

It should display all of the pages on your site:

Backlinko – Posts XML sitemap

If everything looks good, it’s time to submit your sitemap to Google.

Submit Your Sitemap To Google

To submit your sitemap login to your Google Search Console account.

Then, go to “Index” → “Sitemaps” in the sidebar.

Google Search Console – Navigation

If you already submitted your sitemap, you’ll see a list of “Submitted Sitemaps” on this page:

Google Search Console – Sitemaps

Either way, to submit your sitemap, enter your sitemap’s URL into this field:

Add a sitemap

And hit “Submit”.

Submit new sitemap

And if everything is all setup, you’ll start to see information on your sitemap on this page under the “Submitted Sitemaps” section:

Submitted sitemaps

Use the Sitemap Report to Spot Errors

Once Google has crawled your sitemap, click on it under “Submitted Sitemaps”:

Google Search Console – Submitted sitemaps

If you see “Sitemap index processed successfully”, then Google successfully crawled your sitemap.

Sitemap index processed

You can also click on the little bar chart icon to go to the Coverage Report for your sitemap:

See index coverage

This report shows you how many URLs Google found in your sitemap… and how many of those pages ended up in Google’s index:

Coverage report

For example, you can see that my sitemap contains links to 116 webpages. 109 are “valid” and 6 are “Excluded”.

"Valid" excluded

I can obviously ignore the valid pages.

But I do want to check out any “Excluded” pages to see what’s up.

It turns out that those 6 URLs in my sitemap are getting a “Duplicate, submitted URL not selected as canonical” message.

Excluded error message

And when I look at the URLs, I see that these are pages that I don’t even want indexed in the first place.

Excluded examples

So I should remove them from my sitemap.

Use Your Sitemap to Find Problems With Indexing

One of the cool things about using a sitemap is that it can gives you a ballpark estimate of:

  • How many pages you WANT indexed
  • How many pages ARE indexed

For example, let’s say that your sitemap links to 5,000 pages.

But when you look at the Google Search Console, your site only has 2,000 pages indexed.

That’s a sign that something’s up. It could be that there’s a lot of duplicate content in those 5,000 pages. So Google isn’t indexing all of them.

Or it could be that the number of pages on your site exceed your crawl budget.

Match Your Sitemaps and Robots.txt

It’s important that your sitemaps and Robots.txt work together.

In other words:

If you clock a page in robots.txt or use the “noindex” tag on a page, you DON’T want it to appear in your sitemap.

Otherwise, you’re sending mixed messages to Google.

Your sitemap says: “This page is important enough to make it into our sitemap”. But when Googlebot lands on the page, they get blocked.

Sitemap Pro Tips

Huge Site? Break Things Up Into Smaller Sitemaps: Sitemaps have a limit of 50k URLs. So if you run a site with a ton of pages, Google recommends breaking up your sitemap into several smaller sitemaps.

Be Careful With Dates: URLs in your sitemap have a “last modified” date associated with them.

Sitemap last modified

I recommend changing these dates ONLY when you make significant changes to your site (or add new content to your site). Otherwise, Google warns that updating dates on pages that haven’t changed can be seen as a spammy tactic.

Don’t Sweat Video Sitemaps: Video Schema has largely replaced the need for video sitemaps. A video sitemap definitely won’t hurt your page’s ability to get a video rich snippet. But it’s usually not worth the hassle.

Stay Under 50MB: Google and Bing both allow sitemaps that are up to 50MB. So as long as you’re under 50MB, you’re good.

HTML Sitemaps: This is basically the equivalent of an XML sitemap… but for users.

HTML sitemap

You don’t necessarily need these as Google and other search engines now rely on your XML sitemap. But if you think they’re useful for human visitors, an HTML sitemap probably isn’t going to hurt your SEO efforts.

Tips for Optimizing Sitemaps

  • Use XML Files to Structure Internal Links and External URLs

The XML file is a list of URLs directing crawling bots to the content, and the pathway on a website. Consequently, using internal and external links for your sitemaps informs web crawlers what’s considered important on the website, and helps reduce the occurrence of orphan pages. Such clarity boosts overall SEO health, which augurs well for ranking!

Note

XML sitemaps don’t guarantee the indexing of your web pages but rather boost indexability chances.

  • Keeping the Root Directory Clean and Organized

The root directory stores other folders and files on a domain, i.e, it’s the central location for all files and directories forming a website. All web requests start at the root directory.

Hypothetically, including your sitemaps outside the root directory is harmless but this goes against the established protocol. The location of a sitemap determines the files it can accommodate. Methinks, search engines don’t care much when the sitemap.xml is not located in the root directory.

Avoid clogging your root directory with multiple files, as this affects the responsiveness of your website.

  • Include ALL Web Pages in the Sitemaps Page URL

As mentioned, sitemaps act as a pathway for Google bots; taking them to all web pages on the site, even when the internal linking isn’t great. Including all webpages on the sitemaps file enhances communication between the website and the search engines. 

Tools to Easily Create Sitemap

If you need to generate a sitemap faster, here’s a summary of the best and most convenient tools to consider:

  • Google Search Console Tools,
  • Bing Webmaster Tools
  • Paid online tools such as Yoast
  • Pulling sitemaps from websites you don’t own.

10 Things to Exclude on Your Sitemaps

As a best practice, aim to include only the SEO-relevant pages in the sitemap. It’s a recommended method of effectively utilizing the crawl budget.

With this approach, the search engines crawl your website intelligently helping you reap rewards for better indexation.

Aim to exclude:

  • Duplicate pages
  • Paginated pages
  • Non-canonical pages
  • Archive pages
  • Redirected pages (3xx), Missing pages (4xx) and Error pages (5xx)
  • Comment URLs
  • No-index pages
  • Resource pages useful to site visitors but don’t serve as landing pages
  • Site result search pages
  • Shared via email pages

FAQs

How do I find the root directory in WordPress?

For WordPress sites, the /html serves as the root directory for your files. To access the root directory, you can use SSH, STFP, or the File Manager.

Does a sitemap affect SEO?

Yes. Sitemaps list all the priority pages on a website to guide search engines on crawling and indexability. This boosts the rankings of a website making it visible to a large number of internet users, thus complimenting SEO efforts.

Learn More

Build and submit a sitemap: A guide from Google on creating sitemaps… and getting them submitted to Google.

Using Sitemaps to help Google find content hosted on your site: Quick video from the Google Webmaster YouTube channel on how sitemaps can help your site appear higher and more often in the search results.

Next Crawl Budget
Previous Website Architecture
Next Crawl Budget
More Topics
All Topics
8 ResourcesSEO Fundamentals
4 ResourcesKeyword Research Strategies
8 ResourcesContent Optimization Strategies