What Is An XML Sitemap And How To Create One | SEO Guide

what is an XML sitemap and how to create one

We’ve all been there. You just poured your heart, soul, six months, and a pile of cash into a new website. It’s beautiful. It’s fast. Your blog posts are masterpieces. You hit the “launch” button, lean back, and wait for the sweet, sweet sound of traffic rolling in.

And then… silence.

Just crickets.

You type your own brand name into Google, and thankfully, you show up. But what about your new products? Your brilliant blog post titles? You search for them. Nothing. Page five. Nothing.

It’s a gut-wrenching feeling. It feels like you just built a million-dollar department store in the absolute middle of the desert, but forgot to build any roads.

So, what went wrong?

There’s a very good chance you missed one of those small, boring-but-critical pieces of technical SEO: the XML sitemap.

This is the “road map.” It’s the file you literally hand to Google to make sure it can find every last page on your site.

If you’ve ever felt that launch-day panic, this guide is for you. We’re going to break down what this techy-sounding file actually is and give you a simple plan to create one. We will cover exactly what is an XML sitemap and how to create one, no matter how “non-technical” you think you are.

More in Technical SEO Category

How To Improve Page Speed With Caching

Step-By-Step Technical SEO Audit Guide

Key Takeaways

Alright, before we get into the weeds, let’s get the most important stuff on the table. If you only read this part, you’ll be 80% of the way there.

  • A Map for Robots, Not People: An XML sitemap is a file on your website that lists all your important pages. It’s not designed for human visitors; it’s written in a specific format for search engine crawlers (like Googlebot).
  • It’s About Discovery, Not Ranking: Submitting a sitemap does not guarantee you will rank #1. It doesn’t even guarantee you’ll be indexed. It simply tells Google, “Hey, these pages exist, and you should come look at them.” It helps Google find your content faster.
  • Crucial for Some, Helpful for All: If you have a massive website (50,000+ pages), a brand new website with few external links, or a site with lots of “orphan” pages (pages not well-linked internally), a sitemap is absolutely critical. For small, well-structured sites, it’s still a “best practice” that provides a direct line of communication to Google.
  • Your CMS Can Probably Do It: You most likely do not need to code this by hand. Platforms like WordPress (with plugins like Yoast or Rank Math), Shopify, and Wix create and maintain sitemaps for you automatically.
  • “Submit It” is the Final Step: Creating the file is only half the battle. You must submit your sitemap’s URL to Google Search Console to tell Google where it is.

Let’s Start at the Beginning: What Exactly is an XML Sitemap?

First off, let’s kill the tech-speak.

An XML sitemap is just a text file. That’s it. The “XML” part stands for “Extensible Markup Language,” which is just a fancy way of saying it’s formatted for machines to read, not humans.

Think of your website as a massive, sprawling city. Your pages are all the different buildings—the houses, the skyscrapers, the little corner shops.

Google’s “crawler” (you’ll hear us call it a “spider”) is like a tourist who just arrived in town with a mission: visit every single building.

Without a map, that tourist has to just wander. They’ll start on Main Street (your homepage) and follow every street (link) they find. Then they’ll follow the streets those streets lead to. They’ll probably find most things. But what if there’s a whole neighborhood (a blog category) that no streets lead to? The tourist might miss it entirely.

An XML sitemap is the official city map. You get to walk up to the tourist (Google) and hand it to them directly. You say, “Here. Here is a clean, organized list of every single building I want you to visit. Here are their exact addresses.”

Suddenly, the tourist’s job isn’t about aimless discovery. It’s about efficient visitation. They can just go right down your list.

You might have also heard of an “HTML sitemap.” That’s a different beast. An HTML sitemap is a page for your users—it’s often a bulleted list of all your main pages, usually linked in your website’s footer. It’s for humans. An XML sitemap is only for search engines.

But Wait, Doesn’t Google Find Pages on Its Own?

This is the most common question I get, and it’s a great one.

Yes, Google’s main way of finding new pages is by “crawling.” It lands on one page (like your homepage) and follows every single link it finds on that page. Then it follows all the links on those pages, and so on. It spiders its way through the web by following this endless trail of links.

So, if your website has perfect “internal linking” (where all your pages connect to each other logically), Google should, in theory, find everything.

But “in theory” is doing a lot of heavy lifting in that sentence.

So, Why Bother With a Sitemap at All?

Relying on crawling alone is a passive strategy. You’re just hoping Google finds everything.

A sitemap is proactive. You are telling Google exactly what matters.

Here’s when a sitemap shifts from a “nice-to-have” to an absolute “must-have”:

  • Your Site is Huge: Google doesn’t have infinite time or resources. It gives every site a “crawl budget” (a set amount of time and resources it will spend crawling you). If you run a massive e-commerce store with 200,000 products, Google’s crawler might get “tired” or just run out of its assigned budget before it finds all of them. A sitemap shows them the most important URLs first.
  • Your Site is Brand New: A new site has very few (or zero) external links pointing to it. Google may not even know you exist. A sitemap is the fastest way to roll out the welcome mat and say, “We’re open for business, come on in and crawl.”
  • You Have “Orphan” Content: These are pages that exist on your site but have few or no internal links pointing to them. Maybe it’s an old landing page you forgot about or a specific resource. Without a sitemap, Google might never find them.
  • Your Site Uses Rich Media: While Google is better at this now, specialized sitemaps for Video, Images, and Google News can provide extra context (like video duration or an image’s subject matter). This helps that content get indexed properly in those specific search verticals.
  • Your Internal Linking is… Messy: Let’s be honest. Is every new blog post you write perfectly linked from 3-4 other relevant posts? Do your new product pages get added to category pages instantly? If your site structure is deep or just a bit chaotic, a sitemap is your critical safety net.

In short, a sitemap plugs all the potential holes that a link-based crawl might miss.

Is a Sitemap a Magic SEO Bullet?

I need to be perfectly clear about this: No.

I’ve seen so many clients get this wrong. They think submitting a sitemap is the “on” switch for SEO. It is not.

A sitemap helps with indexation, not ranking.

This is the most important distinction to understand.

  • Indexation = Google knows your page exists and has added it to its massive database (its “index”). Your page is now eligible to show up in search results.
  • Ranking = Google has decided your page is a high-quality, relevant answer to a user’s query and chooses to show it as result #1 (or #10, or #50).

A sitemap gets your foot in the door. It gets you into Google’s giant library.

It does not determine if your book is placed on the front display or tossed in the dusty back corner. That job belongs to your content quality, your site speed, your user experience, and your backlinks. A sitemap is the foundation, not the skyscraper.

What Kind of Information Does a Sitemap Actually Tell Google?

Okay, let’s get slightly technical for just a second. If you were to crack open a sitemap.xml file, you wouldn’t see a pretty list. You’d see code.

But it’s simple code. Here’s a basic example for a site with two pages:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="[http://www.sitemaps.org/schemas/sitemap/0.9](http://www.sitemaps.org/schemas/sitemap/0.9)">
   <url>
      <loc>[https://www.mywebsite.com/](https://www.mywebsite.com/)</loc>
      <lastmod>2025-10-21</lastmod>
      <changefreq>daily</changefreq>
      <priority>1.0</priority>
   </url>
   <url>
      <loc>[https://www.mywebsite.com/about-us](https://www.mywebsite.com/about-us)</loc>
      <lastmod>2025-10-18</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.5</priority>
   </url>
</urlset>

Let’s break that down into plain English.

  • <urlset>: This is just the wrapper. It says, “Everything inside this file is part of the sitemap.”
  • <url>: This is the container for each individual page.
  • <loc>: This is the only required tag. It stands for “location.” It’s the full, absolute URL of the page.
  • <lastmod>: This stands for “last modified.” It tells Google the date the file was last changed. This is the most important optional tag. If Google sees a recent date, it might think, “Oh, this content is fresh. I should re-crawl it.”
  • <changefreq>: This is your hint to Google about how often the page changes (e.g., daily, weekly, always, never).
  • <priority>: This is your hint of how important this page is relative to other pages on your site, on a scale of 0.0 to 1.0.

Should I Really Mess with ‘Priority’ and ‘Changefreq’?

My honest advice? Don’t worry about them.

Years ago, SEOs spent time meticulously setting priorities. We’d mark the homepage as 1.0, categories as 0.8, and blog posts as 0.5.

The problem? Everyone abused it. Everyone marked all their pages as 1.0 and daily, thinking it would trick Google into crawling them more.

Because of this, Google has publicly stated that they… mostly ignore these two tags.

The only tag that really matters, besides the required <loc>, is <lastmod>. Why? Because it’s a factual piece of data, not a subjective opinion. Your page was last modified on a specific date. If you keep this date accurate, it builds trust and helps Google crawl more efficiently. Good plugins do this for you automatically.

How I Learned About Sitemaps the Hard Way (A Cautionary Tale)

I want to tell you a quick story from my early days as a freelance web developer. It’s burned into my memory.

I landed my first big e-commerce client. They were a regional retailer with about 10,000 products, and they were migrating from an ancient, custom-built platform to a shiny new one. The project took months. I was focused on everything: data migration, design, page speed, URL structures, on-page SEO.

The launch day came. We flipped the switch.

It was… okay. Traffic from branded searches and existing customers was fine. But a week later, my client called, and he was not happy. “I’m searching for our new spring line,” he said, “and all I see are last year’s products. Where are the new ones? Why can’t anyone find them?”

My stomach just dropped into my shoes.

I dived into the analytics. He was right. Google wasn’t indexing any of the new product URLs. It was only crawling pages it already knew about from the old site.

In the total chaos of the migration, I had completely, totally forgotten about the XML sitemap.

The new platform had a feature to generate one, but it wasn’t turned on by default. And I had forgotten to submit the new sitemap location to Google Search Console. Google was either hitting an old, static sitemap file from the previous build or, worse, just had no map at all.

It had no “road map” to the 10,000 new product pages.

I spent that night frantically configuring the new sitemap generator, splitting it into multiple “sitemap index” files (more on that in a second) because it was so large, and submitting the new index file to GSC.

The result? Within 48 hours, GSC showed it was “processing.” Within a week, indexation of the new products shot up like a rocket. The client’s panic subsided.

I learned a powerful lesson: You can build the most beautiful store in the world, but if you forget to give Google the keys and the blueprint, you’re just a ghost. Never, ever overlook the technical basics.

Okay, I’m Convinced. How Do I Create an XML Sitemap?

This is the “how to” part of our guide. The good news is, for 99% of you, this will be incredibly easy. You have three main paths.

Method 1: The Easy Way (Your CMS Probably Does It)

If you are using a modern website platform, this is very likely already done for you.

For WordPress Users: WordPress does not create an XML sitemap by default. You need a plugin. But since every serious WordPress site uses an SEO plugin, you’re probably covered.

  • Rank Math: This is my personal favorite. If you install Rank Math, it automatically creates a sitemap and updates it. You can find the settings under Rank Math > Sitemap Settings. It creates a sitemap_index.xml file.
  • Yoast SEO: The other major player. During setup, Yoast asks you if you want to enable sitemaps. The setting is under Yoast SEO > Settings > General > Features. Scroll down to “XML sitemaps” and make sure it’s toggled “On.”
  • All in One SEO: Similar to the others, this feature is built-in and usually on by default.

With these plugins, your sitemap URL will almost always be: https://www.yourdomain.com/sitemap_index.xml

For Shopify, Wix, or Squarespace Users: You’re done.

Seriously. That’s it. These platforms (known as “hosted” platforms) handle this for you automatically. You cannot turn it off, and you don’t need to configure it.

  • Shopify: Your sitemap is always at https://www.yourstore.com/sitemap.xml. It auto-updates when you add new products.
  • Wix & Squarespace: Both platforms automatically generate and update your sitemap at https://www.yourdomain.com/sitemap.xml.

For these platforms, you can skip to the section on “How to Tell Google About Your Sitemap.”

Method 2: The Pro Way (Using a Crawler Tool)

What if you have a custom-built site? Or what if you just want more control?

The best tool for the job is Screaming Frog SEO Spider. It’s a desktop program (PC, Mac, and Linux) that crawls your website just like Google does.

It’s my go-to tool for all technical SEO audits.

The free version will crawl up to 500 URLs, which is enough for many small sites. Here’s the process:

  1. Open Screaming Frog.
  2. Enter your homepage URL in the top bar and click “Start.”
  3. Let it crawl your entire site. When it’s finished, the progress bar will hit 100%.
  4. In the top menu, go to Sitemaps > XML Sitemap.
  5. A configuration box will pop up. This is where its power lies. You can choose to exclude “noindex” pages, “canonicalised” pages, or PDFs. (My advice: check all of these. You only want your main, indexable pages).
  6. Click “Export” and save the file as sitemap.xml.
  7. Now you just have to upload this file to your website’s root directory (we’ll cover that).

The catch with this method is that the sitemap is static. If you add new pages to your site, you have to remember to re-run the crawl and re-upload the file.

Method 3: The Online Generator (Fast and Free for Small Sites)

If you have a small, static HTML site (under 500 pages) and don’t want to download software, you can use a free online generator.

If you Google “XML sitemap generator,” you’ll find dozens. They all work the same way:

  1. You paste in your homepage URL.
  2. You click “Start” or “Generate.”
  3. The tool crawls your site (from the web, not your desktop).
  4. It presents you with a sitemap.xml file to download.

This is a great, fast, one-time solution. But just like Screaming Frog, it’s a static snapshot. It will not update automatically. This is only for sites that rarely change.

Wait, There’s More Than One Type of Sitemap?

Yes. As your site grows in size and complexity, you’ll find that one single sitemap file isn’t enough. This is where most people get confused, but it’s simple.

What’s a Sitemap Index File?

Search engines impose limits on sitemaps. A single sitemap.xml file cannot contain more than 50,000 URLs and cannot be larger than 50MB (when uncompressed).

My 10,000-product client was fine. But what about that 200,000-product e-commerce store?

The solution is a Sitemap Index.

A sitemap index is just a “sitemap of sitemaps.” It’s a simple file that doesn’t list any page URLs. Instead, it lists the locations of your other sitemaps.

This is what WordPress plugins like Rank Math and Yoast create by default. If you go to yourdomain.com/sitemap_index.xml, you’ll see something like this:

  • post-sitemap.xml (contains all your blog posts)
  • page-sitemap.xml (contains all your pages)
  • product-sitemap.xml (contains all your products)
  • category-sitemap.xml (contains all your category pages)

This is fantastic for organization and for large sites. When you submit to Google, you only submit the one index file URL (sitemap_index.xml). Google will see it and then automatically find all the “child” sitemaps listed within it.

Do I Need a Video Sitemap?

A video sitemap is a specialized sitemap that gives Google extra information about the videos embedded on your site. It can include the video’s title, a short description, the thumbnail URL, and the duration.

Should you use one? If videos are a core part of your business, yes. It can help your videos get a “rich snippet” in regular search results and show up more prominently in the “Videos” tab of Google search.

Most WordPress video or SEO plugins can generate this for you.

What About an Image Sitemap?

This is the same concept, but for images. It lists the URLs of the important images on your site, along with potential captions or titles.

Should you use one? My opinion: For most sites, it’s not necessary.

Google is extremely good at finding images just by crawling your pages (as long as you’re using proper <img> tags). The most important things for image SEO are:

  1. A descriptive filename (e.g., blue-nike-running-shoe.jpg, not IMG_1234.jpg).
  2. Descriptive alt text (for accessibility and SEO).

SEO plugins like Rank Math do have an option to “Include Images in Sitemap.” This is a good, low-effort setting to turn on. It essentially bundles the image information inside your regular page sitemap, which is more efficient than creating a separate file.

I Have a Sitemap File. Now What? (Best Practices)

Okay, you’ve got your sitemap.xml file (or your sitemap_index.xml URL). You’re not done yet. You have to put it somewhere and tell Google about it.

Where Do I Even Put the File?

Your sitemap file must be placed in the root directory of your website.

This just means it should be accessible at https://www.yourdomain.com/sitemap.xml.

If you’re on a CMS like WordPress or Shopify, this is done for you. You don’t need to do anything.

If you used a tool like Screaming Frog to create the file, you will need to use an FTP client (like FileZilla) or your web host’s “File Manager” to upload that sitemap.xml file to the main folder of your site (often called public_html or www).

How to Tell Google About Your Sitemap (The Most Important Step)

There are two ways to do this. You should do both.

Method A: Google Search Console (The “Must-Do”) Google Search Console (GSC) is the free diagnostic dashboard for your website. If you haven’t set it up, stop reading this and go do it right now.

Once you’re in GSC:

  1. Select your website property in the top left.
  2. On the left-hand menu, under the “Indexing” section, click on “Sitemaps”.
  3. You’ll see a bar at the top that says “Add a new sitemap.”
  4. Your domain will already be filled in. You just need to type the rest of the URL.
    • sitemap_index.xml (if you’re on WordPress)
    • sitemap.xml (if you’re on Shopify/Wix or used a generator)
  5. Click “Submit”.

That’s it. GSC will say “Submitted.” It may take a few hours or even a few days for it to say “Success” and show the number of URLs it discovered.

Method B: Your robots.txt File (The “Belt-and-Suspenders”) Your robots.txt file is another text file in your root directory. Its main job is to tell crawlers which pages not to visit.

But you can also use it to tell all search engines (Google, Bing, DuckDuckGo) where your sitemap is.

Just open your robots.txt file (you can often edit this with your SEO plugin) and add this line, usually at the very top or bottom:

Sitemap: https://www.yourdomain.com/sitemap.xml

(Obviously, replace the URL with your own).

This is a great fallback and ensures all crawlers can find your map, even if you haven’t submitted it to their specific “webmaster tools.”

What Pages Shouldn’t Be in My Sitemap?

This is where we get into advanced, high-impact SEO. A common mistake is to just include every single URL on the site.

This is bad. It’s like giving Google a map where half the locations are private homes, demolition sites, or dead-end alleys. It wastes Google’s time (their “crawl budget”) and sends mixed signals.

The Golden Rule: Your sitemap should only contain your high-quality, 200-status-code, canonical, indexable pages.

It is a list of your best content.

Let’s Get Specific: What Do I Exclude?

Here is your exclusion checklist. Your sitemap should never contain:

  • No-Indexed Pages: This is the #1 mistake. If you have a page with a “noindex” tag (telling Google not to index it), it must not be in your sitemap. This is a powerful mixed signal. You’re saying “Here’s my map, visit this page!” and “Hey, don’t index this page!” at the same time.
  • Non-Canonical URLs: If Page A has a rel="canonical" tag pointing to Page B (telling Google “Page B is the real version”), Page A must not be in the sitemap. Only Page B should be.
  • Redirects (301s, 302s): Why would you put a URL in your sitemap that just sends the crawler somewhere else? Don’t include the old URL. Make sure the new (destination) URL is included.
  • Error Pages (404s, 500s): Never. This is just a list of broken links.
  • Utility & Private Pages: Any page that is for users, not search engines.
    • Login pages
    • “My Account” dashboards
    • Shopping cart
    • Checkout pages
    • “Thank you for subscribing” pages
    • Password reset pages
  • Filtered/Faceted Navigation Pages: This is a huge one for e-commerce. If you have a category page for /shoes, that’s great. But when a user filters by ?color=red&size=10, it creates a new URL. You do not want these parameter-based URLs in your sitemap. It creates a massive amount of low-quality, duplicate content.
  • Internal Search Results: The pages your own site generates when a user searches for “blue shoes.”

Good SEO plugins (like Rank Math) and good generators (like Screaming Frog) will exclude most of these for you by default. But it’s always your job to double-check.

How Often Should I Update My Sitemap?

The best answer is: dynamically.

  • If you use a CMS plugin (Yoast, Rank Math): You never have to think about this. The moment you publish a new blog post, the plugin adds it to the sitemap file. The moment you delete a post, it’s removed. It’s all done for you.
  • If you used a static generator (Screaming Frog): You must update it manually. How often? As often as your site changes. If you add new pages once a month, you should re-run your crawl and re-upload your sitemap once a month.

You do not need to re-submit the sitemap to GSC every time it changes. Google will re-crawl your sitemap URL periodically on its own.

Help! Google Search Console Has an Error!

Don’t panic. GSC sitemap reports are one of the most useful diagnostic tools you have.

GSC Says “Couldn’t Fetch”

This is the most common error. It’s almost always a simple problem:

  1. You had a typo. You submitted sitemap.xml but your file is actually sitemap_index.xml.
  2. Your robots.txt is blocking it. You might have a rule like Disallow: /*.xml, which accidentally blocks your sitemap.
  3. Your site was down. Google tried to grab the file, your server timed out.
  4. The file doesn’t exist. You submitted the URL, but you never actually uploaded the file.

The fix: Check those four things, fix the issue, and you can just wait. Google will try again.

GSC Says “Sitemap has Errors”

This is even better. It means Google fetched the file but didn’t like what it found.

Click on the error, and GSC will tell you exactly what’s wrong. The most common errors are:

  • “URLs blocked by robots.txt”
  • “Submitted URL marked ‘noindex'”
  • “Submitted URL seems to be a 404”

This is not a failure. This is a free to-do list. Google is telling you exactly how to clean up your sitemap. Go into your SEO plugin or your sitemap file, remove those bad URLs, and your sitemap will be healthier for it.

My Sitemap Was Submitted, But My Pages Still Aren’t Indexed!

This is the big one. This brings us full circle.

Remember: A sitemap is a request for crawling, not a guarantee of indexing.

If your sitemap is submitted, and GSC says “Discovered – currently not indexed,” the sitemap has done its job. It showed Google the page.

Google then looked at the page and decided, for whatever reason, that it was not worthy of being in the main search index.

This is no longer a sitemap problem. It’s a content quality problem.

This is often caused by:

  • Thin Content: The page has almost no unique text.
  • Duplicate Content: The text on this page is an exact (or very-near) copy of another page on your site or another site.
  • Poor Site Architecture: The page is so many clicks deep from the homepage that Google considers it unimportant.
  • It’s a New Site: Sometimes, it just takes time for Google to “trust” your site enough to index all its pages.

As Stanford University’s own web services guide puts it, a sitemap helps search engines find your content. It doesn’t, however, improve your actual ranking in search results. The quality of your content is what handles that.

Is an XML Sitemap Still Worth the Effort?

Yes. One hundred percent.

An XML sitemap is your direct, formal line of communication with Google and other search engines.

It guarantees that Google knows about all your important content, which is the critical first step to getting any search traffic at all. It speeds up the discovery of new pages, which is vital for news sites and e-commerce stores. And for large, complex sites, it’s the only way to ensure your entire inventory gets seen.

Best of all, with modern tools, it’s something you can “set and forget.”

You build the house. You write the content. You design the experience. But don’t forget to hand Google the blueprint. Don’t make them guess where the rooms are.

FAQ

What is an XML sitemap and why is it important for my website?

An XML sitemap is a specially formatted text file that lists all important pages of your website, designed for search engine crawlers like Googlebot. It is crucial because it acts as a road map, ensuring search engines can discover, crawl, and index all of your valuable content efficiently, especially on large or complex sites.

How does a sitemap help with SEO, and does it guarantee higher rankings?

A sitemap primarily assists with indexation by informing search engines about all the pages that exist on your website, aiding in their discovery and inclusion in search results. However, it does not guarantee higher rankings; ranking depends on content quality, site speed, backlinks, and user experience.

What are the essential elements included in an XML sitemap?

An XML sitemap includes tags such as for the URL of each page, for the last modified date, indicating how often the page updates, and to suggest the importance of the page relative to others. The only required element is .

How can I create and submit an XML sitemap if I use a CMS like WordPress or Shopify?

Most modern CMS platforms like WordPress (with plugins such as Yoast, Rank Math, or All in One SEO) generate sitemaps automatically. Shopify and Wix also create sitemaps for you, which are accessible at standard URLs like ‘yourdomain.com/sitemap.xml.’ You then submit this URL to Google Search Console or include it in your robots.txt file.

Posted in Technical SEO

About Author: Jurica Šinko

jurica.lol3@gmail.com

Hi, I'm Jurica Šinko, founder of Rank Your Domain. With over 15 years in SEO, I know that On-Page & Content strategy is the heart of digital growth. It's not just about keywords; it's about building a foundation that search engines trust and creating content that genuinely connects with your audience. My goal is to be your partner, using my experience to drive high-quality traffic and turn your clicks into loyal customers.

Leave a Comment

Your email address will not be published. Required fields are marked *

*
*