Automatic Splitting
How SitemapHost handles large sitemaps by automatically splitting them
The Problem
Search engines impose strict limits on sitemap files:
- 50,000 URLs maximum per sitemap file
- 50MB maximum uncompressed file size
For sites with more than 50,000 pages, you need to split your sitemap into multiple files and create a sitemap index that references them all. Managing this manually is error-prone and time-consuming.
The Solution
SitemapHost automatically handles splitting for you. When you upload more than 50,000 URLs, we:
- Split your URLs into chunks of 50,000
- Generate individual sitemap files for each chunk
- Create a sitemap index that references all child sitemaps
- Serve the index at your root sitemap URL
You don't need to do anything special to enable splitting. Just upload all your URLs and we handle the rest automatically.
How It Works
Before: Your Upload
You upload 150,000 URLs via the API or dashboard:
{
"domain": "sitemap.yoursite.com",
"urls": [
{ "loc": "https://yoursite.com/page-1" },
{ "loc": "https://yoursite.com/page-2" },
// ... 149,998 more URLs
]
}After: Generated Files
SitemapHost generates the following structure:
sitemap.yoursite.com/
├── sitemap.xml (sitemap index)
├── sitemap-1.xml (URLs 1-50,000)
├── sitemap-2.xml (URLs 50,001-100,000)
└── sitemap-3.xml (URLs 100,001-150,000)
Sitemap Index
The main sitemap.xml becomes a sitemap index:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://sitemap.yoursite.com/sitemap-1.xml</loc>
<lastmod>2024-01-15T10:30:00Z</lastmod>
</sitemap>
<sitemap>
<loc>https://sitemap.yoursite.com/sitemap-2.xml</loc>
<lastmod>2024-01-15T10:30:00Z</lastmod>
</sitemap>
<sitemap>
<loc>https://sitemap.yoursite.com/sitemap-3.xml</loc>
<lastmod>2024-01-15T10:30:00Z</lastmod>
</sitemap>
</sitemapindex>Child Sitemaps
Each child sitemap contains up to 50,000 URLs:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://yoursite.com/page-1</loc>
<lastmod>2024-01-15</lastmod>
</url>
<url>
<loc>https://yoursite.com/page-2</loc>
<lastmod>2024-01-14</lastmod>
</url>
<!-- ... up to 50,000 URLs -->
</urlset>Submitting to Search Engines
When using automatic splitting, you only need to submit the main sitemap index URL to search engines. They will automatically discover and process all child sitemaps.
robots.txt: Add only the main sitemap URL to your robots.txt:
User-agent: *
Allow: /
Sitemap: https://sitemap.yoursite.com/sitemap.xmlIncremental Updates
When you update your sitemap, we intelligently handle the changes:
- Adding URLs - New URLs are added to existing chunks or new chunks are created
- Removing URLs - Chunks are rebalanced to maintain optimal distribution
- Updating URLs - Only affected chunks are regenerated
The
lastmodtimestamp in the sitemap index is updated whenever any child sitemap changes, signaling to search engines that they should re-crawl.
Limits
| Limit | Value | Notes |
|---|---|---|
| URLs per chunk | 50,000 | Google/Bing limit |
| Max chunks | 500 | 25 million URLs total |
| File size per chunk | < 50MB | Automatically managed |
Next Steps
- Search Engine Notifications - Notify search engines when sitemaps update
- Upload API - Learn how to upload URLs programmatically