One important configuration option for the sitemap generator is the date format. The <ahref="http://www.w3.org/TR/NOTE-datetime">W3C datetime standard</a> allows you to choose the precision of your datetime (anything from just specifying the year like "1997" to specifying the fraction of the second like "1997-07-16T19:20:30.45+01:00"); if you don't specify one, we'll try to guess which one you want, and we'll use the default timezone of the local machine, which might not be what you prefer.
<p>That will generate two sitemaps, sitemap1.xml and sitemap2.xml, and then generate a sitemap_index.xml file describing the two.</p>
<prename="code"class="java">
// Use DAY pattern (2009-02-07), Greenwich Mean Time timezone
W3CDateFormat dateFormat = new W3CDateFormat(Pattern.DAY);
.dateFormat(dateFormat).build(); // actually use the configured dateFormat
wsg.addUrl("http://www.example.com/index.html");
wsg.write();</pre>
<p>It's also possible to carefully organize your sub-sitemaps. For example, it's recommended to group URLs with the same changeFreq together
(have one sitemap for changeFreq "daily" and another for changeFreq "yearly"), so you can modify the lastMod of the daily
sitemap without modifying the lastMod of the yearly sitemap. To do that, just construct your sitemaps one at a time using
the WebSitemapGenerator, then use the SitemapIndexGenerator to create a single index for all of them.</p>
<h2>Lots of URLs: a sitemap index file</h2>
<blockquote><code>SitemapIndexGenerator sig = new SitemapIndexGenerator("http://www.example.com", new File("sitemap_index.xml");<br>
for (int i = 0; i <5;i++)sig.addUrl("http://www.example.com/sitemap"+i+".html",newDate(i));<br>
wsg.write();<br>
</code></blockquote>
One sitemap can contain a maximum of 50,000 URLs. (Some sitemaps, like Google News sitemaps, can contain only 1,000 URLs.) If you need to put more URLs than that in a sitemap, you'll have to use a sitemap index file. Fortunately, WebSitemapGenerator can manage the whole thing for you.
<prename="code"class="java">WebSitemapGenerator wsg = new WebSitemapGenerator("http://www.example.com", myDir);
for (int i = 0; i < 60000; i++) wsg.addUrl("http://www.example.com/doc"+i+".html");
wsg.write();
wsg.writeSitemapsWithIndex(); // generate the sitemap_index.xml
</pre>
<p>That will generate two sitemaps for 60K URLs: sitemap1.xml (with 50K urls) and sitemap2.xml (with the remaining 10K), and then generate a sitemap_index.xml file describing the two.</p>
<p>It's also possible to carefully organize your sub-sitemaps. For example, it's recommended to group URLs with the same changeFreq together (have one sitemap for changeFreq "daily" and another for changeFreq "yearly"), so you can modify the lastMod of the daily sitemap without modifying the lastMod of the yearly sitemap. To do that, just construct your sitemaps one at a time using the WebSitemapGenerator, then use the SitemapIndexGenerator to create a single index for all of them.</p>
for (int i = 0; i < 5; i++) wsg.addUrl("http://www.example.com/bar"+i+".html");
wsg.write();
// generate sitemap index for foo + bar
SitemapIndexGenerator sig = new SitemapIndexGenerator("http://www.example.com", myFile);
sig.addUrl("http://www.example.com/foo.xml");
sig.addUrl("http://www.example.com/bar.xml");
sig.write();</pre>
<p>You could also use the SitemapIndexGenerator to incorporate sitemaps generated by other tools. For example, you might use Google's official Python sitemap generator to generate some sitemaps, and use WebSitemapGenerator to generate some sitemaps, and use SitemapIndexGenerator to make an index of all of them.</p>
<h2>Validate your sitemaps</h2>
SitemapGen4j can also validate your sitemaps. (If you used SitemapGen4j to make the sitemaps, you shouldn't need to
do this unless there's a bug in our code.) It's easy to configure the WebSitemapGenerator to automatically validate
your sitemaps right after you write them (but this does slow things down, naturally).
<p>SitemapGen4j can also validate your sitemaps using the official XML Schema Definition (XSD). If you used SitemapGen4j to make the sitemaps, you shouldn't need to do this unless there's a bug in our code. But you can use it to validate sitemaps generated by other tools, and it provides an extra level of safety.</p>
<blockquote><code>WebSitemapGenerator wsg = <b>WebSitemapGenerator.builder</b>("http://www.example.com", new File(".")<br>
<p>It's easy to configure the WebSitemapGenerator to automatically validate your sitemaps right after you write them (but this does slow things down, naturally).</p>
You can also use the SitemapValidator directly to manage sitemaps. It has two methods: validateWebSitemap(File f)
.autoValidate(true).build(); // validate the sitemap after writing
wsg.addUrl("http://www.example.com/index.html");
wsg.write();</pre>
<h2>Google-specific sitemaps</h2>
<p>You can also use the SitemapValidator directly to manage sitemaps. It has two methods: validateWebSitemap(File f) and validateSitemapIndex(File f).</p>
<p>Google can understand a wide variety of custom sitemap formats that they made up, including a Mobile sitemaps, Geo
sitemaps, Code sitemaps (for Google Code search), Google News sitemaps, and Video sitemaps. SitemapGen4j can
generate any/all of these different types of sitemaps.</p>
<h2>Google-specific sitemaps</h2>
<p>To generate a special type of sitemap, just use GoogleMobileSitemapGenerator, GoogleGeoSitemapGenerator,
GoogleCodeSitemapGenerator, GoogleCodeSitemapGenerator, GoogleNewsSitemapGenerator, or GoogleVideoSitemapGenerator
instead of WebSitemapGenerator.</p>
<p>Google can understand a wide variety of custom sitemap formats that they made up, including a Mobile sitemaps, Geo sitemaps, Code sitemaps (for Google Code search), Google News sitemaps, and Video sitemaps. SitemapGen4j can generate any/all of these different types of sitemaps.</p>
<p>You can't mix-and-match regular URLs with Google-specific sitemaps, so you'll also have to use a
GoogleMobileSitemapUrl, GoogleGeoSitemapUrl, GoogleCodeSitemapUrl, GoogleNewsSitemapUrl, or GoogleVideoSitemapUrl
instead of a WebSitemapUrl. Each of them has unique configurable options not available to regular web URLs.</p>
<p>To generate a special type of sitemap, just use GoogleMobileSitemapGenerator, GoogleGeoSitemapGenerator, GoogleCodeSitemapGenerator, GoogleCodeSitemapGenerator, GoogleNewsSitemapGenerator, or GoogleVideoSitemapGenerator instead of WebSitemapGenerator.</p>
<p>You can't mix-and-match regular URLs with Google-specific sitemaps, so you'll also have to use a GoogleMobileSitemapUrl, GoogleGeoSitemapUrl, GoogleCodeSitemapUrl, GoogleNewsSitemapUrl, or GoogleVideoSitemapUrl instead of a WebSitemapUrl. Each of them has unique configurable options not available to regular web URLs.</p>