com.bssp.javasitemapper
Class Sitemap
java.lang.Object
com.bssp.javasitemapper.Sitemap
public class Sitemap
- extends java.lang.Object
Class Sitemap acts as the main container for the sitemap.xml file, setting
up the "urlset" attribute required for every sitemap.xml file and used to
hold URL containers and their encapsulated attributes.
Using the sitemap API is quite intuitive, as will be shown in the following
example:
1. Sitemap sitemap = new Sitemap();
2.
3. SitemapURL url1 = new SitemapURL(
4. "http://www.example.com/");
5. sitemap.addURL(url1);
6.
7. SitemapURL url2 = new SitemapURL(
8. "http://www.example.com/article/","2008-07-29T17:12:02-07:00",
9. SitemapURL.CHANGE_FREQUENCY_DAILY,
10. SitemapURL.PRIORITY_ONE_POINT_ZERO);
11. sitemap.addURL(url2);
12.
13. SitemapURL url3 = new SitemapURL(
14. "http://www.example.com/article/this-is-a-sample");
15. url3.setPriority(SitemapURL.PRIORITY_ZERO_POINT_NINE);
16. sitemap.addURL(url3);
17.
18. for(int i=1;i<=2;i++) {
19. //for(int i=1;i<=40000;i++) {
20. SitemapURL url = new SitemapURL(
21. "http://www.example.com/article/" + i);
22. url.setLastModified("2007-04-18");
23. url.setChangeFrequency(SitemapURL.CHANGE_FREQUENCY_MONTHLY);
24. url.setPriority(SitemapURL.PRIORITY_ZERO_POINT_NINE);
25.
26. sitemap.addURL(url);
27. }
28.
29. SitemapOutput sitemapOutput = new SitemapOutput();
30. //sitemapOutput.setSitemapDomain("http://www.example.com");
31. //sitemapOutput.setSitemapURI(
32. // "JavaSource/com/bssp/sitemapcreatortesting/main/");
33. //sitemapOutput.setSitemapFilename("temp.xml");
34. sitemapOutput.output(sitemap);
35.
36. //SitemapOutput sitemapOutput = new SitemapOutput(
37. // "http://www.example.com");
38. //sitemapOutput.setSitemapURI(
39. // "JavaSource/com/bssp/sitemapcreatortesting/main/");
40. //sitemapOutput.setSitemapFilename("temp.xml");
41. //sitemapOutput.output(sitemap);
42.
43. //SitemapOutput sitemapOutput = new SitemapOutput(
44. // "http://www.example.com",
45. // "JavaSource/com/bssp/sitemapcreatortesting/main/");
46. //sitemapOutput.setSitemapFilename("temp.xml");
47. //sitemapOutput.output(sitemap);
48.
49. //SitemapOutput sitemapOutput = new SitemapOutput(
50. // "http://www.example.com",
51. // "JavaSource/com/bssp/sitemapcreatortesting/main/", "temp.xml");
52. //sitemapOutput.output(sitemap);
Line 1 instantiates the Sitemap object, which serves as the container for the
sitemap.
Lines 3-5 demonstrate the steps needed in creating a URL. Locations are
required for all URLs in the "sitemap.xml" file; therefore, a SitemapURL
object cannot be created unless a URL location is included in the
instantiation. To add the URL to the sitemap, simply call Sitemap's addURL()
method, passing in the newly created SitemapURL object.
Lines 7-11 demonstrate use of an overloaded SitemapURL constructor that
allows the developer to set a URL's location, lastModified, frequency, and
priority attributes at instantiation.
Lines 13-16 demonstrate the setting of URL attributes (via setter methods)
without use of the overloaded SitemapURL constructor.
Lines 18-27 demonstrate the typical means by which a developer may populate
a Sitemap object with URLs and their attributes (via some looping scheme).
Lines 29-34 demonstrate the steps needed in creating the sitemap.xml file.
This is easily done by instantiating a SitemapOutput object (line 29)and
calling SitemapOutput's output() method, passing in the previously-created
Sitemap container object (line 34). Lines 30-33 demonstrate the setting of
SitemapOutput attributes via setter methods.
Lines 36-52 demonstrate alternatives to creating the sitemap file.
Lines 36-41 demonstrate use of an overloaded SitemapOutput constructor
that sets a SitemapOutput object's sitemapDomain attribute at instantiation.
Similarly, lines 43-47 demonstrate use of an overloaded SitemapOutput
constructor that sets a SitemapOutput object's sitemapDomain and sitemapURI
attributes at instantiation. To be thorough, lines 49-52 demonstrate use of
an overloaded SitemapOutput constructor that sets a SitemapOutput object's
sitemapDomain, sitemapURI, and sitemapFilename attributes at instantiation.
Attributes not set through constructor use can be initialized via setter
methods (lines 38-40, 46). (NOTE: when defining your own domain, a file
separator is not necessary, while definition of your own file location will
require a file separator).
Lines 9-10, 15, and 23-24 demonstrate use of constants in initializing URL
attributes. Although it is recommended that developers use the constants
built into this package, limited validation has also been implemented in
case developers wish to utilize their own custom values.
The above example results in a sitemap file similar to the one below
(compression through gzip ".gz" is provided to reduce bandwidth
requirements):
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
</url>
<url>
<loc>http://www.example.com/article/</loc>
<lastmod>2008-07-29T17:12:02-07:00</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.example.com/article/this-is-a-sample</loc>
<priority>0.9</priority>
</url>
<url>
<loc>http://www.example.com/article/1</loc>
<lastmod>2007-04-18</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
<url>
<loc>http://www.example.com/article/2</loc>
<lastmod>2007-04-18</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
</urlset>
Line 19 presents the situation where 40,000+ URLs are added to a Sitemap
object. To accommodate the 10MB sitemap filesize maximum, each sitemap
file can only support a max of 40,000 URLs. (Theoretically, a single
sitemap can support a maximum of 50,000 URLs, but 40,000 URLs keeps the
average sitemap filesize at ~7MB and has the added benefit of keeping
the API relatively fast). When this maximum is exceeded, URLs are split among
multiple sitemap files, with each sitemap file referenced in a general index
file. Each sitemap file is named sitemap#.xml, where # represents some index
number from 1-1000 (a sitemap index file can only reference a maximum of
1000 different sitemap files). The sitemap index file is named
sitemap_index.xml by default. In the case where a custom name is defined
for a set requiring multiple sitemaps (lines 33, 40, 46, 49-51), the
sitemap_index.xml file will utilize the custom name, thus extending the
custom-naming functionality to sitemap index files. As with single sitemap
creation, compression through gzip ".gz" is also provided with multiple
sitemap file creation to reduce bandwidth requirements. To enable
sitemap-indexing, no extra programming is required by the developer: simply
load the URLs into the Sitemap object and the API will take care of the rest.
After outputting your file, be sure to specify the sitemap files' locations
in robots.txt using the following tag:
Sitemap: <sitemap_location>
<sitemap_location> should be the complete URL to the sitemap, such as
http://www.example.com/sitemap.xml.
It is also important to remember that where you put your sitemap file on
your Web server determines which files can be included in that sitemap. For
example, a sitemap placed in http://www.example.com/articles/ will only
include files in the articles/ directory and below. Thus, it is to your
advantage to place all your sitemaps at the root of your Web server so that
all files are searchable.
Note: In its current version, the API only allows for the creation of 1
index file (which implies only 1000 sitemap files, or based on its current
build, only 40,000,000 URLs). We hope to improve functionality in the next
version.
- Version:
- 0.9.9, 2008-11-13
- Author:
- Victor Ngo
- See Also:
Sitemap
,
SitemapURL
,
SitemapOutput
Constructor Summary |
Sitemap()
This constructor creates a new Sitemap object that will serve as a
container for building the sitemap file(s). |
Method Summary |
void |
addURL(SitemapURL url)
This method adds a URL to the sitemap. |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Sitemap
public Sitemap()
- This constructor creates a new Sitemap object that will serve as a
container for building the sitemap file(s).
addURL
public void addURL(SitemapURL url)
- This method adds a URL to the sitemap.
- Parameters:
url
- an object of type SitemapURL containing the attributes of a
url to be added to the sitemap.