VUFind and sitemaps

In an effort to improve SEO (search engine optimization) I have done my best to implement sitemaps against the “Catholic Portal’s” VUFind implementation.

Sitemaps are XML files listing all the individual files/resources of a website. The intention and structure of these files is documented at Sitemaps.org. By exposing a site’s content in this way Internet robots/spiders can slurp up sitemap files’ URLs, go directly the resources without crawling, and index the content found there. In short, sitemaps make it easier for Internet indexers to do their job.

Implementing sitemaps in VUFind is relatively trivial. Edit a configuration file (web/conf/sitemap.ini), and run the sitemap file generator (php util/sitemap.php). See the VUFind documentation for more detail. Here at Portal Central I configured sitemap.ini with the following values:

  • frequency = monthly
  • countPerPage = 10000
  • fileName = sitemap
  • fileLocation = /shared/catholic_portal/data/data/sitemaps/
  • baseSitemapUrl = http://www.catholicresearch.net/sitemaps
  • baseSitemapFileName = baseSitemap

The only configuration which differs from the norm is the value of baseSitemapUrl. Instead of putting the sitemap files in the root of the VUFind filesystem I am having them saved in a directory called sitemaps. While such a thing is discouraged by the folks at Sitemaps.org, it keeps my filesystem clean, and more importantly, it makes it easier for me to migrate from one version of VUFind to another. Besides, the Google Webmaster tools ask for the specific location of one’s sitemap files and I don’t really need them to be discoverable by too many other indexers. All the other indexers pale in comparison.

Because I put all of the sitemap files in a separate directory, I needed to edit my httpd.conf file so VUFind does not try to interpret the directory as an action. My configuration follows:

  # sitemaps
  Alias /sitemaps /shared/catholic_portal/data/data/sitemaps/
  <Directory "/shared/catholic_portal/data/data/sitemaps">
    Options FollowSymLinks ExecCGI +Includes +Indexes
    AllowOverride FileInfo
    Order deny,allow
    Allow from all
  </Directory>

Is this an overly complicated solution? Maybe. It is cleaner? In my opinion, yes. But heck, all of this is only differences in configuration.

Author: Eric Lease Morgan

I am a librarian first and a computer user second. My professional goal is to discover new ways to use computers to provide better library services. I use much of my time here at the University of Notre Dame developing and providing technical support for the Catholic Research Resources Alliance -- the "Catholic Portal".