Archive for May, 2011

Harvesting metadata

Tuesday, May 24th, 2011

It is imperative for CRRA member institutions to make their metadata available for harvesting via a Web server.

A couple of years ago, when the “Portal” was just beginning, the modus operandi for ingesting MARC and EAD metadata was to send it to Notre Dame, save it on local hard disk, and index it. That process worked then, but as we grow it becomes less and less scalable.

Now-a-days the preferred method of getting your metadata to the Portal is through harvesting. Here is how it works:

  1. Create metadata – Use whatever process you desire to create and edit your metadata. Much of what we suggest is outlined in a previous posting affectionately called “the recipe“.
  2. Export metadata – If your metadata is in MARC format, then query your integrated library system for all things destined for the Portal, and save the result to a single file using the UTF-8 character set. If your metadata is in EAD format, then export it as individual files making sure they are well-formed and valid.
  3. Expose metadata – In either case, MARC records or EAD files, the next step is to save the metadata on a Web server. Create or have created a directory on a Web server. Put the file of MARC records and/or the EAD files in the directory. There is no need to create a Web page. Just make sure the directory’s contents are listed automatically and by default. A good example is the work done by Marquette University.
  4. Share the URL(s) – Once the files are on a Web server, they will have URLs. In the case of MARC records, send Notre Dame the URL of the MARC file. In the case of EAD files, send the URL of the directory.
  5. Repeat – This is an never-ending process. Go to Step #1. As you create, edit, and export new or different metadata, save it in the Web-accessible directory. There is no need to send the updates to Notre Dame. They will be harvested on a regular basis. There is no need to denote which records are new, changed, or deleted. Previously indexed records will be discarded and the whole lot will be re-indexed.

There are many benefits to this process. First, the data gets duplicated. “Lot’s of copies keep stuff safe.” Second, Internet spiders and robots will find your data, index it, and make it accessible via their indexes. That is a good thing. Third, it gives you more control over the data and reduces the risk of Notre Dame loosing it.

Just like the previous “recipe”, what is described above is only an outline. Each institution will differ slightly in their implementation. If you have any questions, then please don’t hesitate to ask.

“Catholic Portal” usability efforts

Friday, May 13th, 2011

This page has become the home page for the usability efforts of the “Catholic Portal”.

The Digital Access Committee had a conference call on Thursday, May 12. The purpose of the meeting was to discuss usability studies. The resources (time and money) required to do the studies was emphasized. Similarly, the need to have the studies done with the intended audience of the Portal — upper-class man, graduate students, faculty, and scholars — was also stressed.

Ideally each institutional member of the Committee will facilitate and complete a set of usability studies by Christmas. In that vein, the following tentative list of who will do studies by has been drafted:

  • Seton Hall during June/July
  • University of Toronto during July/August
  • Marquette University during August/September
  • Georgetown University during late August/early September
  • Catholic Theological Union during September
  • Villanova University during September/October

Individual committee members are expected to communicate with the committee as a whole by May 27 with more definite commitments.

For more information about the usability studies, see “Doing usability against the ‘Catholic Portal’

CRRA-Tech

Friday, May 13th, 2011

This is the home page for a mailing list called CRRA-Tech.

The Catholic Research Resources Alliance (CRRA) or “Catholic Portal” brings together data and metadata for the purposes of Catholic research and scholarship. This process is facilitated through a number of groups dealing with administrtive issues, collection issues, metadata issues, etc. CRRA-Tech is a mailing list intended to support and discuss the computer technology issues of the CRRA such as but not limited to the harvesting of content and metadata, the validation of content and metadata, indexing technologies, library “discovery systems”, the programming languages (PHP, Java, Perl, and Javascript) used, log file analysis, casscading stylesheets, debugging tools, the role of open source software, etc. In short, CRRA-Tech provides a forum for discussing the computer infrastructure of the Portal.

If supporting research and scholarship through the use of computer technology is a part of your daily work and if your employer is as member of the CRRA, then consider subscribing to CRRA-Tech. To subscribe:

  1. address a message to listserv@listserv.nd.edu
  2. in the body of the message put “subscribe crra-tech”
  3. send it away

You ought to get back a couple of confirmations, and you will be done.

CATLA Spring Conference

Tuesday, May 3rd, 2011

On Friday, April 15 I had the honor and pleasure of giving a presentation to the Chicago Area Theological Library Association. This posting documents the experience.

To and from Berrien Springs (MI)

To and from Berrien Springs (MI)

The Chicago Area Theological Library Association held its Spring Conference at Andrews University in Berrien Springs (Michigan). The conference was small, about 15 people attended. After the business meeting I gave a presentation on “next-generation library catalogs”, digital humanities, and the “Catholic Portal”. In a nutshell I compared & contrasted database applications (traditional library catalogs) with indexes (“discovery systems”). I then demonstrated a few text analysis tools and at the same time explained how these tools can be used to supplement the “close” reading process. Finally I described and demonstrated the “Catholic Portal”, and I showed how the ideas of “next-generation” library catalogs and text mining have been incorporated into it. I got lucky with the last part of the presentation because I had upgraded the Portal the previous day, and nothing went wrong during the demonstration.

After a vegetarian lunch in the University’s dining hall, we returned to the conference room for a set of lightning talks:

  • Kate Ganski (University of Wisconsin, Milwaukee) described a needs-based marketing campaign which looked rather innovative and energetic
  • Lisa Gonzalez (Catholic Theological Union) enumerated a number of cool (and “kewl”) tech tools to share with patrons
  • Alan Krieger (University of Notre Dame) described the the Hesburgh Libraries’s newly created theological reading room and how it was being used
  • Matt Ostercamp (North Park University) outlined ways to promote traditional reading in libraries, and of all the lightning talks, this one complemented my presentation the most
  • Karl Stutzman (Associated Mennonite Biblical Seminary) reported on the process his library is going through to implement Primo

After the talks we were given a very nice tour of the University’s library and archive. Hosting the largest collection of Seventh Day Adventist materials in the world, the University archive was quite impressive. They actively digitize their materials and provide a home for a wide variety of materials. I was also impressed with the library’s service to the community. Specifically, they operated a charitable giving program where they received new (and used) books from a variety of sources and then shipped these books to fledgling libraries all over the world. They were putting their university’s values into practice.

I had a good time, and I appreciate the opportunity. “Thank you, Lisa G., for inviting me!”