<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Catholic Research Resources Alliance (CRRA) Blog</title>
	<atom:link href="http://www.catholicresearch.net/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.catholicresearch.net/blog</link>
	<description>Supporting Catholic research &#38; scholarship</description>
	<lastBuildDate>Mon, 14 May 2012 14:59:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Using OAI-PMH to populate the &#8220;Catholic Portal&#8221; is not straight-forward</title>
		<link>http://www.catholicresearch.net/blog/2012/05/oai/</link>
		<comments>http://www.catholicresearch.net/blog/2012/05/oai/#comments</comments>
		<pubDate>Mon, 14 May 2012 14:59:47 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=498</guid>
		<description><![CDATA[Using OAI-PMH to populate the &#8220;Catholic Portal&#8221; is not straight-forward, and this posting outlines some of my investigations in this regard. Introduction As you may or may not know, OAI-PMH is a &#8220;standard&#8221; protocol designed for harvesting metadata. It only understands six commands (or in OAI-PMH parlance, &#8220;verbs&#8221;). These commands are sent to remote computers [...]]]></description>
			<content:encoded><![CDATA[<p>
Using OAI-PMH to populate the &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8221; is not straight-forward, and this posting outlines some of my investigations in this regard.
</p>
<h2>Introduction</h2>
<p>
As you may or may not know, <a href="http://www.openarchives.org/OAI/openarchivesprotocol.html">OAI-PMH is a &#8220;standard&#8221; protocol designed for harvesting metadata</a>. It only understands six commands (or in OAI-PMH parlance, &#8220;verbs&#8221;). These commands are sent to remote computers in the form of URLs, and the remote computer is expected to respond in the form of specifically shaped XML streams. These commands include:
</p>
<ul>
<li>Identify &#8211; Lists who manages the repository and what type of content it contains.</li>
<li>ListMetadataFormats &#8211; Lists the various metadata schemes used to describe the repository&#8217;s content. At least one of these schemes must be Dublin Core.</li>
<li>ListSets &#8211; Specifies how the repository&#8217;s content is subdivided. There can be zero or more of these subdivisions.</li>
<li>ListIdentifiers &#8211; Returns a list of keys pointing to specific records in the repository.</li>
<li>ListRecords &#8211; An enhanced version of ListIdentifiers, this verb downloads whole records, not just identifiers.</li>
<li>GetRecord &#8211; Given a specific identifier, this verb retrieves a single record.</li>
</ul>
<p>
Through a conversation of these verbs and the returned XML streams, metadata between computers can be exchanged. It is then up to the computer doing the harvesting to implement some sort of cool and interesting service with the harvested content. Here at Catholic Portal Central we want to index the metadata and provide immediate access to remote digitized content.
</p>
<h2>Investigations</h2>
<p>
At least three Catholic Research Resources Alliance (CRRA) members have OAI-PMH repositories: Duquesne University, Boston College, and Loyola University Chicago. Using <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2012/05/oai-explorer.pl">a little Perl script</a>, I most recently investigated the content of the repositories of Boston College and Loyola University Chicago. Through this process I learned what metadata formats they supported, what sets were used to subdivided their collections, and output Dublin Core metadata from a few selected sets.
</p>
<p>
The harvested Dublin Core metadata was typical of OAI-PMH repositories: thin, a bit ambiguous, and somewhat inconsistant across repositories. It was thin because many of the Dublin Core elements are left unpopulated. It is ambiguous because many of the fields are repeated, and the values of repeated elements are of different types. For example, a description field may be empty, contain an abstract of the work, the full text of the work, or the process used to digitize the material. It is inconsistant because things like dates, names, and subject entries are formatted differently. In some places names are listed in first name/last name order. Other times it is last name/first name order. Dates can be anything from &#8220;February 12, 2012&#8243; to &#8220;2012-02-12&#8243; to &#8220;Twelfth Century&#8221;. None of this is new the world of OAI-PMH. It is typical.
</p>
<p>
All is not lost. There are patterns to this apparent randomness. Using my script I can sometimes output titles, descriptions, subject headings, and URLs of digitized objects. For example, here is such a list from the Loyola University Chicago repository:
</p>
<blockquote>
<p>
item: 46<br/><br />
key: oai:content.library.luc.edu:coll6/45<br/><br />
title(s): Letter to the Secretary of the Literary Agency of London, 1908<br />
title(s): Catholic Women Poets<br/><br />
identifier(s): cudahy219e3<br/><br />
identifier(s): 003_kayesmith_1908;pg3.jpg<br/><br />
identifier(s): http://content.library.luc.edu/u?/coll6,45<br/><br />
subject(s): Shelia Kaye-Smith; poets; women poets; Catholic poets<br/><br />
subject(s): Local<br/><br />
description(s): third page of letter requesting appointment<br/><br />
description(s): does not suit you any other time up to 4 15 will do Would you kindly send a reply to me c o Miss F E Walters Girton College Cambridge With apologies for troubling you believe me Yours faithfully Sheila Kaye Smith <br/><br />
description(s): Master file scanned at 600 dpi RGB in reflective mode from original document using MicroTek ScanMaker 1000XL<br/><br />
description(s): http://www.luc.edu.archives<br/><br />
type: image
</p>
</blockquote>
<p>
From this output it becomes apparent that the first title is the title of the artifact, the third identifier is the URL of the digitized object, the first subject field is a delimited list of keywords, the first description is a sort of abstract, and the type field contains a value denoting what kind of digitized thing is in question. Thus, the output follows a pattern, and computers are very good at patterns, therefore a computer program could easily be written to read this particular OAI-PMH output and stored in the Portal&#8217;s index.
</p>
<h2>Next steps</h2>
<p>
My next steps are two-fold. First, I will harvest and index some of the metadata from selected Loyola University Chicago OAI-PMH sets. Second, I will let colleagues from various CRRA committees (specifically the Digital Access Committee as well as the Collection Committee) peruse the results. In the end I hope to get feedback on how to proceed. Should I index more content? Less? None? If more, then how should records be displayed, and exactly how ought the Dublin Core metadata be mapped to VuFind&#8217;s underlying Solr index fields?
</p>
<p>
All of this work is entirely feasible. At the same time it is not enormously scalable. Hand-crafting the parsing of OAI-PMH output, and handcrafting how it all gets mapped to Solr&#8217;s index is time consuming and fragile. The Portal Home Planet can easily do this work for no more than a dozen different repositories, but after that some other means of production will need to be examined.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/05/oai/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>April 2012 Update</title>
		<link>http://www.catholicresearch.net/blog/2012/04/april-2012-update/</link>
		<comments>http://www.catholicresearch.net/blog/2012/04/april-2012-update/#comments</comments>
		<pubDate>Mon, 23 Apr 2012 21:03:20 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=493</guid>
		<description><![CDATA[CRRA UPDATE April 2012 This month’s update includes: A Focus on Members, from Janice Welburn, Chair, CRRA Board of Directors To guide us in developing effective strategies for successful member engagement, the Board has set up a Membership committee and I’m delighted to welcome a current Board member, Evelyn Minick, University Librarian, Saint Joseph’s University, [...]]]></description>
			<content:encoded><![CDATA[<p align="center"><strong>CRRA UPDATE</strong></p>
<p align="center"><strong>April 2012</strong></p>
<p>This month’s update includes:</p>
<ul>
<li>A Focus on Members, from Janice Welburn, Chair, CRRA Board of Directors<br />
<em>To guide us in developing effective strategies for successful member engagement, the Board has set up a Membership committee and I’m delighted to welcome a current Board member, Evelyn Minick, University Librarian, Saint Joseph’s University, as the chair </em>… <em>The Committee’s major objectives are to grow the membership and ensure retention of current members …</em></li>
<li>CRRA Collections Spotlight: The Philadelphia Archdiocesan Historical Research Center Catholic Newspaper Collection, by Shawn Weldon<br />
<em>The Philadelphia Archdiocesan Historical Research Center (PAHRC) holds one of the largest collections of Catholic newspapers in the United States …</em></li>
<li>Update on the Digital Access Committee (DAC), from Demian Katz, DAC Chair<br />
<em>In spite of changes, DAC has pressed forward with several initiatives.  The </em><em><a href="../../">Catholic Portal</a></em><em>, still the centerpiece of CRRA&#8217;s website, is under continuous improvement, both in response to member feedback gathered during usability testing and due to new features in the underlying VuFind software …</em></li>
<li>Mark Your Calendars: All-Members Meeting, Anaheim, CA, June 25-26, 2012, all are invited;<br />
Archival Networks and EAD Consortia at SAA in August (San Diego); Fall Symposium at DePaul University, Oct. 15-16, 2012</li>
<li>Position Announcement:Duquesne University</li>
</ul>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p align="center"><strong>A Focus on Members<br />
</strong><em>from Janice Welburn Chair, CRRA Board of Directors</em></p>
<p> The new strategic plan affirms the importance of a strong value proposition for members.  To guide us in developing effective strategies for successful member engagement, the Board has set up a Membership committee and I’m delighted to welcome a current Board member, Evelyn Minick, University Librarian, Saint Joseph’s University, as the chair.  Evelyn’s deep commitment to our mission, keen insights into member expectations and effective leadership of the task force that developed a multi-tiered dues schedule, make her an excellent choice to guide our membership development and support. While we may add other members over time, I am pleased to announce the initial membership:</p>
<ul>
<li>Kris Brancolini, Dean of the Library, Loyola Marymount University, Los Angeles</li>
<li>Theresa Byrd, University Librarian, University of San Diego; also a Board member</li>
<li>Melody McMahon, Director of the Paul Bechtold Library, Catholic Theological Union, Chicago</li>
<li>Tom Messner, Library Director, Barry University, Miami Shores, FL</li>
<li>Laverna Saunders, Library Director, Duquesne University, Pittsburgh</li>
<li>Bob Seal, Dean of Libraries, Loyola University Chicago</li>
<li>Kathy Webb, Dean of University Libraries, University of Dayton</li>
<li>Jennifer Younger, ex officio, Executive Director, CRRA</li>
</ul>
<p>The Committee’s major objectives are to grow the membership and ensure retention of current members.  It is advisory to the Board.  Although the Committee plays a central role, it is important to emphasize that the Committee will consult broadly with members on needs and expectations of membership, as well as actively seek suggestions from individuals and committees on prospective members.  We want to continue our participative tradition of reaching out to potential members as noted in our protocol for inviting new members.  The charge to the Membership Committee will be accessible shortly along with the full roster on our website.</p>
<p>&nbsp;</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p align="center"> <strong>THE PHILADELPHIA ARCHDIOCESAN HISTORICAL RESEARCH CENTER</strong></p>
<p align="center"><strong>CATHOLIC NEWSPAPER COLLECTION</strong></p>
<p> The Philadelphia Archdiocesan Historical Research Center (PAHRC) holds one of the largest collections of Catholic newspapers in the United States. These newspapers were collected by the American Catholic Historical Society of Philadelphia which was founded in 1886 to collect material documenting the history of Catholicism in the United States. The ACHS collections, including manuscripts, newspapers, periodicals, pamphlets, books, artifacts and graphic material, were given to the Archdiocese of Philadelphia in the 1930’s. In 1989, the ACHS Collection was merged with the Archives of the Archdiocese of Philadelphia to form PAHRC.</p>
<p>The newspaper collection contains Catholic newspapers from throughout the United States as well as some Catholic newspapers from Canada, England, Ireland, France and Italy.  The collection contains over 300 titles, representing 35 states and the District of Columbia, and covers the period primarily from the 1820’s through the 1940’s. The bulk of the collection dates from the 1840’s through the 1920’s.</p>
<p>Included are early and prominent Catholic newspapers such as <em>The Catholic Press/<em>The United States Catholic Press</em></em> (Hartford), <em>The Catholic Miscellany</em> (Charleston), <em>The Catholic Herald</em> (Philadelphia),<em> The Catholic Mirror</em> (Baltimore), <em>The Catholic Advocate</em> (Louisville), <em>The Pilot</em> (Boston), <em>The Catholic Telegraph</em> (Cincinnati) and <em>The Freeman’s Journal</em> (New York City). The collection also contains many ethnic newspapers, including Irish-American, German-American and Polish-American newspapers, as well as newspapers published for a juvenile audience, society newspapers and papers published for the support of Catholic institutions.</p>
<p>Notable are some of the first black Catholic newspapers published in the United States. There is a good run of the <em>American Catholic Tribune, </em>originally published in Cincinnati and later in Detroit, for the years 1887-1894. There are some issues of <em>The Journal</em>, a Philadelphia black Catholic newspaper that was published for a few months in 1892. The collection also includes Volume I, Number 1 (February 18, 1905) of <em>The Catholic Herald</em>, a black Catholic newspaper in Washington, D.C. which may be the only issue published. For more information on black Catholic newspapers and periodicals in the PAHRC collection see the following: <a href="http://www.pahrc.net/index.php/black-catholic-periodicals/">http://www.pahrc.net/index.php/black-catholic-periodicals/</a></p>
<p>The collection also contains other rare titles such <em>as Sina Sapa Wocekiye Taeyanpaha</em>, a North Dakota newspaper published in the Sioux language, <em>The Catholic Visitor</em> (Richmond, Virginia), <em>The New Jersey Catholic Journal</em> (Trenton, New Jersey) and perhaps the only significant collection of <em>Redpath’s Illustrated Weekly</em>, a primarily Irish national newspaper published in New York City by the journalist and social activist James Redpath.</p>
<p>Despite the size and research value of the collection, there are some issues that impact its usefulness. Although there are very large runs of issues for the major newspapers, none of the titles is complete and there are gaps and issues missing. Most of the rarer newspapers may contain only a few issues or a few years of the paper. The most pressing issue is that the collection is maintained primarily in hard copy and a significant number of the newspapers are in very fragile condition and in need of immediate conservation and preservation. One of the advantages of membership in the CRRA is the opportunity to cooperate with other repositories facing these same issues to create a comprehensive online inventory and directory of North American Catholic newspapers and to facilitate the eventual digitization of the various collections. One of my goals as a member of the CRRA Newspapers Taskforce is to assist in the realization of these projects.</p>
<p>To view the contents of the newspaper collection at PAHRC see the following:  <a href="http://pahrc.pastperfect-online.com/30664cgi/mweb.exe?request=clicksearch;dtype=d;subset=0;_t1101=newspapers">http://pahrc.pastperfect-online.com/30664cgi/mweb.exe?request=clicksearch;dtype=d;subset=0;_t1101=newspapers</a></p>
<p><em>&#8211;Shawn Weldon, PAHRC</em></p>
<p><em>Member, Collections Committee</em></p>
<p><em>Member, Catholic Newspapers Task Force</em></p>
<p>&nbsp;</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Update on the Digital Access Committee (DAC), from Demian Katz, DAC Chair</strong></p>
<p>The Digital Access Committee has had some recent membership changes, bidding farewell to Ann Hanlon (Marquette) and welcoming new member Megan Bernal (DePaul).</p>
<p>In spite of changes, DAC has pressed forward with several initiatives.  The <a href="../../">Catholic Portal</a>, still the centerpiece of CRRA&#8217;s website, is under continuous improvement, both in response to member feedback gathered during usability testing and due to new features in the underlying VuFind software used to run it.  Additionally, DAC has begun looking at some new software that can be used to expand and improve CRRA&#8217;s online presence.  The Concrete5 Content Management System is an open source tool for building websites, and DAC hopes to use it to improve the quality and simplify the maintenance of the informational pages that accompany the Catholic Portal on catholicresearch.net.</p>
<p>Archon and Archivists&#8217; Toolkit are both packages for building EAD files for archival description, and DAC has been weighing the benefits of installing one of these packages to help members build finding aids.  Finally, DAC is also preparing to support the <a href="../../About/Contact">Newspapers Task Force</a> as needed as efforts progress.</p>
<p>&nbsp;</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong> </strong></p>
<p align="center"><em></em><strong>Mark Your Calendars!  Upcoming Events</strong></p>
<p align="center"><strong> All-Members Meeting</strong></p>
<p align="center"><strong>Anaheim, CA</strong></p>
<p align="center"><strong>June 25-26, 2012</strong></p>
<p align="center"><strong><br />
</strong></p>
<p>CRRA colleagues,<br />
As you make plans for the ALA conference and/or the following ATLA conference, we hope you will also make time for the CRRA All-Members meeting.  This announcement also appeared in the CRRA March Update but we thought it might be good to send it out again after the Easter holiday. <em>–Jennifer</em></p>
<p>You are invited to the annual All-Members meeting.  While we don’t know specific locations at this time, we will hold our events in easily accessible locations. The <a href="http://gocalifornia.about.com/gi/o.htm?zi=1/XJ&amp;zTi=1&amp;sdn=gocalifornia&amp;cdn=travel&amp;tm=37&amp;f=00&amp;su=p1090.11.200.ip_&amp;tt=2&amp;bt=0&amp;bts=0&amp;zu=http%3A//www.rideart.org/">Anaheim Resort Transit Trolley</a> has numerous routes connecting hotels, restaurants, shops, convention center and the Crystal Cathedral, and we will provide directions for getting to CRRA events.  Later we will ask for RSVP’s from those attending Monday’s dinner and/or Tuesday’s meeting and/or lunch so as to provide appropriately for a dinner reservation, and for breaks and lunch on Tuesday. On Monday evening, June 25, we will meet for dinner at a casual restaurant. We meet about 6:30.  We will make a group reservation.</p>
<p>We meet on Tuesday, June 26, from 9:00 a.m. through 12:30 p.m. followed by lunch (optional).  Our agenda is focused on mission-support for the next year: identifying top priorities, ideas for forming local teams and expanding our understanding of Catholic Studies.  With the announcement that the Board has adopted a five year strategic plan, we will be asking committees to develop their annual goals in this context and will be inviting all members to participate in identifying high priorities for the coming year.</p>
<p align="center"><strong><br />
</strong><strong>Agenda</strong></p>
<p>·         Welcome, Janice Welburn, chair, Board of Directors</p>
<p>·         Annual goals, objectives and priorities – Moderator, Pat Lawton</p>
<p>·         Forming institutional teams – Panel discussion TBA</p>
<p>·         Catholic Studies and challenges facing Catholic educators – Rev. James Heft, S.M. President, Institute for Advanced Catholic Studies at the University of Southern California and Member, CRRA Leadership Council</p>
<p>We look forward to meeting with as many of you as can be there. Please share this invitation with any others at your institution who may also be in Anaheim.  Traditionally, our meetings are open to others interested in our mission and activities. If you know of others who might like to attend, you can share this information or request that Pat or Jennifer do so.  See you there.</p>
<p><em>Jennifer Younger<br />
CRRA Executive Director</em></p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><em>Attending SAA in San Diego in August?  This session on Networks and EAD Consortia may be of interest:<br />
</em><strong><br />
Archival Networks and EAD Consortia</strong><br />
EAD consortia and aggregators of archival resources share broad interests in the ongoing exchange of information about each others&#8217; projects and programs.  Why reinvent the wheel?</p>
<p><span style="text-decoration: underline;">Where</span>: SAA 76th Annual Meeting, San Diego Hilton Bayfront &#8212; room to be determined.  Please consult conference program for location details, once available.</p>
<p><span style="text-decoration: underline;">When</span>: Thursday, August 9, 2012, 12:00-1:15 pm</p>
<p><span style="text-decoration: underline;">Goal</span>: to increase communication across consortia, in order to share expertise and develop a common vision for broader archival description and discovery networks.</p>
<p><span style="text-decoration: underline;">Agenda</span>: brief regional/statewide/national program updates, followed by structured discussion.  Additional agenda details forthcoming.</p>
<p>Anyone interested is welcome to attend.</p>
<p><em>Jodi Allison-Bunnell, Orbis Cascade and NWDA<br />
Jennifer Schaffner, OCLC Research<br />
Adrian Turner, Online Archive of California and the California Digital Library</em></p>
<p><strong>Fall Symposium at DePaul University,</strong><strong> </strong><strong>Oct. 15-16, 2012<br />
</strong>Continuing on the success of the November 2011 Duquesne Symposium, plans are underway for a Fall Symposium to be held at DePaul University Oct. 15-16.  Please hold this date, and watch the <em>CRRA Update</em> for further details.<strong></strong></p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Position Available: Reference &amp; Instruction Librarian, Duquesne University</strong></p>
<p><strong> </strong>Duquesne University</p>
<p>Gumberg Library</p>
<p>Reference &amp; Instruction Librarian<br />
<strong>NATURE OF WORK:</strong><br />
This non-tenured library faculty position reports to the Director of Information Services.  This is primarily a public service position with significant instructional and liaison duties.  Knowledge of information sources, interpersonal skills, instructional skills, and technology skills are of highest importance for this position.  Provides reference service and instruction to enable members of the Duquesne University community and guest users to find and effectively make use of library resources and other information sources.</p>
<p>For the full posting, please see: <a href="http://www.duq.edu/hr/faculty/faculty-jobs-openings/gumberg.cfm">http://www.duq.edu/hr/faculty/faculty-jobs-openings/gumberg.cfm</a></p>
<p>&nbsp;</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong><em>CRRA Update</em></strong> is an electronic newsletter distributed via email to provide members with an update of CRRA activities.  Please contact Pat Lawton at 574.631.1324 or email <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share. We welcome your news items!</p>
<p>&#8212;&#8212;&#8212;<br />
CRRA Calendar: <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a><br />
CRRA Contact page: <a href="../../About/Contact">http://www.catholicresearch.net/About/Contact</a><br />
CRRA blog: <a href="../">http://www.catholicresearch.net/blog/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/04/april-2012-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Statistical reports against the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2012/04/statistical-reports-against-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2012/04/statistical-reports-against-the-catholic-portal/#comments</comments>
		<pubDate>Wed, 18 Apr 2012 00:15:30 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=489</guid>
		<description><![CDATA[This text describes the beginnings of a set of statistical reports describing the use of the &#8220;Catholic Portal&#8220;. More specifically, the Portal&#8217;s Web server log files are read on a daily basis, normalized, and saved to an underlying database. A number of queries are then applied to the database to create rudimentarily lists of tabulations. [...]]]></description>
			<content:encoded><![CDATA[<p>
This text describes the beginnings of a set of statistical reports describing the use of the &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8220;.
</p>
<p>
More specifically, the Portal&#8217;s Web server log files are read on a daily basis, normalized, and saved to an underlying database. A number of queries are then applied to the database to create rudimentarily lists of tabulations. Each one of the reports are described below:
</p>
<ul>
<li><a href="http://www.catholicresearch.net/tmp/hosts.txt">Hosts</a> &#8211; This report lists the Internet address or name of the top 100 computers using the Portal. To the best of our ability, the list excludes Internet robots and spiders, but the list needs to be updated. As of this writing, it is quite likely that many of the top computers are still robots, and the host named university.archives.nd.edu is probably the most frequent user of the Portal with shunat236-189.shu.edu coming in at a close second.</li>
<li><a href="http://www.catholicresearch.net/tmp/page-count.txt">Page count</a> &#8211; This is a list of the number of hits the Portal received on any given day. Obviously the script creating this report needs to be updated in order to output data for the current year.</li>
<li><a href="http://www.catholicresearch.net/tmp/query-strings.txt">Query strings</a> &#8211; This is a tabulation of the most frequently used search terms applied against the Portal. The &#8220;null&#8221; query is probably a simple hit against the &#8220;browse&#8221; link at the bottom of the Portal&#8217;s home page and/or simply clicking the search box&#8217;s Find button. The queries in quotes are probably from clicks on hot linked search results. </li>
<li><a href="http://www.catholicresearch.net/tmp/referrers-all.txt">Referrers</a> &#8211; This is a list of the websites where people came from before they visited the Portal. A whole lot of these websites are places where blog postings about the Portal appear. Many are spam. Some are HTML versions of the EAD finding aids. Further down the list one can begin to see Google searches.</li>
<li><a href="http://www.catholicresearch.net/tmp/referrers-engines.txt">Referrers engines</a> &#8211; This report is just exactly like the Referrers report except it only includes search engines (Google, Yahoo, and Bing).</li>
<li><a href="http://www.catholicresearch.net/tmp/tabs.txt">Tabs</a> &#8211; This is a list of the most frequently used links used across the top of the Portal&#8217;s home page. </li>
<li><a href="http://www.catholicresearch.net/tmp/top-records.txt">Top records</a> &#8211; This is a tabulation of the most frequently viewed records in the Portal. The first item on the list is an error, but as of this writing the most frequently viewed record is something from Catholic University of America.</li>
<li><a href="http://www.catholicresearch.net/tmp/types-of-searches.txt">Types of searches</a> &#8211; From this report is all but obvious that the overwhelming majority of the searches applied against the Portal are free text searches. Nobody uses the advanced search form.</li>
<li><a href="http://www.catholicresearch.net/tmp/whose-records.txt">Whose records</a> &#8211; This is a list of the names of the libraries/institutions whose records are viewed most frequently. </li>
</ul>
<p>
For a more technical description of how these reports are generated, see the blog posting entitled &#8220;<a href="http://www.catholicresearch.net/blog/2011/01/data-warehousing-web-server-log-files/">Data warehousing Web server log files</a>&#8221; as well as a follow-up posting called &#8220;<a href="http://www.catholicresearch.net/blog/2011/09/progress-with-statistics-reporting/">Progress with statistics reporting</a>&#8220;.
</p>
<p>
These reports can be improved in any number of ways. First, they could be represented graphically &#8212; pie charts, histograms, etc. Second, they could be re-generated on a month-by-month basis to look for trends over time. Luckily just about all the necessary data has been preserved. Alternatively, a peek at the Portal&#8217;s Google Analystics site may illuminate additional trends.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/04/statistical-reports-against-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transforming schema-based EAD files</title>
		<link>http://www.catholicresearch.net/blog/2012/04/transforming-schema-based-ead-files/</link>
		<comments>http://www.catholicresearch.net/blog/2012/04/transforming-schema-based-ead-files/#comments</comments>
		<pubDate>Tue, 10 Apr 2012 15:37:23 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=485</guid>
		<description><![CDATA[This posting describes my solution for transforming schema-based EAD files for the &#8220;Catholic Portal&#8221;. In a sentence, the solution boils down to removing the all the namespaces from the input. For the longest time the EAD files harvested for the Portal were validated against the EAD DTD. These files have no namespace declarations, and transformations [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting describes my solution for transforming schema-based EAD files for the &#8220;Catholic Portal&#8221;. In a sentence, the solution boils down to removing the all the namespaces from the input.
</p>
<p>
For the longest time the EAD files harvested for the Portal were validated against the EAD DTD. These files have no namespace declarations, and transformations were relatively easy. It was almost trivial for me to add unitid attributes to did-level elements. It was almost trivial for me to loop through the input files to extract did-level elements for indexing. Using a stylesheet I found through the Library Of Congress, it was easy for me to convert the EAD into an HTML file for online reading.
</p>
<p>
When I started getting EAD files generated from the venerable Archivist&#8217;s Toolkit my processes broke because these new files were validated against EAD schema which is full of two or three namespaces. None of my XPath statements worked. A number of people offered a number of suggestions. Some of them required the use of XSLT 2.0, which is not an option for me. Others thought I should update my existing stylesheets to accomodate the namespaces, but that would have been too complicated and not scalable.
</p>
<p>
In the end, I chose a different solution which was alluded to by a number of other people &#8212; remove the namespaces. Each person offered a slightly different take on the problem, but in the end I went for a brute force method I found in the TEI community Web space:
</p>
<pre><code>&lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  &lt;xsl:output method="xml" indent="no" /&gt;
  &lt;xsl:template match="/|comment()|processing-instruction()"&gt;
    &lt;xsl:copy&gt;
      &lt;xsl:apply-templates /&gt;
    &lt;/xsl:copy&gt;
  &lt;/xsl:template&gt;
  &lt;xsl:template match="*"&gt;
    &lt;xsl:element name="{local-name()}"&gt;
      &lt;xsl:apply-templates select="@*|node()" /&gt;
    &lt;/xsl:element&gt;
  &lt;/xsl:template&gt;
  &lt;xsl:template match="@*"&gt;
    &lt;xsl:attribute name="{local-name()}"&gt;
      &lt;xsl:value-of select="." /&gt;
    &lt;/xsl:attribute&gt;
  &lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;
</code></pre>
<p>
Consequently, my XML processing pipeline now looks this:
</p>
<ol>
<li>harvest EAD files</li>
<li>validated them</li>
<li>strip namespaces</li>
<li>add unitids</li>
<li>transform them into HTML</li>
<li>index them</li>
<li>done</li>
</ol>
<p>
The next thing to do is improve Step #5 since the generic EAD to HTML transformation is just that &#8212; too generic.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/04/transforming-schema-based-ead-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Moving to VuFind version 1.3</title>
		<link>http://www.catholicresearch.net/blog/2012/03/moving-to-vufind-version-1-3/</link>
		<comments>http://www.catholicresearch.net/blog/2012/03/moving-to-vufind-version-1-3/#comments</comments>
		<pubDate>Fri, 23 Mar 2012 14:25:12 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=478</guid>
		<description><![CDATA[We here at &#8220;Catholic Portal Central&#8221; are spending time and effort moving to VuFind version 1.3. To this end I have implemented a number of things as per our usability studies as well as begun to skin the underlying &#8220;blueprint&#8221; theme. Give it a whirl and share your thoughts &#8212; http://vufind.library.nd.edu]]></description>
			<content:encoded><![CDATA[<p>
We here at &#8220;Catholic Portal Central&#8221; are spending time and effort moving to VuFind version 1.3. To this end I have implemented a number of <a href="http://www.catholicresearch.net/blog/2012/03/prioritized-list-of-fixesenhancements-for-the-portal/">things as per our usability studies</a> as well as begun to skin the underlying &#8220;blueprint&#8221; theme. Give it a whirl and share your thoughts &#8212; <a href="http://vufind.library.nd.edu">http://vufind.library.nd.edu</a>
</p>
<p><a href="http://vufind.library.nd.edu/"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2012/03/screendump-300x227.png" alt="" title="screendump" width="300" height="227" class="aligncenter size-medium wp-image-479" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/03/moving-to-vufind-version-1-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Linking CRRA items to member libraries: A Prototype</title>
		<link>http://www.catholicresearch.net/blog/2012/03/linking-crra-items-to-member-libraries-a-prototype/</link>
		<comments>http://www.catholicresearch.net/blog/2012/03/linking-crra-items-to-member-libraries-a-prototype/#comments</comments>
		<pubDate>Mon, 19 Mar 2012 20:47:12 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=468</guid>
		<description><![CDATA[I have implemented a prototype for linking items found in the &#8220;Catholic Portal&#8221; to CRRA member institutions. The Problem The vast majority of the content in the &#8220;Portal&#8221; is not digitized. Consequently, when items of interest are identified, the reader is left hanging because the Portal does not support document delivery. &#8220;Now that I have [...]]]></description>
			<content:encoded><![CDATA[<p>
I have implemented a prototype for linking items found in the &#8220;Catholic Portal&#8221; to CRRA member institutions.
</p>
<h2>The Problem</h2>
<p>
The vast majority of the content in the &#8220;Portal&#8221; is not digitized. Consequently, when items of interest are identified, the reader is left hanging because the Portal does not support document delivery. &#8220;Now that I have found this item, how do I get it?&#8221;
</p>
<h2>The Solution</h2>
<p>
The solution is not perfect, but rather a step in the right direction. Instead of delivering the item, the solution is to provide a means for the reader (I don&#8217;t use the word &#8220;user&#8221; anymore) to easily connect with the member institution libraries through a directory. Specifically, create a directory of member institution libraries/archives complete with names, addresses, and other pieces of contact information. Hyperlink each and every search result to specific entires in the directory and thus enable readers get in touch with member institutions.
</p>
<p>
I have implemented this in the <a href="http://vufind.library.nd.edu/Search/Results?lookfor=&#038;type=AllFields&#038;submit=Find/" target="_blank">Portal&#8217;s &#8220;sandbox&#8221;</a>. Search for any item, and from both the search result page as well as detail holdings pages, the reader can click on the institutions&#8217; library and be shown a (bogus) directory.
</p>
<p><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2012/03/results.png"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2012/03/results-300x226.png" alt="" title="Search results" width="300" height="226" class="aligncenter size-medium wp-image-472" /></a></p>
<p><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2012/03/details.png"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2012/03/details-300x226.png" alt="" title="Item details" width="300" height="226" class="aligncenter size-medium wp-image-473" /></a></p>
<p>
The implementation was much easier than I anticipated, and the key was found in the identifiers of each indexed record. (All puns intended.) Each indexed record in the Portal is prefixed with a code denoting the library holding the item. For example, Boston College&#8217;s code is bcu, and Loyola Marymount University&#8217;s code is lmu. When search items are returned VuFind&#8217;s IndexRecord record driver is called. In that code I am able to extract each record&#8217;s identifier, and parse out is first three characters &#8212; the code. I then pass this identifier and the library&#8217;s name on to my template for display:
</p>
<blockquote>
<pre><code>$interface->assign('CRRALibrary', $this->fields['building'][0]);
$interface->assign('CRRAKey', substr ($this->fields['id'], 0, 3 ));
</code></pre>
</blockquote>
<p>
In the template I hyperlink the holding library&#8217;s name with the directory&#8217;s URL, and specifically, a named anchor for the library:
</p>
<blockquote>
<pre><code>&lt;a href='http://zoia.library.nd.edu/tmp/directory.html#{$CRRAKey}'&gt;{$CRRALibrary}&lt;/a&gt;
</code></pre>
</blockquote>
<p>
The directory I created was rudimentary at best, and it will be up to people other than me and including myself to determine how the directory gets created and what it looks like.
</p>
<p>
Hooray for open source software and object oriented programming techniques!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/03/linking-crra-items-to-member-libraries-a-prototype/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Prioritized list of fixes/enhancements for the &#8220;Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2012/03/prioritized-list-of-fixesenhancements-for-the-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2012/03/prioritized-list-of-fixesenhancements-for-the-portal/#comments</comments>
		<pubDate>Wed, 07 Mar 2012 13:19:18 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=463</guid>
		<description><![CDATA[Based on our usability studies and conference call from the other day I have created a (more or less) prioritized list of fixes/enhancements to be applied to the &#8220;Portal&#8221;: add a a note to the email dialog box denoting how the from field is mandatory and requires an email address create a directory of institutions, [...]]]></description>
			<content:encoded><![CDATA[<p>
Based on our usability studies and conference call from the other day I have created a (more or less) prioritized list of fixes/enhancements to be applied to the &#8220;Portal&#8221;:
</p>
<ul>
<li>add a a note to the email dialog box denoting how the from field is mandatory and requires an email address</li>
<li>create a directory of institutions, and from search results hyperlink institutions&#8217; names to the directory</li>
<li>update the &#8220;Portal&#8221; look &#038; feel (theme) so it is based on the &#8220;blueprint&#8221; theme</li>
<li>turn off the &#8220;Suggested Topics&#8221; feature</li>
<li>fix the author searches so when author names are clicked the content displays correctly</li>
<li>make the login links float to the right instead of the left</li>
<li>change the red text &#8212; such as the text in the search box &#8212; to black</li>
<li>change the login label to read &#8220;Login / Create account&#8221;</li>
</ul>
<p>On my mark. Get set. Go.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/03/prioritized-list-of-fixesenhancements-for-the-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to make MARC and EAD metadata available in the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2012/02/recipe/</link>
		<comments>http://www.catholicresearch.net/blog/2012/02/recipe/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 15:48:54 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=459</guid>
		<description><![CDATA[This is a set of (draft) prescriptive instructions describing how to make MARC and EAD metadata available in the &#8220;Catholic Portal&#8220;. Introduction At its core, the &#8220;Portal&#8221; is an index &#8212; a list of pointers to content items. Access to this index is implemented through a form-based interface. Readers enter queries into the form, and [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>This is a set of (draft) prescriptive instructions describing how to make MARC and EAD metadata available in the &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8220;.</p>
<h2>Introduction</h2>
<p>At its core, the &#8220;Portal&#8221; is an index &#8212; a list of pointers to content items. Access to this index is implemented through a form-based interface. Readers enter queries into the form, and items are returned. Readers are then expected to select items of interest from the returned list, and use them for the purposes of research and scholarship. In order to implement this functionality, each content item in the index requires, at the very least, three elements: 1) a unique identifier, 2) a human-readable description of the item, and 3) a location code where the item can be acquired.</p>
<p>The MARC and EAD metadata schemes are well-suited for indexing. After making sets of MARC records and/or EAD files transparently accessible on a Web server, it is easy to harvest the metadata, integrate it into the Portal&#8217;s index, and provide access to the content items.</p>
<p>The balance of this posting describes how to make MARC and EAD files available for harvesting.</p>
<h2>MARC</h2>
<p>Here&#8217;s the short version. Export all the MARC records from your integrated library system you think are apropos to the &#8220;Catholic Portal&#8221; making sure they are encoded using the UTF-8 character set. Save the resulting file on a Web server, and tell Eric Morgan the URL of the resulting file. Eric will do the rest.</p>
<p>Here&#8217;s the long version. Remember, every record in the Portal needs a unique identifier, a human-readable description, and a location code. For MARC records, this means every record first needs a value in the 001 field. Any value will do as long as it is unique to your set of records. Second, each MARC record needs something in the 245 field. At the very least this will be the human-readable description. All the other descriptive and analytic fields will supplement this description. Third, each MARC record needs to have a location code, and this is the item&#8217;s call number. This value will most likely be extracted from the 090 field.</p>
<p>Helping you decide which MARC records to extract from your integrated library system is beyond the scope of this document. But once you have figured that out it is recommended you denote which items are to be extracted by updating them with a local note. Here at the University of Notre Dame, we put the letters CRRA in field 590 subfield a. Once this is done it is relatively easy for the systems librarian to do a search for CRRA in field 590 subfield a, and dump the resulting records to a file. Alternatively, the systems librarian might search for all items whose call numbers begin with BX and dump the resulting set. The process you use to denote and export your MARC records depends on your local environment.</p>
<p>When exporting your MARC records from your integrated library system, it is imperative the records be encoded using the UTF-8 character set and not something else. The Portal&#8217;s underlying indexer does not deal very well with encodings of another kind. If your system does not export records as UTF-8, and it exports things in MARC-8 instead, then use an open source application called <a href="http://www.indexdata.com/yaz/doc/yaz-marcdump.html">yaz-marcdump from Index Data</a> to transform your records from one encoding into another. Once yaz-marcdump is installed you can execute a command like the following to do the transformation:</p>
<blockquote>
<p>
      <code>yaz-marcdump -f MARC-8 -t UTF-8 -o marc -l 9=97 input.mrc &gt; output.mrc</code>
    </p>
</blockquote>
<p>The command translates MARC records from (-f) MARC-8 encoding to (-t) UTF-8 encoding. It outputs (-o) the result as MARC records, and inserts the letter a (ASCII character 97) into the leader (-l) at position 9. It uses the file named input.mrc as input, and it outputs the result to a file named output.mrc.</p>
<p>Every time you export your records, you should export everything that you feel is relevant to the portal. Do not worry about additions, changes, nor deletions. We here at Portal Central handle this issue by deleting all of your records locally and re-indexing the whole lot.</p>
<p>After the records have been exported, save them on a Web server, and finally, tell Eric Morgan the URL of the resulting file. Please don&#8217;t change the name of the URL. Eric will harvest the records and incorporate them into the index. As of this writing it is a good idea to tell Eric when new records are available, but at some point in time this won&#8217;t be necessary.</p>
<h2>EAD</h2>
<p>Here&#8217;s the short version. Use validated EAD files to encode the content you deem apropos to the Portal. Save all the EAD files in a single directory on a Web server making sure each file is given a .xml extension. Tell Eric Morgan the URL of the directory, and he will take care of the rest.</p>
<p>Here&#8217;s the longer version. Use whatever tool you desire to create EAD files describing the archival content you deem appropriate for the Portal. There are any number of available editors and applications facilitating this process. Make sure the resulting EAD files validate against the EAD DTD or schema. It doesn&#8217;t really matter which one, but right now validation against the DTD is easier to handle here at Portal Central.</p>
<p>Each did-level element in your EAD files will eventually become a record in the Portal&#8217;s index. During pre-processing here at Portal Central, unique <code>unitid</code> attributes will be added to each <code>did</code>-level element, if no <code>unitid</code> attributes exist in the first place. This pre-processing satisfies the need for unique identifiers. You need to do nothing in regards to unique identifiers.</p>
<p>Each <code>did</code>-level unittitle element will recursively be combined with its parent <code>did</code>/<code>unittitle</code> element to form a human-readable description of each content item. Consequently, there is nothing you need to do in regards to human-readable descriptions.</p>
<p>The location of items found in EAD files is facilitated in three ways. First, the name of your hosting institution and library/archive will be associated with each search result, thus the need for location information will be satisfied but only in a rudimentary way. Second, through the use of the url attribute of the <code>eadid</code> element, location information is re-enforced. Specifically, you are expected to include a value in the url attribute of the <code>eadid</code> element. This value is expected to point to a human-readable version of your EAD file on your Web server. Portal search results include hot links with a label similar to &#8220;View finding aid at owning institution&#8221;. The hot links will be the same as the value in the url attribute. Your human-readable version of the EAD file is then expected to include instructions and contact information describing how to acquire items of interest. Finally, search results will include a second hot link labeled similar to &#8220;View finding aid in Portal display&#8221;. These hot links will equal to a URL pointing to a local HTML file transformed from the original EAD. Again, location and contact information should be a part of the HTML because it was a part of the original EAD.</p>
<p>In summary, create complete and valid EAD files making sure you include values in the url attributes of the <code>eadid</code> elements.</p>
<p>Once you have created your EAD files, save them in a single directory on a Web server, and tell Eric Morgan the URL of the directory. Make sure each EAD file ends with a .xml extension. Eric will then regularly harvest all the .xml files from your directory, re-validate them, make sure they include <code>url</code> attributes, add unique identifiers to each <code>did</code>-level element, and index each <code>did</code>-level element.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/02/recipe/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Philadelphia Archdiocesan Historical Research Center (PAHRC) records</title>
		<link>http://www.catholicresearch.net/blog/2012/02/philadelphia-archdiocesan-historical-research-center-pahrc-records/</link>
		<comments>http://www.catholicresearch.net/blog/2012/02/philadelphia-archdiocesan-historical-research-center-pahrc-records/#comments</comments>
		<pubDate>Tue, 07 Feb 2012 16:21:46 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=454</guid>
		<description><![CDATA[Just less than 1,100 records from the Philadelphia Archdiocesan Historical Research Center (PAHRC) have been added to the &#8220;Portal&#8221; &#8212; http://bit.ly/uG92RG]]></description>
			<content:encoded><![CDATA[<p>Just less than 1,100 records from the Philadelphia Archdiocesan Historical Research Center (PAHRC) have been added to the  &#8220;Portal&#8221; &#8212; <a href="http://bit.ly/uG92RG" target="_blank">http://bit.ly/uG92RG</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/02/philadelphia-archdiocesan-historical-research-center-pahrc-records/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Content from the University of Dayton</title>
		<link>http://www.catholicresearch.net/blog/2012/01/content-from-the-university-of-dayton/</link>
		<comments>http://www.catholicresearch.net/blog/2012/01/content-from-the-university-of-dayton/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 15:12:29 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=452</guid>
		<description><![CDATA[Twenty-nine records from the Archives at the University of Dayton added to the &#8220;Catholic Portal&#8221; &#8212; http://bit.ly/weVl8h]]></description>
			<content:encoded><![CDATA[<p>Twenty-nine records from the Archives at the University of Dayton added to the &#8220;Catholic Portal&#8221; &#8212; <a href="http://bit.ly/weVl8h">http://bit.ly/weVl8h</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2012/01/content-from-the-university-of-dayton/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Indexing PastPerfect metadata for the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2011/12/indexing-pastperfect-metadata-for-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2011/12/indexing-pastperfect-metadata-for-the-catholic-portal/#comments</comments>
		<pubDate>Thu, 15 Dec 2011 16:03:50 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=449</guid>
		<description><![CDATA[Using VuFind&#8217;s inherent ability to index OAI metadata, I have successfully been able to index metadata coming from a PastPerfect implementation. Starting somewhere near version 1.2, VuFind supports the indexing of arbitrary metadata types. Content from OAI repositories was the original example. Later, I figured out how to index EAD files. This was a break [...]]]></description>
			<content:encoded><![CDATA[<p>
Using VuFind&#8217;s inherent ability to index OAI metadata, I have successfully been able to index metadata coming from a PastPerfect implementation.
</p>
<p>
Starting somewhere near version 1.2, <a href="http://vufind.org/">VuFind</a> supports the indexing of arbitrary metadata types. Content from OAI repositories was the original example. Later, I figured out how to index EAD files. This was a break through for the &#8220;Portal&#8221;. Give credit to open source software.
</p>
<p>
With the addition of the <a href="http://www.pahrc.net/">Philadelphia Archdiocesan Historical Research Center (PAHRC)</a> into the Catholic Research Resources Alliance, a new metdata format needed to be accepted &#8212; metadata other than EAD or MARC. PAHRC uses &#8220;cataloging&#8221; software called <a href="http://www.museumsoftware.com/">PastPerfect</a>. From what I can tell, it is a sophisticated FoxPro/Microsoft Access database application. It provides the means for institutions to do data entry, and have their holdings searched, and ultimately displayed on the Web.
</p>
<p>
PastPerfect can export its metadata in a form of Dublin Core. After working closely with <strong>Shawn Weldon</strong>, <strong>Faith Charlton</strong> (both of PAHRC), and <strong>Brian Gomez</strong> (Past Perfect, Inc), the <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/12/pahrc.xml">metadata exported by PAHRC</a> was tweaked to be less ambiguous and more accurate. Once this was done I was able to harvest the metadata, parse it into something usable by VuFind&#8217;s Solr indexer, and make it available through the Portal. I did this with a script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/12/pastperfect-index.pl">pastperfect-index.pl</a>. The result is a <a href="http://bit.ly/uG92RG">set of searchable records from PAHRC</a>.
</p>
<p>
My current implementation is specific to PAHRC, and when other PastPerfect libraries/archives come on board, it will not be too difficult to abstract my implementation to support other institutions. That work is left to the future, when and if it occurs.
</p>
<p>
Fun with open source software!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/12/indexing-pastperfect-metadata-for-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Duplicate records in the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2011/12/duplicate-records-in-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2011/12/duplicate-records-in-the-catholic-portal/#comments</comments>
		<pubDate>Fri, 09 Dec 2011 19:25:46 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=445</guid>
		<description><![CDATA[There is some concern about duplicate records in the &#8220;Catholic Portal&#8221;, and this posting introduces the topic to a wider audience. The &#8220;Catholic Portal&#8221; is intended to contain links to and content of a rare and infrequently held nature. Every once in a while search results return duplicate records. For example, yesterday, it was brought [...]]]></description>
			<content:encoded><![CDATA[<p>
There is some concern about duplicate records in the &#8220;Catholic Portal&#8221;, and this posting introduces the topic to a wider audience.
</p>
<p>
The &#8220;Catholic Portal&#8221; is intended to contain links to and content of a rare and infrequently held nature. Every once in a while search results return duplicate records. For example, yesterday, it was brought to our attention that there are <a href="http://bit.ly/syVUpt">five records with the title <cite>Life Of Mrs. Eliza A. Seton</cite></a>. On one hand, few if any of these records are duplicates because between the five of them they are held by two different institutions. And each institution owns multiple editions. In the sense of a &#8220;catalog&#8221;, this is perfectly acceptable, if not expected. On the other hand, the Portal is not a catalog but rather an index, and each of the five items are really a variation on a theme. Should these records be merged?
</p>
<p>
<strong>Demian Katz</strong> shared with me and the Portal&#8217;s Digital Access Committee a query that can be applied the Portal&#8217;s underlying Solr index, here, with carriage returns added for readability:
</p>
<blockquote>
<p>
<code>
<pre>http://localhost:8080/solr/biblio/select/?
q=*%3A*&#038;rows=0&#038;start=0&#038;facet=true&#038;facet.mincount=2&#038;
facet.limit=-1&#038;facet.field=oclc_num&#038;facet.field=isbn</pre>
<p></code></p>
</blockquote>
<p>
The result of this query is a list of OCLC and ISBN numbers which occur in the index at least two times. According to <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/12/duplicates.xml">the result, which only matches on the OCLC or ISBN keys</a>, there are no records in the index appearing more than three times. Furthermore, there are about 1,100 duplicated OCLC numbers and about 300 duplicated ISBN numbers. Considering the total number of records (93,000) in the index, this represents a total duplication rate of approximately 1.5%. Is this value too high?
</p>
<p>
In an ideal world, there would be no duplicate records and/or duplicates would be merged into a single record. Unfortunately, the definition of &#8220;duplicate&#8221; is ambiguous, and a process for eliminating duplicates has not been implemented. To a Walt Witman scholar, the difference between various editions of The Leaves Of Grass is definitely significant. Thus, sometimes the differences in editions is very important. Other times and for other people, this is not always so important. In an ideal world, there would be no duplicates and a single record would warrant a de-duplication process, but the expense of de-duplicating that single record may be very high, especially if there is no de-duplication process in place. How many records &#8212; or what percentage of records &#8212; warrants a de-duplication process, especially considering the other things that have been set as priorities for the Portal? Honestly, I don&#8217;t know the answer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/12/duplicate-records-in-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Survey of Digitized Rare Catholica &#8211; Results</title>
		<link>http://www.catholicresearch.net/blog/2011/11/survey-of-digitized-rare-catholica-results/</link>
		<comments>http://www.catholicresearch.net/blog/2011/11/survey-of-digitized-rare-catholica-results/#comments</comments>
		<pubDate>Tue, 22 Nov 2011 19:51:11 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=433</guid>
		<description><![CDATA[             Marta Deyrup and Martha Loesch, catalogers at (CRRA instution) Seton Hall University, and Pat Lawton, digital projects librarian for the CRRA, have released the results of their Survey of Digitized Rare Catholica held by Catholic universities, colleges, seminaries and archives in the U.S. and Canada. You may view the [...]]]></description>
			<content:encoded><![CDATA[<table summary="" border="0" cellspacing="0" cellpadding="0" align="right">
<tbody>
<tr>
<td>
<table summary="" cellspacing="0" cellpadding="0">
<caption align="bottom">            </caption>
<tbody>
<tr>
<td><img id="||CPIMAGE:355685|" title="Bible Text" src="http://www.shu.edu/images/hartmann6.jpg" alt="Bible Text" width="150" height="100" border="0" hspace="5" vspace="0" /></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p>Marta Deyrup and Martha Loesch, catalogers at (CRRA instution) Seton Hall University, and Pat Lawton, digital projects librarian for the CRRA, have released the results of their<em> Survey of Digitized Rare Catholica</em> held by Catholic universities, colleges, seminaries and archives in the U.S. and Canada. You may view the <a id="http://bit.ly/Survey_report|" href="http://bit.ly/Survey_report">Summary Report of Results</a> and the <a id="http://bit.ly/Survey_results|" href="http://bit.ly/Survey_results">results data</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/11/survey-of-digitized-rare-catholica-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Portal surgery</title>
		<link>http://www.catholicresearch.net/blog/2011/11/portal-surgery/</link>
		<comments>http://www.catholicresearch.net/blog/2011/11/portal-surgery/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 16:24:23 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=427</guid>
		<description><![CDATA[I was recently told to delete thousands upon thousands of records from the &#8220;Catholic Portal&#8221;, and through the magic of the Solr&#8217;s Web-based API and a full-featured HTTP client I was able to do this surgery with laser beam accuracy. Specifically, I needed to delete all of the records in the Portal from the University [...]]]></description>
			<content:encoded><![CDATA[<p>
I was recently told to delete thousands upon thousands of records from the &#8220;Catholic Portal&#8221;, and through the magic of the Solr&#8217;s Web-based API and a full-featured HTTP client I was able to do this surgery with laser beam accuracy.
</p>
<p>
Specifically, I needed to delete all of the records in the Portal from the University of Notre Dame Archives because the Archives wanted to totally replace what finding aids were available. This meant deleting more than a 100,000 records from the underlying index. After a bit of investigation, I learned that at the following one-liner from the command line would do the trick:
</p>
<blockquote>
<p>
<code>curl http://localhost:8080/solr/biblio/update?commit=true -H "Content-Type: text/xml" --data-binary '&lt;delete&gt;&lt;query&gt;id:unaead_*&lt;/query&gt;&lt;/delete&gt;'</code>
</p>
</blockquote>
<p>
In short, curl is a command-line HTTP client. It is being told to first connect to the local host on port 8080. It is then told to find all the records matching the query &#8220;id:unaead_*&#8221; and delete them from the index named biblio. Once that is done, the underlying index is expected to commit the changes. Deleting these records took about ten minutes. I was then able to use my previously created scripts to harvest, validate, transform, and index the Archives&#8217; content painlessly.
</p>
<p>
It is a pleasure when things work in the way they were designed! Now if I could only get my local indexing process to work faster.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/11/portal-surgery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VuStuff II: A Travelogue</title>
		<link>http://www.catholicresearch.net/blog/2011/11/vustuff-ii-a-travelogue/</link>
		<comments>http://www.catholicresearch.net/blog/2011/11/vustuff-ii-a-travelogue/#comments</comments>
		<pubDate>Tue, 01 Nov 2011 18:13:17 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=423</guid>
		<description><![CDATA[On Wednesday, October 12, 2011 I had the opportunity to attend and present at the second annual VuStuff meeting held at Falvey Library, Villanova University (Philadelphia). This posting documents my experience there, but in a nutshell, this small and intimate meeting provided a venue for interesting discussion on the topic of modern librarianship. Liberty Bell [...]]]></description>
			<content:encoded><![CDATA[<p>
On Wednesday, October 12, 2011 I had the opportunity to attend and present at the second annual <a href="http://vustuff.org/">VuStuff</a> meeting held at Falvey Library, Villanova University (Philadelphia). This posting documents my experience there, but in a nutshell, this small and intimate meeting provided a venue for interesting discussion on the topic of modern librarianship.
</p>
<table align='center'>
<tr valign='top' align='center'>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/11/vustuff/bell.JPG" width="160" height="120" alt="liberty bell" /><br />Liberty Bell</td>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/11/vustuff/sandwich.JPG" width="160" height="120" alt="cheese steak sandwich" /><br />cheese steak sandwich</td>
</tr>
</table>
<p>
<strong>Joe Lucia</strong> (Villanova University) initialized the meeting and set the stage by recommending a book called <cite><a href="http://www.amazon.com/gp/product/0520258827/ref=as_li_tf_tl?ie=UTF8&#038;tag=infomotions-20&#038;linkCode=as2&#038;camp=217145&#038;creative=399369&#038;creativeASIN=0520258827">The Googlization of Everything</a><img src="http://www.assoc-amazon.com/e/ir?t=infomotions-20&#038;l=as2&#038;o=1&#038;a=0520258827&#038;camp=217145&#038;creative=399369" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></cite>. It advocates the creation of an open knowledge commons similar to the ones at the root of the fledgling <a href="http://cyber.law.harvard.edu/research/dpla">Digital Public Library of America</a>. To paraphrase his remarks, &#8220;Everything we do in our shop here embrases the open knowledge commons concept&#8230; Libraries are not just purveyors of content, but also creators of content &#8212; The New Resource Sharing. We [librarians] can become agents of information creation.&#8221;
</p>
<p>
The first presentation was given by <strong>Amy Baker Williams</strong> (University of Pittsburgh), and she described her process for conserving the maps of local coal mines. In the Pittsburgh (Pennsylvania) area there are many coal mines dating back as far as 1750. Some of the oldest maps of the mines date from 1850. A few years ago some miners were trapped in a mine, and if maps of the mines had been easily accessible, then rescue efforts would have been simplified. Since then concerted efforts have been made to preserve, digitize, and make accessible as many of these coal mining maps as possible in order to prevent similar accidents from happening in the future. I found the process used to flatten the maps to be the most interesting. Basically they are re-hydrated and unrolled. Moving the maps from the conservation lab to the scanning location was also interesting because, ironically, the maps are rolled up again for transportation as well as long-term storage. For more detail, <a href="http://pitt.edu/~aeb59/">see the website</a>.
</p>
<p>
My presentation was next, and I shared with the audience how the <a href="http://www.catholicresearch.net/">Catholic Research Resources Alliance</a> (CRRA) is using <a href="http://vufind.org/">VuFind</a> to implement the &#8220;Catholic Portal&#8221;. I first described the mission and history of the CRRA. I then outlined the Portal&#8217;s technical architecture as well as the process I used to index EAD files. Finally, I described how text mining functions have been integrated into the Portal&#8217;s interface emphasizing the possibilities for libraries in general.
</p>
<table align='center'>
<tr valign='top' align='center'>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/11/vustuff/library.JPG" width="160" height="120" alt="library" /><br />Falvey Library</td>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/11/vustuff/mural.JPG" width="160" height="120" alt="mural" /><br />mural</td>
</tr>
</table>
<p>
During lunch we broke up into groups, and I sat with the folks interested in the digital humanities. For the most part we went around the table sharing common war stories. Most of our initiatives where fledgling, but there was plenty of enthusiasm.
</p>
<p>
After lunch a sort of &#8220;unconference&#8221; session was facilitated by <strong>David Upsal</strong> (Villanova University). The discussion topic that made itself apparent was the challenge of the profession to serve both traditional librarianship as well as librarianship in the current environment. If my memory serves me correctly, some of the suggested solutions included more resources (people and money), permission to &#8220;play&#8221; with new technology, a redefinition of library purpose, and greater collaboration between different types of libraries (public, academic, etc.)
</p>
<p>
The next presentation was given by <strong>Eric Zino</strong> (LYRASIS) who described how LYRASIS has been working with the Sloan Foundation and the Internet Archive to facilitate the digitization of 20,000,000 pages of library content. Approximately 160 libraries have been participating in the <a href="http://www.lyrasis.org/MassDig.aspx">project with LYRASIS</a>. Subsidized by the Foundation, partipants package up their content and ship it to the Internet Archive. The content gets digitized, returned to the owning library, and the digital versions are made accessible at the Archive. From my perspective, this is exactly how any other library works with the Archive, except in this case LYRASIS does a bit of hand-holding during the process. Not all media is digitized by the Archive though. Some things, such as microfilm, are scanned by a different vendor &#8212; Creekside Digital.
</p>
<p>
The last presentation of the day was given by <strong>Bob Behary</strong> (Duquesne University), and he shared with the audience how Duquesne is digitizing a newspaper called the <cite>Pittsburgh Catholic</cite>. The project was initiated by a Catholic order called the Spiritans (the founding order of Duquesne University) with evangelism at its root. At first digitized versions of the newspaper were put on CDs and distributed. This has evolved over time, and now the content is housed in a ContentDM system. The collection has proven useful in a number of ways, including: local &#038; regional church histories, literature allusions (such as Emily Dickinson), and United States history. Behary listed a number of key considerations for any digitization effort: 1) get administrative support, 2) make sure the project fits within the mission of the institution, 3) make sure to use sustainable technology, and 4) ensure knowledgable research advocates are a part of the process.
</p>
<table align='center'>
<tr valign='top' align='center'>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/11/vustuff/vuie.JPG" width="120" height="160" alt="Vuee award" /><br />Vuee Award</td>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/11/vustuff/stairs.JPG" width="160" height="120" alt="stairs" /><br />Art Museum staircase</td>
</tr>
</table>
<p>
I believe the meeting was attended by fifty to seventy-five people. Most were from the immediate area, and it offered a easy opportunity for professional development. Kudos to the folks at Villanova for hosting the event. Just before the meeting concluded I was awarded the second annual &#8220;Vuee&#8221; for best presentation. It is a small shoebox-sized container in the shape of a book. I was very flattered. &#8220;Thank you very much!&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/11/vustuff-ii-a-travelogue/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Indexing EAD files in the &#8220;Catholic Portal&#8221; with VUFind</title>
		<link>http://www.catholicresearch.net/blog/2011/10/indexing-ead-files-in-the-catholic-portal-with-vufind/</link>
		<comments>http://www.catholicresearch.net/blog/2011/10/indexing-ead-files-in-the-catholic-portal-with-vufind/#comments</comments>
		<pubDate>Tue, 25 Oct 2011 19:26:23 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=418</guid>
		<description><![CDATA[This posting describes how EAD files are indexed in the &#8220;Catholic Portal&#8221; with VUFind. VUFind is a &#8220;next-generation library catalog&#8221; or &#8220;discovery system&#8221; application. Its primary purpose is to index bibliographic metadata and provide a reader-friendly interface to the result. The heart of this process is a Solr index made up of many bibliographic-like fields. [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>This posting describes how EAD files are indexed in the &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8221; with VUFind.</p>
<p><a href="http://vufind.org/">VUFind</a> is a &#8220;next-generation library catalog&#8221; or &#8220;discovery system&#8221; application. Its primary purpose is to index bibliographic metadata and provide a reader-friendly interface to the result. The heart of this process is a <a href="http://lucene.apache.org/solr/">Solr</a> index made up of many bibliographic-like fields. These fields are the usual suspects including a host of variants on author, title, institution, building, collection, language, format, physical description, publisher, published date, edition, description (note), contents, URL, call number, ISSN, ISBN, OCLC number, series, topic, genre, geographic, era, illustration, full text, and record type. In order for EAD files to be searchable in the Portal, they need to have their metadata extracted, the metadata needs to be mapped to Solr fields, and the metadata needs to be added to the index. The balance of this posting describes this in more detail.</p>
<h2>Pre-processing</h2>
<p>Before any indexing can take place, bits of pre-processing are applied against the EAD files. In a nutshell, this pre-processing (and the Perl scripts doing the work) includes:</p>
<ol>
<li>harvesting the EAD files from a remote HTTP server and caching them locally (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/indexing/ead-harvest.pl">ead-harvest.pl</a>) &#8211; Done so the balance of the work can be done.</li>
<li>validating the EAD files against the DTD and/or schema (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/indexing/ead-validate.pl">ead-validate.pl</a>) &#8211; Done because we don&#8217;t want to practice GIGO (Garbage In, Garbage Out).</li>
<li>adding unique identifiers to each did-level element of the EAD files (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/indexing/ead-transform.pl">ead-transform.pl</a>) &#8211; The Solr indexer requires unique identifiers for each indexed item. This process provides the identifiers as well makes it easy to hyperlink directly to a place in the EAD through the use of HTML anchors.</li>
<li>transforming the EAD files into HTML and making the results Web accessible (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/indexing/ead-transform.pl">ead-transform.pl</a>) &#8211; Done because links to remote versions of the EAD files break, and humans do not read XML very well.</li>
</ol>
<h2>Indexing</h2>
<p>The bulk of the indexing process centers around the acquisition of metadata, and it is completely handled by a Perl script named <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/indexing/ead-index.pl">ead-index.pl</a>:</p>
<ol>
<li>The process begins by looking up the name of the institution and the name of the library from where the EAD file was created. These values are located in a rudimentary tab-delimited database.</li>
<li>Next, the value for record type is denoted. It is always &#8220;EAD&#8221;.</li>
<li>Third, a value for format is denoted. It is always &#8220;Archival material&#8221;.</li>
<li>Next, the language of the material is extracted from the /ead/archdesc/did/langmaterial/language element. If no language is specified, then language is denoted as &#8220;Unknown&#8221;.</li>
<li>Each did-level element from the EAD file is then examined pulling out its unique identifier (the id attribute of unitid element created in Step #3 of pre-processing), title (the unittitle element), and date (the unitdate element). The title metadata is a bit special since it is really a concatenation of all the parent title values of the given did element. This is done because each item in an EAD file is a part of the entire collection, and this enhanced title is intended to provide context.</li>
<li>At this point the metadata for each did-level element has been extracted and is mapped to a select number of Solr fields, namely:
<ul>
<li>id -&gt; unique identifier;</li>
<li>title -&gt; title</li>
<li>title_auth -&gt; title</li>
<li>title_full -&gt; title</li>
<li>title_fullStr -&gt; title</li>
<li>title_full_unstemmed -&gt; title</li>
<li>title_short -&gt; title</li>
<li>title_sort -&gt; title</li>
<li>publishDate -&gt; date</li>
<li>format -&gt; always &#8220;Archival material&#8221;, from Step #2</li>
<li>institution -&gt; the name of the library&#8217;s hosting institution, from Step #1</li>
<li>building -&gt; the name of the library, from Step #1</li>
<li>fullrecord -&gt; An XML snippet containing the unique identifier, title, date, as well as two URLs pointing to HTML versions (local and remote) of the EAD file</li>
<li>recordtype -&gt; always EAD, from Step #3</li>
<li>language -&gt; language, from Step #4</li>
</ul>
</li>
<li>Finally, the metadata is added to VuFind&#8217;s underlying Solr index.</li>
</ol>
<h2>Discussion</h2>
<p>The indexing process is far from perfect. For example, in the current process, the entire head element of the EAD file is ignored. While it contains very rich metadata, such as controlled vocabulary terms and abstracts, these values describe the collection as a whole and do not necessarily apply to each individual did-level element.</p>
<p>Second, creating EAD files is laborious in the first place. There are not enough resources in most archival departments to describe did-level elements with much more detail than title and date. It would be nice to have a narrative summary describing of each did-level element, a more specific format, some key words or controlled vocabulary, a consistently formatted date, etc. But again, creating such metadata for each did-level element is expensive. Consequently, indexed items are not described as robustly as possible.</p>
<p>Third, while VuFind&#8217;s implementation of Solr is bibliographic in nature, it is heavily weighted towards bibliographic metadata describing books. OCLC number. Call number. ISBN &amp; ISSN. Edition. Etc. There are no fields for EAD-specific things such as postal addresses, provenance, nor biographies.</p>
<p>Again, the process is not perfect, but it does enable the Catholic Research Resources Alliance to amalgamate the metadata of its member institutions and provide a searchable index to the result. Suggestions for improvement are welcome.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/10/indexing-ead-files-in-the-catholic-portal-with-vufind/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>&#8220;Advancing Catholic Scholarship&#8221; Symposium at Duquesne Nov. 9-10</title>
		<link>http://www.catholicresearch.net/blog/2011/10/advancing-catholic-scholarship-symposium-at-duquesne-nov-9-10/</link>
		<comments>http://www.catholicresearch.net/blog/2011/10/advancing-catholic-scholarship-symposium-at-duquesne-nov-9-10/#comments</comments>
		<pubDate>Wed, 12 Oct 2011 18:08:50 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=412</guid>
		<description><![CDATA[Colleagues, The registration deadline for this CRRA/Duquesne sponsored event is this Friday, October 15, 2011.  We are pleased that many of you have already registered for the event and if you have thought about registering, please do so now.  There is no fee to register. The event features Catholic scholars, archivists, and librarians gathering together [...]]]></description>
			<content:encoded><![CDATA[<p>Colleagues,</p>
<p><strong>The registration deadline for this CRRA/Duquesne sponsored event is this Friday, October 15, 2011.  </strong>We are pleased that many of you have already registered for the event and if you have thought about registering, please do so now.  There is no fee to register.</p>
<p>The event features Catholic scholars, archivists, and librarians gathering together to consider the state of Catholic scholarship and how we can act together to advance and enhance freely available global access and discovery of important Catholic resources. The event will take place at Duquesne University (Pittsburgh) on Nov. 9-10.</p>
<p>We encourage librarians, scholars, and archivists interested in learning more about opportunities to make scholarly resources accessible to join in and meet new friends and colleagues.</p>
<p>A full roster of <strong>events and registration information is available at <a href="http://bit.ly/Duquesne_Symposium">http://bit.ly/Duquesne_Symposium</a> .</strong>   <strong>The registration deadline is this Friday, October 15, 2011.</strong></p>
<p>We hope that you will join us in what promises to be a stimulating and productive conversation about Catholic scholarly research and the ways in which librarians and archivists support this research.</p>
<p>On behalf of Duquesne University and the Catholic Research Resources Alliance (CRRA),<br />
•         Jennifer Younger, chair, Board of Directors at younger.1@nd.edu<br />
•         Laverna Saunders, University Librarian, Gumberg Library, Duquesne University at lsaunders@duq.edu<br />
•         Pat Lawton, CRRA Digital Projects Librarian at plawton@nd.edu</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/10/advancing-catholic-scholarship-symposium-at-duquesne-nov-9-10/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Archdiocese of Chicago</title>
		<link>http://www.catholicresearch.net/blog/2011/10/archdiocese-of-chicago/</link>
		<comments>http://www.catholicresearch.net/blog/2011/10/archdiocese-of-chicago/#comments</comments>
		<pubDate>Thu, 06 Oct 2011 20:35:55 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=405</guid>
		<description><![CDATA[Last week a number of us visited the archives of the Archdiocese of Chicago, and I went away thoroughly impressed. Fireproof walls and doors. Systematic digitization. The implementation of retention policies. The papers of cardinals, rows and rows of baptismal records, and even the transcripts of school children. Very professional. Large. Seemingly well-equipped. Knowledgable staff. [...]]]></description>
			<content:encoded><![CDATA[<p>Last week a number of us visited the archives of the Archdiocese of Chicago, and I went away thoroughly impressed. Fireproof walls and doors. Systematic digitization. The implementation of retention policies. The papers of cardinals, rows and rows of baptismal records, and even the transcripts of school children. Very professional. Large. Seemingly well-equipped. Knowledgable staff. The are taking their responsibility seriously.</p>

<a href='http://www.catholicresearch.net/blog/2011/10/archdiocese-of-chicago/archive-01/' title='Archive of Archdiocese of Chicago'><img width="150" height="150" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/archive-01-e1317933168912-150x150.jpg" class="attachment-thumbnail" alt="Archive of Archdiocese of Chicago" title="Archive of Archdiocese of Chicago" /></a>
<a href='http://www.catholicresearch.net/blog/2011/10/archdiocese-of-chicago/archive-02/' title='Archive of Archdiocese of Chicago'><img width="150" height="150" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/archive-02-e1317933248295-150x150.jpg" class="attachment-thumbnail" alt="Archive of Archdiocese of Chicago" title="Archive of Archdiocese of Chicago" /></a>
<a href='http://www.catholicresearch.net/blog/2011/10/archdiocese-of-chicago/archive-03/' title='Archive of Archdiocese of Chicago'><img width="150" height="150" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/10/archive-03-e1317933277726-150x150.jpg" class="attachment-thumbnail" alt="Archive of Archdiocese of Chicago" title="Archive of Archdiocese of Chicago" /></a>

]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/10/archdiocese-of-chicago/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA July/August 2011 Update</title>
		<link>http://www.catholicresearch.net/blog/2011/09/crra-julyaugust-2011-update/</link>
		<comments>http://www.catholicresearch.net/blog/2011/09/crra-julyaugust-2011-update/#comments</comments>
		<pubDate>Mon, 19 Sep 2011 15:29:28 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=395</guid>
		<description><![CDATA[The July/August 2011 CRRA Update is now available at: http://bit.ly/crra_JulyAug2011. CRRA Update July/August 2011 We begin a new semester and a month of many welcomes!  Join us in welcoming the Catholic Theological Union, board members, board and committee chairs, a new baby (Otto Ray Katz), the Five Year Strategic Plan Task Force members, and executive [...]]]></description>
			<content:encoded><![CDATA[<p>The<strong> July/August 2011 <em>CRRA Update</em></strong> is now available at:<a href="http://bit.ly/crra_JulyAug2011"> http://bit.ly/crra_JulyAug2011</a>.</p>
<p align="center"><strong>CRRA Update</strong></p>
<p align="center"><strong>July/August 2011</strong></p>
<p>We begin a new semester and a month of many welcomes!  Join us in welcoming the Catholic Theological Union, board members, board and committee chairs, a new baby (Otto Ray Katz), the Five Year Strategic Plan Task Force members, and executive director, Jennifer Younger.<strong> </strong></p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Welcome, Catholic Theological Union!</strong></p>
<p>We are pleased to welcome Catholic Theological Union (CTU) <em>The Paul Bechtold Library</em>, under the leadership of Melody McMahon, to the CRRA.  Lisa Gonzalez, Electronic Resources Librarian, will join Melody in her work with CRRA, and will serve as a member of the Digital Access Committee.  Welcome, CTU, Melody and Lisa!  One of our goals for the coming year is to implement an instance of Archivists’ Toolkit, for which CTU has generously volunteered to act as a pilot user.</p>
<p><em><a href="http://www.ctu.edu/library">Catholic Theological Union</a></em> is the largest Catholic graduate school of theology and ministry in the United States. Founded in 1968 in the spirit of Vatican II, there are currently 32 religious orders that send students to CTU, as well as lay students from the United States and around the world.  The Bechtold Library collection is particularly strong in materials pertaining to religious orders, Franciscan studies and catechetical materials. The library contains over 150,000 volumes, and has a varied art collection that was included in the Art and Architecture in Illinois Libraries project.</p>
<p>The library houses both the Weber-Killgallon Center collection of catechetical materials and the Stuhlmueller Room, which contains the personal library of Carroll Stuhlmueller. Stuhlmueller, a Passionist priest and Biblical scholar, served on the faculty of CTU until his death in 1994. In addition,collections housed in the CTU archives include the papers of catechists Gerard Weber and Irene H.</p>
<p>Murphy, the archives of the North American Academy of Liturgy, and the papers of the Women Religious Imprisoned Under Eastern European Communism project.</p>
<p><strong>Melody Layton McMahon, Director of the Paul Bechtold Library</strong></p>
<p>Melody Layton McMahonstarted her career as a music librarian (Juilliard, Cleveland Institute of Music) and then worked for twelve years at John Carroll University, a Jesuit liberal arts university. Always having an interest in theological librarianship, she came to CTU in 2008. She has been an active participant in the Catholic Library Association, the Ohio Theological Library Association, the Chicago Area Theological Library Association, and the American Theological Library Association, serving on a number of committees.</p>
<p>In recent years, Melody’s vocational interests have extended into writing, including numerous reviews,  an article on faculty-library collaboration in <em>Theological Education</em> (2005), co-editing an anthology of writings on theological librarianship (<em>A Broadening Conversation: Classic Readings in Theological Librarianship, </em>2006<em>), </em>and serving as the Critical Reviews Editor for the recently-launched online journal <em>Theological Librarianship: an Online Journal of the American Theological Library Association </em> (<a href="http://www.theolib.org/">www.theolib.org</a>). (M.S., School of Library Service, Columbia University; M.A., St. Mary Seminary and Graduate School of Theology)</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Lisa Gonzalez, Electronic Resources Librarian, Paul Bechtold Library</strong><br />
<strong>Lisa</strong> <strong>Gonzalez</strong> has worked in libraries in Illinois and California, including Trinity Christian College and Azusa Pacific University. Lisa earned her M.L.I.S. from Dominican University and an M.A. in Theology from Fuller Theological Seminary. She has worked as the Electronic Resources Librarian at CTU since September 2008, and currently serves as the Communications Officer for CATLA, as well as serving as a member of the I-OPAC Team for the I-Share group catalog in Illinois.</p>
<p>Lisa serves as the newest member of the Digital Access Committee – welcome, Lisa!</p>
<p><strong><br />
Thank you to Departing Board members and Welcome to the 2011/12 Board </strong></p>
<p>We owe thanks and a hearty “job well done” to three Board members who have served since the Board was set up in February 2008. Each of them has contributed mightily to defining and carrying out the CRRA mission, not only through their Board service but also in leading CRRA committees.  We have benefited from their unique blends of vision, commitment and proactive leadership and we thank you for giving so generously of your time and talents.</p>
<ul>
<li>Tom Leonhardt, St. Edward’s University and chair, Digital Access Committee</li>
<li>Tim Meagher, The Catholic University of America and chair, Scholars Advisory Comm.</li>
<li>Bob O’Neill, Boston College and chair, Collections Committee</li>
</ul>
<p>In selecting new Board members, the Board considered factors relevant to overall Board composition.  We agreed it is desirable to have representation from small and large institutions as well as from sustaining members. We also wanted to bring in members from institutions not currently represented on the Board and to consider individuals who have expressed prior interest in Board service.  Although we did not specifically discuss geographic or gender distribution, the new Board is geographically diverse and gender-balanced. With the inclusion of the new position of executive director as an <em>ex officio</em> member, the Board increased from nine to ten members.</p>
<p><strong>Roster of CRRA Board of Directors for 2011/12 </strong>(in alphabetical order)</p>
<ul>
<li>Theresa Byrd, University Librarian, University of San Diego</li>
<li>Tyrone Cannon, Dean, University Libraries, University of San Francisco</li>
<li>Steve Connaghan, Director of Libraries, The Catholic University of America</li>
<li>Artemis Kirk, University Librarian, Georgetown University</li>
<li>Joe Lucia, University Librarian and Director of Falvey Memorial Library, Villanova University</li>
<li>Evelyn Minick, University Librarian, Saint Joseph’s University</li>
<li>Susan Ohmer, Director, Office of Digital Asset Management, University of Notre Dame</li>
<li>Tom Wall, University Librarian, Boston College</li>
<li>Janice Welburn, Dean, University Libraries, Marquette University and Board chair</li>
<li>Jennifer Younger, Executive Director, CRRA, <em>ex officio  </em></li>
</ul>
<p><strong>CRRA Welcomes New Board and Committee Chairs</strong><strong><br />
</strong>Stepping up to carry on the good work of our illustrious founding Board and committee chairs, Jennifer Younger, Tim Meagher, Bob O’Neill, and Tom Leonhardt are Janice Welburn (Board of Directors), Jean McManus (Scholars Advisory Committee), Matt Blessing (Collections Committee) and Demian Katz (Digital Access Committee).  We warmly welcome you and the opportunities your leadership will afford the Alliance. Please join us in welcoming our new board and committee chairs!</p>
<p><strong>Janice Welburn, Chair, Board of Directors<br />
Janice Welburn</strong>, Dean of University Libraries at Marquette University, continues Marquette’s role as a founding member of the CRRA.   Janice is the recipient of the prestigious 2011 Association of College and Research Libraries’ (ACRL) Academic/Research Librarian of the Year. The award, sponsored by YBP Library Services, recognizes an outstanding member of the library profession who has made a significant national or international contribution to academic/research librarianship and library development.</p>
<p>As Board member and co-chair of the Budget &amp; Personnel Committee, Janice brings a thoughtful approach to understanding and advancing our mission. At Marquette, she actively engages a team that supports the CRRA through their activities. Activities include contribution of metadata records to the portal, committee participation and leadership of the successful collaborative CLIR grant for cataloging unique materials at Catholic, St. Catherine and Marquette; all of which have resulted in significant contributions to our growth and success.  This year, Janice anticipates the development of a five year plan for the CRRA that will inspire and guide our activities going forward.</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Jean McManus</strong>, <strong>Chair, Scholars Advisory Committee</strong><br />
<strong>Jean McManus</strong> has an A.B. in English from Bryn Mawr College and an M.A. in Library Science from the University of Chicago. She has worked primarily in reference services and collection development, with formative stints in the worlds of serials and interlibrary loan. Since 1997, Jean has been at the University of Notre Dame (UND), and has been involved with CRRA and UND’s Team Catholic Portal since 2008.</p>
<p><strong>Matt Blessing, Chair, Collections Committee<br />
Matt Blessing</strong> is the head of the Department of Special Collections and University Archives at Marquette University. Prior to joining Marquette he served as director of the Mass Communications History Center at the Wisconsin Historical Society.  Matt also currently serves on the boards of Wisconsin Heritage Online and the Wisconsin Historical Records Advisory Board.  He is also active within the Midwest Archives Conference.</p>
<p>Marquette University serves as the archives for numerous Catholic groups and organizations, including the Catholic Worker movement, Religious Formation Conference, Catholic Library Association, Women&#8217;s Ordination Conference, National Catholic Rural Life Movement, in addition to several historic American Indian schools and missions .  Marquette also preserves the manuscripts of prominent Catholic writers,  including J.R.R. Tolkien, Dorothy Day, and Penny Lernoux.</p>
<p>Matt has served as a member of the Collections Committee since the 2007 meeting at Notre Dame.</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Demian Katz, Chair, Digital Access Committee (DAC)<br />
Demian Katz</strong> has a B.S. in Computer Science from West Chester University and an M.L.I.S. from the University of Pittsburgh.  Over the course of his career, he has worked both in and out of libraries as a computer programmer and as a provider of reference services.  He is currently employed by Villanova University, where he serves as the lead developer of the VuFind discovery software, and he is greatly enjoying the opportunity to apply both aspects of his background to a single job while working with an enthusiastic open source development community that includes the CRRA members behind the Catholic Portal.</p>
<p>Demian has served on the DAC committee since 2010 and &#8211; Demian is the proud father of the newly arrived <strong>Otto Ray Katz</strong>, born at 8:12 on 8/12!  Welcome to the world, Otto, and congratulations, Demian and family!</p>
<p>&nbsp;</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>CRRA Welcomes Executive Director </strong></p>
<p><em>Posted on behalf of Janice Welburn, chair, Board of Directors and co-chair, Budget &amp; Personnel Committee and Artemis Kirk, co-chair, Budget &amp; Personnel Committee </em></p>
<p><em><br />
</em><strong>Jennifer Younger, CRRA Executive Director<br />
</strong>In 2009, CRRA took decisive action to expand our capacity to carry out our mission by hiring a digital projects librarian, Pat Lawton, as our first staff member. As a result, we have in the last two years been able to set and achieve ambitious goals.  Working collaboratively across CRRA committees, the Board and individual members, we have grown from eight to twenty-seven members and the portal now provides access to over two hundred thousand items held by CRRA members. Scholars, librarians and archivists are noting the collections through their blogs and providing valuable input on highly valued resources to add to the portal.  Although many search queries come from individuals looking for the CRRA or the portal, we also see searches in which users are finding materials, such as the recent search query for “Monsignor Martin B. Hellriegel” whose papers are at Boston College with related materials held by Marquette University.</p>
<p>Just as our alliance has grown, so have our needs. In January 2011, the Board determined that an executive director would assist the CRRA in continuing its growth and impact.  On behalf of the Board, we are pleased to announce that Jennifer Younger began her appointment as the Executive Director on July 1, 2011. Together, she and Pat Lawton form an effective partnership that will lead and serve us well. Please join us in welcoming Jennifer. &#8212; <em>Janice Welburn and Artemis Kirk</em></p>
<p><strong> </strong></p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>CRRA to Develop Five Year Strategic Plan/Task Force Members Named</strong></p>
<p><em>From Janice Welburn, chair, Board of Directors and the Board of Directors</em></p>
<p>In June 2011, the CRRA Board of Directors approved the establishment of a task force for the development of a five year strategic plan (FY2012/13-2016/17) to identify key directions and goals for the CRRA and the Catholic portal. The plan will include a statement of core values, vision and mission, the benefits of CRRA in advancing Catholic scholarship, as well as directions and strategies to carry out the mission and deliver value to stakeholders.</p>
<p>We proposed a task force of five to seven members to include representation from CRRA committees, the Board, members at large, and Pat Lawton, Jennifer Younger and Terry Ehling, Strategic Consultant, as ex officio members (<a href="../../info/Updates/CRRA%20Update_May_June_2011.pdf">CRRA May/June Update</a>). The responses to the call for volunteers came quickly and we are pleased to name the following individuals to the CRRA Five Year Strategic Planning Task Force.  After consultation with the members, we will appoint a chair. As noted in the announcement, the Task Force will solicit input from members as part of its work. We can achieve our mission only through the direction and participation of our members. We thank you for volunteering to serve.</p>
<p>Task Force Members include:</p>
<p>&nbsp;</p>
<ol>
<li>Jonathan Bengtson, University of St. Michael’s College and member, Scholars Advisory Committee</li>
<li>Stephanie Clark, Georgetown and member-at-large</li>
<li>Ann Hanlon, Marquette University, member of Digital Access Committee</li>
<li>Ingrid Hsieh-Yee, Catholic University and member-at-large</li>
<li>Joe Lucia, Villanova University and Board member</li>
<li>Lorraine Olley, University of Saint Mary at the Lake/Mundelein Seminary and member, Collections Committee</li>
<li>Diane Parr Walker, Notre Dame and member-at-large</li>
<li>Tom Wall, Boston College and Board member</li>
</ol>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Mark Your Calendars … for the Duquesne Symposium November 9-10, 2011</strong></p>
<p align="center"><strong>Advancing Catholic Scholarly Research: A Symposium at<br />
Duquesne University<br />
</strong>November 9-10, 2011</p>
<p><strong> </strong>We are pleased to invite you, members of your staff, students and faculty at your institution to a symposium to be held November 9-10, 2011 at Duquesne University. Invited Catholic scholars and librarians will discuss the “state of the art” of Catholic scholarship, directions that scholarship is headed, and how libraries, archives, and member organizations support and nurture future Catholic scholars and scholarship.</p>
<p>We are honored to have Leslie Tentler (keynote speaker), <em>Professor of History at Catholic University</em>, Paula M. Kane, <em>John and Lucine O&#8217;Brien Marous Associate Professor of Contemporary Catholic Studies, </em>University of Pittsburgh<em>,</em> Joseph P. Lucia, <em>University Librarian and Director, </em>Falvey Memorial Library, Villanova University, Dr. Kevin Mongrain, <em>Ryan Chair for Newman Studie</em>s, Duquesne University and <em>Executive Director</em>, National Institute for Newman Studies and Dr. Michael Galligan-Stierle, <em>President and CEO</em>, ACCU speaking to the symposium theme, “Advancing Catholic Scholarship.”</p>
<p>Other events will include a <em>Digital Projects Showcase</em>, highlighting innovative technologies, best practices, future trends, and related scholarship related to our theme. CRRA members are especially encouraged to participate in the showcase by submitting <strong><em>poster proposals</em> </strong>by <strong>September 26: &lt;</strong><a href="../../info/events/Call_for_Posters.pdf">http://www.catholicresearch.net/info/events/Call_for_Posters.pdf</a>&gt;.</p>
<p>A full roster of events and registration information is available at <a href="http://bit.ly/Duquesne_Symposium">http://bit.ly/Duquesne_Symposium</a>.<strong> </strong>The <strong><em>registration deadline</em> is October 15, 2011</strong>. Registration is open and space is limited.</p>
<p>We hope you will join us in what promises to be a stimulating and productive conversation about Catholic scholarly research and the ways in which librarians and archivists support this research.</p>
<p>On behalf of Duquesne University and the Catholic Research Resources Alliance (CRRA),</p>
<p>·         Jennifer Younger, chair, Board of Directors at <a href="mailto:younger.1@nd.edu">younger.1@nd.edu</a></p>
<p>·         Laverna Saunders, University Librarian, Gumberg Library, Duquesne University at <a href="mailto:lsaunders@duq.edu">lsaunders@duq.edu</a></p>
<p>·         Pat Lawton, CRRA Digital Projects Librarian at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a></p>
<p>&nbsp;</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><strong>Please mark your calendars …<br />
</strong>… to attend a roundtable discussion on <strong>“<em>Building a Catholic Archival Network</em>”</strong> at the American Catholic Historical Association (ACHA) meeting in Chicago on Friday, January 6, 2012 from 9:30-11:30 a.m. at the Chicago Downtown Marriott, Northwestern Room. This topic is of vital importance to many of us and we invite you to join in as participants present their vantage points to realizing a network of Catholic archival resources.  Please join us.</p>
<p>Speakers will include:</p>
<p>Emilie Gagnet Lumis [Archdiocese of New Orleans, <a href="mailto:lleumas@arch-no.org">lleumas@arch-no.org</a>]<br />
Patricia A. Lawton [Catholic Research Resources Alliance, <a href="mailto:plawton@nd.edu">plawton@nd.edu</a>]<br />
Ellen D. Pierce [Maryknoll Mission Archives; <a href="mailto:epierce@maryknoll.org">epierce@maryknoll.org</a>]<br />
Chair: Robert E. Carbonneau [Passionist Historical Archives, NJ; <a href="mailto:RobCarb@cpprov.org">RobCarb@cpprov.org</a>]</p>
<p>Thank you, Malachy McCarthy, for taking the lead to assemble this discussion! We look forward to seeing you in Chicago.<br />
<strong><em>CRRA Update</em></strong> is an electronic newsletter distributed via email to provide members with an update of CRRA activities.  Please contact Pat at 574.631.1324 or email <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share.</p>
<p><strong>Announcing the Self-Subscribe Option to <em>CRRA Update<br />
</em></strong>We are testing a self-subscribe option for those interested in receiving the <em>CRRA Update.</em>  The self-subscribe option will make it easy for individuals from member institutions and beyond to sign up.</p>
<p>For current subscribers, there is nothing you need to do!  You will remain on our mailing list and will continue to receive Updates until you tell us to stop.  : )</p>
<p>We have tested the self-subscribe option over the last couple of months and it has worked well.  Please encourage colleagues interested in receiving our newsletter to sign up. We will monitor the list of subscribers and will continue to add names of all on our mailing list to our Contacts page under “Other Institutional Contacts.”</p>
<p>To <span style="text-decoration: underline;">self-subscribe to the <em>CRRA Update</em>:</span></p>
<ol start="1">
<li>Address a message to listserv@listserv.nd.edu</li>
<li>Enter “subscribe crra-updates-l” in the body of the message</li>
<li>Leave the subject line blank</li>
<li>Send</li>
</ol>
<p>You will then need to confirm your subscription and you will then be subscribed.   List subscribers are then entered on the <em>CRRA list of other institutional contacts</em>: &lt;<a href="../../About/Contact">http://www.catholicresearch.net/About/Contact</a>&gt;.</p>
<p>We appreciate any feedback you may have concerning this option.</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p><em>All<strong> CRRA events</strong></em><strong> </strong>and events of possible interest to members are posted to the <em><a href="http://tiny.cc/Calendar798">CRRA calendar</a></em><em>,</em> please bookmark this link for future reference.</p>
<p>Check our progress and news on the <strong><em>CRRA blog</em></strong>: <a href="../">http://www.catholicresearch.net/blog/</a>.</p>
<div align="center">
<hr align="center" size="2" width="100%" />
</div>
<p>&nbsp;</p>
<p>&#8212;&#8212;&#8212;<br />
CRRA Calendar: <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a><br />
Duquesne/CRRA Symposium:<br />
- Call for Posters <a href="../../info/events/Call_for_Posters.pdf">http://www.catholicresearch.net/info/events/Call_for_Posters.pdf</a><br />
- Symposium Details <a href="http://bit.ly/Duquesne_Symposium">http://bit.ly/Duquesne_Symposium</a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/09/crra-julyaugust-2011-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Progress with statistics reporting</title>
		<link>http://www.catholicresearch.net/blog/2011/09/progress-with-statistics-reporting/</link>
		<comments>http://www.catholicresearch.net/blog/2011/09/progress-with-statistics-reporting/#comments</comments>
		<pubDate>Thu, 15 Sep 2011 21:07:11 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=391</guid>
		<description><![CDATA[Progress is being made when it comes to &#8220;Catholic Portal&#8221; VuFind statistics reporting. Yesterday I broke down and re-wrote my log file import application. Instead of parsing the log, ingesting the results, and then post processing, I re-wrote the application so it does all of this in one pass. I also enhanced the program so [...]]]></description>
			<content:encoded><![CDATA[<p>Progress is being made when it comes to &#8220;Catholic Portal&#8221; VuFind statistics reporting.
</p>
<p>Yesterday I broke down and re-wrote my log file import application. Instead of parsing the log, ingesting the results, and then post processing, I re-wrote the application so it does all of this in one pass. I also enhanced the program so it could take command line input. Specifically, if no arguments are supplied, then it will import yesterday&#8217;s log data. Otherwise it expects two inputs: 1) a beginning date, and 2) an ending date. If given these inputs, then I can drop the entire database and re-create it almost effortlessly. The script is called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/09/log-load.pl">log-load.pl</a>, and it is now running under cron so the database gets updated daily.
</p>
<p>The next step is to create and automate reporting functions. I have already created a number of SQL queries. They are designed to be run from a shell script which outputs results to plain text files. These plain text files are presently put on the Web in a <a href="http://www.catholicresearch.net/tmp/">temporary location</a>. This process is rather brain-dead. The next steps will include creating some sort of Web-based front-end allowing readers (increasingly I don&#8217;t use the word &#8220;users&#8221;) to complete some sort of form and get real-time results.
</p>
<p>Wish me luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/09/progress-with-statistics-reporting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Advancing Catholic Scholarship: A Symposium at Duquesne [call for posters, details]</title>
		<link>http://www.catholicresearch.net/blog/2011/08/advancing-catholic-scholarship-a-symposium-at-duquesne-call-for-posters-details/</link>
		<comments>http://www.catholicresearch.net/blog/2011/08/advancing-catholic-scholarship-a-symposium-at-duquesne-call-for-posters-details/#comments</comments>
		<pubDate>Fri, 26 Aug 2011 19:21:19 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=385</guid>
		<description><![CDATA[   Dear CRRA Colleagues, We are pleased to invite you, members of your staff, students and faculty at your institution to a symposium to be held November 9-10, 2011 at Duquesne University. Invited Catholic scholars and librarians will discuss the “state of the art” of Catholic scholarship, directions that scholarship is headed, and how libraries, [...]]]></description>
			<content:encoded><![CDATA[<p><strong> <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/08/duq-crra1.jpg"><img class="aligncenter size-medium wp-image-387" title="duq crra" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/08/duq-crra1-300x59.jpg" alt="" width="300" height="59" /></a></strong></p>
<p><strong> </strong>Dear CRRA Colleagues,</p>
<p>We are pleased to invite you, members of your staff, students and faculty at your institution to a symposium to be held November 9-10, 2011 at Duquesne University. Invited Catholic scholars and librarians will discuss the “state of the art” of Catholic scholarship, directions that scholarship is headed, and how libraries, archives, and member organizations support and nurture future Catholic scholars and scholarship.</p>
<p>We are honored to have Leslie Tentler, <em>Professor of History at Catholic University</em>, Paula M. Kane, <em>John and Lucine O&#8217;Brien Marous Associate Professor of Contemporary Catholic Studies, </em>University of Pittsburgh<em>,</em> Joseph P. Lucia, <em>University Librarian and Director, </em>Falvey Memorial Library, Villanova University, Dr. Kevin Mongrain, <em>Ryan Chair for Newman Studie</em>s, Duquesne University and <em>Executive Director</em>, National Institute for Newman Studies and Dr. Michael Galligan-Stierle, <em>President and CEO</em>, ACCU speaking to the symposium theme, “Advancing Catholic Scholarship.”</p>
<p>Other events will include a <em>Digital Projects Showcase</em>, highlighting innovative technologies, best practices, future trends, and related scholarship related to our theme. CRRA members are especially encouraged to participate in the showcase by submitting <strong><em>poster proposals</em> </strong>by <strong>September 26: &lt;</strong><a href="../../info/events/Call_for_Posters.pdf">http://www.catholicresearch.net/info/events/Call_for_Posters.pdf</a>&gt;.</p>
<p>A full roster of events and registration information is available at <a href="http://bit.ly/Duquesne_Symposium">http://bit.ly/Duquesne_Symposium</a>.<strong>   </strong>The <strong><em>registration deadline</em> is October 15, 2011</strong>.</p>
<p>We hope that you will join us in what promises to be a stimulating and productive conversation about Catholic scholarly research and the ways in which librarians and archivists support this research.<br />
On behalf of Duquesne University and the Catholic Research Resources Alliance (CRRA),</p>
<p>·         Jennifer Younger, chair, Board of Directors at <a href="mailto:younger.1@nd.edu">younger.1@nd.edu</a></p>
<p>·         Laverna Saunders, University Librarian, Gumberg Library, Duquesne University at <a href="mailto:lsaunders@duq.edu">lsaunders@duq.edu</a></p>
<p>·         Pat Lawton, CRRA Digital Projects Librarian at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/08/advancing-catholic-scholarship-a-symposium-at-duquesne-call-for-posters-details/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VUFind and sitemaps</title>
		<link>http://www.catholicresearch.net/blog/2011/08/vufind-and-sitemaps/</link>
		<comments>http://www.catholicresearch.net/blog/2011/08/vufind-and-sitemaps/#comments</comments>
		<pubDate>Fri, 12 Aug 2011 14:36:53 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=380</guid>
		<description><![CDATA[In an effort to improve SEO (search engine optimization) I have done my best to implement sitemaps against the &#8220;Catholic Portal&#8217;s&#8221; VUFind implementation. Sitemaps are XML files listing all the individual files/resources of a website. The intention and structure of these files is documented at Sitemaps.org. By exposing a site&#8217;s content in this way Internet [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>In an effort to improve SEO (search engine optimization) I have done my best to implement sitemaps against the &#8220;Catholic Portal&#8217;s&#8221; VUFind implementation.</p>
<p>Sitemaps are XML files listing all the individual files/resources of a website. The intention and structure of these files is documented at <a href="http://sitemaps.org/">Sitemaps.org</a>. By exposing a site&#8217;s content in this way Internet robots/spiders can slurp up sitemap files&#8217; URLs, go directly the resources without crawling, and index the content found there. In short, sitemaps make it easier for Internet indexers to do their job.</p>
<p>Implementing sitemaps in VUFind is relatively trivial. Edit a configuration file (web/conf/sitemap.ini), and run the sitemap file generator (php util/sitemap.php). See the <a href="http://vufind.org/wiki/search_engine_optimization">VUFind documentation</a> for more detail. Here at Portal Central I configured sitemap.ini with the following values:</p>
<ul>
<li>frequency = monthly</li>
<li>countPerPage = 10000</li>
<li>fileName = sitemap</li>
<li>fileLocation = /shared/catholic_portal/data/data/sitemaps/</li>
<li>baseSitemapUrl = http://www.catholicresearch.net/sitemaps</li>
<li>baseSitemapFileName = baseSitemap</li>
</ul>
<p>The only configuration which differs from the norm is the value of baseSitemapUrl. Instead of putting the sitemap files in the root of the VUFind filesystem I am having them saved in a directory called sitemaps. While such a thing is discouraged by the folks at Sitemaps.org, it keeps my filesystem clean, and more importantly, it makes it easier for me to migrate from one version of VUFind to another. Besides, the Google Webmaster tools ask for the specific location of one&#8217;s sitemap files and I don&#8217;t really need them to be discoverable by too many other indexers. All the other indexers pale in comparison.</p>
<p>Because I put all of the sitemap files in a separate directory, I needed to edit my httpd.conf file so VUFind does not try to interpret the directory as an action. My configuration follows:</p>
<blockquote>
<pre>
  # sitemaps
  Alias /sitemaps /shared/catholic_portal/data/data/sitemaps/
  &lt;Directory "/shared/catholic_portal/data/data/sitemaps"&gt;
    Options FollowSymLinks ExecCGI +Includes +Indexes
    AllowOverride FileInfo
    Order deny,allow
    Allow from all
  &lt;/Directory&gt;
</pre>
</blockquote>
<p>Is this an overly complicated solution? Maybe. It is cleaner? In my opinion, yes. But heck, all of this is only differences in configuration.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/08/vufind-and-sitemaps/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to upgrade VUFind</title>
		<link>http://www.catholicresearch.net/blog/2011/08/how-to-upgrade-vufind/</link>
		<comments>http://www.catholicresearch.net/blog/2011/08/how-to-upgrade-vufind/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 19:22:33 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=377</guid>
		<description><![CDATA[These are notes (to myself, mostly) on how to upgrade VUFind from a local &#8220;sandbox&#8221; version to a production version. But they are also documented here, just in case I win the lottery and start enjoying umbrella drinks on some Caribbean island. We here at &#8220;Catholic Portal&#8221; Central we generally run two simultaneous versions of [...]]]></description>
			<content:encoded><![CDATA[<p>These are notes (to myself, mostly) on how to upgrade VUFind from a local &#8220;sandbox&#8221; version to a production version. But they are also documented here, just in case I win the lottery and start enjoying umbrella drinks on some Caribbean island.
</p>
<p>We here at &#8220;Catholic Portal&#8221; Central we generally run two simultaneous versions of VUFind. One is production, and the other is a test/sandbox implementation. Since the both run on the same hardware it is necessary to configure them differently. Here is a list of things I need to change when upgrading from the &#8220;sandbox&#8221; to production:
</p>
<ul>
<li>the HTTP port from 88 to 80</li>
<li>the path to the VUFind directory</li>
<li>the Jetty port from 8088 to 8080</li>
<li>the path to Solr</li>
<li>the name of the MySQL database storing user accounts</li>
<li>the name of the MySQL schema</li>
<li>the location of the MySQL schema</li>
<li>the environment variables defined on vufind.sh</li>
<li>the name of the MySQL schema file, from vufind_test.ini to vufind.ini</li>
</ul>
<p>I also need to copy content from the previous production environment to the new production environment, but I suppose I would not need to do this if I were to copy them when I initially create my sandbox in the first place. In any event, they include:
</p>
<ul>
<li>interface/crra/css/common.css (for local styling)</li>
<li>interface/crra/css/style-about.css (for local styling)</li>
<li>layout-about.tpl (for local design)</li>
<li>the contents of interface/crra/About (for local links)</li>
<li>the contents of services/About (for local links)</li>
<li>the contents of web/etc (for the directory, mostly)</li>
</ul>
<p>All of this doesn&#8217;t count the local hacks/additions I&#8217;ve made to VUFind such as EADRecord.php, tweaks to MARCRecord.php, and the various Perl scripts used for indexing.
</p>
<p>All in all though, much of the changes needed to upgrade from the sandbox to production have been listed here. &#8216;Off to buy that winning lottery ticket now&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/08/how-to-upgrade-vufind/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VUFind, version 1.1 or so</title>
		<link>http://www.catholicresearch.net/blog/2011/08/vufind-version-1-1-or-so/</link>
		<comments>http://www.catholicresearch.net/blog/2011/08/vufind-version-1-1-or-so/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 18:03:02 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=373</guid>
		<description><![CDATA[I believe I have just finished upgrading the production version of VUFind &#8212; the software driving the &#8220;Catholic Portal&#8221; &#8212; to version RC3107 which is somewhere between version 1.1 and 1.2. This upgrade addresses at least a couple of usability issues, specifically: wording in regards the linking of online finding aids toggling the check box [...]]]></description>
			<content:encoded><![CDATA[<p>I believe I have just finished upgrading the production version of VUFind &#8212; the software driving the &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8221; &#8212; to version RC3107 which is somewhere between version 1.1 and 1.2. This upgrade addresses at least a couple of usability issues, specifically:
</p>
<ol>
<li>wording in regards the linking of online finding aids</li>
<li>toggling the check box associated with filters</li>
</ol>
<p>With this version there are also quite a number of additional records in the underlying index &#8212; around 280,000. This is because the finding aids (EAD files) have been indexed more completely.
</p>
<p>Next steps include:</p>
<ul>
<li>turning on Google site maps to facilitate better SEO (search engine optimization)</li>
<li>installing version 1.2 in the &#8220;sandbox&#8221; environment</li>
<li>implementing the results of other CRRA member usability studies</li>
<li>figuring out how to speed up the indexing process</li>
<li>figuring out how to check for valid MARC records and thus paving the way for better automated updating.</li>
</ul>
<p>Wish us luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/08/vufind-version-1-1-or-so/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA May/June 2011 Update</title>
		<link>http://www.catholicresearch.net/blog/2011/06/crra-mayjune-2011-update/</link>
		<comments>http://www.catholicresearch.net/blog/2011/06/crra-mayjune-2011-update/#comments</comments>
		<pubDate>Mon, 20 Jun 2011 18:52:06 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=365</guid>
		<description><![CDATA[The May/June 2011 CRRA Update is now available at: http://bit.ly/crra_MayJune2011 In this issue you will find news items regarding: Tom Leonhardt Retires from CRRA Board of Directors and Digital Access Committee (DAC) Welcome to CRRA’s Six New Members From the Board: NEH Challenge Grant Proposal Deferred; CRAA to Develop Five Year Strategic Plan The Collections [...]]]></description>
			<content:encoded><![CDATA[<p>The<strong> May/June 2011 <em>CRRA Update</em></strong> is now available at: <a href="http://bit.ly/crra_MayJune2011">http://bit.ly/crra_MayJune2011</a></p>
<p>In this issue you will find news items regarding:</p>
<ul>
<li>Tom Leonhardt Retires from CRRA Board of Directors and Digital Access Committee (DAC)</li>
<li>Welcome to CRRA’s Six New Members</li>
<li>From the Board: NEH Challenge Grant Proposal Deferred; CRAA to Develop Five Year Strategic Plan</li>
<li>The Collections Committee Update: Developing Critical Mass Around Three Themes</li>
<li>Member Institutions Continue Portal Usability Studies</li>
<li>CRRA in ACCU News</li>
<li><em><strong>Mark your calendars</strong></em> to attend “Advancing Catholic Scholarly Research: A Symposium at Duquesne University.” November 9-10, 2011, Pittsburgh, PA.  Details at: <a href="http://bit.ly/lg0oCH">http://bit.ly/lg0oCH</a></li>
</ul>
<div>
<hr size="2" />
</div>
<p><strong>CRRA Welcomes Six New Members</strong></p>
<p>Welcome to our newest members!</p>
<ul>
<li><a href="http://www.barry.edu/libraryservices/about/archives.htm">Barry      University</a> (Miami Shores, FL);</li>
<li><a href="http://www.ctu.edu/library">Catholic Theological Union</a> (Chicago);</li>
<li><a href="http://www2.cnr.edu/home/library/index.htm">The College of New      Rochelle</a> (New Rochelle, New York);</li>
<li><a href="http://www.pahrc.net/">Philadelphia Archdiocesan Historical Research      Center</a> (PAHRC) (Wynnewood, PA);</li>
<li><a href="http://mcentegart.sjcny.edu/Archives/">St. Joseph’s College</a> (Brooklyn, NY); and</li>
<li><a href="http://www.vocations.org/FMLibrary/default.htm">University of St.      Mary at the Lake/Mundelein University</a> (Mundelein, IL).</li>
</ul>
<p>Look for more on our newest members in future updates.</p>
<div>
<hr size="2" />
</div>
<p><strong>Tom Leonhardt</strong></p>
<p>After a distinguished career, Tom Leonhardt is retiring from St. Edward’s University, Austin, TX and returning to the Pacific Northwest.</p>
<p>From the beginning, Tom’s thoughtful voice has informed, encouraged and supported the development of the Alliance through his leadership on the Board and as Chair of the Digital Access Committee. Tom leaves with these words:</p>
<p><strong> </strong></p>
<p>“I have been privileged to have been a part of CRRA from the time it was one person&#8217;s vision through an exciting period of growth led by another person with a shared vision and the organizational and leadership skills to make it all work. Working with other directors of Catholic university libraries brought me into a group that not only included collaborative achievements but the good fellowship that one finds over drinks and shared meals. There are other associations, too, that come with CRRA, the Digital Access Committee being the one that I know best. Thanks to you all for your fellowship and shared memories.”</p>
<p>Thank you, Tom, for your many contributions in shaping the CRRA.  We will miss you and wish you the very best.</p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<div>
<hr size="2" />
</div>
<p><strong>NEH Challenge Grant Proposal Deferred: <em>From the Board</em> </strong></p>
<p>One goal in the strategic plan for this year was to submit a grant proposal to the NEH Challenge Grant Program. However, as the Board and Terry Ehling (strategic consultant) reviewed the proposal, the Board decided we would benefit from doing a five year strategic plan before submitting the proposal. The proposal spoke clearly to the mission and positive impact on Catholic scholarship but it was not clear what the financial target of the grant should be.  The purpose of an NEH challenge grant is to assist the grantee in building capacity and infrastructure to carry out its mission. In looking ahead, we recognized that the annual goals for 2011/12 mark the third year of the current strategic plan and realized this is a good time to develop a five year strategic plan. Doing so we believe will allow us to create a bold and even stronger proposal of what we can do over the next five years in supporting the needs scholars on our own campuses have expressed in focus groups and usability studies.  In the context of the longer term directions and goals, we will also be able to determine an appropriate financial target to assist in building capacity and infrastructure.</p>
<p><strong> </strong></p>
<p><strong>Developing a Five Year Strategic Plan </strong></p>
<p>The Board is setting up a Task Force to develop a five year (FY2012/13-2016/17) strategic plan &lt;<a href="http://bit.ly/crra_5yr%20Plan">http://bit.ly/crra_5yr Plan</a> &gt; to identify key directions and goals for the CRRA and the Catholic portal. The plan will include a statement of core values, vision and mission; a statement on the benefits of CRRA in advancing Catholic scholarship, and directions and strategies to carry out the mission and deliver value to stakeholders. While the planning process is underway, we will continue to implement the goals for the coming year (2011-12) and especially the three priorities identified in the March membership meeting.</p>
<p>We propose a task force of five to seven members and with representation from each committee, the Board and members at large. Pat Lawton, Jennifer Younger and Terry Ehling will be ex officio members. If you are interested, please send an email to Pat or Jennifer who will create a list of interested members for the Board.</p>
<p>Once the task force is in place, more information will be forthcoming on the process, time line, key questions and most importantly, opportunities for member input. We can achieve our mission and aims only through the direction and participation of our members.</p>
<p>&nbsp;</p>
<div>
<hr size="2" />
</div>
<p><strong>Portal Anecdote: User Connected with Parish History</strong></p>
<p>A variety of interesting requests come through the “Contact us” link on our homepage and one of the most rewarding to date came recently from a woman “working on a history of the Church of the Magdalene and gathering all print sources that [she] can find.”  She referenced a record for a parish history listed in the portal.  In a short period of time, the item was located, scanned, and sent to the church via the requester.  She was pleased to receive the document, and we were pleased to know that the portal had enabled access to a valued resource to one very real user.</p>
<p>Users are finding us!  If you have portal success stories to share in future Updates, please pass them on.</p>
<div>
<hr size="2" />
</div>
<p><strong><em>CRRA in the News</em> …</strong></p>
<p><strong>ACCU Summer 2011 Newsletter on the CRRA/Duquesne Symposium</strong></p>
<p>The <em>Summer 2011 ACCU Update</em> includes an announcement of the upcoming November Symposium hosted by Duquesne University and the CRRA.  See the full article on page 12 of ACCU’s newsletter, available at: <a href="http://www.accunet.org/i4a/pages/index.cfm?pageid=3389">http://www.accunet.org/i4a/pages/index.cfm?pageid=3389</a>.</p>
<p>For details about the CRRA/Duquesne Symposium, see <a href="http://bit.ly/lg0oCH">http://bit.ly/lg0oCH</a>.</p>
<div>
<hr size="2" />
</div>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong>The Collections Committee Update: Developing Critical Mass around Three Themes</strong></p>
<p>The Collections Committee states goal 1.1. on the <a href="http://bit.ly/lay6Za">2011/2012 strategic plan</a> as: “Encourage development of critical mass around three themes: Women religious, Vatican 2, and Catholic social action.”   The Committee emphasizes that highlighting these themes is not prescriptive but rather <em>suggestive</em>. The idea is to encourage members to think about their own holdings in these areas and to add to the portal what they may have in these areas, thereby supporting the portal goal to build collections.</p>
<p>If you have rare or uncommon materials in these subject areas, these would be especially welcome additions to the portal. Of course, contributions relating to any of the <a href="../../About/CRRA#portal">twelve primary collecting themes</a> are helpful and highly desirable. If you have any questions about priorities, please contact <a href="mailto:plawton@nd.edu">Pat Lawton</a> or any member of the <a href="../../About/Contact#collections">Collections Committee</a>.</p>
<p>&nbsp;</p>
<div>
<hr size="2" />
</div>
<p><em>All<strong> CRRA events</strong></em><strong> </strong>and events of possible interest to members are posted to the <em><a href="http://tiny.cc/Calendar798">CRRA calendar</a></em><em>,</em> please bookmark this link for future reference.</p>
<p>Check our progress and news on the <strong><em>CRRA blog</em></strong>: <a href="../">http://www.catholicresearch.net/blog/</a>.</p>
<div>
<hr size="2" />
</div>
<p><strong><em>CRRA Update</em></strong> is an electronic newsletter distributed via email to provide members with an update of CRRA activities.  Please contact Pat at 574.631.1324 or email <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share.</p>
<p>&#8212;&#8212;&#8212;<br />
CRRA Calendar: <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a><br />
CRRA Goals for 2011/2012: <a href="http://bit.ly/lay6Za">http://bit.ly/lay6Za</a><br />
Duquesne/CRRA Symposium: <a href="http://bit.ly/lg0oCH">http://bit.ly/lg0oCH</a><br />
Task Force to Develop a Five Year Strategic Plan: <a href="http://bit.ly/crra_5yrPlan">http://bit.ly/crra_5yrPlan</a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/06/crra-mayjune-2011-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PastPerfect</title>
		<link>http://www.catholicresearch.net/blog/2011/06/pastperfect/</link>
		<comments>http://www.catholicresearch.net/blog/2011/06/pastperfect/#comments</comments>
		<pubDate>Fri, 03 Jun 2011 18:09:03 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=362</guid>
		<description><![CDATA[This posting outlines the possibilities for ingesting PastPerfect content into the &#8220;Catholic Portal&#8221;. As membership in the Catholic Research Resources Alliance (CRRA) grows, so does the number of metadata formats the &#8220;Catholic Portal&#8221; is expected to support. When the CRRA was just beginning MARC was the predominate metadata format. After the content of university archives [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting outlines the possibilities for ingesting PastPerfect content into the &#8220;Catholic Portal&#8221;.
</p>
<p>
As membership in the Catholic Research Resources Alliance (CRRA) grows, so does the number of metadata formats the &#8220;Catholic Portal&#8221; is expected to support. When the CRRA was just beginning MARC was the predominate metadata format. After the content of university archives was recognized as significant, EAD became very important. Some institutions use neither MARC nor EAD to describe their special collections but instead use systems like ContentDM. These sorts of things are often accessible via OAI-PMH, and thus, at the very least, harvestable Dublin Core is available. In order to support discovery, all of these types of metadata need to be parsed, mapped to VuFind&#8217;s underlying Solr schema, and indexed.
</p>
<p>
It has come to my attention that some of the CRRA&#8217;s membership may be using an application called <a href="http://www.museumsoftware.com/">PastPerfect by PastPerfect Software, Inc.</a> to describe their collections. After a bit of investigation, I learned that PastPerfect supports a number of exportable metadata formats. One of those formats is an XML file complete with Dublin Core elements. Here is a sample record:
</p>
<pre>
&lt;?xml version="1.0" encoding="windows-1252" standalone="yes"?&gt;
&lt;metadata&gt;
  &lt;dc-record&gt;
    &lt;type&gt;text&lt;/type&gt;
    &lt;type&gt;original&lt;/type&gt;
    &lt;type&gt;cultural&lt;/type&gt;
    &lt;format&gt;23 cm. 292. p. Includes index.&lt;/format&gt;
    &lt;title&gt;Guide to the Use of Books and Libraries&lt;/title&gt;
    &lt;title&gt;Book&lt;/title&gt;
    &lt;description&gt;Guide to the Use of Books and Libraries....&lt;/description&gt;
    &lt;subject&gt;Book&lt;/subject&gt;
    &lt;subject&gt;1. Reference books.&lt;/subject&gt;
    &lt;subject&gt;2. Libraries--Handbooks, manuals, etc.&lt;/subject&gt;
    &lt;subject&gt;3. Library.&lt;/subject&gt;
    &lt;subject&gt;Gates, Jean Key&lt;/subject&gt;
    &lt;creator&gt;Gates, Jean Key&lt;/creator&gt;
    &lt;contributor&gt;Wright, Richard R. and Susan Gamer, editors.&lt;/contributor&gt;
    &lt;publisher&gt;McGraw-Hill Book Company&lt;/publisher&gt;
    &lt;date&gt;1979&lt;/date&gt;
    &lt;identifier&gt;2000.4.3&lt;/identifier&gt;
    &lt;language&gt;English&lt;/language&gt;
    &lt;coverage&gt;1979 - 1979&lt;/coverage&gt;
    &lt;coverage&gt;New York, NY&lt;/coverage&gt;
  &lt;/dc-record&gt;
&lt;/metadata&gt;
</pre>
<p>
The XML is straight-forward enough and seems to be well-formed, but there does not seem to be any DTD nor schema for validation. The content in each of the elements comes straight from the data entry of the PastPerfect system so things like &#8220;1. &#8221; or &#8220;2. &#8221; in the subject elements are included apparently because that is what someone typed in. Similarly, subheadings delimited by &#8220;&#8211;&#8221; and the multiple values in the format element are reminiscent of MARC records and their ISBD (International Standard Bibliographic Description) codes. The repeating elements like coverage or type make things challenging, but not insurmountable. In short, the issues surrounding the mark-up are relatively minor. It is not ideal, but it is functional.
</p>
<p>
The bigger issue surrounds linking to the original item. While each metadata record includes a unique identifier, there is seemingly no way to enable the reader to either see a full-record display at the hosting institution or see the item being described; the PastPerfect records are not associated with an actionable URI (Universal Resource Identifier). This means each record, in order to be used and before it is indexed, will need to be associated with the postal address of the library/archive using PastPerfect, and readers will need to get in touch with the librarian/archivist if they want to use the item.
</p>
<p>
We don&#8217;t live in an ideal metadata world, and increasingly it seems the best we can hope for is well-formed and valid metadata. Whether our metadata is complete or accurate is completely in the hands of people, not computers.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/06/pastperfect/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Harvesting metadata</title>
		<link>http://www.catholicresearch.net/blog/2011/05/harvesting-metadata/</link>
		<comments>http://www.catholicresearch.net/blog/2011/05/harvesting-metadata/#comments</comments>
		<pubDate>Tue, 24 May 2011 15:33:37 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=358</guid>
		<description><![CDATA[It is imperative for CRRA member institutions to make their metadata available for harvesting via a Web server. A couple of years ago, when the &#8220;Portal&#8221; was just beginning, the modus operandi for ingesting MARC and EAD metadata was to send it to Notre Dame, save it on local hard disk, and index it. That [...]]]></description>
			<content:encoded><![CDATA[<p>
It is imperative for CRRA member institutions to make their metadata available for harvesting via a Web server.
</p>
<p>
A couple of years ago, when the &#8220;Portal&#8221; was just beginning, the <i>modus operandi</i> for ingesting MARC and EAD metadata was to send it to Notre Dame, save it on local hard disk, and index it. That process worked then, but as we grow it becomes less and less scalable.
</p>
<p>
Now-a-days the preferred method of getting your metadata to the Portal is through harvesting. Here is how it works:
</p>
<ol>
<li><strong>Create metadata</strong> &#8211; Use whatever process you desire to create and edit your metadata. Much of what we suggest is outlined in a previous posting affectionately called &#8220;<a href="http://www.catholicresearch.net/blog/2010/08/making-your-content-available/">the recipe</a>&#8220;. </li>
<li><strong>Export metadata</strong> &#8211; If your metadata is in MARC format, then query your integrated library system for all things destined for the Portal, and save the result to a single file using the UTF-8 character set. If your metadata is in EAD format, then export it as individual files making sure they are well-formed and valid.</li>
<li><strong>Expose metadata</strong> &#8211; In either case, MARC records or EAD files, the next step is to save the metadata on a Web server. Create or have created a directory on a Web server. Put the file of MARC records and/or the EAD files in the directory. There is no need to create a Web page. Just make sure the directory&#8217;s contents are listed automatically and by default. A <a href="http://www.marquette.edu/library/crra/">good example</a> is the work done by Marquette University.</li>
<li><strong>Share the URL(s)</strong> &#8211; Once the files are on a Web server, they will have URLs. In the case of MARC records, send Notre Dame the URL of the MARC file. In the case of EAD files, send the URL of the directory.</li>
<li><strong>Repeat</strong> &#8211; This is an never-ending process. Go to Step #1. As you create, edit, and export new or different metadata, save it in the Web-accessible directory. There is no need to send the updates to Notre Dame. They will be harvested on a regular basis. There is no need to denote which records are new, changed, or deleted. Previously indexed records will be discarded and the whole lot will be re-indexed.</li>
</ol>
<p>
There are many benefits to this process. First, the data gets duplicated. &#8220;Lot&#8217;s of copies keep stuff safe.&#8221; Second, Internet spiders and robots will find your data, index it, and make it accessible via their indexes. That is a good thing. Third, it gives you more control over the data and reduces the risk of Notre Dame loosing it.
</p>
<p>
Just like the previous &#8220;recipe&#8221;, what is described above is only an outline. Each institution will differ slightly in their implementation. If you have any questions, then please don&#8217;t hesitate to ask.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/05/harvesting-metadata/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>&#8220;Catholic Portal&#8221; usability efforts</title>
		<link>http://www.catholicresearch.net/blog/2011/05/dac-minutes-may-12-2011/</link>
		<comments>http://www.catholicresearch.net/blog/2011/05/dac-minutes-may-12-2011/#comments</comments>
		<pubDate>Fri, 13 May 2011 19:18:36 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=349</guid>
		<description><![CDATA[This page has become the home page for the usability efforts of the &#8220;Catholic Portal&#8221;. The Digital Access Committee had a conference call on Thursday, May 12. The purpose of the meeting was to discuss usability studies. The resources (time and money) required to do the studies was emphasized. Similarly, the need to have the [...]]]></description>
			<content:encoded><![CDATA[<p>
This page has become the home page for the usability efforts of the &#8220;Catholic Portal&#8221;.</p>
<p>
The Digital Access Committee had a conference call on Thursday, May 12. The purpose of the meeting was to discuss usability studies. The resources (time and money) required to do the studies was emphasized. Similarly, the need to have the studies done with the intended audience of the Portal &#8212; upper-class man, graduate students, faculty, and scholars &#8212; was also stressed.
</p>
<p>
Ideally each institutional member of the Committee will facilitate and complete a set of usability studies by Christmas. In that vein, the following tentative list of who will do studies by has been drafted:
</p>
<ul>
<li>Seton Hall during June/July</li>
<li>University of Toronto during July/August</li>
<li>Marquette University during August/September</li>
<li>Georgetown University during late August/early September</li>
<li>Catholic Theological Union during September</li>
<li>Villanova University during September/October</li>
</ul>
<p>
Individual committee members are expected to communicate with the committee as a whole by May 27 with more definite commitments.
</p>
<p>
For more information about the usability studies, see &#8220;<a href="http://www.catholicresearch.net/blog/2011/03/doing-usability-against-the-catholic-portal/">Doing usability against the &#8216;Catholic Portal&#8217;</a>&#8220;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/05/dac-minutes-may-12-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA-Tech</title>
		<link>http://www.catholicresearch.net/blog/2011/05/crra-tech/</link>
		<comments>http://www.catholicresearch.net/blog/2011/05/crra-tech/#comments</comments>
		<pubDate>Fri, 13 May 2011 18:22:44 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=346</guid>
		<description><![CDATA[This is the home page for a mailing list called CRRA-Tech. The Catholic Research Resources Alliance (CRRA) or &#8220;Catholic Portal&#8221; brings together data and metadata for the purposes of Catholic research and scholarship. This process is facilitated through a number of groups dealing with administrtive issues, collection issues, metadata issues, etc. CRRA-Tech is a mailing [...]]]></description>
			<content:encoded><![CDATA[<p>
This is the home page for a mailing list called CRRA-Tech.
</p>
<p>
The <a href="http://www.catholicresearch.net/">Catholic Research Resources Alliance</a> (CRRA) or &#8220;Catholic Portal&#8221; brings together data and metadata for the purposes of Catholic research and scholarship. This process is facilitated through a number of groups dealing with administrtive issues, collection issues, metadata issues, etc. CRRA-Tech is a mailing list intended to support and discuss the computer technology issues of the CRRA such as but not limited to the harvesting of content and metadata, the validation of content and metadata, indexing technologies, library &#8220;discovery systems&#8221;, the programming languages (PHP, Java, Perl, and Javascript) used, log file analysis, casscading stylesheets, debugging tools, the role of open source software, etc. In short, CRRA-Tech provides a forum for discussing the computer infrastructure of the Portal.
</p>
<p>
If supporting research and scholarship through the use of computer technology is a part of your daily work and if your employer is as member of the CRRA, then consider subscribing to CRRA-Tech. To <a href='mailto:listserv@listserv.nd.edu?body=subscribe%20crra-tech'>subscribe</a>:
</p>
<ol>
<li>address a message to listserv@listserv.nd.edu</li>
<li>in the body of the message put &#8220;subscribe crra-tech&#8221;</li>
<li>send it away</li>
</ol>
<p>
You ought to get back a couple of confirmations, and you will be done.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/05/crra-tech/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CATLA Spring Conference</title>
		<link>http://www.catholicresearch.net/blog/2011/05/catla-spring-conference/</link>
		<comments>http://www.catholicresearch.net/blog/2011/05/catla-spring-conference/#comments</comments>
		<pubDate>Tue, 03 May 2011 21:59:27 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=338</guid>
		<description><![CDATA[On Friday, April 15 I had the honor and pleasure of giving a presentation to the Chicago Area Theological Library Association. This posting documents the experience. The Chicago Area Theological Library Association held its Spring Conference at Andrews University in Berrien Springs (Michigan). The conference was small, about 15 people attended. After the business meeting [...]]]></description>
			<content:encoded><![CDATA[<p>
On Friday, April 15 I had the honor and pleasure of giving a presentation to the Chicago Area Theological Library Association. This posting documents the experience.
</p>
<div id="attachment_343" class="wp-caption aligncenter" style="width: 330px"><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/05/catla.gif"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/05/catla.gif" alt="To and from Berrien Springs (MI)" title="To and from Berrien Springs (MI)" width="320" height="240" class="size-full wp-image-343" /></a><p class="wp-caption-text">To and from Berrien Springs (MI)</p></div>
<p>
The <a href="http://catla.blogspot.com/2011/02/schedule-for-spring-2011-catla.html">Chicago Area Theological Library Association held its Spring Conference at Andrews University</a> in Berrien Springs (Michigan). The conference was small, about 15 people attended. After the business meeting I gave a presentation on &#8220;next-generation library catalogs&#8221;, digital humanities, and the &#8220;Catholic Portal&#8221;. In a nutshell I compared &#038; contrasted database applications (traditional library catalogs) with indexes (&#8220;discovery systems&#8221;). I then demonstrated a few text analysis tools and at the same time explained how these tools can be used to supplement the &#8220;close&#8221; reading process. Finally I described and demonstrated the &#8220;Catholic Portal&#8221;, and I showed how the ideas of &#8220;next-generation&#8221; library catalogs and text mining have been incorporated into it. I got lucky with the last part of the presentation because I had upgraded the Portal the previous day, and nothing went wrong during the demonstration.
</p>
<p>
After a vegetarian lunch in the University&#8217;s dining hall, we returned to the conference room for a set of lightning talks:
</p>
<ul>
<li><strong>Kate Ganski</strong> (University of Wisconsin, Milwaukee) described a needs-based marketing campaign which looked rather innovative and energetic</li>
<li><strong>Lisa Gonzalez</strong> (Catholic Theological Union) enumerated a number of cool (and &#8220;kewl&#8221;) tech tools to share with patrons</li>
<li><strong>Alan Krieger</strong> (University of Notre Dame) described the the Hesburgh Libraries&#8217;s newly created theological reading room and how it was being used</li>
<li><strong>Matt Ostercamp</strong> (North Park University) outlined ways to promote traditional reading in libraries, and of all the lightning talks, this one complemented my presentation the most</li>
<li><strong>Karl Stutzman</strong> (Associated Mennonite Biblical Seminary) reported on the process his library is going through to implement Primo</li>
</ul>
<p>
After the talks we were given a very nice tour of the University&#8217;s library and archive. Hosting the largest collection of Seventh Day Adventist materials in the world, the University archive was quite impressive. They actively digitize their materials and provide a home for a wide variety of materials. I was also impressed with the library&#8217;s service to the community. Specifically, they operated a charitable giving program where they received new (and used) books from a variety of sources and then shipped these books to fledgling libraries all over the world. They were putting their university&#8217;s values into practice.
</p>
<p>
I had a good time, and I appreciate the opportunity. <em>&#8220;Thank you, Lisa G., for inviting me!&#8221;</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/05/catla-spring-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>April 2011 Update</title>
		<link>http://www.catholicresearch.net/blog/2011/04/april-2011-update/</link>
		<comments>http://www.catholicresearch.net/blog/2011/04/april-2011-update/#comments</comments>
		<pubDate>Fri, 29 Apr 2011 17:08:45 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=333</guid>
		<description><![CDATA[The CRRA April 2011 Update is now available at: http://bit.ly/Update_April2011. Highlights in this issue include: SAVE THE DATE! &#8220;Advancing Catholic Scholarly Research: A Symposium&#8221; November 9-10, 2011 Duquesne University Pittsburgh, PA CRRA welcomes new members: College of the Holy Cross and Creighton University!  Look for more on our newest members in next month&#8217;s Update. Welcome [...]]]></description>
			<content:encoded><![CDATA[<p>The <strong><em>CRRA </em><em>April 2011 Update</em></strong> is now available at: <a href="http://bit.ly/Update_April2011  ">http://bit.ly/Update_April2011</a>.</p>
<p>Highlights in this issue include:</p>
<div><strong>SAVE THE DATE!</strong></div>
<div>&#8220;Advancing Catholic Scholarly Research: A Symposium&#8221;</div>
<div>November 9-10, 2011</div>
<div>Duquesne University</div>
<div>Pittsburgh, PA</div>
<ul>
<li><strong> </strong>CRRA welcomes new members: <strong>College of the Holy Cross </strong>and <strong>Creighton University</strong>!  Look for more on our newest members in next month&#8217;s <em>Update.</em></li>
<li>Welcome to <strong>Terry Ehling, Strategic Consultant</strong> to the CRRA!  <a href="http://bit.ly/Update_April2011  "><em> </em></a><strong> </strong></li>
<li> <strong>News on membership dues</strong><br />
The    Membership Dues Task Force met with members of the Board and  other    member library directors to discuss its recommendation to adopt  a    multi-tiered dues structure and a proposal for dues for the coming    year. <a href="http://bit.ly/updateapr2011"> </a></li>
<li>Highlights of the CRRA All-Members Meeting in Philadelphia. On Wed., March 30, <a title="Philadelphia participants" rel="35 CRRA members and friends" href="http://bit.ly/lWmMQD" target="_blank">35 CRRA members and friends</a> participated in the CRRA annual meeting. Agenda items included goals for fiscal year 2010/2011 (see proposed goals at <a title="draft goals" rel="httpbit.lyDraftPlan2012" href="http://bit.ly/DraftPlan2012" target="_self">http://bit.ly/DraftPlan2012</a>), <a title="usability studies" rel="usability studies" href="http://bit.ly/l7nZqN" target="_blank">usability studies</a>, and membership dues.</li>
</ul>
<p>See <a href="http://bit.ly/Update_April2011">http://bit.ly/Update_April2011</a> for more on these and other news items.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/04/april-2011-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA All-Members Meeting: A Travelogue</title>
		<link>http://www.catholicresearch.net/blog/2011/04/crra-all-members-meeting-a-travelogue/</link>
		<comments>http://www.catholicresearch.net/blog/2011/04/crra-all-members-meeting-a-travelogue/#comments</comments>
		<pubDate>Tue, 05 Apr 2011 13:41:24 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=315</guid>
		<description><![CDATA[Just about this time last week I was attending the CRRA All-Members Meeting in Philadelphia (March 29-30, 2011). This posting documents the experience. The Meeting began Tuesday afternoon, March 29, at Villanova University where attendees were treated to a number of show &#38; tell presentations describing the digital library goings-on of the Falvey Library. Joseph [...]]]></description>
			<content:encoded><![CDATA[<p>
Just about this time last week I was attending the CRRA All-Members Meeting in Philadelphia (March 29-30, 2011). This posting documents the experience.
</p>
<p>
<img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/04/meeting/crra-meeting.gif" alt="slideshow" width="190" align="right" hspace="10" vspace="5">The Meeting began Tuesday afternoon, March 29, at Villanova University where attendees were treated to a number of show &amp; tell presentations describing the digital library goings-on of the Falvey Library. Joseph Lucia began by listing a number of well-articulated reasons why open source software is akin to the values of librarianship. Most notably, he alluded to the <a href="http://www.law.duke.edu/pd/papers/boyle.pdf">Second Enclosure</a> and the very real threats to the public commons. Other presentations outlined local digitization efforts using <a href="http://code.google.com/p/tesseract-ocr/">Tesseract</a>, their institutional repository implementation, scholarly publishing with <a href="http://pkp.sfu.ca/?q=ojs">Open Journal System</a>, and their newly released digital library software called <a href="http://vudl.org/">VUDL</a>. I am continually impressed with the work being done by the folks at Villanova. Administration has a vision, a plan, and puts the plan into practice. &#8220;We do things for the sake of scholarship&#8230; We collaborate and find partners.&#8221; This approach to digital librarianship seems to me to be the best long-term strategy and ensures sustainability. It is not so much about getting more money but instead about setting priorities and allocating resources accordingly.
</p>
<p>
The main event took place the following day at St. Joseph&#8217;s University. Attended by thirty people or so, this particular All-Members Meeting was the largest to date. This is not surprising since the Catholic Research Resources Alliance (CRRA) now has about twenty members. We&#8217;re growing! The morning&#8217;s session focused on two issues. The first was automated procedures for getting member metadata (MARC and EAD) into the &#8220;Portal&#8221;. This is where I elaborated on the <a href="http://www.catholicresearch.net/blog/2010/08/making-your-content-available/">&#8220;recipe&#8221;</a>. Attendees seemed to think the procesure was feasible. The second discussion surrounded digitization efforts. While no plan was articulated many people believe it is necessary to include more full-text content in the Portal. This can be done through local digitizations projects, coordination with Villanova&#8217;s established program, or through the harvesting of content from the &#8216;Net. Some of this discussion bled into the interpretation of the <a href="http://www.catholicresearch.net/blog/2010/09/collection-policy-statement-for-the-catholic-portal/">Portal&#8217;s collection policy</a>. It seems to me as if the policy may not be as prescriptive as necessary. Many people seem confused by it and desire clarification regarding their content selections. After lunch most of us participated in a discussion regarding <a href="http://www.catholicresearch.net/blog/2011/03/doing-usability-against-the-catholic-portal/">usability studies</a>. This is where I outlined how we did usability here at Notre Dame and the expectations for institutional CRRA members.
</p>
<p>
In summary, the Alliance feels like it is moving ahead at a measured pace.
</p>
<p>
Finally, I relished the location of this particular meeting. Let me explain. United States Highway Route 1 starts in Key West (Florida), goes through every major city along the East Coast, and terminates at the Canadian border in Maine. United States Highway Route 30 starts at the Atlantic Ocean in New Jersey, crosses the country as the Lincoln Highway (for the most part), and ends in Oregon at the Pacific Ocean. These two cross-country highways intersect at the western-most edge of Philadelphia. The following movie was taken at this intersection &#8212; a true crossroads of America.
</p>
<p align="center">
<iframe title="YouTube video player" width="360" height="293" src="http://www.youtube.com/embed/xXkJbFVkaVg" frameborder="0" allowfullscreen></iframe>
</p>
<p>
Some day I would love to start at the end of either one of these roads, slowly drive to the other end, and take photographs all along the way.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/04/crra-all-members-meeting-a-travelogue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Digital Humanities Forum &#8212; A Travelogue</title>
		<link>http://www.catholicresearch.net/blog/2011/04/digital-humanities-forum-a-travelogue/</link>
		<comments>http://www.catholicresearch.net/blog/2011/04/digital-humanities-forum-a-travelogue/#comments</comments>
		<pubDate>Mon, 04 Apr 2011 18:26:25 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=311</guid>
		<description><![CDATA[This is the briefest of travelogues &#8212; a description of what went on at the Digital Humanities Forum, February 24, 2011. On Thursday, February 24, the Hesburgh Libraries and the Catholic Research Resources Alliance (CRRA) sponsored the Digital Humanities Forum. The purpose of the event was to raise the awareness of the digital humanities across [...]]]></description>
			<content:encoded><![CDATA[<p>
This is the briefest of travelogues &#8212; a description of what went on at the Digital Humanities Forum, February 24, 2011.
</p>
<p>
On Thursday, February 24, the Hesburgh Libraries and the Catholic Research Resources Alliance (CRRA) sponsored the <a href="http://www.catholicresearch.net/blog/2011/01/crrand-digital-humanities-forum-february-24-25-2011/">Digital Humanities Forum</a>. The purpose of the event was to raise the awareness of the digital humanities across campus just a little bit. To that end we hosted two speakers and a couple of hands-on workshops.
</p>
<p>
The first speaker was <strong>Art Crivella</strong> (Crivella West). Crivella has recently become passionate about digital libraries and was instrumental in getting the <a href="http://www.newmanstudiesinstitute.org/library.aspx">digital access to Cardinal Newman content</a> off the ground. &#8220;Books are very hard to let go of&#8221;, and &#8220;We are developing ways to read 90 million pages [of text]&#8220;, he said. He then described some of the work his company has been doing in this regard. Digitize the &#8220;best&#8221; text possible. Make it perfect. Divide the corpus into parts: published, oratory, tracts, and personal writings. Compile lists of words and phrases denoting broad subject areas, emotional connotations, and philosophic concepts. Implement a system &#8212; which looked a lot like a concordance &#8212; allowing scholars to search the corpus and identify relevant passages of text. Select paragraphs from the search results, click a button, and find similar paragraphs. He summarized by saying, &#8220;Algorithms of the 22nd Century are just as important as the works, sets, and commentary of collections. The sum of these things constitute the library of the near future.&#8221;
</p>
<table align='center'>
<tr align='center'>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/04/forum/speakers.jpg" alt="speakers" height="150" /><br />speakers</td>
<td><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/04/forum/workshop.png" alt="workshop" height="150" /><br />workshop</td>
</tr>
</table>
<p>
The second speaker was <strong>Ron Snyder</strong>, the Director of Advanced Technology at ITHAKA, and ITHAKA is the parent organization hosting JSTOR. The majority of the time, Snyder described the functionality of a JSTOR site called <a href="http://dfr.jstor.org">Data For Research (DFR)</a>. Built with open source software (most notably Solr), and accessible via <a href="http://dfr.jstor.org/??view=text&#038;&#038;helpview=about_api">a REST-ful application programmer interface (API)</a>, Data For Research provides a way to search the JSTOR content and apply data mining techniques against the result. The site outputs bibliographies, word frequencies, n-grams, keywords based on TFIDF, and references. It supports a bit of visualization, and data sets can be delivered to programmer&#8217;s desktops in the form of XML or CSV files. Snyder compared the DFR&#8217;s Web interface to a form of sculpture. &#8220;The DFR tool is like an ice sculpture where you whittle down your results.&#8221; This is true since it is entirely possible to access to the sum of the JSTOR content by first entering a couple of key words and use the resulting facets to reduce the set to a few items. The following day, Friday, Snyder facilitated two workshops. One akin to a traditional bibliographic instruction session, and the second a brief tutorial on how to use the API.
</p>
<p>
The first day&#8217;s session was attended by approximately 50 people. Just more than half were from the University, and the balance were members of the Catholic Research Resources Alliance. The workshops were attended by fewer but a similar mix of people. The feedback I received from the event was more or less positive. Words used to describe it included &#8220;interesting&#8221;, &#8220;intense&#8221;, and &#8220;thought provoking&#8221;. I believe the Digital Humanities Forum accomplished its goal.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/04/digital-humanities-forum-a-travelogue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Doing usability against the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2011/03/doing-usability-against-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2011/03/doing-usability-against-the-catholic-portal/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 18:28:23 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=305</guid>
		<description><![CDATA[This posting describes a process for iteratively studying usability issues against the &#8220;Catholic Portal&#8221; with the expectation that it will be applied by each institutional member of the Digital Access Committee within the current calendar year. The posting is divided into the following sections: why do usability expectations of institutional Committee members locally required resources [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting describes a process for iteratively studying usability issues against the &#8220;Catholic Portal&#8221; with the expectation that it will be applied by each institutional member of the Digital Access Committee within the current calendar year. The posting is divided into the following sections:
</p>
<ul>
<li><a href="#why">why do usability</a></li>
<li><a href="#expectations">expectations of institutional Committee members</a></li>
<li><a href="#resources">locally required resources</a></li>
<li><a href="#work">doing the work</a></li>
<ul>
<li><a href="#questions">work with the Committee to refine the questions</a></li>
<li><a href="#practice">practice with the technology</a></li>
<li><a href="#schedule">schedule testers</a></li>
<li><a href="#do">do studies</a></li>
<li><a href="#evaluate">evaluate results</a></li>
<li><a href="#return">return results to Committee</a></li>
</ul>
<li><a href="#summary">summary and conclusion</a></li>
</ul>
<p>This document is also available as a <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/03/usability/usability-against-the-portal-(print).pdf">PDF document for printing</a>, a second PDF document designed as a <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/03/usability/usability-against-the-portal-(slides).pdf">set of slides</a>, and just for fun, an <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/03/usability/usability-against-the-portal.epub">EPUB file</a> for your mobile device.</p>
<h2><a name="why">Why do usability?</a></h2>
<p>
Why do usability? Because very few things in life are truly intuitive. We &#8212; the totality of the Catholic Research Resources Alliance &#8212; have worked hard to build a tool facilitating Catholic scholarship. This tool is functional, and by that we mean it is always online, does not crash, and does not return incorrect information. In order for the &#8220;Portal&#8221; to be successful, it needs to go beyond functionality to usability. This means it needs to be easy-to-use, contain a limited amount of jargon, and be perceived by its intended audience as a time saver.
</p>
<h2><a name="expectations">Expectations</a></h2>
<p>
Usability is an iterative process benefiting from the experience of many people. For these reasons, each institutional member of the Catholic Research Resources Alliance&#8217;s Digital Access Committee is expected to conduct usability studies against the Portal before the end of this calendar year. Using this posting as a framework, this means five or six usability studies will be done against the Portal before Christmas. The exact times when these studies will be done, and the exact way they are facilitated are left up to each institutional member as long as this document is used as a &#8220;recipe&#8221; and the balance of the Committee is consulted. This is a group process.
</p>
<h2><a name="resources">Locally required resources</a></h2>
<p>
Fiscally speaking, usability is not an expensive process. Instead, the greatest costs are measured in people&#8217;s time. In order to do this work each institutional member of the Digital Access Committee will need:
</p>
<ul>
<li>a team of at least 2 people, but 3 for 4 is much better</li>
<li>approximately 40 hours of time to be spent by the team</li>
<li>money to compensate testers&#8217; time</li>
<li>usability software, optional</li>
</ul>
<p>
The most important resource is the people. At a minimum, two are required. One will facilitate tests. The other will record observations. But there are many tasks that need to be done besides facilitation and observation. It is better to have three for four people on the team. This will make scheduling testers easier, reduce the possibility of overwhelming anybody, and take advantage of different people&#8217;s individual talents.
</p>
<p>
The whole usability process will consume approximately 40 hours of staff time. If you assume 6 to 8 hour-long usability studies will be facilitated at your institution, and two people are doing the work, then actually doing the tests will &#8220;cost&#8221; about 16 hours of time. Evaluating the totality of the tests may take another 2 or 3 hours, so the cost is increased by 4 to 6 hours. Scheduling people to participate is one of the most difficult parts of usability and requires a great deal of coordination. Expect to spend another 4 hours just finding and getting qualified and representative people to participate to your study. As the year progresses, we expect the Portal to change, and consequently the usability studies will change. Time will need to be spent coordinating with the Digital Access Committee on these changes, another 3 to 4 hours. If you plan to take advantage of usability software, time will need to be spent purchasing the software as well as practicing with it. Finally, time will need to be spent documenting the experience so it can most effectively be shared with the Committee.
</p>
<p>
Usability studies cost the time of testers. It is customary to compensate these people for their time. The amount of compensation will be guided by your local policies, but things like food, gift certificates, small spending sprees at the local bookstore, or services are all examples. You might need to allocate as much as $10-$25 per usability participant for compensation.
</p>
<p>
Computer software exists to help facilitate usability studies. At the very least such software records how the participant interacts with the system being tested. More full-featured software also records the participants&#8217; facial expressions and auditory responses.  For the Macintosh we suggest Silverback which costs about $80. A popular Windows application is called Morae Recorder and costs just less than $200. Employing software in your studies makes it easier to be more thorough in your evaluation as well as enabling one to share individual studies. At the same time, the software adds a bit of complexity and expense.
</p>
<h2><a name="work">Doing the work</a></h2>
<p>
Once the goals of usability and expectations are understood, and once the resources have been allocated, it is time to actually do the work. The process is more stratified and iterative than it is sequencial. In other words, it is not always necessary to complete one step before starting the next. The steps include:
</p>
<ul>
<li>refining the usability tasks to be studied</li>
<li>practicing with the technology, optional</li>
<li>scheduling testers</li>
<li>facilitating the studies</li>
<li>evaluating the results</li>
<li>reporting on the results</li>
</ul>
<p>
The following sections elaborate on each of these items.
</p>
<h3><a name="questions">Refining the usability tasks</a></h3>
<p>
Usability studies are done in an effort to learn how systems can be made easier-to-use, free of jargon, and perceived as a time saver for the intended audience. Since a primary focus of the &#8220;Portal&#8221; is to create &#8220;access to those rare, unique and uncommon research materials&#8221;, the usability study must test how well the Portal facilitates these goals. The initial set of usability studies done at Notre Dame included the following tasks:
</p>
<ol>
<li>Identify the library or archive holding the papers of Dorothy Day.</li>
<li>Find a record whose author is Graham Greene. Create an account, then add the Graham Greene record to your favorites, tagging it as &#8220;ggreene.&#8221;</li>
<li>Locate resources, including primary resources, on the Catholic Conference for Interracial Justice.</li>
<li>Find a set of records on the topic of &#8220;Catholic social action.&#8221; Choose 1-3 from the retrieved set and email them to yourself for future reference.</li>
<li>Locate materials on the topic of sermons and the Lutheran church.</li>
<li>Who owns &#8220;Our Sunday Visitor Records&#8221;? What telephone number would you call in order to schedule a time to visit the collection?</li>
<li>Which library has the most French-language materials in the &#8220;Portal&#8221;?</li>
<li>What is the most frequently used word in the pamphlet owned by Notre Dame entitled &#8220;Pastoral instruction for the application of the Decree of the Second Vatican Ecumenical Council on the Means of Social Communication&#8221;? (hint: see the record with the call number BV 4319).</li>
</ol>
<p>
Notice how the tasks touch on many different aspects of the Portal. They focus on the finding of diverse materials, identifying where they are physically located, and actually using some of them.
</p>
<p>
Since we expect to take the results of usability studies and apply them to the Portal interface as soon as feasible, the tasks outlined above may become moot over time. Consequently you will need to see how the Portal is evolving, discuss your institutional study with Digital Access Committee, and combine the result with your own personal experiences and skills to create a new set of questions.
</p>
<p>
You don&#8217;t want more than ten tasks in any set of usability studies. Any more than that and the tests take too long to facilitate and are difficult to evaluate. Put another way, create a list of tasks than can be studied in less than an hour.
</p>
<h3><a name="practice">Practice</a></h3>
<p>
If you opted to use computer software to help with your usability studies, then you will have to acquire it and practice with it. &#8220;Practice makes perfect,&#8221; and it makes you look good when facilitating the studies.
</p>
<h3><a name="schedule">Scheduling testers</a></h3>
<p>
Scheduling usability testers has got to be one of the more difficult steps. Not only does it take a long time, but it also requires a lot of coordination and selectivity. First and foremost, it is important to schedule testers who are representative of our target audience &#8212; scholars. The Portal is intended to support research in all things Catholic. The material the Portal is archival in nature, leans towards primary literature, and requires previous knowledge of a deep nature in order to adequately interpret. In general, this not necessarily a system designed for undergraduates. Please make every effort to schedule faculty and graduate students working within the Portal&#8217;s subject domain.
</p>
<p>
Schedule the testers for no more than an hour at a time. Thirty to forty minutes will be spent on doing the tasks. The balance of the time can be spent on discussion and elaboration.
</p>
<p>
Between four and eight testers is usually considered enough for usability studies, but schedule about ten with the idea that some will unexpectedly drop out. It is much easier to ask people not to come than it is to find people to participate at the last minute.
</p>
<h3><a name="do">Facilitating the study</a></h3>
<p>
You&#8217;ve created your list of tasks. You&#8217;ve practiced with the optional software. Your testers are arriving. The hard parts are now behind you, and it is time to actually do a study. The process is easy. Here&#8217;s how:
</p>
<ol>
<li>One person facilitates the study, and another person takes notes. </li>
<li>Thank the particpant for their time.</li>
<li>Remind the particpant that they are not being tested, but rather the Portal&#8217;s interface is being tested. There are no wrong answers.</li>
<li>Emphasis to the particpant the critical importance of thinking out loud. By doing so it will be much easier for you, the facilitator, to understand what is going on, and it will be easier for the note-taker to record the results.</li>
<li>When everybody is ready, go through the tasks one by one. Try really hard not to interfere with the completion of the tasks. If the particpant is really off base, then intervene but don&#8217;t do so too quickly. This part of the process is sometimes difficult to watch. Be patient. Remember, you are not being tested either, only the Portal&#8217;s interface.</li>
<li>After each of the tasks have been completed, have a discussion. Ask the particpant what they liked, disliked, and thought was easy or hard. Consider asking, &#8220;If you could change one thing about the interface, then what would it be?&#8221;</li>
<li>Thank the particpant for their valuable time, and don&#8217;t forget to give them their compensation.</li>
<li>After the particpant has gone, you may want to discuss the study among yourselves, and it a really good idea for the note taker to transcribe their notes for the evaluation process.</li>
</ol>
<h3><a name="evaluate">Evaluating the results</a></h3>
<p>
Once all the studies have been facilitated it is time to evaluate the results. The goal of the evaluation process is to come up with a prioritized list of things you think need to be improved with the Portal&#8217;s interface. The key word here is &#8220;prioritized&#8221;. We are sure there are many things that can be improved, but considering limited time and resources, some things need to be more important than others.
</p>
<p>
To create the prioritized list, try this:
</p>
<ol>
<li>Read the written notes, and review the optionally used software recordings. Based on our experience, we think it is easier to read the notes &#8220;across rather down&#8221;. In other words, we found it is easier to see patterns between the studies when we compared &#038; contrasted the responses to each question from each participant as opposed to looking at each participant as a whole. For example, we looked at all the notes for task number #1, and then all the notes for task #2, and then task #3, etc.</li>
<li>Based on the notes, create a prioritized list of three to five items where each item is something you think needs to be addressed. Be prepared to cite which tester and which task demonstrates the issue you think needs to be fixed.</li>
<li>Based on your professional opinion, create a second prioritized list of three to five items where each item is something you think needs to be fixed.</li>
<li>Bring together your team of people &#8212; have a meeting.</li>
<li>Go around the room asking everybody for their prioritized items, first based on the notes and second based on professional opinion. Record everything on a whiteboard, and add tick marks to items repeated by team members.</li>
</ol>
<p>
In the end you ought to have a list of as many as a dozen issues to be addressed, and you ought to be able to sort them by the number of tick marks each received. Here at Notre Dame we had as many as five people on our team, and our list ended up looking like this:
</p>
<ul>
<li>6 &#8211; set search filter to off by default</li>
<li>5 &#8211; enable sending of more than one email at a time</li>
<li>4 &#8211; clarify difference between canonical and remote [files]</li>
<li>3 &#8211; remove autocomplete feature</li>
<li>2 &#8211; re-do text mining language</li>
<li>1 &#8211; tweak facets to be more descriptive or complete</li>
<li>1 &#8211; retain links of original EAD file in local EAD file</li>
<li>1 &#8211; respect my browser preferences</li>
<li>1 &#8211; remember [search] results after creating account</li>
<li>1 &#8211; make local EAD file the default</li>
<li>1 &#8211; implement authority control (cross-reference) functionality</li>
<li>1 &#8211; highlight search words in result [list]</li>
<li>1 &#8211; explain what facets are</li>
<li>1 &#8211; enable further search [refinements] after selecting &#8220;archival records&#8221;</li>
<li>1 &#8211; confirm adding to favorites</li>
<li>1 &#8211; add addresses and phone numbers to records</li>
</ul>
<p>
Based on these sorts of lists it is easy to see what are priorities and what are not. &#8220;Librarians love lists.&#8221;
</p>
<h3><a name="return">Reporting on the results</a></h3>
<p>
This is the last step. Simply document what you discovered &#8212; most importantly your prioritized list of issues &#8212; and share it with the Digital Access Committee. As the results come in concerted effort will be made to address them as soon as feasible, or at least before the next round of usability studies to be conducted by other institutional members.
</p>
<h2><a name="summary">Summary and conclusion</a></h2>
<p>
Usability studies are an effective way of learn how software interfaces can be improved. They do not need to be expensive in terms of money, but they do require time and effort. Usabilities studies, like software, are never done. There are always things that can be improved. Consequently, usability studies are iterative. Implement something. Test it. Apply the results. Repeat. By working together we &#8212; the Catholic Research Resources Alliance &#8212; can share the load, draw from a wide variety of experiences, and ultimately create a better Portal interface. On our mark. Get set. Go!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/03/doing-usability-against-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Usability results from Team Catholic Portal</title>
		<link>http://www.catholicresearch.net/blog/2011/03/usability-results-from-team-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2011/03/usability-results-from-team-catholic-portal/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 18:12:31 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=301</guid>
		<description><![CDATA[This posting lists the results of a usability study done against the &#8220;Catholic Portal&#8221;. In a previous posting called &#8220;Usability testing&#8221; (dated February 14, 2011) a set of eight usability questions was outlined. Since then Team Catholic Portal here at Notre Dame facilitated six usability studies made up of five graduate students and one faculty [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting lists the results of a usability study done against the &#8220;Catholic Portal&#8221;.
</p>
<p>
In a previous posting called &#8220;<a href="http://www.catholicresearch.net/blog/2011/02/usability-testing/">Usability testing</a>&#8221; (dated February 14, 2011) a set of eight usability questions was outlined. Since then Team Catholic Portal here at Notre Dame facilitated six usability studies made up of five graduate students and one faculty member. These participants were scholars in philosophy and theology. We used the simple facilitator/note-taker approach. We employed usability software (<a href="http://silverbackapp.com/">Silverback</a>), but didn&#8217;t use it to evaluate our results. Using our notes as well as professional judgement, we evaluated the results and came up with the following prioritized list of things to be addressed with the Portal&#8217;s interface:
</p>
<ul>
<li>6 &#8211; set search filter to off by default</li>
<li>5 &#8211; enable sending of more than one email at a time</li>
<li>4 &#8211; clarify difference between canonical and remote [files]</li>
<li>3 &#8211; remove autocomplete feature</li>
<li>2 &#8211; re-do text mining language</li>
<li>1 &#8211; tweak facets to be more descriptive or complete</li>
<li>1 &#8211; retain links of original EAD file in local EAD file</li>
<li>1 &#8211; respect my browser preferences</li>
<li>1 &#8211; remember [search] results after creating account</li>
<li>1 &#8211; make local EAD file the default</li>
<li>1 &#8211; implement authority control (cross-reference) functionality</li>
<li>1 &#8211; highlight search words in result [list]</li>
<li>1 &#8211; explain what facets are</li>
<li>1 &#8211; enable further search [refinements] after selecting &#8220;archival records&#8221;</li>
<li>1 &#8211; confirm adding to favorites</li>
<li>1 &#8211; add addresses and phone numbers to records</li>
</ul>
<p>
Once we have finished migrating our existing &#8220;sandbox&#8221; implementation of the Portal to production hardware, I will see about implementing the changes. Some of them require changes to the underlying VuFind software. Some of them require changes in wording.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/03/usability-results-from-team-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Goals for 2011/2012 draft</title>
		<link>http://www.catholicresearch.net/blog/2011/03/goals-for-20112012-draft/</link>
		<comments>http://www.catholicresearch.net/blog/2011/03/goals-for-20112012-draft/#comments</comments>
		<pubDate>Wed, 23 Mar 2011 19:56:47 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=297</guid>
		<description><![CDATA[At our March 30 All-members meeting in Philadelphia http://bit.ly/CRRA_SJU , we will take a look at where we&#8217;ve been and where we are going.  The strategic draft plan:  Goals for 2011/2012 will guide our discussion.  You can have a sneak preview here:  http://bit.ly/DraftPlan2012. &#8211;Pat &#160; &#160;]]></description>
			<content:encoded><![CDATA[<p>At our March 30 <em>All-members</em> meeting in Philadelphia <a title="http://bit.ly/CRRA_SJU" href="http://bit.ly/CRRA_SJU">http://bit.ly/CRRA_SJU</a> , we will take a look at where we&#8217;ve been and where we are going.  The strategic draft plan:  Goals for 2011/2012 will guide our discussion.  You can have a sneak preview here:  <a title="http://bit.ly/DraftPlan2012" href="http://bit.ly/DraftPlan2012">http://bit.ly/DraftPlan2012.</a></p>
<p>&#8211;Pat</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/03/goals-for-20112012-draft/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>February 2011 Update</title>
		<link>http://www.catholicresearch.net/blog/2011/03/february-2011-update/</link>
		<comments>http://www.catholicresearch.net/blog/2011/03/february-2011-update/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 23:38:47 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=284</guid>
		<description><![CDATA[Please mark your calendars for the March 30 All-Members Meeting and March 29 Pre-meeting Events Tuesday, March 29 – Villanova University, Philadelphia 1-4:30 Events at Falvey Memorial Library 6:30  Dinner with CRRA colleagues, place TBA Wednesday, March 30 – Saint Joseph’s University, Philadelphia 10:00-2:00 CRRA All Members Meeting and more In this update … Welcome [...]]]></description>
			<content:encoded><![CDATA[<p>Please mark your calendars for the</p>
<p><strong>March 30 <em>All-Members Meeting </em></strong>and</p>
<p><strong>March 29 Pre-meeting Events </strong></p>
<p><strong><br />
</strong><span style="text-decoration: underline;">Tuesday, March 29 – Villanova University, Philadelphia</span><strong> </strong></p>
<p>1-4:30 Events at Falvey Memorial Library</p>
<p>6:30  Dinner with CRRA colleagues, place TBA</p>
<p><span style="text-decoration: underline;">Wednesday, March 30 – Saint Joseph’s University, Philadelphia</span></p>
<p>10:00-2:00 <strong>CRRA</strong> <strong>All Members Meeting</strong></p>
<p><em>and <a href="http://bit.ly/CRRA2011">more</a></em></p>
<p><strong>In this update …</strong></p>
<ul>
<li>Welcome to <strong><a href="http://reinert.creighton.edu/">Creighton University</a>, our newest member</strong>.  Watch for more information about Creighton in the <em>March 2011</em> <em>Update</em>.</li>
<li>Congratulations to <strong>Duquesne’s Gumberg Library for winning an</strong> <strong>LSTA Grant Award</strong> to Digitize the <em>Pittsburgh Catholic Newspaper</em>!</li>
<li>Collections Committee Update:  Portal Enhancements and more …</li>
<li>CRRA at Upcoming Conferences and Outreach to Catholic Colleges and Universities, by Jennifer Younger</li>
<li>CRRA Colleagues from the <strong>University of Dayton,</strong> <strong>Marquette University</strong>, and <strong>Dominican University</strong> braved the February weather for two days of tours, presentations, workshops, and conversation at the ND/CRRA Digital Humanities Forum and Workshops at Notre Dame.</li>
<li><strong>Mark your calendars!</strong> <a href="http://bit.ly/CRRA2011">CRRA in Philadelphia</a> (March 2011) To facilitate reservations and our hosts in providing refreshments, it would be helpful to have a count of expected attendees.  Please email Pat Lawton at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> if you will be joining us for any or all of these events.</li>
</ul>
<div>
<hr size="2" />
</div>
<p><strong> </strong></p>
<p><strong>Congratulations to Laverna Saunders and Team at Duquesne’s Gumberg Library</strong></p>
<p>for winning a $20,000 LSTA Grant Award to digitize additional years of the <em>Pittsburgh Catholic Newspaper</em>.  The Pittsburgh Catholic, America’s oldest Catholic newspaper, has been in continuous publication since March 16, 1844.</p>
<p>Through a pilot project, Gumberg Library completed digitization of the 1844-1864 issues, and seeks to complete digitization through the year 1900. “Duquesne University has continuously microfilmed the Pittsburgh Catholic since its inception in 1844. To convert the Pittsburgh Catholic to digital format, Gumberg Library started with volume 1 issue 1 and will continue to digitize the newspaper from the oldest volumes to the newest volumes.” Grant funds will allow the Gumberg to digitize the February 27, 1864 through January 3, 1900. Adding this content to our digital library will support our service to the local Catholic community and increase the depth of our unique content to be available through the CRRA (Catholic Research portal).”</p>
<p>Catholic newspapers were frequently cited as a source of great interest by participants in focus groups conducted last year at member institutions. We look forward to making the <em>Pittsburgh Catholic</em> available to portal users and encourage all to consider adding records and/or full text items for locally held Catholic newspapers, cited by scholars as an invaluable resource.  Kudos to Duquesne!</p>
<p>&nbsp;</p>
<div>
<hr size="2" />
</div>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong>The Collections Committee Update: A View from the Collections Committee </strong></p>
<p>Earlier this month, we looked at the enhanced portal functionality under development at <a href="http://vufind.library.nd.edu/">http://vufind.library.nd.edu/</a> . You may remember that one goal in the <em>Strategic Plan 2010-11 </em>is to index finding aids in full EAD (Enhanced Archival Description) content for discovery.  With the EAD indexer and viewer in place, discoverability has been significantly increased.  A search for “Dorothy Day” now retrieves items not previously found in the “non-EAD indexer” portal.   For example, this record <a href="http://vufind.library.nd.edu/Record/cuaead_id2796180">http://vufind.library.nd.edu/Record/cuaead_id2796180</a> was retrieved from the series level within the George Gilmary Higgins Papers. This exciting development underscores the value of deep indexing of archival finding aids in the portal.</p>
<p>It is also interesting to see areas in which a critical mass of items is developing along with growing member participation.  We searched four themes with the following results:</p>
<ul>
<li>Catholic social action – 1,203 responses in five institutions (and suggested topics such as “Church and social problems” and “Catholic Worker Movement” leading you to more)</li>
<li>Vatican II – 1,270 responses in ten institutions</li>
<li>Women religious – 491 responses in nine institutions</li>
<li>Catholic education – 1,626 in 11 institutions</li>
</ul>
<p>If you have rare or uncommon materials in these subject areas, these would be especially welcome additions to the portal. Of course, contributions relating to any of the <a href="../../About/CRRA#portal">twelve primary collecting themes</a> are helpful and highly desirable. If you have any questions about finding aids or priorities, please contact <a href="mailto:plawton@nd.edu">Pat Lawton</a> or any member of the <a href="../../About/Contact#collections">Collections Committee</a>.</p>
<div>
<hr size="2" />
</div>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong>CRRA Events at Upcoming Conferences</strong><br />
By Jennifer Younger, chair, CRRA Board of Directors</p>
<p><strong> </strong>At the CRRA meeting at Georgetown University, we decided to hold the annual CRRA membership meeting in conjunction with the ACRL Conference in Philadelphia. We will not be holding a formal event during the ALA Annual Conference in New Orleans, June 2011.  Just as is the case at other conferences, however, there may be informal get-togethers with other CRRA members or prospective members.</p>
<p>There are CRRA participants at a number of different conferences, including those of the Catholic Library Association (CLA), April 26-28, in New Orleans; the American Theological Library Association (ATLA), June 8-11, in Chicago; and the Society of American Archivists (SAA), August 22-27, in Chicago and possibly at the Association of Catholic Diocesan Archivists meeting during the SAA conference on August 24, in Chicago.  [All events are posted to the <a href="http://tiny.cc/Calendar798">CRRA calendar</a>.]</p>
<p>At the CRRA meeting in San Diego, a couple of people asked about setting up a local team.  Despite not planning formal meetings, we hope we can always find opportunities to get together, share best practices and ideas for carrying out the CRRA mission.</p>
<div>
<hr size="2" />
</div>
<p><strong>Outreach to Catholic Colleges and Universities</strong><br />
By Jennifer Younger, chair, CRRA Board of Directors</p>
<p>Our membership outreach to Catholic colleges and universities is underway.  The <a href="../../About/Organizations">directory</a> on our home page lists the Catholic colleges, universities and seminaries and the directors of their libraries and/or archives. I have sent the <em>About CRRA </em>brochure as well as information on the collections policy, benefits and principles of membership, and making content available to the directors. About 200 of the Catholic colleges, universities and seminaries in the United States belong to the Association of Catholic Colleges and Universities (ACCU).</p>
<p>Several directors responded immediately. Michael Lacroix, Creighton University, was the first to ask for an invitation and Creighton is joining as the 20<sup>th</sup> CRRA member. Others said they want to pursue membership. Still others said they have been carefully following previous news items since the CRRA began and are excited to hear from us. One director replied with information about the special collections at his institution, which I have sent to Bob O’Neill and the Collections Committee. With assistance from a good number of you, I am following up in conversations with all interested directors.  In addition, with directors and/or librarians from Marquette, St. Edward’s and Seton Hall, I have had the distinct pleasure of meeting directors, librarians and/or archivists at prospective member campuses.</p>
<p>We are on target to meet, possibly even to exceed, the goal in our strategic plan of 20 new members for next year, thanks to your good work. Your conversations with colleagues are the best way of reaching out to prospective members. Some of you will be attending the ACRL meeting in Philadelphia which follows the CRRA meetings on March 29-30.  Others of you are likely to be attending the Catholic Library Association meeting in New Orleans (April 26-28), the American Theological Library Association in Chicago (June 8-11) and/or the Society of American Archivists in Chicago (August 22-27). If there is anything that Pat or I can do to help you in reaching out to your colleagues, please do call on us.</p>
<p>&nbsp;</p>
<div>
<hr size="2" />
</div>
<p><strong>February 24-25 <a href="http://bit.ly/hWMU5j">ND/CRRA Digital Humanities Forum and Workshops at Notre Dame</a></strong></p>
<p><strong>Ann Hanlon</strong>, <strong>Rosemary Del Toro</strong>, <strong>John Jentz</strong> (Marquette University); <strong>Jillian Slater</strong>, <strong>Fred Jenkins</strong>, <strong>Fran Rice, Colleen Mahoney</strong> (University of Dayton) and <strong>Steven Szegedi</strong> of Dominican University braved the February winter weather and joined us for two days of tours, discussion, and conversation.  In addition to the planned events, Reference Librarian Bob Hohl provided us with a special look at the St. John’s Bible held by St. Mary’s College.</p>
<p>Thanks to all who made the trip!  This was our first meeting with many of you, and the event proved to be a great opportunity to learn about one another and our work. We learned about text mining and visualization techniques and how they are being implemented in the portal.  This ability to search across a large body of documents and “do stuff with them” (Eric Morgan) promises to be an important benefit for scholars interested in Catholic scholarship and to the CRRA mission.  As Bob O’Neill observed in a recent Collections Committee meeting, “text mining distinguishes the portal from other projects.”</p>
<p>*    *   *   *</p>
<div>
<hr size="2" />
</div>
<p><em>All<strong> CRRA events</strong></em><strong> </strong>and events of possible interest to members are posted to the <a href="http://tiny.cc/Calendar798">CRRA calendar</a>, please bookmark this link for future reference!</p>
<p>Check our progress and news on the <strong><em>CRRA blog</em></strong>: <a href="../">http://www.catholicresearch.net/blog/</a>.</p>
<div>
<hr size="2" />
</div>
<p><em>CRRA Update</em> is an electronic newsletter distributed via email January-November to provide members with an update of CRRA activities.  Please contact us at 575.631.1324 or email <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share.</p>
<p>&#8212;&#8212;&#8212;<br />
CRRA Calendar: <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a><br />
CRRA 2011 Philadelphia All-Events: <a href="http://bit.ly/CRRA2011">http://bit.ly/CRRA2011</a><br />
CRRA 2011 All-Members Meeting Saint Joseph’s: <span style="text-decoration: underline;"><a href="http://bit.ly/CRRA_SJU">http://bit.ly/CRRA_SJU</a></span></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/03/february-2011-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Join virtually the &#8220;CRRA/ND Digital Humanities Forum&#8221; Thursday Feb. 24</title>
		<link>http://www.catholicresearch.net/blog/2011/02/join-virtually-the-crrand-digital-humanities-forum-thursday-feb-24/</link>
		<comments>http://www.catholicresearch.net/blog/2011/02/join-virtually-the-crrand-digital-humanities-forum-thursday-feb-24/#comments</comments>
		<pubDate>Mon, 21 Feb 2011 21:54:06 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=276</guid>
		<description><![CDATA[Dear CRRA members and friends, Are you interested in learning more about Catholic digital scholarship and innovations in text mining and visualization to facilitate knowledge discovery within the Catholic portal?  Please join us virtually or in person this Thursday, Feb. 24 from 1:00 to 5:00 pm EST (noon Central, 10am Pacific).  For event details, see [...]]]></description>
			<content:encoded><![CDATA[<p>Dear CRRA members and friends,</p>
<p>Are you interested in learning more about Catholic digital scholarship and innovations in text mining and visualization to facilitate knowledge discovery within the Catholic portal?  Please join us virtually or in person this Thursday, Feb. 24 from 1:00 to 5:00 pm EST (noon Central, 10am Pacific).  For event details, see <a href="http://bit.ly/hWMU5j">http://bit.ly/hWMU5j</a>.</p>
<p>I am pleased to announce we are able to share this event with all of you via WebEx audio, video, and document-sharing technologies. Log-in at any time during the event and stay as long as you like.</p>
<p>For log-in information, just send an email to me at plawton@nd.edu.</p>
<p>We look forward to seeing you Thursday! &#8211;pat</p>
<p><strong> </strong><br />
&#8212;&#8211;<br />
Pat Lawton<br />
Digital Projects Librarian<br />
Catholic Research Resources Alliance<br />
574.631.1324 (office)<br />
608.698.2519 (cell)<br />
<a href="mailto:plawton@nd.edu">plawton@nd.edu</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/02/join-virtually-the-crrand-digital-humanities-forum-thursday-feb-24/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Usability testing</title>
		<link>http://www.catholicresearch.net/blog/2011/02/usability-testing/</link>
		<comments>http://www.catholicresearch.net/blog/2011/02/usability-testing/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 13:16:57 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=271</guid>
		<description><![CDATA[As we move the &#8220;Portal&#8217;s&#8221; sandbox implementation into production we plan on doing some usability testing. Below are the question we will be asking: Identify the library or archive holding the papers of Dorothy Day. Find a record whose author is Graham Greene. Create an account, then add the Graham Greene record to your favorites, [...]]]></description>
			<content:encoded><![CDATA[<p>
As we move the &#8220;Portal&#8217;s&#8221; sandbox implementation into production we plan on doing some usability testing. Below are the question we will be asking:
</p>
<ol>
<li>Identify the library or archive holding the papers of Dorothy Day.</li>
<li>Find a record whose author is Graham Greene. Create an account, then add the Graham Greene record to your favorites, tagging it as &#8220;ggreene.&#8221;</li>
<li>Locate resources, including primary resources, on the Catholic Conference for Interracial Justice.</li>
<li>Find a set of records on the topic of &#8220;Catholic social action.&#8221; Choose 1-3 from the retrieved set and email them to yourself for future reference.</li>
<li>Locate materials on the topic of sermons and the Lutheran church.</li>
<li>Who owns &#8220;Our Sunday Visitor Records&#8221;? What telephone number would you call in order to schedule a time to visit the collection?</li>
<li>Which library has the most French-language materials in the &#8220;Portal&#8221;?</li>
<li>What is the most frequently used word in the pamphlet owned by Notre Dame entitled &#8220;Pastoral instruction for the application of the Decree of the Second Vatican Ecumenical Council on the Means of Social Communication&#8221;? (hint: see the record with the call number BV 4319).</li>
<li>How would you describe the overall scope of the collection?</li>
</ol>
<p>
Wish us luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/02/usability-testing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Visitors&#8217; Info &#8211; February Digital Humanities Forum</title>
		<link>http://www.catholicresearch.net/blog/2011/02/visitors-info-february-digital-humanities-forum/</link>
		<comments>http://www.catholicresearch.net/blog/2011/02/visitors-info-february-digital-humanities-forum/#comments</comments>
		<pubDate>Thu, 03 Feb 2011 18:11:22 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=260</guid>
		<description><![CDATA[Visitor Information February 24-25, 2011 ND/CRRA Digital Humanities Forum University of Notre Dame Please note that we have reserved blocks of rooms at the Inn at St. Mary’s and the Morris Inn.  If you will be staying at either of these locations, please make your reservations by February 11, 2011. Lodging Inn at St. Mary’s [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Visitor Information </strong></p>
<p><strong><a href="http://www.catholicresearch.net/blog/?s=digital+humanities">February 24-25, 2011 ND/CRRA Digital Humanities Forum</a> </strong></p>
<p><strong>University of Notre Dame</strong></p>
<p>Please note that we have reserved blocks of rooms at the <em>Inn at St. Mary’s</em> and the <em>Morris Inn</em>.  If you will be staying at either of these locations, please make your reservations by February 11, 2011.</p>
<table border="0" cellspacing="0" cellpadding="0" width="100%">
<tbody>
<tr>
<td width="20%" valign="top"><a href="#lodging"></a></td>
<td width="20%" valign="top"></td>
<td width="20%" valign="top"></td>
<td width="20%" valign="top"></td>
<td width="20%" valign="top"></td>
</tr>
</tbody>
</table>
<hr size="2" /><strong>Lodging<br />
</strong><a href="http://innatsaintmarys-px.trvlclick.com/index.cfm">Inn at St. Mary’s</a> is about a mile walk to the Notre Dame campus.  The Inn provides shuttle service to and from the airport, and to the Notre Dame campus.  For those driving, there is free parking. Very convenient to I-80 exit/entrance, literally around the corner.</p>
<p>We have reserved a block of rooms at the special rate of $107 per night. To make your reservation, call 1-877-567-1438 and mention you are part of the “Hesburgh Library block.”  Rooms will be held through February 11, we encourage you to make your reservations now!</p>
<p>&#8212;-</p>
<p>The <a href="http://morrisinn.nd.edu/">Morris Inn</a> is just across the street from the Eck Visitor Center.  For those staying at the Morris Inn, there is free parking.</p>
<p>We have reserved a block of rooms at the special rate of $132 per night, which includes full breakfast. To make your reservation, call 574.631.2000 and mention you are part of the “Hesburgh Library block.”</p>
<p>&#8212;-</p>
<p><a href="http://www.microtelinn.com/MicrotelInn/control/Booking/property_info?propertyId=32033&amp;cid=carat_search-Microtel&amp;gclid=CJ-Ssb7G4KYCFUdrKgodDRdO2A">Microtel South Bend Notre Dame University</a> is about one mile from campus.  The hotel is brand new with a very convenient location and free Internet in room.</p>
<hr size="2" /><strong>Directions to Notre Dame</strong></p>
<p><strong>Arrival by air</strong></p>
<p>The South Bend Regional Airport is about 15 minutes by car from the Notre Dame campus (flights should be booked to South Bend, Indiana &#8212; airport code SBN). From the airport, go east on Lincolnway West (left out of the airport) to downtown South Bend. Turn left on Indiana 933 (Michigan Street) and proceed about two miles to Angela Boulevard. Turn right onto Angela, and then turn left at the second stoplight (Eddy Street). Follow signs to visitor parking.</p>
<p>Visitors also can fly to Chicago and drive or take a bus to Notre Dame. The University is about two hours by car from Chicago’s O&#8217;Hare International Airport and about 90 minutes from Midway International Airport. From O&#8217;Hare, take I-190 east out of the airport, merge onto I-90 east (the Kennedy Expressway) toward downtown Chicago and merge with I-94 south (the Dan Ryan Expressway). Take the Skyway exit off the Dan Ryan and remain on I-90 to the Indiana Toll Road, which merges with I-80. From the Illinois border, it is about 75 miles to Exit 77 (the South Bend/Notre Dame exit).</p>
<p><strong>Arrival by car</strong></p>
<p>From the north: The University is located just south of the Indiana Toll Road (Interstate 80/90). Exit I-80/90 at Exit 77 and turn right onto Indiana 933. Make a left at the fourth stop light (Angela Boulevard), then turn left at the second stoplight (Eddy Street). Follow signs to visitor parking.</p>
<p>From the south: Take U.S. 31 north which becomes Indiana 933 just south of the city of South Bend. Stay on Indiana 933 through downtown South Bend to Angela Boulevard. Turn right onto Angela, and then turn left at the second stoplight (Eddy Street). Follow signs to visitor parking.</p>
<p><strong>Arrival by train</strong></p>
<p>The South Shore Line trains run directly from the Chicago Loop (corner of Michigan and Randolph) to South Bend Regional Airport in South Bend (about a two-hour trip). From the airport, the Notre Dame campus is approximately a 15-minute ride by car. Various transportation methods are available (e.g. taxi, rental car, limo).</p>
<hr size="2" /><strong>Parking </strong></p>
<p>Parking is available just west of the Hammes Bookstore, which is across the street of the Eck Visitors Center or &#8211; in the lot just east of the DeBartolo Performing Arts Center.   See <strong><a href="http://map.nd.edu/#/placemarks/1158/zoom/16/lat/41.696444434643446/lon/-86.23906016349792">Map.nd.edu</a></strong>.</p>
<hr size="2" /><strong>Campus maps: <a href="http://map.nd.edu/#/placemarks/1158/zoom/16/lat/41.696444434643446/lon/-86.23906016349792">Map.nd.edu</a></strong></p>
<p><a href="http://nd.edu/visitors/">Notre Dame Visitors Information</a></p>
<table border="0" cellspacing="0" cellpadding="0" width="550">
<tbody>
<tr>
<td></td>
</tr>
</tbody>
</table>
<p><em> </em></p>
<p><em>Please send your questions about this event or RSVP (not essential but helpful in our planning) to<br />
Pat Lawton at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a>.</em></p>
<p>We look forward to seeing you in February!</p>
<p><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/02/forum-visualization-bw21.jpeg"><img class="aligncenter size-medium wp-image-263" title="forum visualization bw2" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/02/forum-visualization-bw21-300x207.jpg" alt="" width="300" height="207" /></a></p>
<p>﻿</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/02/visitors-info-february-digital-humanities-forum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data warehousing Web server log files</title>
		<link>http://www.catholicresearch.net/blog/2011/01/data-warehousing-web-server-log-files/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/data-warehousing-web-server-log-files/#comments</comments>
		<pubDate>Thu, 27 Jan 2011 22:09:20 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=253</guid>
		<description><![CDATA[I have begun to create a data warehouse for CRRA (VuFind) Web server log files. This posting introduces the topic. The problem There is an understandable need/desire to know how well the &#8220;Catholic Portal&#8221; is operating. But for the life of me I was not able to enumerate metrics defining success. On the other hand, [...]]]></description>
			<content:encoded><![CDATA[<p>
I have begun to create a data warehouse for CRRA (VuFind) Web server log files. This posting introduces the topic.
</p>
<h2>The problem</h2>
<p>
There is an understandable need/desire to know how well the &#8220;Catholic Portal&#8221; is operating. But for the life of me I was not able to enumerate metrics defining success. On the other hand, <strong>Pat Lawton</strong> had no problem listing quite a few. Here are most of her suggestions:
</p>
<ul>
<li>Are users looking at records?</li>
<li>Are users searching in English? Other languages?</li>
<li>Are users using field searches?</li>
<li>Can we get a sense of the number of records viewed per search?</li>
<li>Do we know how many searches resulted in zero hits?</li>
<li>How many hits came from a google search result? Or other search engine?</li>
<li>How many hits per day?</li>
<li>How many times were each institution’s records viewed?</li>
<li>How many times were the Web 2.0 things used?</li>
<li>How many users set up an account?</li>
<li>How often were the tabs at the top clicked on?</li>
<li>Per searches where records were looked at?</li>
<li>What is the average number of hits retrieved per search?</li>
<li>What percentage of queries resulted in an error message?</li>
<li>What sorts of search strings are entered?</li>
<li>When are the peak periods of use? Is there a pattern?</li>
<li>Where are users coming from?</li>
<li>Which geographic locations and types of institutions?</li>
</ul>
<p>
If you know about Web (Apache) server log files, you know that that answers to many of these questions can be found there, sort of. If you are Web server administrator who deals with these log files, then you probably know about Analog, Webalizer, and Google Analytics. These tools can answer many of the questions, above, but the information would need to be &#8220;gleaned&#8221; from the reports. Time consuming at best and rather frustrating.
</p>
<p>
So the problem is, how do I generate regular or on-demand reports answering the questions listed above?
</p>
<h2>The solution</h2>
<p>
The initial solution was to write some sort of computer program regularly reading log files, and outputing the desired answers. Upon reflection, this would be tedious because the business logic &#8212; the questions needing answered &#8212; would either be hard-coded into the program, or the program would require an abunanced of command line switches. Complicated and not very flexible. Remember, good computer programs are programs that do one thing and do it well &#8212; the &#8220;Unix Way&#8221;.
</p>
<p>
Instead, the solution will be to first create a database &#8212; a &#8220;data warehouse&#8221; &#8212; containing log file content, and second, to provide a front-end to the database enabling people to query it. With this approach, counting the number of times anything occurs could be as easy as a single SQL (Structured Query Language) query as opposed to tabulating 10s of thousands of log file entries.
</p>
<p>
To date, the database is simple and defined by the following MySQL-specific SQL statement:
</p>
<pre><code>  CREATE TABLE IF NOT EXISTS `crra_logs`.`transactions` (
	`id`          INT           NOT NULL  AUTO_INCREMENT PRIMARY KEY,
	`host`        VARCHAR(128)  NOT NULL,
	`username`    VARCHAR(16)   NOT NULL,
	`password`    VARCHAR(16)   NOT NULL,
	`datetime`    DATETIME      NOT NULL,
	`timezone`    VARCHAR(8)    NOT NULL,
	`method`      VARCHAR(8)    NOT NULL,
	`request`     VARCHAR(1024) NOT NULL,
	`protocol`    VARCHAR(8)    NOT NULL,
	`statuscode`  VARCHAR(8)    NOT NULL,
	`bytessent`   INT           NOT NULL,
	`referrer`    VARCHAR(1024) NOT NULL,
	`useragent`   VARCHAR(1024) NOT NULL,
	`hosttype`    VARCHAR(16)   DEFAULT 'unknown',
	`requesttype` VARCHAR(16)   DEFAULT 'unknown'
  );</code></pre>
<p>
The astute Web server administrator will notice how the database&#8217;s structure mirrors almost exactly an Apache &#8220;combined&#8221; log file, with the following exceptions:
</p>
<ul>
<li><code>id</code> is a unique key</li>
<li><code>datetime</code> is a reformulation of the time stamp found in the Apache&#8217;s logs</li>
<li><code>hosttype</code> and <code>requesttype</code> are fields used to classify transactions, explained below</li>
</ul>
<p>
I then wrote <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/load.pl">a Perl script</a> to read log files, parse each entry into fields, and stuff the result into the database. (&#8220;Thank goodness for regular expressions!&#8221;) Once this is done it is almost trivial to answer questions like this:
</p>
<ul>
<li>How many different computers from the University of Notre Dame used the &#8220;Portal&#8221;? &#8211; <code>SELECT COUNT(host) AS c, host FROM transactions WHERE host LIKE '%.nd.edu' GROUP BY host ORDER BY c DESC</code></li>
<li>What are the 100 most popular requests sent to the server? &#8211; <code>SELECT COUNT(request) AS c, request FROM transactions GROUP BY request ORDER BY c DESC LIMIT 100</code></li>
<li>My computer&#8217;s address is lib-1234.library.nd.edu. What requests did I make against the Portal on December 13, 2010, and in what order? &#8211; <code>SELECT datetime, request FROM transactions WHERE host = 'lib-1234.library.nd.edu' AND datetime LIKE '2010-12-13%' ORDER BY datetime ASC</code></li>
</ul>
<p>
Unfortunately, without some extra knowledge answering Pat&#8217;s questions is still problematic. For example, how does one count &#8220;hits&#8221; against the Portal when requests from Internet robots and spiders bloat the input? How does one accurately count searches for content and record views when so many of the requests include calls for images, javacript files, and cascading stylesheets?
</p>
<p>
The answers lie in the use of classification as well as the <code>hosttype</code> and <code>requesttype</code> fields. Many (most) of the &#8220;hits&#8221; on the Portal come from a computer named googlebot.com. I know this is a robot, and I can flag database records accordingly with the following SQL &#8212; <code>UPDATE transactions SET hosttype = 'robot' WHERE host LIKE '%.googlebot.com'</code>. Once I do this for all the robots hitting the Portal, I can accurately answer the question, &#8220;What computers operated by humans use the Portal the most?&#8221; &#8212; <code>SELECT COUNT(host) AS c, host FROM transactions WHERE hosttype <> 'robot' GROUP BY host ORDER BY c DESC LIMIT 100</code>.
</p>
<p>
Because VuFind uses HTTP GET methods almost exclusively, all transactions are saved in the Web server log files. These transactions have patterns. Searches contain the string &#8220;?lookfor=&#8221;. Record views all start with &#8220;/Record/&#8221;. Requests for supporting content contain things like &#8220;.gif&#8221;, &#8220;.css&#8221;, &#8220;.js&#8221;, etc. Consequently it is easy to classify the requests with SQL statements like this &#8212; <code>UPDATE transactions SET requesttype = 'record' WHERE request LIKE '/Record/%'</code>. Now it is really easy to count the most frequent record views by humans &#8212; <code>SELECT COUNT(request) AS c, request FROM transactions WHERE hosttype <> 'robot' AND requesttype = 'record' GROUP BY request ORDER BY c DESC LIMIT 100</code>.
</p>
<p>
Much of the work described above has been implemented in a handful of files &#8212; <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/warehouse-2011-01-27.zip">4 SQL files and 1 Perl script</a> &#8212; available for downloading. More classification work needs to be done but the foundation has been layed. The next big steps include automating the ingestion of new log file content and building a user interface to query the database.
</p>
<h2>Summary</h2>
<p>
Log file analysis will be greatly simplified through the use of data warehousing technqiues, and the consistently structured requests implemented by VuFind will make it much easier to learn who is using the Portal and how.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/data-warehousing-web-server-log-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Really simple movies</title>
		<link>http://www.catholicresearch.net/blog/2011/01/really-simple-movies/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/really-simple-movies/#comments</comments>
		<pubDate>Wed, 26 Jan 2011 21:27:37 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=246</guid>
		<description><![CDATA[I have created a set of really simple movies demonstrating the features and functions of the &#8220;Catholic Portal&#8221; &#8212; http://bit.ly/eCls8b Enjoy!?]]></description>
			<content:encoded><![CDATA[<p>I have created a set of really simple movies demonstrating the features and functions of the &#8220;Catholic Portal&#8221; &#8212; <a href="http://bit.ly/eCls8b">http://bit.ly/eCls8b</a> Enjoy!?</p>
<p><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/introduction.swf"><img class="aligncenter size-medium wp-image-248" title="Introduction" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/introduction-300x192.png" alt="" width="300" height="192" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/really-simple-movies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA/ND Digital Humanities Forum, February 24-25, 2011</title>
		<link>http://www.catholicresearch.net/blog/2011/01/crrand-digital-humanities-forum-february-24-25-2011/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/crrand-digital-humanities-forum-february-24-25-2011/#comments</comments>
		<pubDate>Tue, 25 Jan 2011 19:31:57 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=243</guid>
		<description><![CDATA[Digital Humanities Forum and Workshops February 24-25, 2011 sponsored by Hesburgh Libraries, the Center for Research Computing (CRC), and the Catholic Research Resources Alliance (CRRA) SCHEDULE OF EVENTS Thursday, February 24, 2011 10:30               Optional tours of the Notre Dame campus and libraries (Meet at Eck Center) 11:30               Optional lunch at Legends on [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Digital Humanities Forum<br />
</strong><strong>and Workshops</strong><strong><br />
</strong><strong>February 24-25, 2011</strong></p>
<p><em>sponsored by</em><br />
Hesburgh Libraries, the Center for Research Computing (CRC), and the<br />
Catholic Research Resources Alliance (CRRA)</p>
<p><strong>SCHEDULE OF EVENTS</strong></p>
<p><strong>Thursday, February 24, 2011<br />
</strong>10:30               Optional tours of the Notre Dame campus and libraries (Meet at <a href="http://tour.nd.edu/locations/eck-visitors-center/">Eck Center)</a><br />
11:30               Optional lunch at Legends on the ND campus</p>
<p><strong>Forum draft Agenda                                                Eck Center Auditorium<br />
</strong>1-1:30              Welcome, overview, context, and demo of the Catholic portal (Eric Morgan)<br />
1:45 &#8211; 2:45       Crivella West<br />
2:45 &#8211; 3:15       Break<br />
3:15 &#8211; 4:15       Ron Snyder, JSTOR<br />
4:15 &#8211; 5:00       Discussion, Q&amp;A<br />
5:00 &#8211; ?             Participants are invited to continue the discussion over drinks and/or dinner at <a href="http://www.kildaresirishpub.com/">Kildare’s Irish Pub</a>, within walking distance of the meeting site.  Shuttles also available.</p>
<p><strong>Friday, February 25, 2011.<br />
Workshops</strong> <strong>with Ron Synder,  JSTOR                             Hesburgh Library, Room 248 </strong><br />
<strong><br />
Draft Agenda</strong><br />
9:00                 Welcome, introductions (Eric Morgan)<br />
9:15                 The first workshop is a &#8220;bibliographic instruction&#8221; type session where participants will      learn how to use  JSTOR&#8217;s Data For Research interface (<a href="http://dfr.jstor.org/">http://dfr.jstor.org/</a>).<br />
10:15               Break<br />
10:45               Using  JSTOR datasets.  Intended for computer programmers.  Ron Snyder and will walk participants through the use of the raw datasets extracted from searches.<br />
12:00               Adjourn</p>
<p><strong>About the presenters:</strong><br />
<strong>Crivella West</strong> (<a href="../../../Users/jmcmanu1/AppData/Local/Microsoft/Windows/Temporary%20Internet%20Files/Content.Outlook/VFHZ0Y9K/crivellawest.com">crivellawest.com</a>) &#8211; Working closely with St. Michael&#8217;s College of the University of Toronto, Crivella West is applying text mining computing techniques to the Cardinal Newman archives for the purposes of providing enhanced understand of Newman&#8217;s writings and thought. Representatives from Crivella West will describe and demonstrate these techniques.</p>
<p><strong>Ronald Snyder, JSTOR</strong> (<a href="../Local%20Settings/Users/jmcmanu1/AppData/Local/Microsoft/Windows/Temporary%20Internet%20Files/Content.Outlook/VFHZ0Y9K/dfr.jstor.org">dfr.jstor.org</a>) &#8211; Snyder is a driving force behind JSTOR&#8217;s research and development efforts. As a technologist and data miner, he will discuss DfR (Data For Research) which allows one to search JSTOR, illustrate results with charts and graphs, and download resulting datasets for further analysis. Snyder will be discussing Dfr as content discovery tool supporting research generally, and the use of DfR for obtaining datasets, both in bulk form and programatically.</p>
<p><strong>Eric Morgan</strong> (University of Notre Dame) – Eric will provide an overview and demo of the <a href="../../">Catholic portal</a>. The Catholic portal is a project of the Catholic Research Resources Alliance whose mission is to provide enduring global access to Catholic scholarly materials.</p>
<p>Participants are welcome to come and go at any time!</p>
<p>Refreshments will be served at all events.</p>
<p><em>Please send your questions about this event or RSVP (not essential but helpful in our planning) to Pat Lawton at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a>.</em></p>
<p>We look forward to seeing you in February!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/crrand-digital-humanities-forum-february-24-25-2011/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>VuFind, OAI-PMH, and the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2011/01/vufind-oai-pmh-and-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/vufind-oai-pmh-and-the-catholic-portal/#comments</comments>
		<pubDate>Fri, 14 Jan 2011 15:50:33 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=240</guid>
		<description><![CDATA[Without undue difficulty I have been able to harvest metadata from a ContentDM site via OAI-PMH, index the data in Solr, and successfully search &#38; retrieve this metadata in VuFind all for the &#8220;Catholic Portal&#8221;. This posting outlines how I did this and why it is important. Background The content of the &#8220;Portal&#8221; is expected [...]]]></description>
			<content:encoded><![CDATA[<p>
Without undue difficulty I have been able to harvest metadata from a ContentDM site via OAI-PMH, index the data in Solr, and successfully search &amp; retrieve this metadata in VuFind all for the &#8220;Catholic Portal&#8221;. This posting outlines how I did this and why it is important.
</p>
<h2>Background</h2>
<p>
The content of the &#8220;Portal&#8221; is expected to be rare, infrequently held, and uncommon. More often than not, this type of material is held in library special collections and archives. Increasingly, this same material is digitized and stored in some sort of digital repository. Any repository worth its weight in salt supports some sort of API (application programmer interface) allowing computer programs to harvest and use the underlying metadata. ContentDM is one such repository application, and OAI-PMH (Open Archives Initiative &#8211; Protocol for Metadata Harvesting) is one such API.
</p>
<p>
Duquesne University recently became a memer of the CRRA (Catholic Research Resources Alliance), and it is my job to make their metadata stored in ContentDM a part of the Portal. The balance of this posting describes how I did that.
</p>
<h2>Implementation</h2>
<p>
The latest and greatest version of VuFind comes with a PHP-based utility to harvest content from OAI-PMH data providers. Using it is simple enough. Edit a configuration file. Run a program. Metadata (XML files) appear in a local directory. The utility is smart enought to keep track of harvest dates so OAI-PMH deletes and updates can be managed easily. For more detail see the <a href="http://vufind.org/wiki/importing_records#oai-pmh_harvesting" target="_blank">section on OAI harvesting</a> on the VuFind wiki.
</p>
<p>
VuFind comes with an second PHP-based utility to index the harvested metadata. Using it requires the developer to write XSLT files, edit another configuration file, and run the program. But since my PHP skill are not nearly as strong as my Perl skills, and since I had previously indexed other XML files in a different manner, I decided not to use the PHP indexer.
</p>
<p>
My implementation is based my previously written EAD indexing routines and described in &#8220;<a href="http://www.catholicresearch.net/blog/2010/10/indexing-ead/" target="_blank">Indexing MARC and EAD in VUFind with Solr for the CRRA</a>&#8220;. In a nutshell, the script:
</p>
<ul>
<li>reads each harvested metadata file file</li>
<li>maps the Dublin Core metdata to VuFind/Solr schema fields</li>
<li>feeds the metadata to Solr</li>
</ul>
<p>
More specifically, I mapped the following Dublin Core elements to Solr schema fields like this:
</p>
<ul>
<li>contributor -&gt; author2</li>
<li>creator -&gt; author, author_letter</li>
<li>date -&gt; publishDate</li>
<li>description -&gt; description</li>
<li>format -&gt; format</li>
<li>language -&gt; language</li>
<li>publisher -&gt; publisher</li>
<li>subject -&gt; topic</li>
<li>title -&gt; title, title_auth, title_full, title_fullStr, title_full_unstemmed, title_short, title_sort</li>
<li>type -&gt; type</li>
</ul>
<p>
I populated additional Solr schema fields in different ways. Allfields is a concatonation of all the Dublin Core metadata elements. Fullrecord is a tiny XML file of my own design, similar to the one I created in the EAD implementation. Institution and building are presently hard-coded into the script but will later be pulled from a database containing all CRRA members. RecordType was filled with &#8220;oaidc&#8221;. Finally, the location of the remote digital object (dc:identifier) is inserted into the URL element of the fullrecord field.
</p>
<p>
Once the mapping is done a Perl WebService::Solr document object is created, filled with the metadata, and posted to Solr. The script is called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/oai_dc-index.pl" target="_blank">oai_dc-index.pl</a> and available for your perusal.
</p>
<p>
The final step was to write a VuFind record driver for the new record type &#8212; oaidc. The coding for this was trivial since much of the work had been done for the EAD files. I copied EadRecord.php to <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/OaidcRecord.txt" target="_blank">OaidcRecord.php</a>, and changed the names of a couple of classes. These minor tweaks enable me to display Duquesne&#8217;s name, library, and URLs in VuFind.
</p>
<p>
The end result is a set of <a href="http://bit.ly/dJ0PCB" target="_blank">five additional records in the Portal</a> [5], all pointing and providing access to digitized content from Duquesne University&#8217;s Gumberg Library.
</p>
<h2>Evaluation</h2>
<p>
The implementation is not perfect.
</p>
<p>
First of all, each of the five digitized items in Duquesne&#8217;s ContentDM implementation are books. All of the pages in the books are accessible individually, and each page has metadata associated with it. Unfortunately, the metadata is meager. Consequently I needed to delete hundreds of metatadata records from the OAI-PMH harvest and retain only the book-level metadata.
</p>
<p>
Second, the script currently includes a number of hard-coded characterisitcs, but when other OAI-PMH data repositories become available these hard-coded characterisitcs will be generalized.
</p>
<p>
Why is the important? There are a number of reasons. I believe a few of our CRRA members have ContentDM implementations. Harvesting and indexing their metadata will not only make the Portal richer, but it will also make it easier for students, teachers, and researchers to access the full text of the materials online.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/vufind-oai-pmh-and-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA in San Diego January 6, 2011</title>
		<link>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego-january-6-2011/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego-january-6-2011/#comments</comments>
		<pubDate>Wed, 12 Jan 2011 14:50:04 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=233</guid>
		<description><![CDATA[From left to right: Eric Morgan (ND), Eric Frierson (St. Ed&#8217;s), Marta Deyrup (Seton Hall), Clay Stalls (Loyola Marymount), Kris Brancolini (Loyola Marymount), Jennifer Younger (CRRA), Tyrone Cannon (Univ of San Francisco), Janice Welburn (Marquette), Jean Zanoni (Marquette), Pat Lawton (CRRA), Alma Ortega (Univ of San Diego), Theresa Byrd (Univ of San Diego), Susan Ohmer [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/crra-san-diego-2011.jpg"><img class="size-medium wp-image-234 aligncenter" title="crra san diego 2011" src="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/crra-san-diego-2011-300x199.jpg" alt="" width="300" height="199" /></a></p>
<p style="text-align: left;">From left to right: Eric Morgan (ND), Eric Frierson (St. Ed&#8217;s), Marta Deyrup (Seton Hall), Clay Stalls (Loyola Marymount), Kris Brancolini (Loyola Marymount), Jennifer Younger (CRRA), Tyrone Cannon (Univ of San Francisco), Janice Welburn (Marquette), Jean Zanoni (Marquette), Pat Lawton (CRRA), Alma Ortega (Univ of San Diego), Theresa Byrd (Univ of San Diego), Susan Ohmer (Notre Dame), Laverna Saunders (Duquesne), Diane Maher (U San Diego), Ed Starkey (U San Diego)</p>
<p style="text-align: left;">The San Diego meeting provided an opportunity for new and continuing CRRA members and friends to look at the enhanced portal, discuss future directions for the CRRA,  and last but not least,  to get to know one another.</p>
<p style="text-align: center;">
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego-january-6-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA in San Diego Jan. 6, 2011</title>
		<link>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego-jan-6-2011/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego-jan-6-2011/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 23:26:06 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=227</guid>
		<description><![CDATA[We look forward to seeing many of you in San Diego for our upcoming meeting.  Full details follow and are on the web at http://tinyurl.com/crra-jan2011. Portal development is a focal point for this meeting.  Many milestones have been met and Eric will demonstrate new portal functionality including Web 2.0 features of VuFind, an EAD indexing [...]]]></description>
			<content:encoded><![CDATA[<p>We look forward to seeing many of you in San Diego for our upcoming meeting.  Full details follow and are on the web at <a href="http://tinyurl.com/crra-jan2011" target="_blank">http://tinyurl.com/crra-jan2011</a>.</p>
<p>Portal development is a focal point for this meeting.  Many milestones have been met and Eric will demonstrate new portal functionality including Web 2.0 features of VuFind, an EAD indexing and display tool, and text mining techniques to facilitate discovery and creation of new knowledge.</p>
<p>For those of you unable to join us on-site, please join via the live webcast.  You may virtually join the meeting at any time, simply by clicking on this link: <a href="http://connectpro87278527.adobeconnect.com/crra/" target="_blank">http://connectpro87278527.adobeconnect.com/crra/</a>.</p>
<p>Jeff Rach (<a href="mailto:jrach@sandiego.edu">jrach@sandiego.edu</a> ) is helping us with all things technical, and  has made the whole process quite seamless.  When you log in, you will see documents we are viewing, hear the discussion via audio streaming, and join in the conversation via textchat.</p>
<p>If you have any questions, please contact me via email or my cell: 608.698.2519.</p>
<p>Best wishes, safe travels, and see you online &#8211;pat</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p>University of San Diego<br />
Thursday, January 6, 2011<br />
Agenda and details<br />
5:30-8:00 pm (Eastern) 4:30-7:00 pm (Central) 2:30-5:00 pm (Pacific)</p>
<p><strong>On-site</strong><br />
Joan B. Kroc Institute for Peace and Justice (IPJ)<br />
Across from the Copley Library, University of San Diego<br />
Campus map at<a href="http://www.sandiego.edu/maps/pdf_print.php#"> http://www.sandiego.edu/maps/pdf_print.php#</a></p>
<p><strong>Webcast</strong><br />
To join the meeting, follow the link:<br />
<a href="http://connectpro87278527.adobeconnect.com/crra/">http://connectpro87278527.adobeconnect.com/crra/</a><br />
Invited By: Jeff Rach (jrach@sandiego.edu)<br />
Contact Pat via cell: 608.698.2519</p>
<p><strong>Agenda</strong><br />
2:00 Refreshments<br />
2:30 Welcome, introductions (Jennifer)<br />
2:45 Steps in building the portal (Pat)<br />
3:00 Portal demo (Eric)<br />
○ EAD viewer<br />
○ Website navigability<br />
○ Concordance demonstration<br />
○ Use statistics<br />
3:45 break<br />
4:05 Discussion (Pat)<br />
○ Hosting usability studies at member institutions<br />
○ How CRRA can help members to make records available, i.e., how the<br />
process works or may be facilitated for institutions<br />
○ The NEH Challenge grant, future directions<br />
4:45 Wrap up, closing remarks (Jennifer)<br />
5:00 Adjourn</p>
<p><strong>For those who will be in San Diego</strong><br />
In addition to the meeting at 2:30, tours of the University of San Diego (USD) campus will be<br />
available to all CRRA attendees. Thanks to the generosity of Theresa Byrd, Director of the<br />
Copley Library, lunch will also be provided. Events begin at 10:00 &#8211; feel free to attend one or all<br />
events.</p>
<p>A map of campus is here: <a href="http://www.sandiego.edu/maps/pdf_print.php#">http://www.sandiego.edu/maps/pdf_print.php#</a></p>
<p>10:00 Tour of Copley Library and Archives<br />
Meet at Copley Library<br />
11:30 Lunch<br />
Held at the Joan B. Kroc Institute for Peace and Justice (IPJ) 165 Conference Room F<br />
(Board members will be meeting in IPJ 164 Conference Room G)<br />
12:30 Walking tour of campus that will include a visit to a current exhibit on campus titled: |<br />
Dreams &amp; Diversions: 250 Years of Japanese Woodblock Prints: http://<br />
www.signonsandiego.com/news/2010/sep/18/exhibit-of-japanese-woodblock-prints-includes/<br />
Meet at Joan B. Kroc Kroc Institute for Peace and Justice (IPJ) 165 Conference Room F<br />
2:00 Refreshments<br />
Kroc IPJ 164 Conference Room G<br />
2:30 Open Forum &#8211; CRRA all members meeting<br />
Kroc IPJ 164 Conference Room G<br />
5:00 Adjourn<br />
6:00 Dinner (on your own) at C Level Restaurant</p>
<p><strong>TRAVEL INFO</strong><br />
AIRPORT<br />
The campus is a 10-minute cab ride (approximately $15.00).<br />
Super Shuttle: http://www.supershuttle.com/</p>
<p>USD TRAM<br />
Listed below is information about USD’s tram service which runs from the campus to the Old<br />
Town Transit Center every half hour between 6:45am-10:15am and then from 3:00pm-7:30pm.<br />
Conference participants can take the San Diego Trolley from downtown to the Old Town Transit<br />
Center. Here is the link to the San Diego Trolley: http://www.sdmts.com/trolley/trolley.asp<br />
Due to the completion of finals and classes not being held Wednesday, December 22, 2010, the Tram<br />
Services Department will be running on a modified schedule. Please note the number of trams in service<br />
will be reduced and the hours of operation will be limited to the following times:<br />
Campus Routes (reduced service)<br />
6:30 a.m. to 7:30 p.m.<br />
Old Town (regular schedule)<br />
6:45 a.m. to 10:15 a.m.<br />
3:00 p.m. to 7:30 p.m.<br />
Also due to the extended holiday period the Tram Services Department will be shut down completely from<br />
December 23, 2010 thru January 2, 2010. There will be no trams servicing the campus, as well as trips to<br />
and from the Old Town.<br />
December 23, 2010 &#8211; January 2, 2010<br />
No tram service due to the campus closure for the holidays.<br />
<strong>For those participating via the Web</strong><br />
The San Diego meeting will be available to all members and friends via the Web using the userfriendly<br />
application, Adobe Connect. Those joining the meeting via the Web will see and hear<br />
meeting attendees in San Diego via video and audio streaming, will see shared documents,<br />
and able to post comments and ask questions via text chat. Please join in for as much of the<br />
meeting as you are able!<br />
For those interested in the meeting, but unable to attend in-person or via the web, the session<br />
will be archived.<br />
To join the webcast:<br />
Meeting Name: CRRA Meeting<br />
Invited By: Jeff Rach (jrach@sandiego.edu)<br />
To join the meeting:</p>
<p>http://connectpro87278527.adobeconnect.com/crra/</p>
<p>&#8212;&#8212;&#8212;&#8212;<br />
If you have never attended an Adobe Connect meeting before:<br />
Test your connection: http://connectpro87278527.adobeconnect.com/common/help/en/support/<br />
meeting_test.htm<br />
Get a quick overview: http://www.adobe.com/go/connectpro_overview<br />
Adobe, the Adobe logo, Acrobat and Adobe Connect are either registered trademarks or trademarks of Adobe Systems Incorporated in the<br />
United States and/or other countries.<br />
&#8212;&#8212;-<br />
Jeff Rach<br />
Senior AudioVisual Technician<br />
Joan B. Kroc School of Peace Studies<br />
University of San Diego<br />
5998 Alcalá Park-KIPJ 134<br />
San Diego, CA 92110-2492<br />
Phone: (619) 260-7810<br />
Fax: (619) 260-7809<br />
jrach@sandiego.edu</p>
<p><strong>CONTACT INFO</strong><br />
Whether you are in San Diego or joining via the webcast, if you have any questions, please<br />
contact me (Pat) via my cell at any time: 608.698.2519.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego-jan-6-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA in San Diego</title>
		<link>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 21:02:14 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=219</guid>
		<description><![CDATA[This is a simple annotated list of links used as an outline for a presentation to the CRRA in San Diego: CRRA website &#8211; The good ol&#8217; look &#038; feel but wrapped around new content and functionality. (&#8220;Thank you, Eric Frierson!&#8221;) Web 2.0 &#8211; All the Web 2.0 links (cite this, email this, favorite this) [...]]]></description>
			<content:encoded><![CDATA[<p>
This is a simple annotated list of links used as an outline for a presentation to the CRRA in San Diego:
</p>
<ol>
<li><a href="http://vufind.library.nd.edu/" target="_blank">CRRA website</a> &#8211; The good ol&#8217; look &#038; feel but wrapped around new content and functionality. (&#8220;Thank you, Eric Frierson!&#8221;)</li>
<li><a href="http://vufind.library.nd.edu/Record/marmarc_ocm53369060" target="_blank">Web 2.0</a> &#8211; All the Web 2.0 links (cite this, email this, favorite this) that did not work previously now function correctly.</li>
<li><a href="http://vufind.library.nd.edu/Record/unaead_id2671385" target="_blank">EAD viewer</a> &#8211; It is now possible to view EAD files locally or from the originating institution.</li>
<li><a href="http://vufind.library.nd.edu/Search/Results?lookfor=Carolus+Rossatius+Dei&amp;type=AllFields&amp;submit=Find" target="_blank">Item-level indexing</a> &#8211; The content of EAD files is indexed at the item level making for finer-grained searching.</li>
<li>PDF display &#8211; Records linking to digitized versions of books now enable a person to get the full text. Examples include content from the <a href="http://vufind.library.nd.edu/Record/tormarc_tractatusdeactib00peri" target="_blank">St. Michael&#8217;s</a> and the <a href="http://vufind.library.nd.edu/Record/undmarc_000941495" target="_blank">University of Notre Dame</a></li>
<li><a href="http://vufind.library.nd.edu/Record/undmarc_000841024" target="_blank">Text mining</a> &#8211; After extracting the full text from the PDF documents, it is possible to apply concordancing techniques to the full text for analysis.</li>
<li><a href="http://www.catholicresearch.net/blog/2010/10/indexing-ead/" target="_blank">Automated updating</a> &#8211; The &#8220;Portal&#8221; can be updated automatically by harvesting metadata from member institutions, massaging it for the Portal, and re-indexing it on a regular basis.</li>
<li><a href="https://www.catholicresearch.net/admin/" target="_blank">Use statistics</a> &#8211; Rudimentary Web server log file analysis as well as Google Analytics reports illustrate how the Portal is being used.</li>
<li><a href="http://www.catholicresearch.net/blog/" target="_blank">Blog</a> &#8211; A running commentary on what&#8217;s happening with Portal development.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/crra-in-san-diego/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Simple log file analysis</title>
		<link>http://www.catholicresearch.net/blog/2011/01/simple-log-file-analysis/</link>
		<comments>http://www.catholicresearch.net/blog/2011/01/simple-log-file-analysis/#comments</comments>
		<pubDate>Mon, 03 Jan 2011 21:23:36 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[log file analysis]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=215</guid>
		<description><![CDATA[Today I did a bit of simple log file analysis against the Portal&#8217;s Apache log file. Specifically, I wanted to extract the queries people have been using. Naturally, I wrote a program to do this work &#8212; parse.pl. It is rather brain-dead and certainly not 100 percent accurate, but it goes generate a report of [...]]]></description>
			<content:encoded><![CDATA[<p>
Today I did a bit of simple log file analysis against the Portal&#8217;s Apache log file. Specifically, I wanted to extract the queries people have been using.
</p>
<p>
Naturally, I wrote a program to do this work &#8212; <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/parse.pl">parse.pl</a>. It is rather brain-dead and certainly not 100 percent accurate, but it goes generate a report of some value.
</p>
<p>
In the end, the Portal was queried approximately 18,000 from September to December in 2010. The <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2011/01/report.txt">report</a> itself lists the top 100 queries and the number of times they were searched. The top 5 and the number of searches are:
</p>
<ol>
<li>Meditations (1462)</li>
<li>Cardinal virtues (918)</li>
<li>Newman, John Henry 1801-1890 (349)</li>
<li>Apostles (192)</li>
<li>Theological virtues (184)</li>
</ol>
<p>
The report also lists each query searched only once. Here&#8217;s a random sample:
</p>
<blockquote>
<p>
&#8220;Christian saints Algeria Hippo (Extinct city) Biography.&#8221; * &#8220;Christopher Hollis&#8221; * &#8220;De rege et regis institutione&#8221; * &#8220;DeAndreis, John A. 1920-1979&#8243; * &#8220;John Pearson (bishop)&#8221; * &#8220;John Pearson (cricketer)&#8221; * &#8220;John R. Cavanaugh&#8221; * &#8220;John R. Ryan&#8221; * &#8220;John Richard Parker&#8221; * &#8220;John Robert * Church year sermons Early works to 1800 * Church year sermons Early works to 1800 Indexes. * Self-esteem * Self-evaluation. * Seminary * Senigallia * Sermons, Chinese * Sermons, English * Sermons, German Early works to 1800 * pontificalia * portavoz * portrait * postmodernity * worldview * wrestling * yellow fever * younger * zill
</p>
</blockquote>
<p>
I think the value of 18,000 queries is high. I will have to investigate that. Based on the queries, I believe most people are browsing the system and not necessarily entring specific queries. Why do I think this? Well, who puts in all of that syntax when searching?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2011/01/simple-log-file-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ND/CRRA Forum on Digital Humanities</title>
		<link>http://www.catholicresearch.net/blog/2010/12/ndcrra-forum-on-digital-humanities/</link>
		<comments>http://www.catholicresearch.net/blog/2010/12/ndcrra-forum-on-digital-humanities/#comments</comments>
		<pubDate>Thu, 16 Dec 2010 14:39:35 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=212</guid>
		<description><![CDATA[This message outlines an upcoming event tentatively called the Notre Dame/CRRA Forum on Digital Humanities: Who: Anybody and everybody across the University What: A set of presentations and workshops on digital humanities When: Thursday afternoon (February 24) and Friday morning (February 25) Where: (probably) Geddes Hall Why: Because it is about more than find and [...]]]></description>
			<content:encoded><![CDATA[<p>
This message outlines an upcoming event tentatively called the Notre Dame/CRRA Forum on Digital Humanities:
</p>
<pre>
    Who: Anybody and everybody across the University
   What: A set of presentations and workshops on
         digital humanities
   When: Thursday afternoon (February 24) and Friday
         morning (February 25)
  Where: (probably) Geddes Hall
    Why: Because it is about more than find and
         access, it is also about use and
         understanding
</pre>
<p>
The Hesburgh Libraries, the Center for Research Computing (CRC), and the Catholic Research Resources Alliance (CRRA) are jointly sponsoring a set of presentations and workshops on the digital humanities Thursday afternoon (February 24) and Friday morning (February 25). While all of the details have yet to be ironed out, we expect there to be at least two presenters on Thursday:
</p>
<ol>
<li><a href="http://crivellawest.com/">Crivella West</a> &#8211; Working closely with St. Michael&#8217;s College of the University of Toronto, Crivella West is applying text mining computing techniques to the Cardinal Newman archives for the purposes of providing enhanced understand of Newman&#8217;s writings and thought. We expect Crivella West to describe these techniques during the Forum.</li>
<li><a href="http://dfr.jstor.org/">Ron Snyder</a> &#8211; Snyder is a driving force behind some of JSTOR&#8217;s research &amp; development efforts. He will be discussing the digital humanities in general as well as demonstrating JSTOR&#8217;s Data For Research interface which allows one to search JSTOR, illustrate results with charts &amp; graphs, and download resulting datasets for further analysis.</li>
</ol>
<p>
On Friday morning we hope to facilitate two hands-on workshops with Snyder. The first will be akin to a traditional &#8220;bibliographic instruction&#8221; session where participants will learn in detail how to use JSTOR&#8217;s Data For Research interface. This workshop is intended for scholars, researchers, and librarians. The second workshop is intended for computer programmers and it will deal with the in&#8217;s and out&#8217;s of using the raw datasets extracted from searches.
</p>
<p>
In the end the Libraries, the CRC, and the CRRA hope to raise awareness of digital humanities computing techniques. With the advent of so much full text, the Internet, and ubiquitous computing horsepower new methods for understanding the written word are manifesting themselves. The Forum will make these ideas concrete.
</p>
<p>
&#8216;More later, but mark your calendars now.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/12/ndcrra-forum-on-digital-humanities/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Blogpost about CRRA &#8211; DePaul Univ Law Library</title>
		<link>http://www.catholicresearch.net/blog/2010/12/blogpost-about-crra-depaul-univ-law-library/</link>
		<comments>http://www.catholicresearch.net/blog/2010/12/blogpost-about-crra-depaul-univ-law-library/#comments</comments>
		<pubDate>Fri, 10 Dec 2010 19:03:36 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=204</guid>
		<description><![CDATA[CRRA is getting some press &#8230;  DePaul University Rinn Law Library for their recent blogpost “Catholic Research Resources Alliance Helps Locate Canon Law Titles” http://depaullaw.typepad.com/library/2010/09/catholic-research-resources-alliance-helps-locate-canon-law-titles.html DePaul is the CRRA&#8217;s newest member and we welcome and thank you!]]></description>
			<content:encoded><![CDATA[<p>CRRA is getting some press &#8230;  DePaul University Rinn Law Library for their recent blogpost “Catholic Research Resources Alliance Helps Locate Canon Law Titles” <a href="http://depaullaw.typepad.com/library/2010/09/catholic-research-resources-alliance-helps-locate-canon-law-titles.html">http://depaullaw.typepad.com/library/2010/09/catholic-research-resources-alliance-helps-locate-canon-law-titles.html</a></p>
<p>DePaul is the CRRA&#8217;s newest member and we welcome and thank you!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/12/blogpost-about-crra-depaul-univ-law-library/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Catholic Portal look &amp; feel</title>
		<link>http://www.catholicresearch.net/blog/2010/12/catholic-portal-look-feel/</link>
		<comments>http://www.catholicresearch.net/blog/2010/12/catholic-portal-look-feel/#comments</comments>
		<pubDate>Thu, 02 Dec 2010 19:00:39 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=200</guid>
		<description><![CDATA[Thanks to the good work done by Eric Frierson of St. Edwards University, the &#8220;sandbox&#8221; of &#8220;Catholic Portal&#8221; now sports the look &#038; feel of our public view: Moreover, since the &#8220;sandbox&#8221; is runs version 1.0 of VUFind, many of the Web 2.0 links work correctly. In other words, things like emailing, tagging, citing, reviewing, [...]]]></description>
			<content:encoded><![CDATA[<p>
Thanks to the good work done by <strong>Eric Frierson</strong> of <strong>St. Edwards University</strong>, the <a href="http://vufind.library.nd.edu/">&#8220;sandbox&#8221; of &#8220;Catholic Portal&#8221;</a> now sports the look &#038; feel of our public view:
</p>
<p style='text-align: center'><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/12/screenshot.png" alt="screen shot"></p>
<p>
Moreover, since the &#8220;sandbox&#8221; is runs version 1.0 of VUFind, many of the Web 2.0 links work correctly. In other words, things like emailing, tagging, citing, reviewing, etc. function correctly.
</p>
<p>
While we could move this whole thing into production, it may behoove use to associate hyperlinks with each found item to point back to hosting institutions to facilitate access. What do you think?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/12/catholic-portal-look-feel/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Catholic pamphlets and the &#8220;Catholic Portal&#8221;</title>
		<link>http://www.catholicresearch.net/blog/2010/11/catholic-pamphlets-and-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2010/11/catholic-pamphlets-and-the-catholic-portal/#comments</comments>
		<pubDate>Thu, 18 Nov 2010 17:21:35 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=195</guid>
		<description><![CDATA[This posting outlines a possible workflow for getting digitized versions of Notre Dame&#8217;s Catholic pamphlets into the &#8220;Catholic Portal&#8221;. The problem The University of Notre Dame owns a significant number of Catholic pamphlets. These materials have been cataloged and denoted as destined for the &#8220;Portal&#8221; in their MARC records with the letters &#8220;CRRA&#8221; in field [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting outlines a possible workflow for getting digitized versions of Notre Dame&#8217;s Catholic pamphlets into the &#8220;Catholic Portal&#8221;.
</p>
<h2>The problem</h2>
<p>
The University of Notre Dame owns a significant number of Catholic pamphlets. These materials have been cataloged and denoted as destined for the &#8220;Portal&#8221; in their MARC records with the letters &#8220;CRRA&#8221; in field 590$u.
</p>
<p>
The University&#8217;s library wants to digitize these materials, make the resulting PDF files freely available on the Web, apply optical character recognition against the PDF files, and support a text mining interface against the result. Bits and pieces of this work have already been done. The problem is gluing them together into functional workflow.
</p>
<h2>The solution</h2>
<p>
Here is an outline of a proposed solution:
</p>
<ol>
<li>scan the documents</li>
<li>convert them into PDF files</li>
<li>give them meaningful names</li>
<li>store them in a Web-accessible location</li>
<li>update their corresponding MARC records</li>
</ol>
<p>
Each of the sections below elaborate on the steps above.
</p>
<h3>Scan documents</h3>
<p>
Scanning the documents is the actual process of digitizing them. It can be done in-house with brute force using something like the Bookeye hardware. Or it could be outsourced. The technicalities of the process (dpi &#8212; dots per inch, color versus black &#038; white, TIFF versus JPEG versus PNG,etc.) will be driven by collection development policies and the intended use of the materials. Personally, I think the pamphlets ought be scanned as black &#038; white images at 300 dpi and saved as uncompressed TIFF images. Compared to original manuscripts or colorful out-of-copyright materials, I do not think the pamphlets warrant anything more substantial than that.
</p>
<h3>Convert to PDF</h3>
<p>
The original TIFF images function as archival surrogates. PDF files are intended for use. The next step is to concatenate the TIFF images representing a single pamphlet into a a single PDF file. During this process OCR (optical character recognition) should be applied to the images and saved inside the PDF files. This process can be done using our existing in-house facilities or an outside jobber. A word of caution, if the scanned images include two pages, then care will need to be taken when doing the OCR. Specifically, the OCR process needs to not go all the way across the page, but rather all the way down one image first and then down the next. Otherwise the resulting plain text will not be ordered correctly.
</p>
<h3>Give the files meaningful names</h3>
<p>
There is plenty of room for interpretation in the previous two steps, but there is little or no room for interpretation in this step. Save the TIFF and PDF files with names corresponding to the pamphlets&#8217; MARC record field 001. The 001 field is a unique value across the library&#8217;s collection, and by using this value it will be easy to match a file with its description. For example, suppose the 001 field of a given pamphlet has a value of 00023459. Then assign the TIFF files names such as 00023459-001.tiff, 00023459-002.tiff, 00023459-003.tiff where everything before the dash (-) is the 001 value, and everything after the dash is a page number. Similarly, and most importantly, save the PDF files with names such as 00023459.pdf. One could save files in a single directory with a name corresponding to the 001 value, but the principle is the same. Explicitly associate the saved files with a MARC record. If the 001 field is not used during the file naming process, then updating the MARC records with URIs, below, will be considerably more expensive.
</p>
<h3>Store files</h3>
<p>
The next step is to store the files in a Web-accessible location. This could be as simple as putting them on a computer&#8217;s hard disk and providing access via the Web. Unfortunately, this process is not very scalable after 10&#8242;s of thousands of files. Consequently, it maybe be better to save the files in a repository-like application such as Fedora. The technology behind storing the files is not as important as the resulting URI/URL. It is very important to make sure the URL is constant and immutable. If it is not, then the risk of an ongoing maintenance nightmare is dramatically increased. It is also better if the URLs are shorter rather than longer. Put in the language of the Web, follow the principles of &#8220;<a href="http://www.w3.org/TR/cooluris/">cool URLs</a>&#8221; when making the digitized content Web-accesible. [1]
</p>
<h3>Update MARC</h3>
<p>
Links now need to be made between the description of the pamphlets and their location on the Web. To do this one loops through all of the digitized pamphlets and updates the 856 fields of the corresponding MARC records with the URI/URL created in the previous step. This can be done manually, but it can be done programmatically if the files have been saved with values based on MARC field 001 values. Once this process is completed it will be possible for the patron to search the library&#8217;s catalog or &#8220;discovery system&#8221;, identify a Catholic pamphlet of interest, and then choose to retrieve it from Special Collections or download it from the Web.
</p>
<h3>Optionally, update MARC again</h3>
<p>
The process outlined above provides access to the materials, but if we want to make it easier for the patron to use and manipulate the materials, then an additional URI/URL will may need to be added to the MARC record. Specifically, an additional URI/URL in field 865 may need to be included pointing to a text mining interface. Just like the URI/URL pointing to the PDF file, this URI/URL needs to be as constant and immutable as possible. If this additional step is done, then not only will the patron have access to the materials in physical and digital form, but they will also be able to perform various text mining functions against it. Examples include: listing all the words starting with the letter z and the number of times they occur, listing the 50 most frequently used words, listing the 10 most frequently used two-word phrases, concordancing the results of the previous examples thus displaying them in context.
</p>
<h2>Summary</h2>
<p>
If the process outlined above is implemented, then the &#8220;Catholic Portal&#8221; software will be able to regularly and systematically harvest the Catholic pamphlet MARC records and integrate them into the CRRA in exactly the same way all of the other CRRA content is harvested.
</p>
<p>
Providing access and use of Catholic pamphlets is of interest to both the University of Notre Dame as well as the Catholic Research Resources Alliance (CRRA). Digitizing the materials, making them Web-accessible, and integrating their locations into our collection is a way of exploiting the current technological environment as well as meeting patron expectations. By providing text mining functionality we &#8212; the Libraries and the CRRA &#8212; will be exemplifying leadership in the wider community.
</p>
<h2>Notes</h2>
<p>
[1] &#8220;Cool URIs for the Semantic Web&#8221; &#8212; <a href="http://www.w3.org/TR/cooluris/">http://www.w3.org/TR/cooluris/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/11/catholic-pamphlets-and-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Text mining Catholic pamphlets</title>
		<link>http://www.catholicresearch.net/blog/2010/11/text-mining-catholic-pamphlets/</link>
		<comments>http://www.catholicresearch.net/blog/2010/11/text-mining-catholic-pamphlets/#comments</comments>
		<pubDate>Tue, 16 Nov 2010 22:08:55 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=190</guid>
		<description><![CDATA[This is the quickest of blog postings outlining how I am initially providing a text mining interface to digitized Catholic pamphlets. Jean McManus used a scanner to create PDF versions of a few Catholic pamphlets. Along the way, she also had the software to a bit of OCR. She then gave the PDF documents to [...]]]></description>
			<content:encoded><![CDATA[<p>
This is the quickest of blog postings outlining how I am initially providing a text mining interface to digitized Catholic pamphlets.
</p>
<p>
<strong>Jean McManus</strong> used a scanner to create PDF versions of a few Catholic pamphlets. Along the way, she also had the software to a bit of OCR. She then gave the PDF documents to me with filenames matching MARC 001 fields.
</p>
<p>
I saved these files to our local file system and used the venerable pdftotext application to extract the plain text. I then hacked my locally harvested MARC records describing the given pamphlets with two additional URLs. One pointing to the local PDF file. Another pointing to a rudimentary text mining interface. Finally, I reindexed the MARC records making the URLs visible. There were only three edited records, and you can see the fruits of these labors here:
</p>
<ul>
<li><a href="http://vufind.library.nd.edu/Record/undmarc_000537132">http://vufind.library.nd.edu/Record/undmarc_000537132</a></li>
<li><a href="http://vufind.library.nd.edu/Record/undmarc_000841024">http://vufind.library.nd.edu/Record/undmarc_000841024</a></li>
<li><a href="http://vufind.library.nd.edu/Record/undmarc_000941495">http://vufind.library.nd.edu/Record/undmarc_000941495</a></li>
</ul>
<p>
There are many things wrong with the implementation. The text mining interface points to invalid catalog records because they are hard-coded for University of Toronto content. The titles of the content include MARC field 245$c, but the older text mining interface did not expect this. Consequently, the title information for these newly added records is invalid. The PDF documents were scanned two pages at a time. This probably causes the extracted text to span both pages and thus invalidate every sentence. We will need to scan only one page per image to circumvent this problem.
</p>
<p>
Despite these difficulties, it is possible now to do a bit of analysis against the pamphlet, but there are many avenues for improvement. &#8220;Software is never done.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/11/text-mining-catholic-pamphlets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VUFind record drivers and templates</title>
		<link>http://www.catholicresearch.net/blog/2010/11/vufind-record-drivers-and-templates/</link>
		<comments>http://www.catholicresearch.net/blog/2010/11/vufind-record-drivers-and-templates/#comments</comments>
		<pubDate>Thu, 11 Nov 2010 02:13:26 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=177</guid>
		<description><![CDATA[This posting documents how I wrote and edited a couple of VUFind record drivers and Smarty templates for the &#8220;Portal&#8221; of the Catholic Research Resources Alliance. In writing this posting I hope to support any developer coming behind me as well as inform the wider open source community on how VUFind works. The Problem The [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting documents how I wrote and edited a couple of VUFind record drivers and Smarty templates for the &#8220;Portal&#8221; of the Catholic Research Resources Alliance. In writing this posting I hope to support any developer coming behind me as well as inform the wider open source community on how VUFind works.
</p>
<h2>The Problem</h2>
<p>
The heart of my problem is that the Portal is essentially a union catalog &#8212; an aggregation of metadata from a number of archives and libraries. Search results do not come from a single institution but a multitude. Search results need to display specific information about locations &#8212; information describing what library owns each item, what institution hosts the library, and the call number of selected items. Out-of-the-box VUFind is designed to dynamically query a library&#8217;s integrated library system (ILS), but I deemed this too complicated for our purposes. Too many different systems. Too expensive in both time and energy.
</p>
<p>
Consequently, my problem is, &#8220;How do I display location information in a multi-library environment?&#8221;
</p>
<h2>The Solution</h2>
<p>
The solution includes three parts: 1) exploiting the indexer (Solr) supporting VUFind, 2) creating and editing VUFind record drivers, and 3) editing Smarty templates.
</p>
<h3>Exploiting the indexer</h3>
<p>
The <a href="http://www.catholicresearch.net/blog/2010/10/indexing-ead/">first part of the solution was documented in a previous posting</a> where I described how metadata in the form of MARC records and EAD files is being indexed with Solr. To summarize, as each record is fed to Solr, three bits of information are saved in the index: 1) the type of data being indexed, 2) the name of the library holding the index item, and 3) the name of the institution hosting the library. This part works well and is pretty much divorced from the internal workings of VUFind.
</p>
<h3>Creating and editing record drivers</h3>
<p>
The second part of the solution was about creating and editing VUFind record drivers. VUFind is designed to branch to alternative sets of code based on the value of a Solr field called record type. In the case of our Portal, a record type may be &#8220;marc&#8221; or &#8220;ead&#8221;. VUFind handles MARC out-of-the-box. As documented in the VUFind wiki, to handle things other than MARC <a href="http://vufind.org/wiki/other_than_marc">a record driver needed to be created</a>. This is what I did, and in our case the driver is called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/11/EadRecord-php.txt">EadRecord.php</a>.
</p>
<p>
The astute PHP programmer will see that the driver inherits the vast majority of its functionality from the IndexRecord.php; EadRecord.php only overrides two methods. The first (<code>getURLs</code>) reads the value of the fullrecord field, uses an XPath expression to extract all the URLs from the fullrecord field, and returns the URLs as an associative array. The second method (<code>getHoldings</code>) is almost an exact duplicate of getHoldings from IndexRecord.php, but creates two new Smarty tokens (CRRALibrary and CRRAInstitution). Like the URLs, the values for these tokens are pulled directly from the fullrecord field.
</p>
<p>
Next I needed to edit the MarcRecord.php and IndexRecord.php record drivers. Editing MarcRecord.php was trivial. All I had to do was turn off a Boolean value. Specifically, I had to denote the value of <code>summAjaxStatus</code> as false because I did not want VUFind to try to query a remote ILS for holdings information. (I think this sort of thing ought to be configuration setting in config.ini, but that is for another time.)
</p>
<p>
Editing IndexRecord.php was almost as easy. Just like EadRecord.php, I needed to create two new Smarty tokens (CRRALibrary and CRRAInstitution) and stuff them into templates. This was done in the <code>getHoldings</code> and <code>getSearchResult</code> methods. Additionally, in <code>getHoldings</code>, I needed to create a third token &#8212; summCallNo to hold call numbers. Again, just like EadRecord.php, this was done by pulling values out of the fullrecord field. Very, very nice.
</p>
<h3>Editing templates</h3>
<p>
The last part of the solution was editing the Smarty templates. This was done by adding my newly created tokens (CRRALibrary, CRRAInstitution, and summCallNo) in both result.tpl and holdings.tpl. I find Smarty to be a bit obtuse because it requires the developer to understand YASS (&#8220;Yet Another Scripting Syntax&#8221;). Personally, I don&#8217;t think logic such as if-then statements should exist in template files. Logic belongs in code.
</p>
<h2>Observations</h2>
<p>
In retrospect, the whole process from indexing to display was pretty efficient. Gather metadata. Parse it. Feed it to indexer. Search. Get results. Parse (something else). Fill templates. Display. In order for this process to be maintainable, I need to hope things like IndexRecord.php and MarcRecord.php do not change over upgrades. As new versions of VUFind come along I will replace the IndexRecord.php and MarcRecord.php with my versions as well as drop in EadRecord.php. Unfortunately, I don&#8217;t think I can guarantee zero changes in IndexRecord.php and MarcRecord.php. These are gotchas I will need to expect. Similarly, I added new logic to my template files. Another thing to keep in mind but not as detrimental as changes to the record drivers.
</p>
<p>
Finally, I want to publically acknowledge Demian Katz&#8217;s assistance. He outlined what needed to be done clearly and suscinctly, both in writting as well as verbally. His description of VUFind&#8217;s architecture was very understandable, especially for this Perl programmer. He knows the application very well. <em>Thank you.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/11/vufind-record-drivers-and-templates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Internet Archive content, VUFind (Solr), and text mining</title>
		<link>http://www.catholicresearch.net/blog/2010/10/ia2vufind/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/ia2vufind/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 20:06:12 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=171</guid>
		<description><![CDATA[The posting outlines how I have: 1) mirrored metadata and full text content from the Internet Archive, 2) made the mirrored content accessible through VUFind, and 3) implemented a rudimentary text mining interface against the mirror. Background The &#8220;Catholic Portal&#8221; is intended to be a research tool centered around &#8220;rare, unique, and uncommon&#8221; materials of [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>The posting outlines how I have: 1) mirrored metadata and full text content from the Internet Archive, 2) made the mirrored content accessible through VUFind, and 3) implemented a rudimentary text mining interface against the mirror.</p>
<h2>Background</h2>
<p>The &#8220;Catholic Portal&#8221; is intended to be a research tool centered around &#8220;rare, unique, and uncommon&#8221; materials of a Catholic nature. Many of these sorts of things are older as opposed to newer, and therefore, many of these things are out of copyright. Projects such as Google Books and the Open Content Alliance specialize in the mass digitization of out of copyright materials. By extension we can hope some of the things apropos to the Portal have been digitized by one or more of these projects.</p>
<p>Very recently St. Michael&#8217;s College in the University of Toronto has become a member of the Catholic Research Resources Alliance, and consequently, they desire to contribute to the Portal. As it just so happens, the University of Toronto has been a big proponent of mass digitization. They have been working with the Open Content Alliance for quite a while. Much of their content, including content from St. Michael&#8217;s, has been digitized. Complete with MARC records, PDF files, and plain text these digital artifacts are freely available for downloading. Moreover, the availability of full text content opens up the doors to all sort of text mining and digital humanities computing techniques in library &#8220;discovery systems&#8221;. Collocations. Word clouds. Graphing and mapping. Concordancing. Etc. As an example of one such discovery system, the Portal not only provides access to the content, but it can also make the content useful.</p>
<p>With input from <strong>Dave Hagelaar</strong>, <strong>Pat Lawton</strong>, and <strong>Remi Pulwer</strong> I implemented all of the things above, to some degree. The balance of this posting describes how.</p>
<h2>The Process</h2>
<p>Dave Hagelaar from St. Michael&#8217;s College sent me a set of around 600 Internet Archive unique identifiers from their collection representing &#8220;rare, unique, and uncommon&#8221; materials. Based on <a href="http://bit.ly/aTPkPJ">previous work</a>, I was able to harvest the metadata, mirror the content, and integrate the whole into our VUFind interface. The process included the following steps:</p>
<ol>
<li><strong>Convert identifiers</strong> &#8211; Each of the Internet Archive identifiers (keys) represent a Web page complete with metadata and links to digital content. The identifiers look something like this: delancienneetdel00rich. Given this information sets of URLs can be constructed pointing to locations at the Archive. Creating a set of URLs based on the list of keys was done with a trivial Perl script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/harvesting/keys2urls.pl">keys2urls.pl</a>. The resulting URL look like this:
<ul>
<li>MARC &#8211; <a href="http://www.archive.org/download/delancienneetdel00rich/delancienneetdel00rich_meta.mrc">http://www.archive.org/download/delancienneetdel00rich/delancienneetdel00rich_meta.mrc</a></li>
<li>PDF &#8211; <a href="http://www.archive.org/download/delancienneetdel00rich/delancienneetdel00rich.pdf">http://www.archive.org/download/delancienneetdel00rich/delancienneetdel00rich.pdf</a></li>
<li>plain text &#8211; <a href="http://www.archive.org/download/delancienneetdel00rich/delancienneetdel00rich_djvu.txt">http://www.archive.org/download/delancienneetdel00rich/delancienneetdel00rich_djvu.txt</a></li>
</ul>
</li>
<li><strong>Mirror content</strong> &#8211; The next step was to copy the remote data locally &#8212; mirror it. This was done using the venerable <a href="http://www.gnu.org/software/wget/">wget</a> program. Essentially, wget is called with a very long set of parameters as well as the output from Step #1. The result is a local cache of MARC, PDF, and plain text files. Since these files were saved in their own directory on an HTTP file system, each file has its own URL. To make life easier, the running of wget with all of its parameters was implemented as a simple shell script &#8212; <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/harvesting/mirror.sh">mirror.sh</a></li>
<li><strong>Enhance MARC records</strong> &#8211; Given the additional locations of the mirrored content, the MARC records harvested from the Internet Archive were not complete. They did not include URLs pointing to the Internet Archive, nor did they include the URLs pointing to the local cache. Consequently the next step was to enhance the MARC records. This was done with a second Perl script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/harvesting/updatemarc.pl">updatemarc.pl</a>, but the script does more. Since we hoped to provide text mining services against the full text, a third URL needed to be included in the MARC pointing to the text mining interface. Finally, since the text mining application needs a bit of metadata itself, a rudimentary database listing the full text items is created along the way. This entire subprocess was complicated by the fact that not all of the harvested MARC records were valid. Because of character encoding issues, some of them were not readable by my MARC record parser (MARC::Batch). Some of the records are structurally incorrect. Invalid leaders and misplaced record/field/subfield delimiters. Finally, some of the records apparently included invalid values for various indicators. To make sure the database was as clean as possible, any record generating any sort of error was not included in the final processing. This left approximately 400 of the original 600 records.</li>
<li><strong>Index MARC records</strong> &#8211; The next step was to ingest the MARC records into VUFind&#8217;s underlying Solr index. This was done with a Perl script called <a href="http://bit.ly/a3MeKE">marc-index.pl</a> and described in a <a href="http://bit.ly/cIu0lG">previous posting</a>. With the completion of this step, the content provided by St. Michael&#8217;s College became available in the Portal. Search or browse the Portal for records. Find items from St. Michael&#8217;s. Click on a link to get the content from the Internet Archive. Click on another link to retrieve it from the local cache. For example, see the <a href="http://bit.ly/ahjLf2">record for <cite>Letters of an Irish Catholic layman</cite></a>.</li>
<li><strong>Support text mining</strong> &#8211; The final step in the process deserves a blog posting in its own right, and thus only a summary will be provided here. At its foundation, text mining surrounds the process of counting ngrams whether they be single letters, single syllables, multiple syllables, individual words, multi-word phrases, sentences, etc. Once these things are counted they can be measured. Once they are measured, patterns can be sought, and if patterns are found, then overarching descriptions can be articulated resulting in the creation of new knowledge or an increase in understanding. When coupled with concordances, ngrams can be placed within the context of the larger work to learn how they were used. Using two Perl modules (<a href="http://search.cpan.org/dist/Lingua-EN-Ngram/">Lingua::EN::Ngram</a> and <a href="http://search.cpan.org/dist/Lingua-Concordance/">Lingua::Concordance</a>) a simple Web-based interface was written allowing the scholar to list the most frequent ngrams in a text, map their relative locations in it, and read snippets of text surrounding them. Using this technique it is possible to quickly and easily get an overview of the content of a document. The text mining application I created is initialized with an Internet Archive identifier. The application reads the identifier, looks up the location of the locally cached plain text file, reads it into memory, and allows the researcher to do &#8220;distant reading&#8221; against it. Unfortunately Lingua::Concordance only works sporadically against non-English files, but you can still see how the system works by using the <a href="http://bit.ly/bOXc2c">concordance against <cite>Letters of an Irish Catholic layman</cite></a>.</li>
</ol>
<h2>Summary</h2>
<p>The process outlined above described how full text content can be harvested from the &#8216;Net and integrated into the VUFind &#8220;discovery system&#8221;. The key to doing this easily was the existence of metadata (MARC records) describing the harvested items. Without this metadata the process would have been too laborious. The process also outlined how the harvested full text can be put to greater use through a simple text mining interface.</p>
<p>Software is never done. If it were, then it would be called hardware. Consequently, there are many ways the process can be improved. Examples include figuring out ways to repair broken MARC records, and updating Lingua::Concordance to work correctly with foreign language materials. Maybe I should call this job security.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/ia2vufind/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Names &amp; addresses</title>
		<link>http://www.catholicresearch.net/blog/2010/10/names-addresses/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/names-addresses/#comments</comments>
		<pubDate>Thu, 14 Oct 2010 14:20:17 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=162</guid>
		<description><![CDATA[This posting outlines how the names &#38; addresses of the &#8220;Catholic Portal&#8221; are made available. The purpose of this posting is mostly documentation. Documentation for myself, since I always forget. And documentation so somebody else can do the work after I win the lottery and move to the beach to drink cocktails with umbrellas in [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting outlines how the names &amp; addresses of the &#8220;Catholic Portal&#8221; are made available. The purpose of this posting is mostly documentation. Documentation for myself, since I always forget. And documentation so somebody else can do the work after I win the lottery and move to the beach to drink cocktails with umbrellas in them.
</p>
<p>
Here goes:
</p>
<ol>
<li><strong>Extract data</strong> &#8211; Open the spreadsheet. Activate the ACCU tab. Copy all of the data sans the &#8220;cool&#8221; data entry macros. Create a new spreadsheet. Paste all of the previously cut data into the new spreadsheet. Save the new spreadsheet for future reference with the name catholic_libraries.xls.</li>
<li><strong>Extract more data</strong> &#8211; Repeat Step #1 for the tab labeled Tab 2, but save the newly created spreadsheet with the name atla.xls.</li>
<li><strong>Clean</strong> &#8211; Open catholic_libraries.xls and delete columns so the only ones remaining are: last name, first name, school, address, city, state, zip code, and email address. Make sure the remaining columns are in the the order listed above. During this process the data may need further cleaning. For example, curly quotes need to be straightened. Carriage returns inside cells need to be removed. Make sure city and state values contain only&#8230; city and state values. No countries.</li>
<li><strong>Sort</strong> &#8211; Sort catholic_libraries.xls in ascending order by school.</li>
<li><strong>Save</strong> &#8211; Save the cleaned and sorted data as a tab-delimited text file with the name catholic_libraries.db. Make sure the resulting text file is Unix-based and not DOS- or Macintosh-based. Additionally, Excel often tries to do you a favor by surrounding fields containing commas with quotes. Remove the quotes.</li>
<li><strong>Go to Step #3</strong> &#8211; Repeat the process for the file named atla.xls, but include only the last name, first name, school, city, state, and email address in the saved data, and call the result atla.db.</li>
<li><strong>Mount</strong> &#8211; Mount the saved database files (catholic_libraries.db and atla.db) by saving them to the Portal. They are expected to live in Y:\data\vufind\web\etc.</li>
</ol>
<p>
The result of this work should then be visible under the <a href="http://www.catholicresearch.net/About/Organizations">Directory tab</a>.
</p>
<p>
This process is a bit tedious, but since the directory does not change very often, and now that I have documented the process, the next time the directory needs updating things will be easier. On the other hand, as the Portal grows, there will be a need for a real database, and it will be able to support additional functions, such as document delivery. Stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/names-addresses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Digital Access Committee (DAC) Meeting</title>
		<link>http://www.catholicresearch.net/blog/2010/10/digital-access-committee-dac-meeting/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/digital-access-committee-dac-meeting/#comments</comments>
		<pubDate>Tue, 12 Oct 2010 19:29:54 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=157</guid>
		<description><![CDATA[Today we had a CRRA Digital Access Committee (DAC) meeting via the telephone. Attendees included: Ann Hanlon Demian Katz Eric Frierson Eric Morgan Kevin Cawley Pat Lawton Susan Leister Thomas Leonhardt I did a bit of &#8220;Portal&#8221; show &#38; tell demonstrating the work done to date on indexing EAD files. (See the previous blog posting.) [...]]]></description>
			<content:encoded><![CDATA[<p>
Today we had a CRRA Digital Access Committee (DAC) meeting via the telephone. Attendees included:
</p>
<ul>
<li>Ann Hanlon</li>
<li>Demian Katz</li>
<li>Eric Frierson</li>
<li>Eric Morgan</li>
<li>Kevin Cawley</li>
<li>Pat Lawton</li>
<li>Susan Leister</li>
<li>Thomas Leonhardt</li>
</ul>
<p>
I did a bit of &#8220;Portal&#8221; show &amp; tell demonstrating the work done to date on indexing EAD files. (See the <a href="http://www.catholicresearch.net/blog/2010/10/indexing-ead/">previous blog posting</a>.) We then discussed ways the indexing/display could be improved. Suggestions included:
</p>
<ul>
<li>putting the words &#8220;Archival material&#8221; into the format field of the Solr index thus allowing better faceting</li>
<li>reading the value of langmaterials and using it as the value for Solr&#8217;s language fields, again allowing for better faceting</li>
<li>reading all of the fields associated with a given container-level element and putting them into Solr&#8217;s allfields field to improve indexing</li>
<li>extracting the last value of our current &#8220;title&#8221;, using it as our title, and using the remaining values as some sort of supplemental description or alternatively, simply reversing the &#8220;title&#8221; string</li>
</ul>
<p>
We then brainstormed ways to resolve character encoding issues, the feasibility of making our metadata available via Web servers, and the status of the metadata guidelines.
</p>
<p>
We felt we had discussed it all, so the meeting was over.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/digital-access-committee-dac-meeting/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Indexing MARC and EAD in VUFind with Solr for the CRRA</title>
		<link>http://www.catholicresearch.net/blog/2010/10/indexing-ead/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/indexing-ead/#comments</comments>
		<pubDate>Tue, 12 Oct 2010 17:31:24 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=148</guid>
		<description><![CDATA[This posting outlines how I am currently indexing MARC and EAD files in VUFind with Solr for the CRRA. (Boy, there are a lot of acronyms in that sentence!) Background The Catholic Research Resources Alliance (CRRA) is a member-driven organization with the purpose of making available &#8220;rare, unique, and uncommon&#8221; research materials for Catholic scholarship. [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting outlines how I am currently indexing MARC and EAD files in VUFind with Solr for the CRRA. (Boy, there are a lot of acronyms in that sentence!)
</p>
<h2>Background</h2>
<p>
The Catholic Research Resources Alliance (CRRA) is a member-driven organization with the purpose of making available &#8220;rare, unique, and uncommon&#8221; research materials for Catholic scholarship. Presently the membership is primarily made up of libraries and archives who pool together their metadata records, have them indexed, and provide access to the index. My responsibility is to build and maintain the technical infrastructure supporting this endeavor.
</p>
<p>
A couple of years ago much of the CRRA metadata was manifested as MARC, and at that time <a href="http://vufind.org/">VUFind</a> was selected as the tool we would use to index, search, and display this content. About six months ago the Alliance realized the growing necessity of including EAD files as well. At the same time, the ability of accomodate non-MARC metadata was increasingly becoming a VUFind reality. New ground still had to be broken; processes needed to be implemented allowing VUFind (and the underlying <a href="http://lucene.apache.org/solr/">Solr</a> indexer) to understand how to work with materials which were not book-like.
</p>
<p>
The balance of this posting describes in greater detail how I am beginning to accomodate MARC as well as EAD metadata into VUFind&#8217;s interface with Solr.
</p>
<h2>Assumptions</h2>
<p>
The system runs on a number of assumptions. First, it is assumed it is the members&#8217; responsibility to create and maintain their metadata. Second, it is my responsibility to index it and make it available for display. Moreover, it is assumed each metadata record incudes at least three values: 1) a unique identifier, 2) a human-readable description of an item, and 3) an address pointing to the location of the item. For MARC records, these things reside in the 001, 245, and 099 fields. For EAD files, they have been designated as the id attribute of unitid elements, the content of unititle elements, and the url attribute of the eadid element and from there the location of the item.
</p>
<p>
Additionally, it is assumed all metadata records, whether MARC or EAD, are available for harvesting from a Web server. In other words, each member who wants to have their MARC records available in the CRRA needs to export their records to a single file and make them accessible via a URL. Similarly, all EAD files which are intended to be indexed need to be in a single Web-accessible directory and the URL of the directory needs to be known. Making member metadata accessible via a Web server has three benefits: 1) it facilitates automation, 2) it distributes the responsibility of archiving metadata across the membership, 3) it enables the metadata to be harvested by other applications and used for other things. &#8220;Can you say &#8216;linked data?&#8217;&#8221;
</p>
<h2>
Files and Perl scripts</h2>
<p>
Given these assumptions, the following sets of files and Perl scripts are used to do the work. The first set is core the both of the other two:
</p>
<ul>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/libraries.db">libraries.db</a> &#8211; A &#8220;database&#8221; of CRRA participants consisting of their names, libraries, and URLs where their metadata records can be found. This file is used by just about every other script in the system.</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/subroutines.pl">subroutines.pl</a> &#8211; A tiny library of Perl subroutines, mostly to read the contents of libraries.db.</li>
</ul>
<p>
This second set is used to index MARC metadata:
</p>
<ul>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/marc-harvest.pl">marc-harvest.pl</a> &#8211; Copies (mirrors) remote MARC files locally</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/marc-add-code.pl">marc-add-code.pl</a> &#8211; Validates and updates the values of MARC 001 fields making sure they exist and are unique</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/marc-index.pl">marc-index.pl</a> &#8211; Slurps up a Solr marc.properties template (template.txt), makes the appropriate substitutions, and indexes the MARC records associated with a given library</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/marc-build.sh">marc-build.sh</a> &#8211; A shell script used to run all of the MARC-based scripts. One ring to rule them all.</li>
</ul>
<p>
The third is used to index EAD files:
</p>
<ul>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead-harvest.pl">ead-harvest.pl</a> &#8211; Copies (mirrors) remote XML files locally</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead-validate.pl">ead-validate.pl</a> &#8211; Makes sure the mirrored XML files are well-formed, conform to the EAD DTD, and include an eadid url attribute (done with a stupid stylesheet called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/geturl.xsl">geturl.xsl</a>)</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead-transform.pl">ead-transform.pl</a> &#8211; Makes sure each EAD container-level element includes a unitid with a unique id attribute, saves the result to a local cache, and transforms these same files into HTML. The first process is done with a stylesheet called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/addunitid.xsl">addunitid.xsl</a>. The second process is done with another stylesheet called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead2html.xsl">ead2html.xsl</a>.</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead-index.pl">ead-index.pl</a> &#8211; Indexes all the cached/transformed EAD files by parsing out container-level elements, creating an XML stream of records of my own design, parsing the result, and passing each record on to Solr. The heart of this script is a fourth stylesheet &#8212; <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead2solr.xsl">ead2solr.xsl</a></li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/ead-build.sh">ead-build.sh</a> &#8211; A shell script used to run all of the EAD-based scripts. Another ring to rule them all.</li>
</ul>
<p>
The &#8220;secret&#8221; to indexing EAD files is really no secret. I simply followed <a href="http://vufind.org/wiki/other_than_marc">Demian Katz&#8217;s instructions</a>. In a nutshell, to index non-MARC content the developer needs to:
</p>
<ul>
<li>Parse the given metadata into records. I do this with ead2solr.xsl.</li>
<li>Map each of the record&#8217;s values to as many of the underlying Solr fields as possible. Presently I only have titles and I do this through ead2solr.xsl as well.</li>
<li>Create an XML snippet representing each record and map it to the Solr fullrecord field, described below.</li>
<li>Denote a record type. I call mine ead.</li>
<li>Save the whole thing to Solr, done with ead-index.pl.</li>
</ul>
<p>
Currently, my XML snippet (Item #3) looks like this:
</p>
<pre>
  &lt;record&gt;
    &lt;id&gt;unaead_id2635150&lt;/id&gt;
    &lt;title&gt;Catholic Church. Archdiocese of Detroit (Mich.)
      Collection -- Catholic Church. Archdiocese of
      Detroit (Mich.): Manuscripts -- Letters -- Bp.
      Baraga to his sister Amalia
    &lt;/title&gt;
    &lt;date&gt;1836/1203&lt;/date&gt;
    &lt;url description='View remote, canonical version of EAD'&gt;

http://archives.nd.edu/findaids/ead/xml/det.xml

    &lt;/url&gt;
    &lt;url description='View local version of EAD file'&gt;

http://zoia.library.nd.edu/sandbox/crra-data/ead/una-det.html#id2635150

    &lt;/url&gt;
  &lt;/record&gt;
</pre>
<p>
The VUFind application provides seamless access to the index through its search box, but a bit of work needs to be done to display search results. Specifically a &#8220;record driver&#8221; needs to be written to accomodate new record types (Item #4, above). This driver inherits methods from a parent driver, IndexRecord.php, and the developer needs to override some of the methods found there with methods considering the content of the fullrecord field. Presently, the only thing I have in my record driver (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/indexing/EadRecord.txt">EadRecord.php</a>) is a method to extract URLs. In the future I will need to include methods to extract names of CRRA members, names of their libraries, and additional descriptive metadata.
</p>
<p>
You can see the fruits of these efforts in the <a href="http://vufind.library.nd.edu/">CRRA &#8220;sandbox&#8221;</a> &#8212; something we are affectionately calling &#8220;The Green Interface&#8221;.
</p>
<h2>Issues</h2>
<p>
The whole process functions and could be run automatically from cron on a daily basis, but there is plenty of room for improvement. Issues include:
</p>
<ul>
<li><strong>speed</strong> &#8211; The indexing process is slower than I&#8217;d like. I think throwing more hardware thrown at the problem will make things faster. </li>
<li><strong>invalid data and stale URLs</strong> &#8211; A small percentage of the MARC and EAD files do not include the required metadata values. No unique identifiers. Malformed MARC leaders. Non-validating EAD files and/or eadid url attributes pointing to broken locations. This is where metadata maintenance comes in.</li>
<li><strong>character encoding</strong> &#8211; This is one of the bigger problems. Trying to figure out whether or not a MARC record has been exported as UTF-8 is difficult. Solr assumes UTF-8 and I don&#8217;t think it even knows about MARC-8. When MARC data is not encoded as UTF-8, search results look really ugly. Similarly, some of the EAD files, because of similar issues, really display poorly after they have been transformed, indexed, searched, and displayed. </li>
</ul>
<p>
None of these things are insurmountable. They will be addressed.
</p>
<h2>Next steps</h2>
<p>
My immediate next steps focus on richer search results. I need to extract additional information from the EAD files to supplement the content of my fullrecord field. After that I will explore the creation of &#8220;collection-level&#8221; records by indexing the headers of EAD files. These records will be fuller because they will have things like controlled vocabularies, scope notes/abstracts, and biographies from which to draw. Once the fullrecord fields are enhanced, I will need to go back to EadRecord.php and enhance its functionality. After that I will see about creating reports listing errors in metadata files. These reports will be designed to share with members making it easier for them to maintain their content.
</p>
<p>
All of that sounds like plenty to me. Wish me luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/indexing-ead/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Very satisfying!</title>
		<link>http://www.catholicresearch.net/blog/2010/10/very-satisfying/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/very-satisfying/#comments</comments>
		<pubDate>Wed, 06 Oct 2010 20:48:23 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=142</guid>
		<description><![CDATA[I have made significant progress in the process of harvesting EAD files and preparing them for ingestion into the &#8220;Catholic Portal&#8221;. This posting outlines the successes. Assuming a Catholic Research Resources Alliance members place their EAD files in a HTTP-accessible directory, and those files have a .xml extension, then the following Perl scripts enable me [...]]]></description>
			<content:encoded><![CDATA[<p>
I have made significant progress in the process of harvesting EAD files and preparing them for ingestion into the &#8220;Catholic Portal&#8221;. This posting outlines the successes.
</p>
<p>
Assuming a <a href="http://www.catholicresearch.net/">Catholic Research Resources Alliance</a> members place their EAD files in a HTTP-accessible directory, and those files have a .xml extension, then the following Perl scripts enable me to harvest and prepare them for indexing:
</p>
<ul>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/harvest-ead.pl">harvest-ead.pl</a> &#8211; reads remote HTTP-accessible directories and copies all of the .xml files found there to a local cache</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/validate.pl">validate.pl</a> &#8211; makes sure the cached XML files are well-formed and conform to the EAD DTD, and if not, then move the files to a different directory</li>
<li><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/transform.pl">transform.pl</a> &#8211; reads the validated XML files, adds id attributes to all unitid elements through the use of a stylesheet (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/addunitid.xsl">addunitid.xsl</a>), transforms the resulting XML into HTML using another stylesheet (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/ead2html.xsl">ead2html.xsl</a>), and saves the result to an HTTP-accessible directory</li>
</ul>
<p>
What was really cool and a huge time-saver was the use of ead2html.xsl. Originally named <a href="http://www.archivists.org/saagroups/ead/stylesheets/AAAv2002-HTML.xsl">AAAv2002-HTML.xsl</a>, found on a page called <a href="http://www.archivists.org/saagroups/ead/stylesheets.html">User Contributed Stylesheets</a>, and submitted by <strong>Stephanie Ashley</strong>, this stylesheet took my id attributes and automatically made named anchors for me. Boy, did I get lucky. &#8220;Thank you, Stephanie!&#8221;
</p>
<p>
My next step is to revisit my indexing routines.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/very-satisfying/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EAD @ Marquette 4 CRRA</title>
		<link>http://www.catholicresearch.net/blog/2010/10/ead-marquette-4-crra/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/ead-marquette-4-crra/#comments</comments>
		<pubDate>Sun, 03 Oct 2010 20:22:46 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=137</guid>
		<description><![CDATA[This is the briefest of travelogues reporting on a meeting about EAD files at Marquette University for the Catholic Research Resources Alliance on September 20, 2010. A few members of the Alliance were previously awarded a CLIR grant to catalog previously uncataloged special collections items. These members are now doing the work using EAD (Encoded [...]]]></description>
			<content:encoded><![CDATA[<p>
This is the briefest of travelogues reporting on a meeting about EAD files at Marquette University for the <a href="http://www.catholicresearch.net/">Catholic Research Resources Alliance</a> on September 20, 2010.
</p>
<p><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/crra-in-marquette.gif"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/10/crra-in-marquette.gif" alt="marquette sights" title="Marquette sights" width="320" height="240" class="aligncenter size-full wp-image-138" /></a></p>
<p>
A few members of the Alliance were previously awarded a CLIR grant to catalog previously uncataloged special collections items. These members are now doing the work using EAD (Encoded Archive Description) with the intent of sharing the resulting metadata with the &#8220;Catholic Portal&#8221;. The purposes of the meeting were to build relationships between these particular Alliance members and to discuss progress on the grant. In attendance where people from St. Catherine University (<strong>Deborah Kloiber</strong> and <strong>Emily Asch</strong>), Marquette University (<strong>Matt Blessing</strong>, <strong>Ann Hanlon</strong>, <strong>Bill Fliss</strong>, and <strong>Jean Zanoni</strong>), and the University of Notre Dame (<strong>Pat Lawton</strong>, <strong>Kevin Cawley</strong>, and <strong>Eric Lease Morgan</strong>).
</p>
<p>
Of primary concern was the particular way people were using EAD and whether or not it would lend itself to indexing by the &#8220;Portal&#8221; software. Consequently, I spent a lot of the time describing the technical infrastructure of VUFind and how it interfaced with Solr, the underlying indexer/search engine. In short, the absolute need for unique identifiers, human-readable descriptions of items, and location codes were enumerated. The former two can be garnered from the unitid and unittitle elements of a EAD did elements. The later can be gotten from the url attribute of the eadid element. Everybody was confident their EAD files would contain these values.
</p>
<p>
We then went around the table doing a bit of show &#038; tell against our EAD. The folks of St. Catherine&#8217;s were using the Archivist&#8217;s Tool kit to &#8220;catalog&#8221; their Ade Bethune collection. Marquette University was using a Microsoft Access database to &#8220;catalog&#8221; Dorothy Day content.
</p>
<p>
Time tables where then outlined. The whole CLIR project is expected to be finished by December of 2011. Participants in attendance thought their work would be done by the end of Spring 2011, and the remaining time would be spent on putting the content onto the &#8220;Portal&#8221; as well as doing various types of publicity (conference presentations, etc.).
</p>
<p>
The meeting was over around noon, and we all retired to the faculty club for lunch. (&#8220;Thank you, Marquette!&#8221;)
</p>
<p>
In retrospect, there may be two additional issues needing to be addressed. First, I originally planned to assign or replace unitid values with locally generated, &#8220;Catholic Portal&#8221; specific values, but I have since learned that unitid information is often times used as a sort of call number and therefore necessary for location. Replacing (removing) such values from the EAD files may make work down the line more difficult. Maybe I should be getting the unique values from an id attribute of the unitid element instead?
</p>
<p>
Second, as a group we may need to decide how to encode dates. Dates can be nested within unittitle elements as well as free-standing elements in the did. Just as importantly, they can take all sort of forms. In order to make sorting and faceting feasible, the Alliance may need to figure out ways to standardize and normalize dates.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/ead-marquette-4-crra/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>index-ead.pl</title>
		<link>http://www.catholicresearch.net/blog/2010/10/index-ead-pl/</link>
		<comments>http://www.catholicresearch.net/blog/2010/10/index-ead-pl/#comments</comments>
		<pubDate>Thu, 30 Sep 2010 20:56:12 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=132</guid>
		<description><![CDATA[Today I indexed some of the metadata I extracted yesterday using a script called index-ead.pl. Of all the scripts I&#8217;ve written so far, this one is the most straight-forward. Read locally-developed XML file. Extract the unique identifier, title, and date. Associate each with VUFind/Solr fields. Commit. You can (temporarily) see the fruits of these labors [...]]]></description>
			<content:encoded><![CDATA[<p>
Today I indexed some of the metadata I extracted yesterday using a script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/index-ead.pl">index-ead.pl</a>. Of all the scripts I&#8217;ve written so far, this one is the most straight-forward. Read locally-developed XML file. Extract the unique identifier, title, and date. Associate each with VUFind/Solr fields. Commit.
</p>
<p>
You can (temporarily) see the fruits of these labors because all of the records have been associated with the <a href="http://tinyurl.com/26jugpv">Eric Lease Morgan Foo Bar Library</a>. The result is a list of container-level records with very little additional information.
</p>
<p>
By the way, as of today I am running a version of VUFind as retrieved from the development trunk, specifically, revision 3029. When upgrading from revision to revision, it is important to retain one&#8217;s config.ini file and reindex. The process is not painful, if done infrequently. As time goes on I will also need to retain locally developed hacks, such as the ones I need to write below.
</p>
<p>
The next steps are to write the MARC record driver so it does not attempt to do automatic look-ups for call numbers, but rather extracts such information from of the local index. A second next step is to write an EAD record driver to accomodate the special cases of&#8230; EAD records.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/10/index-ead-pl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>September Update</title>
		<link>http://www.catholicresearch.net/blog/2010/09/125/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/125/#comments</comments>
		<pubDate>Wed, 29 Sep 2010 16:26:58 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=125</guid>
		<description><![CDATA[CRRA Update SEPTEMBER 2010 In this update … CRRA Welcomes Three New Members Duquesne University, Laverna Saunders Loyola Marymount University, Kristine Brancolini University of St. Michael&#8217;s College in the University of Toronto &#38; Pontifical Institute of Mediaeval Studies, Jonathan Bengtson Digital Access Committee (DAC) News Member meetings in January and March – Mark your calendars! [...]]]></description>
			<content:encoded><![CDATA[<p><strong>CRRA Update<br />
SEPTEMBER 2010</strong></p>
<p><strong> </strong></p>
<p><strong>In this update …</strong></p>
<p><strong> </strong></p>
<ul>
<li>CRRA Welcomes Three New Members
<ul>
<li>Duquesne University, Laverna Saunders</li>
<li>Loyola Marymount University, Kristine Brancolini</li>
<li>University of St. Michael&#8217;s College in       the University of Toronto &amp; Pontifical Institute of Mediaeval Studies,       Jonathan Bengtson</li>
</ul>
</li>
<li>Digital Access Committee (DAC) News</li>
<li>Member meetings in January and March – Mark your      calendars!</li>
</ul>
<p><strong> </strong></p>
<hr size="2" /><strong> </strong></p>
<p><strong>CRRA Welcomes Duquesne University, Loyola Marymount University, and University of St. Michael&#8217;s College in the University of Toronto &amp; Pontifical Institute of Mediaeval Studies!</strong></p>
<p>We are pleased to announce the addition of three new members.   Following is brief information about our newest members, their collections and leadership.   A warm welcome to all!</p>
<p><em>[Watch for the welcome to Dominican University and University of San Francisco in October.]</em><em> </em></p>
<p><strong>New Member Highlights</strong></p>
<p>Our new members bring a rich array of rare and unique resources to the CRRA.  Collection highlights and introductions to their member library deans/directors follow.  A warm welcome to all, we look forward to getting to working with you!</p>
<p><strong>The <a href="http://www.duq.edu/library/">Gumberg Library</a> at Duquesne University</strong> <strong>(Pittsburgh, PA)</strong></p>
<ul>
<li>Spiritan Collection <span style="text-decoration: underline;"><a href="http://digital.library.duq.edu/cdm-spiritan/">http://digital.library.duq.edu/cdm-spiritan/</a></span>.<br />
Many of the primary and secondary writings of the Congregation of the Holy Spirit.</li>
<li>Papers of Cardinal John J. Wright and Vatican II<strong><span style="text-decoration: underline;"> </span></strong><span style="text-decoration: underline;"><a href="http://www.duq.edu/archives/index.cfm">http://www.duq.edu/archives/index.cfm</a></span><br />
Includes the Cardinal’s addresses, papers, sermons, writings, and personal library, with substantial material from Vatican Council II.</li>
<li>Pittsburgh Catholic Newspaper <a href="http://digital.library.duq.edu/cdm-pc/">http://digital.library.duq.edu/cdm-pc/</a><strong> .</strong><br />
Duquesne University has continuously microfilmed the Pittsburgh Catholic since its inception in 1844. To convert the Pittsburgh Catholic to digital format, Gumberg Library started with volume 1 issue 1 and will continue to digitize the newspaper from the oldest volumes to the newest volumes. Currently, the first 20 years covering from March 16, 1844 through 1864 are available by browsing or searching the full-text.</li>
</ul>
<p><strong>Laverna Saunders</strong> has served as <em>University Librarian</em> at <em>Duquesne University</em> since 2002 and worked at Salem State College, UNLV, DePauw University, Union College and Drew University over the course of her career.</p>
<p>She is a member of the PALCI Board and currently chairs LLAMA BES.  She has served on various ACRL committees over the years. She is a book reviewer for <em>Technicalities</em>, serves on the editorial board of <em>Technical Services Quarterly</em>, and has written articles and edited three books on the evolution of the virtual library. Her email address is <a href="mailto:lsaunders@duq.edu">lsaunders@duq.edu</a>.<br />
<strong><br />
</strong></p>
<hr size="2" /><strong><a href="http://library.lmu.edu/">Hannon Library</a>, Loyola Marymount University (</strong><strong>Los Angeles, CA</strong><strong>)</strong></p>
<ul>
<li><a href="http://library.lmu.edu/specialcollections/Rare_Books/List_of_Collections/1481_Dante_s_Divine_Comedy.htm">1481 printing of Dante&#8217;s Divine Comedy</a><strong> </strong>illustrated by Botticelli</li>
<li><a href="http://library.lmu.edu/specialcollections/Rare_Books/List_of_Collections.htm">Rare Book Collection</a> including Biblia Sacra Vulgatae editionis Sixti V. Pont. Max. (1603), A late 15th century example of a Book of Hours, probably printed in Paris.</li>
<li><a href="http://linus.lmu.edu/search%7ES1?/aJesuit+collection./ajesuit+collection/-3%2C-1%2C0%2CB/exact&amp;FF=ajesuit+collection&amp;1%2C518%2C">Jesuit Collection</a>: 518 rare books by and about Jesuits</li>
<li><a href="http://lmulibrary.typepad.com/lmu-library-news/2010/05/celebrating-our-first-year-with-the-gutenberg-bible-leaf.html">Leaf from the Gutenberg Bible</a></li>
<li><span style="text-decoration: underline;">Catholic Human <a href="http://library.lmu.edu/specialcollections/CSLA_Research_Collection/Catholic_Human_Relations_Council.htm">Relations</a> Council Collection, 1958-1992 </span></li>
<li>Prominent <a href="http://library.lmu.edu/specialcollections/CSLA_Research_Collection.htm#cath">Roman Catholic families</a> in Los Angeles</li>
<li><a href="http://library.lmu.edu/specialcollections/Manuscripts/Ryan_Catholic_Authors_Collections.htm">Reverend Harold F. Ryan, S. J., Catholic Authors Collection</a></li>
<li><a href="http://library.lmu.edu/specialcollections/Rare_Books/List_of_Collections.htm#longstaff">Saint Thomas More Collection</a></li>
<li><a href="http://library.lmu.edu/specialcollections/Manuscripts.htm">Thomas G. Hanrahan Jesuit Drama</a> Collection</li>
<li><a href="http://library.lmu.edu/specialcollections/Rare_Books/List_of_Collections.htm#valle">Ygnacio del Valle Family Collection</a>: Eighteenth and nineteenth century Spanish-language devotional works, novels, and instructional works<em> </em></li>
<li><a href="http://library.lmu.edu/specialcollections/artifacts/Art___Artifacts__Collection_List.htm#vestments">Early California Mission Vestments,</a> liturgical clothing in use in California Catholic churches and missions during the early nineteenth century</li>
</ul>
<p><strong>Kristine R. Brancolini </strong>is Dean of the Library at Loyola Marymount University (LMU) in Los Angeles. Prior to her arrival at LMU in July 2006, she had been a librarian at Indiana University in Bloomington for more than twenty years, where she held a number of positions.  From 1998–2006, she was the Director of the Digital Library Program (<a title="http://www.dlib.indiana.edu/" href="http://www.dlib.indiana.edu/">www.dlib.indiana.edu</a>); during that time she was principal investigator on numerous digitization projects with funding from the Institute of Museum and Library Services (IMLS), the National Endowment for the Humanities (NEH), and the U.S. Department of Education.  In August 2009, LMU opened a new library, the William H. Hannon Library, located on a bluff overlooking Marina del Rey and the Pacific Ocean.  The library has recently launched a number of new initiatives, including a Digital Library Program and a series of nearly 50 public programs in the library and elsewhere on campus.</p>
<hr size="2" /><strong><a href="http://www.utoronto.ca/stmikes/kelly/index.html">University of St. Michael&#8217;s College in the University of Toronto</a> &amp; <a href="http://www.pims.ca/">Pontifical Institute of Mediaeval Studies</a></strong></p>
<p><strong> </strong></p>
<ul>
<li>The <a href="http://www.utoronto.ca/stmikes/kelly/special_collections/chesterton.html">Chesterton Collection</a> includes over 3000 volumes connected with the life and work of the English journalist, G.K. Chesterton (1874-1936). It embraces virtually all the works of Chesterton. The collection also includes original sketches, complete microfiche of his personal papers, and the papers of John O&#8217;Connor (1870 &#8211; 1952), the Catholic priest who became the model for Chesterton&#8217;s character &#8220;Father Brown&#8221;.</li>
<li><a href="http://www.utoronto.ca/stmikes/kelly/nouwen/index.html">Henri J.M. Nouwen Archives and Research Collection</a> documents the life and work of Henri J.M. Nouwen (1932-1996) and includes the vast majority of Nouwen&#8217;s manuscripts and published works, as well as secondary material in all formats about Nouwen.</li>
<li>Archives of the <a href="http://www.utoronto.ca/stmikes/kelly/special_collections/faith_sharing.html">Faith and Sharing Federation / Foi et Partage</a>, a bilingual Catholic organization with a mandate to deepen and foster the experience of Christian community through week long retreats.</li>
<li><a href="http://www.utoronto.ca/stmikes/kelly/special_collections/counter_ref.html">The Counter-Reformation Collection</a> includes over 3500 volumes of primary source materials showing the Catholic response to the Protestant Reformation up to the time of the French Revolution.</li>
<li><a href="http://www.pims.ca/">The Pontifical Institute of Mediaeval Studies</a> library has one of the largest primary and secondary source collections of medieval documentation in North America.</li>
</ul>
<p>The library is also a lead partner in the development of the <a href="http://www.crivellawest.com/research.html">Humanities Knowledge Kiosk</a> . The software has been applied to Newman’s works and is in development for Lonergan, Nouwen and Chesterton.</p>
<p><strong>Jonathan Bengtson</strong> is the <em>Director of Library and Archives for the University of St. Michael&#8217;s College at the University of Toronto and the Pontifical Institute of Mediaeval Studies, and a Fellow of the College and the Institute</em>.  Educated in California, Oxford and London, he has held various senior positions in academic, research, and nonprofit libraries in Canada, the United States, and the United Kingdom—including Executive Director of the Providence Athenaeum (founded in 1753) in Providence, Rhode Island; Head Librarian of the Queen’s College, Oxford (founded in 1341); and, Associate University Librarian for Scholarly Resources at the University of Toronto. He is currently on the Board of Directors of the Society for the History of Authorship, Reading, and Publishing and the Multicultural History Society of Ontario. Since 2004, Jonathan has been the coordinator for the University of Toronto’s partnership with the Open Content Alliance and Microsoft Live Books mass digitization projects. He is actively involved, and one of the key collaborators, in a partnership with Crivella West Inc. of Pittsburgh to apply advanced textual linguistic analysis to public domain and in-copyright digital texts.  He is also writing a book for University of Toronto Press on an introduction to medieval manuscripts.</p>
<hr size="2" /><strong>Digital Access Committee News</strong></p>
<p>The Digital Access Committee met on September 2 to discuss topics 3 and (data input and ingestion; search functionality and display from the <em>CRRA Strategic Plan Draft: Goals for 2010/2011</em>.  (Minutes from CRRA meetings are available to all members.  Contact Pat for login and password to CRRA documents.)<a href="https://www.catholicresearch.net/admin/docs/Digital%20Access%20Committee/DAC%202010/DAC%20Minutes%2009-02-10.pdf"></a></p>
<p><strong> </strong></p>
<p><strong>Eric Frierson (St. Edward’s) and Wei Zhang (Georgetown) Join the CRRA Digital Access Committee (DAC)</strong></p>
<p>The Digital Access Committee (DAC), under the leadership of Tom Leonhardt, welcomes two new members.  Eric Frierson of St. Edward’s University and Wei Zhang of Georgetown University bring technological expertise that will help us to move forward with development of the Catholic portal. Welcome, Eric and Wei!</p>
<hr size="2" /><strong>Mark your calendars …<br />
</strong>We have tentatively set the date for a <strong>CRRA reunion and meetings</strong> during ALA Midwinter in San Diego and for our Annual All-members Meeting during ACRL in Philadelphia.</p>
<p><em>Please reserve space on your calendars now</em> – <strong>Thursday, January 6, 2011</strong> and <strong>Tuesday, March 29, 2011</strong>. For the March 29, 2011 meeting, plan to join your CRRA colleagues for dinner on Monday evening and dinner on Tuesday as well for all who can.  We will organize the day to include both separate Board and committee retreats and plenary sessions for all.</p>
<p>Further details will be distributed in future Updates, the CRRA Blog, and email.</p>
<p><strong><em> </em></strong></p>
<hr size="2" /><em>All CRRA events</em> and events of possible interest to members are posted to the CRRA calendar, available at <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a> and also accessible from the Admin area of the CRRA website.</p>
<p>Check our progress and news on the <em>CRRA blog</em>: <a href="../">http://www.catholicresearch.net/blog/</a>.</p>
<hr size="2" /><em>CRRA Update</em> is an electronic newsletter distributed via email each month to provide members with an update of CRRA activities.  Please contact us at 575.631.1324 or email <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share.</p>
<p><strong><em> </em></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/125/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preparing EAD files for indexing</title>
		<link>http://www.catholicresearch.net/blog/2010/09/preparing-ead-files-for-indexing/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/preparing-ead-files-for-indexing/#comments</comments>
		<pubDate>Wed, 29 Sep 2010 14:49:37 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=114</guid>
		<description><![CDATA[This posting outlines how I plan to prepare EAD files for indexing with Solr, the underlying indexing technology of VUFind. The problem I am aggregating sets of EAD files from Catholic Research Resource Alliance members. I am expected to index these files at the most granular level possible &#8212; meaning at the did level. In [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting outlines how I plan to prepare EAD files for indexing with Solr, the underlying indexing technology of VUFind.
</p>
<h2>The problem</h2>
<p>
I am aggregating sets of EAD files from <a href="http://www.catholicresearch.net/">Catholic Research Resource Alliance</a> members. I am expected to index these files at the most granular level possible &#8212; meaning at the <code>did</code> level. In order to satisfy both human and computer requirements, each indexed record needs at least a unique identifier, a human-readable descriptor, and a location code. The unique identifier can be gotten from the <code>unitid</code> element. The human-readable descriptor can come from the <code>unittitle</code>. The location code can be inferred from the url attribute of the <code>eadid</code> element.
</p>
<p>
Unfortunately, not all of the aggregated EAD files include a <code>unitid</code>, and when they do, they are not always unique. Additionally, the hierarchal nature of EAD files make the values extracted from <code>unittitle</code> elements almost meaningless unless they are placed within the context of their parent <code>unittitle</code> values. In short, indexing EAD files without some preprocessing makes the indexing process all but useless. What to do?
</p>
<h2>The solution</h2>
<p>
The solution includes: 1) adding and/or normalizing the <code>unitid</code> values, 2) constructing a more complete &#8220;title&#8221; based on previously enumerated <code>unittitle</code> values, 3) and outputting the whole thing to an XML stream easily indexable by Solr.
</p>
<p>
Adding and/or normalizing the <code>unitid</code> values (Step #1) can be accomplished with a stylesheet called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/addunitid.xsl">addunitid.xsl</a>. Essentially an identity transformation, the stylesheet loops through an EAD file using the <code>generate-id()</code> function to create or replace <code>unitid</code> values. The result is an enhanced EAD file.
</p>
<p>
Constructing more complete &#8220;titles&#8221; and outputting XML streams (Steps #2 and #3) is done by looping through the each <code>did</code> element, extracting the necessary metadata, creating a record describing each <code>did</code>-level element, and sending to <code>STDOUT</code> a rudimentary XML stream of my own design. The heart of this second stylesheet (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/ead2solr.xsl">ead2solr.xsl</a>) is the <code>ancestor::*/<code>did</code>/unittitle</code> selector used to find all the parent <code>unittitle</code> values of a given <code>did</code>.
</p>
<p>
Finally, a simple shell script was written (<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/clean.sh">clean.sh</a>) making it easy to do the above transformations from the command line.
</p>
<p>(I would not have been able to do this work if it weren&#8217;t for the XML4Lib mailing list and a few fine repondants to my pleas for help. Thanks go to MJ Suhonos, Tod Olson, Stefan Krause, and Alexander Johannesen. &#8220;Thank you!&#8221;)</p>
<h2>Next steps</h2>
<p>
Software is never done. If it were, then it would be called hardware. Therefore next steps include:
</p>
<ul>
<li>automatically adding the modified EAD files (the output of the first stylesheet) to Archon</li>
<li>enhancing the output of the second stylesheet with scope notes, abstracts, etc.</li>
<li>indexing the output of the second stylesheet</li>
</ul>
<p>
Fun with XSLT?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/preparing-ead-files-for-indexing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adding unitid elements to did elements</title>
		<link>http://www.catholicresearch.net/blog/2010/09/adding-unitid-elements-to-did-elements/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/adding-unitid-elements-to-did-elements/#comments</comments>
		<pubDate>Mon, 27 Sep 2010 20:59:01 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=110</guid>
		<description><![CDATA[This posting outlines how I believe I will add unitid elements to did elements of EAD files. The problem As the CRRA matures, I expect a greater amount of the metadata ingested into the &#8220;portal&#8221; will come from EAD files. In order to index EAD files meaningfully, I need to extract unique identifiers from each [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting outlines how I believe I will add unitid elements to did elements of EAD files.
</p>
<h2>The problem</h2>
<p>
As the CRRA matures, I expect a greater amount of the metadata ingested into the &#8220;portal&#8221; will come from EAD files. In order to index EAD files meaningfully, I need to extract unique identifiers from each container-level element, a human-readable description of the container, and a location code. The identifier and human-readable description can easily come from unitid and unititle elements of did elements.
</p>
<p>
Unfortunately, unitid (and maybe unititle) are not required elements of did elements. While the CRRA could mandate the creation of such elements, it turns out to be almost just as easy to create them on-the-fly.
</p>
<h2>The solution</h2>
<p>
The good folks apart of the XML4Lib provided me with my solution &#8212; an XSLT stylesheet, below:
</p>
<pre><code>&lt;xsl:stylesheet
  xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
  version='1.0'&gt;

  &lt;!-- match everything and copy it --&gt;
  &lt;xsl:template match="node()|@*"&gt;
    &lt;xsl:copy&gt;&lt;xsl:apply-templates select="@*|node()" /&gt;&lt;/xsl:copy&gt;
  &lt;/xsl:template&gt;

  &lt;!-- special case; match dids with no unitid --&gt;
  &lt;xsl:template match="//did[not(unitid)]"&gt;
    &lt;xsl:copy&gt;
      &lt;!-- add a unit id --&gt;
      &lt;unitid&gt;&lt;xsl:value-of select="generate-id()"/&gt;&lt;/unitid&gt;
      &lt;!-- continue copying --&gt;
      &lt;xsl:apply-templates select="@*|node()" /&gt;
    &lt;/xsl:copy&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;</code></pre>
<p>
While not perfect, it certainly is a step in the right direction. Short and elegant. The next step will be to include some sort of parameter as input or to generate some EAD-specific identifier so each unitid value is unique across the corpus. (Actually, that is another issue I need to address.)
</p>
<p>
Thanks go to MJ Suhonos for the cool //did[not(unitid)] expression, Tod Olson for the idea of identity transformation (copying), and Stefan Krause for the use of generate-id.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/adding-unitid-elements-to-did-elements/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VuFind 2.0 Conference</title>
		<link>http://www.catholicresearch.net/blog/2010/09/vufind-2-0-conference/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/vufind-2-0-conference/#comments</comments>
		<pubDate>Thu, 23 Sep 2010 12:24:09 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=102</guid>
		<description><![CDATA[VUFind is the technical backbone of the &#8220;Catholic Portal&#8221;, and this posting documents my experiences at the VuFind 2.0 Conference held at the Villanova Conference Center on September 15 &#38; 16, 2010. In short, it provided an opportunity for the community to share successes, challenges, and visions for the future. Day #1 The Conference was [...]]]></description>
			<content:encoded><![CDATA[<p>
VUFind is the technical backbone of the &#8220;Catholic Portal&#8221;, and this posting documents my experiences at the VuFind 2.0 Conference held at the Villanova Conference Center on September 15 &amp; 16, 2010. In short, it provided an opportunity for the community to share successes, challenges, and visions for the future.
</p>
<p style='text-align: center'>
<img src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/vufind.gif"></p>
<h2>Day #1</h2>
<p>
The Conference was divided into a number of presentations, group discussions, and informal social events. <strong>Joe Lucia</strong> (Villanova University) facilitated and opened the meeting with a number of general remarks surrounding libraries and the current environment:
</p>
<blockquote>
<p>
The question is, &#8220;Who will fulfill the social mission of libraries in the future?&#8221; If libraries don&#8217;t do it, then some other institution will. Libraries represent a locus of knowledge for our communities and a place for cultural conversation. Open source software is rooted in this same social mission and congruent with the mission of libraries&#8230; Is Google Books and the HaitiTrust a new form of the &#8220;Information Commons&#8221;? Maybe, but maybe not&#8230; Cloud computing is a trend towards aggregation, concentration, and commercialization, but is that the best solution, since it too is not immune to proprietary lock-in&#8230; Software as service is also a current trend and we must ask ourselves, &#8220;Why not just build something based on the WorldCat APIs?&#8221; Public libraries are pointing a way towards the creation of knowledge spaces &#8212; a possible lead for academic libraries. Seen in this light, libraries may be new cathedrals.
</p>
</blockquote>
<p>
<strong>Demian Katz</strong> (Villanova University) then shared how he has integrated VUFind with Serials Solutions&#8217; Summon. After considering a number of options, he decided to go with single search and a two-column display. Do a search. Query the local VUFind (Solr) index. Simultaneously query the remote Summon index. Display both results in a common window with VUFind on one side and Summon on the other. Especially this means books are on the left and articles are on the right. &#8220;You can&#8217;t modify the Summon relevancy ranking, and thus you get a lot of noise. Merging the indexed content often places local materials lower in the relevancy ranked output.&#8221; There are a few things on Katz&#8217;s to-do list: the addition of social features, the highlighting of query terms, advanced faceting options, and a mobile interface. You can try this VUFind/Summon combination at <a href="http://library.villanova.edu/Find">library.villanova.edu/Find</a>.
</p>
<p>
A similar presentation was given by <strong>Chris Spalding</strong> (Ex Libris) in his description of how VUFind can be integrated with Primo Central. &#8220;Through an API access to Primo Central content can be integrated with VUFind. We do two searches, get results, re-rank, and display. The key to the solution is the PC (Primo Central) add-on. We hope to do more collaboration and be as open as possible&#8230; We use AWS (Amazon Web Services) to host our content&#8230; We hope to share the code as soon as the end of the year, and we are sincerely trying to bootstrap the process of combining VUFind with Primo Central.&#8221; The approach described by Spalding is the approach I expected Katz to implement with Summon. Apparently there are problematic issues with both techniques.
</p>
<p>
<strong>Greg Pendlebury</strong> (University of Southern Queensland) then demonstrated a portable Javascript library called Anotar which is integrated with Fascinator (<a href="http://fascinator.usq.edu.au/">fascinator.usq.edu.au</a>). Using CouchDB for a foundation, Anotar is intended to support the sharing of annotations across systems. Add comments to a Web page and have those comments syndicated across the &#8216;Net and accessible in other applications. The point for the community present was, &#8220;Maybe this sort of thing could be integrated into VUFind.&#8221;
</p>
<p>
Name &amp; title authorities as well as controlled vocabularies was the focus of the next presentation, given by <strong>Katz</strong>. He first described how he experimented with prototypical Perl &#8220;hacks&#8221; found in a recent issue of Code4Lib Journal. These hacks exploit the WorldCat API to list authoritative names and subjects. He described another experiment where he integrated locally created authority content with the local VUFind (Solr) index. Finally he described a third possible solution taking advantage of the linked data provided by the Library of Congress. His next experiment will surround the use of the eXtensible Catalog Metadata Services Toolkit to munge and use authority records. &#8220;The use of authority lists make it possible for a person to do browse against the &#8216;catalog&#8217;.&#8221;
</p>
<p>
The group then broke into two or three smaller groups to discuss &#8220;birds-of-a-feather&#8221; sorts of ideas &#8212; breakout sessions.. Because of my interest in archival materials and EAD files, I went with the group called Beyond MARC. There we discussed things such as but not limited to the indexing of many different things: websites, EAD files, METS records, and full text. We also discussed the challenges of indexing hierarchical data, the content of boutique collections, and the provision of non-bibliographic services against metadata. In the end, we advocated for the greater use of VUFind record drivers, making it easier to support local customizations, and figuring out how to handle hierarchies.
</p>
<h2>Day #2</h2>
<p>
Working on a project called SWWHEP, <strong>Luke O&#8217;Sullivan</strong> (Swansea University) described how he hacked VUFind to work in a multi-ILS environment with the ultimate goal of providing reciprocal borrowing. Calling himself a &#8220;shambrarian&#8221; he described MARC as the &#8220;Dark Side of open source&#8221;. After being given sets of MARC records whose 001 fields had been modified for uniqueness, O&#8217;Sullivan essentially created a multitude of configuration files associated with each library system under his charge. When records were returned from searches his code looked at the 001 values and branched accordingly. Of all the implementations described during the Conference, O&#8217;Sullivan&#8217;s hack was the &#8220;kewlest&#8221;. See his good work at <a href="http://ifind.swwhep.ac.uk/">ifind.swwhep.ac.uk</a>.
</p>
<p>
<strong>Birong Ho</strong> (Western Michigan University) was second up on the second day with a description of how she and her team exploited the use of Web Services computing techniques to communicate between VUFind and their local Voyager system. She uses these services to support holds, renews, etc.
</p>
<p>
I was then given the chance to describe a future for &#8220;next generation library catalogs&#8221;, a thing I call services against texts. In a nutshell, I advocated for discovery systems to go beyond find and move towards use, and with the increasing availability of full text content such a prospect is increasingly possible. &#8220;Quantitative metadata &#8212; as opposed to qualitative metadata &#8212; makes it easier to compare, contrast, and analyze individual items in collections or collections as a whole.&#8221; I then demonstrated how digital humanities computing techniques can be applied to full text content to discover underlying patterns.
</p>
<p>
We broke into small groups again &#8212; table talks &#8212; and brainstormed visions for VUFind 2.0. Some of the things we came up with at our table included: relevancy ranking based on social networking data, full text indexing, including content beyond books, personalization based on patrons&#8217; characteristics or history, hooks to download full text from places like the Open Archives, the sharing of social data between VUFind implementations a la Ex Libris&#8217;s bX, tighter integration with Open Library, and an integration with VUFind into other applications through APIs.
</p>
<h2>Juicy quotes</h2>
<p>
Here is a short list of some juicy quotes I picked up from some of the attendees:
</p>
<ul>
<li>&#8220;A plug-in architecture may be a good idea.&#8221; &#8211;<strong>Kun Lin</strong></li>
<li>&#8220;Consider bringing different views into VUFind instead of shelling out.&#8221; &#8211;<strong>Eoghan &Oacute; Carrag&aacute;in</strong></li>
<li>&#8220;Full text indexing is easily implementable as long as you tweak the boosting factor.&#8221; &#8211;<strong>Til Kinsler</strong></li>
<li>&#8220;Maybe part of the solution is to stop giving content to the vendor.&#8221; &#8211;<strong>Greg Pendlebury</strong></li>
<li>&#8220;Remember to exploit the record drivers in order to provide different services and views of content.&#8221; &#8211;<strong>David Lacy</strong></li>
<li>&#8220;Solr&#8217;s VUFind schema is currently flat but maybe the data model needs to be more flexible and maybe hierarchal.&#8221; &#8211;<strong>Till Kinstler</strong></li>
<li>&#8220;We are never going to have &#8216;one bucket&#8217; searching.&#8221; &#8211;<strong>Joe Lucia</strong></li>
</ul>
<h2>Observations and summary</h2>
<p>
The Conference was well-organized and provided a forum for plenty of discussion and idea generation. The setting was very nice and the food was plentiful. Everybody was able to participate. I heard a number of people say they were either implementing or toying with the idea of implementing Evergreen as their &#8220;catalog&#8221; and using VUFind as their &#8220;discovery layer&#8221;. I had not thought of this. Interesting. I appreciated the active participation of Chris Spalding. He was candid and sincere. It was very nice to put names from the mailing lists with faces, and thus the crowd was international. Blacklight was compared &amp; contrasted with VUFind a number of times throughout the meeting. I believe both communities have something to learn from the other.
</p>
<p>
Alas, I was unable to stay for the fourth quarter of the event. I had a plane to catch, and I had made my reservations under the assumption the Conference would conclude at noon. I was wrong. Consequently I missed the last part of the meeting where next steps were to be articulated. If I had my druthers, two things would happen. First, I hope the development process becomes a bit more structured, complete with regular conference calls and software regression testing. Second, and along similar lines, I hope some entrepreneurial organization comes forward to provide commercial support for VUFind. Such a thing would make it more attractive to the libraries without local technical (computer) expertise.
</p>
<p>
Finally, I bounced my ideas regarding the indexing of EAD files off of as many people as I could. I think I am on the right track, even though few had experience with the same problem. Wish me luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/vufind-2-0-conference/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>CRRA in San Diego, January 2011</title>
		<link>http://www.catholicresearch.net/blog/2010/09/crra-in-san-diego-january-2011/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/crra-in-san-diego-january-2011/#comments</comments>
		<pubDate>Tue, 14 Sep 2010 17:32:33 +0000</pubDate>
		<dc:creator>plawton</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=94</guid>
		<description><![CDATA[We invite you to attend the CRRA reunion and discussions in San Diego on Thursday afternoon, January 6, 2011. We are scheduling this meeting before the ALA Midwinter Meeting meetings begin on Friday in hopes that many of you who are attending the ALA meetings will be able to join in the CRRA discussions as [...]]]></description>
			<content:encoded><![CDATA[<p>We invite you to attend the CRRA reunion and discussions in San Diego on Thursday afternoon, January 6, 2011.  We are scheduling this meeting before the ALA Midwinter Meeting meetings begin on Friday in hopes that many of you who are attending the ALA meetings will be able to join in the CRRA discussions as well.</p>
<p>At this time, we are putting together what promises to be a set of lively and informative discussions.  This will be an opportunity to talk about CRRA activities taking place at your library, to discuss progress to date on the 2010/11 goals in the strategic plan, and to explore our readiness to promote the Catholic portal to librarians and scholars.  VuFind 1.0 will be very near to being ready for implementation and this will be an opportunity to explore its functionality.  Also, we will take a look at how the contents on the portal are growing particularly in regard to adding rare, unique and uncommon archival collections and other materials. The outlines of the proposal to be submitted to the NEH Challenge Grant will be ready for discussion.  And, we want to hear from everyone – new and continuing members – how things are going at your library. Very importantly, this is an occasion to network and socialize with your CRRA colleagues.</p>
<p>Thursday, January 6, 2011, Copley Library, University of San Diego<br />
•	Noon to 2 p.m.  Board of Directors (with both onsite and call-in participation for Board members)<br />
•	Noon to 2 p.m.  Campus and library tours to be arranged<br />
•	2:30 – 5 p.m.  Open forum for all participants with refreshments provided by the Copley Library<br />
•	5:30    Dinner for all participants at Le Gran Terraza which offers a fine dining experience on campus (your own treat)</p>
<p>Theresa Byrd, University Librarian, has graciously volunteered to host our group on campus at the University of San Diego.   The University campus is situated on a mesa overlooking San Diego Bay. The Spanish Renaissance architecture and breathtaking views of Mission Bay, the Pacific Ocean, the community of Linda Vista and Tecolote Canyon make the campus a not to be missed destination in San Diego. The campus is also conveniently located near downtown San Diego. In addition to cabs (about $15), a regularly scheduled campus van to and from the sightseeing destination of Old Town San Diego offers an easy option for travel to campus.  More information on location and travel will be sent at a later date.<br />
<a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/JenniferSignature.jpg"><img class="alignnone size-medium wp-image-96" title="JenniferSignature" src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/JenniferSignature-300x264.jpg" alt="" width="155" height="137" /></a></p>
<p>Jennifer Younger, Chair, CRRA Board of Directors</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/crra-in-san-diego-january-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VUFind &#8220;Midwest&#8221; User&#8217;s Group Meeting</title>
		<link>http://www.catholicresearch.net/blog/2010/09/vufind-midwest-users-group-meeting/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/vufind-midwest-users-group-meeting/#comments</comments>
		<pubDate>Sat, 04 Sep 2010 03:29:09 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=81</guid>
		<description><![CDATA[An inaugural VUFind &#8220;Midwest&#8221; User&#8217;s Group Meeting was held Friday, September 3, and this posting outlines my perceptions of what happened there. The &#8220;Catholic Portal&#8221; uses VUFind as its &#8220;discovery interface&#8221; and sometimes I feel starved for people with whom to discuss issues surrounding the application. I then got wind of VUFind&#8217;s use at Western [...]]]></description>
			<content:encoded><![CDATA[<p>
An inaugural VUFind &#8220;Midwest&#8221; User&#8217;s Group Meeting was held Friday, September 3, and this posting outlines my perceptions of what happened there.
</p>
<p style='text-align: center'><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/09/meeting.gif" /></p>
<p>
The &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8221; uses VUFind as its &#8220;discovery interface&#8221; and sometimes I feel starved for people with whom to discuss issues surrounding the application. I then got wind of VUFind&#8217;s use at Western Michigan University (WMU) as well as the University of Michigan at Ann Arbor (U of M). Since WMU is half way between me and U of M I thought a &#8220;user&#8217;s group meeting&#8221; may be in order. A few calls were made, a few postings to a couple of mailing lists were written, and the meeting came to fruition.
</p>
<p>
There were nine of us in attendance:
</p>
<ol>
<li><strong>Bill Dueber</strong> (University of Michigan) </li>
<li><strong>Birong Ho</strong> (Western Michigan University)</li>
<li><strong>Dean Lingley</strong> (Purdue University) </li>
<li><strong>Eric Lease Morgan</strong> (University of Notre Dame)</li>
<li><strong>Keith Kelley</strong> (Western Michigan University)</li>
<li><strong>Matthew Riehle</strong> (Purdue University) </li>
<li><strong>Roy Zimmer</strong> (Western Michigan University)</li>
<li><strong>Scott Garrison</strong> (Western Michigan University)</li>
<li><strong>Tod Olson</strong> (University of Chicago)</li>
</ol>
<p>
The good folks from Purdue suffered through the entire 3 1/2 hour event via Skype. &#8220;Kudos to Dean and Matthew.&#8221;
</p>
<p>
As a group we discussed quite a number of things, listed here in more or less chronological order:
</p>
<ol>
<li><strong>straying from the code base</strong> &#8211; The hottest topic surrounded the difficulty of implementing VUFind version 1.0 given the fact that at least a couple of us have modified (&#8220;hacked&#8221;) previous versions to such a degree that implementing 1.0 was almost too much of a challenge. As one person said, &#8220;It might be easier to start all over with Blacklight rather than migrate my changes.&#8221; This does not mean anybody was dissatisfied with VUFind&#8217;s performance or many of its features. Record display is good. There is a distinct separation of inventory and OPAC. VUFind offers great flexibility, and public services staff seem very happy with the ease patron interfaces can be customized.</li>
<li><strong>Blacklight</strong> &#8211; Given that, the discussion turned to a comparison between VUFind and Blacklight. While the group seemed to have minimal experience with Blacklight a number of things were definitely seen in Blacklight&#8217;s favor, such as: a more disciplined community complete with project management, the insistence of regression testing before code submissions were included into the base, and regular conference calls. Much of this was summed up as the &#8220;open source conundrum&#8221; &#8212; the differences between free software, open source software, and community source.</li>
<li><strong>Solr</strong> &#8211; We then turned to a discusion of Solr since we all understood that VUFind and Blacklight were essentially client interfaces to the increasingly popular indexer/search engine. A number of us believed it was absolutely necessary to modify the underlying Solr schema in order to satisfy local needs. These modifications ran the gamut from what fields exist to how those fields are defined and filtered. We compared &amp; contrasted the use of the stock query interface and the use of the Dismax handler. The indexing of data then led to a discussion how to handle diacritics, dates, and date ranges.</li>
<li><strong>miscellaneous</strong> &#8211; As the discussion wound down we we talked about various things such as systems administration tasks, and whether or not to move the Solr indexer to another host or implement it under a servlet container other than Jetty.</li>
</ol>
<p>
I told the group I was going to attend the <a href="http://vufind.org/wiki/vufind_2.0_conference">VUFind User&#8217;s Group Meeting</a> taking place at Villanova in a couple of weeks, and I asked for a short list of things I ought to share there &#8212; take aways:
</p>
<ul>
<li><strong>governance</strong> &#8211; the VUFind community could use a bit more structure and the application of project management</li>
<li><strong>patches</strong> &#8211; member-submitted patches need to be incorporated to the code to a greater degree; a couple of us felt our contributions were not accepted</li>
<li><strong>authorities</strong> &#8211; a greater emphasis needs to be placed on integrating the profession&#8217;s good work done in regards to named authorities</li>
<li><strong>local customizations</strong> &#8211; a possible solution to the &#8220;straying&#8221; issue may be the implementation of some sort of local code base, something Blacklight apparent has</li>
<li><strong>&#8220;light&#8221; flavor</strong> &#8211; given the spectrum of programming skills available in libraries, some thought a VUFind &#8220;Light&#8221; may be in order</li>
<li><strong>repository</strong> &#8211; there is a need for a central place for the community to share local hacks, normalization routines, changes to the Solr indexer, etc.</li>
</ul>
<p>
In summary, the meeting was definitely a success. Discussion was thorough and focused. I believe we used our time wisely, and no one went away thinking it had been wasted. I do not think the group was representative of the whole VUFind community. We were more skilled than most. We agreed that VUFINd is not broken, but we did outline a number of ways it could be improved. We all agreed that the implementation of VUFind in our institutions represents a giant step forward compared to where we were at least a few years ago. <code>oss++</code></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/vufind-midwest-users-group-meeting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Collection Policy Statement for the Catholic Portal</title>
		<link>http://www.catholicresearch.net/blog/2010/09/collection-policy-statement-for-the-catholic-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2010/09/collection-policy-statement-for-the-catholic-portal/#comments</comments>
		<pubDate>Thu, 02 Sep 2010 17:50:58 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Collection policy]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=76</guid>
		<description><![CDATA[(The following is the current collection policy for the Catholic Portal.) Collection Policy Statement for the Catholic Portal The purpose of the Catholic Research Portal is to provide global access to the wealth of research resources relating to the Catholic experience. Of primary interest are rare, unique and uncommon Catholic research materials. Because these resources [...]]]></description>
			<content:encoded><![CDATA[<p>
(The following is the current collection policy for the Catholic Portal.)
</p>
<p>
Collection Policy Statement for the Catholic Portal</p>
<p>
The purpose of the Catholic Research Portal is to provide global access to the wealth of research resources relating to the Catholic experience. Of primary interest are rare, unique and uncommon Catholic research materials. Because these resources are often uncataloged and little known outside their institutional repositories, the Portal seeks to encourage broad participation and to provide support to libraries, archives, and other institutions that wish to participate in this project but lack the resources to do so. The Portal will ultimately facilitate and assist researchers and students in identifying Catholic research resources and make Catholic scholarship more productive. In doing so, the Catholic Research Portal will contribute substantially to the generation of new knowledge.
</p>
<p>
The Portal will be the modern day bibliography of research resources providing access through a number of approaches, including author, title, subject, keyword, format, and holding institution. Resources will remain under the care of the owning institution. The Portal will identify the owning institution for non-digital resources and, where the resources exist in a digital format, it will link directly to the digital resource. Using international standards, the Portal will collect metadata from participating special and archival collections.</p>
<p>
The Catholic Research Resources Alliance has identified twelve collecting themes:</p>
<ul>
<li>Catholic education</li>
<li>Catholic intellectual life</li>
<li>Catholic literary figures</li>
<li>Catholic liturgy and devotion</li>
<li> Catholic missions</li>
<li>Catholic social action</li>
<li>Diocesan collections, including papers of Bishops</li>
<li>Men’s religious orders</li>
<li>Peace-building</li>
<li>Religion and citizenship</li>
<li>Vatican II</li>
<li>Women’s religious orders</li>
</ul>
<p>
These themes are intended to encourage the consideration and classification of institutional resources which may be suitable for the Portal. It is expected that the Portal will feature an initial emphasis on the above-named topics and that they will produce an early “critical mass” of research content collections. All contributors’ collections will be accepted for inclusion, however, provided that they are relevant to the study of Catholicism and can be deemed rare, unique or uncommon. All formats, including manuscripts, books, ephemera, photographs, and artifacts which meet these criteria, are of interest to the Portal.
</p>
<p>This effort is being sponsored by the Catholic Research Resources Alliance (CRRA). Member institutions currently include: Boston College, The Catholic University of America, Georgetown University, Loyola University of Chicago, Marquette University, University of Notre Dame, St. Catherine University, St. Edwards University, University of San Diego, Seton Hall University, and Villanova University. As the Portal develops and expands, all Catholic colleges, universities, seminaries and archives in North America will be welcome to participate in this effort. Non-Catholic institutions with holdings of Catholic interest will also be welcome to contribute records.</p>
<p>
Drafted by the Collections Committee of the CRRA, September 2009.
</p>
<p>Approved by the CRRA Board of Directors on November 19, 2009.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/09/collection-policy-statement-for-the-catholic-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Where in the world is the CRRA?</title>
		<link>http://www.catholicresearch.net/blog/2010/08/where-in-the-world-is-the-crra/</link>
		<comments>http://www.catholicresearch.net/blog/2010/08/where-in-the-world-is-the-crra/#comments</comments>
		<pubDate>Mon, 30 Aug 2010 20:51:55 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=70</guid>
		<description><![CDATA[Pat and I are in the process of mapping the locations of CRRA members, below: View CRRA Members in a larger map]]></description>
			<content:encoded><![CDATA[<p>Pat and I are in the process of mapping the locations of CRRA members, below:</p>
<p><iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps/ms?ie=UTF8&amp;hl=en&amp;msa=0&amp;msid=107706967661070486888.00048f0e1a9ec58795598&amp;ll=37.857507,-94.21875&amp;spn=47.935945,74.707031&amp;z=3&amp;output=embed"></iframe><br /><small>View <a href="http://maps.google.com/maps/ms?ie=UTF8&amp;hl=en&amp;msa=0&amp;msid=107706967661070486888.00048f0e1a9ec58795598&amp;ll=37.857507,-94.21875&amp;spn=47.935945,74.707031&amp;z=3&amp;source=embed" style="color:#0000FF;text-align:left">CRRA Members</a> in a larger map</small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/08/where-in-the-world-is-the-crra/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Harvesting, updating, and re-indexing</title>
		<link>http://www.catholicresearch.net/blog/2010/08/harvesting-updating-and-re-indexing/</link>
		<comments>http://www.catholicresearch.net/blog/2010/08/harvesting-updating-and-re-indexing/#comments</comments>
		<pubDate>Mon, 30 Aug 2010 17:46:58 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=67</guid>
		<description><![CDATA[This posting describes the automated process I am currently using to harvest, update, and re-index the MARC records of the &#8220;Catholic Portal&#8220;. Step #1 &#8211; Make a list Librarians love lists, and I am no exception. The process begins with a list (databases) of CRRA members who have MARC metadata to share. Each item in [...]]]></description>
			<content:encoded><![CDATA[<p>
This posting describes the automated process I am currently using to harvest, update, and re-index the MARC records of the &#8220;<a href="http://www.catholicresearch.net/">Catholic Portal</a>&#8220;.
</p>
<h2>Step #1 &#8211; Make a list</h2>
<p>
Librarians love lists, and I am no exception. The process begins with a list (databases) of CRRA members who have MARC metadata to share. Each item in the list includes the following fields:
</p>
<ol>
<li>code &#8211; a unique three-letter identifier</li>
<li>institution &#8211; the name of the CRRA member</li>
<li>library &#8211; the name of the member&#8217;s library</li>
<li>URL &#8211; the location of their member&#8217;s MARC records</li>
</ol>
<p>
Right now, the name of this list is <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/libraries.db">libraries.db</a>. It is created by hand.
</p>
<h2>Step #2 &#8211; Harvest</h2>
<p>
The second step is to harvest content from each member library. This is done by looping through the list, extracting the URLs, and copying the remote MARC data sets to a local file system. This process is done with a script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/harvest.pl">harvest.pl</a>.
</p>
<h2>Step #3 &#8211; Update</h2>
<p>
Because each record in the underlying Solr index must have a unique identifier, it is necessary for me to make each 001 value in each MARC record unique. To do this I loop through each of the harvested MARC records and prepend the three-letter institution code to each 001 field. This is done with a script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/add-code.pl">add-code.pl</a>.
</p>
<h2>Step #4 &#8211; Re-index</h2>
<p>
The last step is to re-index the MARC records making sure the Solr index is as current as possible. This is done with a script called <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/re-index.pl">re-index.pl</a>. It is the most complicated. This process is done by, again, looping through the database of CRRA members reading the institution code. The script then deletes all of the records from the index whose identifier begins with the institutional code. (&#8220;Thanks WebService::Solr!&#8221;). Each of the records from each of the institutions&#8217; metadata files are then feed to Solr. Using this re-indexing process it is not necessary for me to manage overlays, duplicates, or deleted records. The whole index is wiped clean and refreshed anew.
</p>
<h2>VUFind</h2>
<p>
The whole process works pretty well, and because the Catholic Portal is based on <a href="http://vufind.org/">VUFind</a>, the whole process is extraordinarily flexible. VUFind cares not about the ingestion process allowing me to handle it in the manner I feel most useful. Let&#8217;s hear it for open source software!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/08/harvesting-updating-and-re-indexing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making your content available</title>
		<link>http://www.catholicresearch.net/blog/2010/08/making-your-content-available/</link>
		<comments>http://www.catholicresearch.net/blog/2010/08/making-your-content-available/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 17:13:00 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=51</guid>
		<description><![CDATA[Pat Lawton and I created this (updated) outline &#8212; a recipe &#8212; for getting CRRA member metadata records into the &#8220;Catholic Portal.&#8221; It is also available as a PDF document designed for printing. Identify specialists &#8211; It takes many people with many skills to extract content for the Portal. It requires bibliographers (subject specialists) who [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>Pat Lawton and I created this (updated) outline &#8212; a recipe &#8212; for getting CRRA member metadata records into the &#8220;Catholic Portal.&#8221; It is also <a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/making.pdf">available as a PDF document</a> designed for printing.</p>
<ol>
<li style='margin-bottom: 1em'>
    <strong>Identify specialists</strong> &#8211; It takes many people with many skills to extract content for the Portal. It requires bibliographers (subject specialists) who know which materials located in your local institution fit the scope of the project. It requires catalogers (metadata specialists) who know how the local materials are described. It requires systems librarians (database administrators) who can extract metadata records from the underlying system(s).</li>
<li style='margin-bottom: 1em'>
    <strong>Have a meeting</strong> &#8211; Bring together all the specialists from Step #1 and discuss Steps #3 through #10.</li>
<li style='margin-bottom: 1em'>
    <strong>Understand the scope of the Portal</strong> &#8211; This is akin to understanding the purpose of the Portal, who is its intended audience, and what is its collection policy. In short, the Portal is intended to contain rare, unique, and/or uncommon materials, in all formats, useful for scholarly Catholic research.</li>
<li style='margin-bottom: 1em'>
    <strong>Identify your resources and collections</strong> &#8211; List the resources and collections in your institution which fall into the scope of the Portal. Examples might include manuscripts, rare books, digitized images, sound recordings, the papers of famous individuals, the archives of leading organizations, pamphlets, newspapers, etc. This work will probably be led by bibliographers.</li>
<li style='margin-bottom: 1em'>
    <strong>Articulate how your resources and collections are described</strong> &#8211; For each of your resources and collections identified in Step #4, determine which ones have metadata and which ones don&#8217;t. For those items which do have metadata, how they are denoted in your various computer systems? Are they all in a particular call number range? Do they comprise the totality of items in your &#8220;special collections&#8221; department? Are they all of the things encoded as EAD files? Do they all have some specific local note in your library catalog? Are they all saved in a particular local spreadsheet or database? etc. This work will probably be led by catalogers.</li>
<li style='margin-bottom: 1em'>
    <strong>Flag records as &#8220;CRRA&#8221;</strong> &#8211; Once you have identified records appropriate for inclusion in the Portal, specifically denote them as such. For example, if your records are in MARC, then insert something like &#8220;crra&#8221; into a local note such as 590 subfield a. If your records are EAD files, you may want to insert &#8220;crra&#8221; into the &lt;notestmt&gt; within the &lt;filedesc&gt; element. This may be the work of both catalogers and systems librarians.</li>
<li style='margin-bottom: 1em'>
    <strong>Validate records</strong> &#8211; Each and every record destined for the Portal must have three metadata characteristics. First, they must have a unique identifier. For MARC records this is the 001 field. For EAD files, this is a <code>&lt;unitid&gt;</code> element inside the <code>&lt;did&gt;</code> element. These unique identifiers are used by the Portal software as database keys.</p>
<p>Second, each record must have some sort of descriptive title element. For MARC records this is usually 245 subfield a. For EAD files this is usually the <code>&lt;unititle&gt;</code> element inside the <code>&lt;did&gt;</code> element. These descriptive title elements provide a means for searching and put the object in context for the patron.</p>
<p>Finally, every record must include some sort of location code or address pointing to the described object. For MARC records, this is often a call number in 099 or a URL in 856. For EAD files, this may be anything from a <code>&lt;note&gt;</code> denoting the postal address of your institution placed in the <code>&lt;did&gt;</code> element to URLs inserted into &lt;extref&gt; elements within <code>&lt;physloc&gt;</code> elements inside <code>&lt;did&gt;</code> elements. This may be the work of both catalogers and systems librarians.</p>
<p>Please refer to the &#8220;CRRA Metadata Guidelines&#8221; for further guidance on requirements and best practices for maximizing the discoverability of your metadata records in the Portal.</li>
<li style='margin-bottom: 1em'>
    <strong>Extract metadata records</strong> &#8211; Run a report against your computer system searching for all the records denoted by Step #6. Save the output to one or more files on a Web server, and tell us at Notre Dame the resulting URL. This process makes your metadata available for harvesting. For example, if your metadata records are in MARC, then query your integrated library system for &#8220;crra&#8221; in field 590 subfield a, and save the result as a single file of MARC records to an HTTP file system. If your metadata is stored as EAD, then find all the Portal-related EAD files and save them in a Web-accessible directory. In both cases, make sure your exported data is character encoded as UTF-8 and not MARC-8. This is the work of systems librarians.</li>
<li style='margin-bottom: 1em'>
    <strong>Create a workflow</strong> &#8211; To ensure your records are continually added to the Portal it is necessary to repeat this process on a regular basis. For example, as new items are selected or come into your institution, bibliographers will need to immediately denote items destined for the Portal. You may do this by adding a special note to the acquisitions record. As the acquisitions are completed, the cataloger will need to immediately update the record(s) with &#8220;crra&#8221; flags. The systems librarian will need to extract the metadata on a regular basis and may consider writing a script that runs every night at midnight.</li>
<li style='margin-bottom: 1em'>
    <strong>Repeat</strong> &#8211; This sort of work is never done. Go to Step #3 about twice a year, and go to Step #1 about once a year.</li>
</ol>
<p>Finally, this &#8220;recipe,&#8221; like any good recipe, is only an outline of what needs to be done. There will surely be variations along the way, but based on our experience, this outline represents a good way to get started.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/08/making-your-content-available/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Web 2.0 features</title>
		<link>http://www.catholicresearch.net/blog/2010/08/web-2-0-features/</link>
		<comments>http://www.catholicresearch.net/blog/2010/08/web-2-0-features/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 15:37:41 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=45</guid>
		<description><![CDATA[After tweaking with VUFind&#8217;s configuration files, our &#8220;sandbox&#8221; implementation of the &#8220;Catholic Portal&#8221; now supports many (if not all) of VUFind&#8217;s Web 2.0 features &#8212; faceted browse, favorites, cover art, reviews, author blurbs, etc. Please give them a whirl. Create an account for yourself and add some items to your Favorites. NTS (&#8220;note to self&#8221;), [...]]]></description>
			<content:encoded><![CDATA[<p>
After tweaking with VUFind&#8217;s configuration files, our &#8220;sandbox&#8221; implementation of the &#8220;Catholic Portal&#8221; now supports many (if not all) of VUFind&#8217;s Web 2.0 features &#8212; faceted browse, favorites, cover art, reviews, author blurbs, etc. Please give them a whirl. <a href="http://vufind.library.nd.edu/index.php?module=MyResearch&amp;action=Account">Create an account</a> for yourself and add some items to your Favorites.
</p>
<p>
NTS (&#8220;note to self&#8221;), the account creation process did not work until I changed the value of RewriteBase in my httpd.conf file from /vufind to /. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/08/web-2-0-features/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Help wanted</title>
		<link>http://www.catholicresearch.net/blog/2010/08/help-wanted/</link>
		<comments>http://www.catholicresearch.net/blog/2010/08/help-wanted/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 12:47:21 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=30</guid>
		<description><![CDATA[Help wanted. Is there anybody out there in CRRA Land who can help me customize the look &#038; feel of our &#8220;Catholic Portal&#8221;? For better or for worse, people increasingly rank their likes or dislikes of a website based on it&#8217;s graphic design. Certainly, usability, functionality, and completeness are very important qualities but so are [...]]]></description>
			<content:encoded><![CDATA[<p>
Help wanted. Is there anybody out there in CRRA Land who can help me customize the look &#038; feel of our &#8220;Catholic Portal&#8221;?
</p>
<p>
For better or for worse, people increasingly rank their likes or dislikes of a website based on it&#8217;s graphic design. Certainly, usability, functionality, and completeness are very important qualities but so are aesthetics.
</p>
<p>
To create aesthetically pleasing websites a person today requires two specialized skills: 1) a formal understanding of graphic design (color, layout, typography, etc.), and 2) the ability manifest graphic design using HTML and cascading style sheet (CSS) technology. The former is embodied as an artistic flare. The later is akin to the painter&#8217;s brush or the bluesman&#8217;s guitar.
</p>
<p>
Alas, I do not have the necessary skills. After all, I think working both in terminal mode as well as at the command line are beautiful.
</p>
<table align="center">
<tr align="center">
<td><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/sandbox.png"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/sandbox.png" alt="screenshot" title="sandbox" width="200" /></a><br /><a href="http://vufind.library.nd.edu/">&#8220;sandbox&#8221; implementation</a></td>
<td><a href="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/public.png"><img src="http://www.catholicresearch.net/blog/wp-content/uploads/2010/08/public.png" alt="screenshot" title="public" width="200" /></a><br /><a href="http://www.catholicresearch.net/">public interface</a></td>
</tr>
</table>
<p>
So, along with bibliographers who identify materials for the Catholic Portal, and along with catalogers who know how it has been organized, and along with systems librarians who know how to extract our metadata from our catalogs, we need competent Web designers.
</p>
<p>
Do you know of anybody in your institution who can help with this work? Like the other tasks surrounding the Portal, the work is ongoing but not constant. The short term goal is to give our current <a href="http://vufind.library.nd.edu/">&#8220;sandbox&#8221; implementation</a> of VUFind the look &#038; feel of our <a href="http://www.catholicresearch.net/">&#8220;public&#8221; interface</a>. Once completed we will be able to move the &#8220;sandbox&#8221; to production and dramatically increase the functionality of our interface.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/08/help-wanted/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VUFind (version 1.0.1)</title>
		<link>http://www.catholicresearch.net/blog/2010/08/vufind-version-1-0-1/</link>
		<comments>http://www.catholicresearch.net/blog/2010/08/vufind-version-1-0-1/#comments</comments>
		<pubDate>Tue, 24 Aug 2010 21:51:49 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.catholicresearch.net/blog/?p=25</guid>
		<description><![CDATA[I have installed VUFind (version 1.0.1) on our development server, and the address is http://vufind.library.nd.edu/. At the present time you won&#8217;t find very much there except our indexed metadata records &#8212; about 60,000 of them. The next steps are to edit some of the underlying configurations to enable bits of functionality (call number displays, cover [...]]]></description>
			<content:encoded><![CDATA[<p>
I have installed VUFind (version 1.0.1) on our development server, and the address is <a href="http://vufind.library.nd.edu/">http://vufind.library.nd.edu/</a>.
</p>
<p>
At the present time you won&#8217;t find very much there except our indexed metadata records &#8212; about 60,000 of them. The next steps are to edit some of the underlying configurations to enable bits of functionality (call number displays, cover art, user reviews, etc.) The bigger issues to be resolved include: 1) giving our implementation the CRRA look &#038; feel, and 2) indexing did-level data found in EAD files. More on these things later.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/08/vufind-version-1-0-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA December 2009 Update</title>
		<link>http://www.catholicresearch.net/blog/2010/01/crra-december-2009-update/</link>
		<comments>http://www.catholicresearch.net/blog/2010/01/crra-december-2009-update/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 22:03:12 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=42</guid>
		<description><![CDATA[Following is the CRRA update of activities during the month of December, 2009.  This update includes news about: CLIR grant announcements Focus group data analysis The CRRA website Kim Kelley Steve Connaghan Upcoming events Enjoy and Happy New Year! CRRA Update December 2009 December was a month of &#8220;hidden labor.&#8221;  We made good progress on [...]]]></description>
			<content:encoded><![CDATA[<p>
Following is the CRRA update of activities during the month of December, 2009.  This update includes news about:
</p>
<ul>
<li>CLIR grant announcements</li>
<li>Focus group data analysis</li>
<li>The CRRA website</li>
<li>Kim Kelley</li>
<li>Steve Connaghan</li>
<li>Upcoming events</li>
</ul>
<p>
Enjoy and Happy New Year!
</p>
<h2>CRRA Update December 2009</h2>
<p>
December was a month of &#8220;hidden labor.&#8221;  We made good progress on a number of fronts (in addition to cookies, eggnog, and snow-shoveling), but most of our activity was behind the scenes. We made great strides on the following, yet the fruits of our labors may be made manifest only in the weeks and months to come.
</p>
<p>
CLIR grant announcements were sent to a number of venues. To date, the announcement has appeared in: Library Journal &lt;<a href="http://www.libraryjournal.com/article/CA6709666.html">http://www.libraryjournal.com/article/CA6709666.html</a>&gt;, ACCU&#8217;s Winter 2009 Newsletter &lt;<a href="http://www.accunet.org/files/public/Winter_09_newsletter.pdf">http://www.accunet.org/files/public/Winter_09_newsletter.pdf</a>&gt;, Marquette University&#8217;s website, and St. Catherine University website.
</p>
<p>
Look for future announcements in: Catholic Library World&#8217;s News Notes (March 2010), Midwest Archives (MAC) Newsletter (April 2010), ATLA-RC list and Newsletter, H-Catholic list, Cushwa Center Newsletter, Archival Outlook, and the Mid-Atlantic Archivist.
</p>
<p>
Focus group data analysis has begun. To date, four institutional members have conducted focus groups at their institutions, with a total of 34 participants. We hope to have data from at least two more institutions between now and Feb. 1.   The data so far has been rich with several discrete, emergent themes.
</p>
<p>
The <a href="http://www.catholicresearch.net">CRRA website</a> is being updated to include new members, participants, and an updated directory of library, seminary and archive directors at Catholic institutions in the United States and Canada.
</p>
<p>
Records from the VuFind test site have been moved into the CRRA website, bringing us to a total of 33,000+ records and counting.
</p>
<p>
Kim Kelley, Associate Provost for University Libraries and Dean, School of Library and Information Science, at the Catholic University of America announced that she would be leaving the CRRA Board of Directors, effective Jan. 1, 2010. Kim has been a great supporter of the CRRA and her calm and thoughtful input will be deeply missed.  We look forward to engaging Kim in future CRRA activities and wish her well in her new Wisconsin adventures!
</p>
<p>
Steve Connaghan, Acting Director of Libraries, will succeed Kim on the CRRA Board of Directors.  A warm welcome to Steve!
</p>
<h2>Upcoming events</h2>
<p>
ACCU 2010 Annual Meeting: Strategic Issues for Catholic Higher Education will be held at the Mandarin Oriental Hotel, Washington, DC, January 30 to February 1, 2010. For more information, visit: <a href="https://www.accunet.org/i4a/ams/publicLogin.cfm">https://www.accunet.org/i4a/ams/publicLogin.cfm</a>.
</p>
<p>
All CRRA events and events of possible interest to members are posted to the CRRA calendar, available at <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a> and also accessible to members (login and password required) from the Admin area of the CRRA website.
</p>
<p>
CRRA Update is an electronic newsletter distributed via email the first Friday of each month to provide members with an update on CRRA activities. The Update is also posted to the CRRA Blog. Please contact us at 574.631.1324 or email <a href="mailto: plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share.
</p>
<p>
Happy New Year to all!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2010/01/crra-december-2009-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>November Update</title>
		<link>http://www.catholicresearch.net/blog/2009/12/november-update/</link>
		<comments>http://www.catholicresearch.net/blog/2009/12/november-update/#comments</comments>
		<pubDate>Fri, 04 Dec 2009 19:38:21 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=39</guid>
		<description><![CDATA[Following is the CRRA Update for November, 2009.  This update includes: -    CRRA member news -    Focus groups report -    Progress report on our strategic plan -    The CRRA Collection Policy Statement -    News about the CLIR grant award -    Upcoming events I would like to include more news items by and about members in [...]]]></description>
			<content:encoded><![CDATA[<p>Following is the CRRA Update for November, 2009.  This update includes:</p>
<p>-    CRRA member news</p>
<p>-    Focus groups report</p>
<p>-    Progress report on our strategic plan</p>
<p>-    The CRRA <em>Collection Policy Statement</em></p>
<p>-    News about the CLIR grant award</p>
<p>-    Upcoming events</p>
<p>I would like to include more news items by and about members in future updates, so if you have news items to share, please pass them on.</p>
<p>Pat Lawton<br />
Digital Project Librarian<br />
plawton@nd.edu</p>
<p align="center"><strong>CRRA Update</strong></p>
<p align="center"><strong>NOVEMBER </strong><strong>2009</strong></p>
<p><strong>Focus Groups at Member Institutions<br />
</strong>One of the goals from our annual meeting in July 2009 was to hold focus groups at all member institutions, to gather user feedback concerning the content of the portal.  In September, Notre Dame hosted four focus groups and documented their processes, providing a model for other institutions. As of November 30, Marquette University Libraries and Seton Hall University Libraries have completed their focus groups. With uniformity in approach and questions, we can realize a sizable dataset that promises to yield rich data from future portal users coast to coast.   Thank you to all who are in the planning stages or have completed their focus group interviews!</p>
<p>Once all member institutions&#8217; focus group data is reported, data across institutions will be analyzed, summarized, and disseminated. The cross-institutional data gathered from the focus groups will serve to build a portal that aligns with users&#8217; wants and needs.</p>
<p><strong>Ed Starkey </strong><br />
CRRA Board Member, Ed Starkey, University Librarian of the University of San Diego,<br />
announced he will be retiring the end of December, 2009.</p>
<p>Ed, together with Charlotte Ames, retired Notre Dame Catholic Studies Librarian, was the inspiring figure for the entire CRRA project.  Ed’s vision was to bring Catholic librarians and libraries together to advance Catholic scholarship and Catholic intellectual life around the globe; to encompass and create a place where scholars might find all materials related to the Catholic intellectual tradition in order to realize new knowledge.</p>
<p>Ed provided the inspiration, encouragement and knowledge that was so essential to the founding of the CRRA. We wish much joy to Ed in his retirement days.  He will be sorely missed and we will continue our work to realize the vision he has inspired.</p>
<p><strong>Collection </strong><strong>P</strong><strong>olicy </strong><strong>Statement A</strong><strong>dopted by CRRA Board of Directors </strong><br />
At the November 19 board meeting, Bob O&#8217;Neill, Chair of the Collections Committee (Boston College), moved to adopt the proposed collection policy statment with a change from “rare, unique and infrequently held&#8221; to “rare, unique and uncommon.”  The change was approved.</p>
<p>The newly adopted Collection Policy Statement is archived and available to CRRA members from the Admin area of the website and from this link:<span style="text-decoration: underline;"><a href="http://tiny.cc/Collection">http://tiny.cc/Collection</a>.</span> (If you need a reminder of the password and login, send Pat an email at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a>.</p>
<p><strong>New Portal Records Now Accessible at Catholicresearch.net </strong><br />
We are happy to announce that the goal to incorporate the VuFind interface and pilot project records  into our home site has been met, and well ahead of schedule!  At <a href="http://www.catholicresearch.net/">http://www.catholicresearch.net</a> you will find the records that were formerly available only at the VuFind test site.</p>
<p>We welcome your comments about this update.  Pages are under construction to make them more inviting, more easily navigable, up to date, and informative.  Please send your comments or suggestions to Pat at <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> or Eric at <a href="mailto:emorgan@nd.edu">emorgan@nd.edu</a>.</p>
<p><strong>Progress Report</strong><strong> on the </strong><strong>Strategic Plan </strong><strong><br />
</strong>Thanks to our members’ hard work, we have made great progress on the objectives charted at our July 2009 Loyola meeting.  This link <a href="http://tiny.cc/LoyolaUpdate">http://tiny.cc/LoyolaUpdate</a> takes you to the &#8220;Loyola Plan&#8221; document with completed tasks highlighted.  We are well on track and even ahead of schedule in some areas.</p>
<ul>
<li>The pilot project to ingest 20,000+ records has been      surpassed.</li>
<li>Focus groups have been completed at Notre Dame, Marquette, and      Seton Hall.  Four other      institutional members have plans to host focus      groups.</li>
<li>Three CRRA member institutions collaborated to submit a successful CLIR grant proposal.</li>
<li>We are working with new members to add their content to the portal      and orient them to the CRRA.</li>
<li>The board is working to identify new members and to develop      policies and procedures to support a growing membership.</li>
</ul>
<p>At the November 19, 2009 board meeting, Tom Leonhardt proposed that we establish CRRA Policies and Procedures to address a variety of issues from committee member roles and expectations to contributor expectations, rights and limitations, etc.  The board enthusiastically approved.  CRRA Committees will be involved in the drafting of policies and procedures.  We welcome your input!</p>
<p>In summary, we have made great strides in the four short months since our July meeting.  This is amazing progress for a largely volunteer organization.  Our congratulations and heartfelt thanks to all!</p>
<p>If there are aspects of the CRRA&#8217;s work in which you would like to have greater involvement, please contact a CRRA board member or Pat Lawton.</p>
<p><strong>CLIR Grant Awarded to CRRA Member Institutions</strong><strong><br />
</strong><em>CLIR Grant Awarded for the Catholic Social Action Access Project:  A Collaborative Project of Marquette University, Catholic University of America, St. Catherine University and the Catholic Research Resources Alliance</em></p>
<p>With funding by the The Andrew W. Mellon Foundation, The Council on Library and Information Resources (CLIR) has awarded Marquette University, Catholic University of America, St. Catherine University and the Catholic Research Resources Alliance (CRRA) a <em>Cataloging Hidden Special Collections and Archives</em> grant in the amount of of $149,964.  The grant will support the <em>Catholic Social Action Access Project,</em> one of only 14 selected from a total of 91 applications. More information about the 2009 CLIR awards is here: <a href="http://www.clir.org/hiddencollections/awards/index2009.html">http://www.clir.org/hiddencollections/awards/index2009.html</a>.</p>
<p>This collaborative project brings together three significant collections documenting US Catholic social action in the 20th century. St. Catherine University&#8217;s <a href="http://library.stkate.edu/spcoll/bethune.html">Ade Bethune Collection</a> documents the career of a world-renowned liturgical artist and social activist who helped found the Church Community Housing Corporation to develop affordable housing in Newport County. The <a href="http://libraries.cua.edu/achrcua/manuA-K.html">Catholic Charities, DC records</a> (CCDC), held by the Catholic University of America document the CCDC&#8217;s leadership and support of progressive social legislation. Marquette University&#8217;s <a href="http://www.marquette.edu/library/collections/archives/day.html">Dorothy Day-Catholic Worker Collection</a> includes audio recordings of the voices of most influential Catholic social activists of the 20th century.</p>
<p>Jean Zanoni, Associate Dean of Marquette Libraries and Matt Blessing, Head of Special Collections and Archives, are Co-Principal Investigators.  Project collaborators are CRRA members and descriptions of project materials will be collocated within the CRRA’s Catholic Portal.</p>
<p>The award is effective January 1, 2010, and project activities will be completed by December 31, 2011.</p>
<p><em>The grant announcement above</em><strong> </strong> will be published in the Winter issue of <em>ACCU&#8217;s </em><em>newsletter. </em>Other planned venues for the announcement include:   CLA&#8217;s <em>Catholic Library World</em>, the <em>ATLA-RC listserv</em><em> </em>and<em> newsletter</em>; <em>ACRL&#8217;s C&amp;RL News; the SAA Newsletter;</em> <em>Midwest Archives Conference newsletter</em>; and <em>Mid-Atlantic Archivist newsletter</em>.</p>
<p>We encourage you to spread the news among your local communities.  If you would like more information, contact Jean Zanoni, Matt Blessing, or Pat Lawton.  If you send an announcement about the grant, please let us know.  For purposes of grant reporting, we would like to track to whom announcements are made.</p>
<p>Thank you and congratulations to all grant participants!</p>
<p><strong><em>Mark your calendars  &#8230;</em></strong><br />
Kevin Cawley (Notre Dame) will present &#8220;Investigating Change through the Catholic Research Resources Alliance&#8221; as a panelist at the <em>Conference on the History of Women Religious</em>, to be held at the University of Scranton June 27-30, 2010.</p>
<p>Of possible interest:  The American Catholic Historical Association will hold their ninetieth annual meeting from January 7 through January 10, 2010 at the ManchesterGrand Hyatt San Diego.  More information about the meeting is here: <a href="http://research.cua.edu/acha/meetings.cfm">http://research.cua.edu/acha/meetings.cfm</a>.</p>
<p><em>All CRRA events</em> and events of possible interest to members are posted to the CRRA calendar, available at <a href="http://tiny.cc/Calendar798">http://tiny.cc/Calendar798</a> and also accessible from the Admin area of the website.</p>
<hr size="2" /><em>CRRA Update</em> is an electronic newsletter distributed via email the first Friday of each month to provide members with an update on CRRA activities. The Update is also posted to the CRRA <a href="../">Blog</a>.  Please contact us at 575.631.1324 or email <a href="mailto:plawton@nd.edu">plawton@nd.edu</a> with your questions, comments, or news to share.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/12/november-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA October Update</title>
		<link>http://www.catholicresearch.net/blog/2009/11/crra-october-update/</link>
		<comments>http://www.catholicresearch.net/blog/2009/11/crra-october-update/#comments</comments>
		<pubDate>Fri, 06 Nov 2009 16:50:33 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=28</guid>
		<description><![CDATA[Following is the update of CRRA activities and news during the month of October, 2009. This update includes: -News about CRRA members -An update on the focus groups at member institutions - Notes from the Digital Access Committee meeting - Future happenings - Suggested readings I would like to include more news items by and [...]]]></description>
			<content:encoded><![CDATA[<p>Following is the update of CRRA activities and news during the month of October, 2009.   This update includes:</p>
<p>-News about CRRA members<br />
-An update on the focus groups at member institutions<br />
- Notes from the Digital Access Committee meeting<br />
- Future happenings<br />
- Suggested readings</p>
<p>I would like to include more news items by and about members in future updates, so if you have news items to share, please pass them on.</p>
<p>Happy reading!</p>
<p>Pat Lawton<br />
CRRA Digital Project Librarian<br />
plawton@nd.edu</p>
<p style="text-align: center; "><strong>CRRA Update<br />
October 2009</strong></p>
<p><em>Member News</em></p>
<p><strong>CRRA welcomes Loyola University Chicago and St. Catherine University!</strong><br />
The CRRA Board of Directors welcomed Loyola University Chicago and St. Catherine University as the newest members and participants in the CRRA.  Bob Seal, Dean of Libraries, Loyola and Carol P. Johnson, Director of Libraries, Media Services, and Archives at St. Catherine accepted with pleasure the invitation to join the CRRA this year (2009/10).</p>
<p>Carol Johnson has been actively involved with the CRRA through her service on the Collections Committee and in partnering with us in authoring the CLIR grant proposal.  Deborah Kloiber, Curator of the Ade Bethune Collection, played a significant role in the planning and drafting of the CLIR grant and has expressed her continued interest in CRRA activities.  St. Kate&#8217;s houses a number of important print and digital collections.  One highlight is their collection of the papers and art works of the liturgical artist, Ade Bethune.  For more about St. Kate&#8217;s library and collections, see  <a href="http://library.stkate.edu">http://library.stkate.edu</a>.<br />
Bob Seal, Director of Loyola University Libraries, graciously hosted our July meeting.  We enjoyed glorious Lake Michigan views and a tour of the newly completed Klarchek Information Commons. Kathy Young, University Archivist and Curator of Rare Books, participated in our July meeting and shared news of her work with a Chicago-based collaboration among libraries, universities, and archives, the Black Metropolis Research Consortium (BMRC). Loyola has a number of resources of interest to the CRRA, including the Jesuitica collection and the Women &amp; Leadership Archives (WLA).  Many of the WLA collections relate to Catholic women leaders; it will be a great resource for the CRRA.  Dr. Beth Myers, Loyola&#8217;s Director for Women &amp; Leadership Archives, will also participate in CRRA activities.  For more about the library and collections at Loyola see <a href="http://libraries.luc.edu">http://libraries.luc.edu</a>.</p>
<p><strong>Ed Starkey Announces Retirement</strong><br />
CRRA Board Member, Ed Starkey, University Librarian of the University of San Diego, announced that he will be retiring the end of December, 2009.  Ed provided the inspiration, encouragement and knowledge that was so essential to the founding of the CRRA. Ed will be sorely missed!</p>
<p><strong>ARL/CNI Forum on Special Collections in Washington DC</strong><br />
Bob O’Neill (Boston), Jennifer Younger (Notre Dame), John Buchtel (Georgetown), and Pat Lawton attended the ARL/CNI Forum “An Age of Discovery: Distinctive Collections in the Digital Age” in Washington, DC.  The forum included a wide array of speakers, including such notables as G. Wayne Clough, Secretary of the Smithsonian Institution; Don Waters, Program Officer for Scholarly Communication, The Andrew W. Mellon Foundation and Ian E. Wilson, Librarian and Archivist of Canada Emeritus, President of the International Council on Archives, and Strategic Advisor, University of Waterloo. Two themes dominated the forum:  special collection are for use, and “[The] Online [Environment] enables the past to be put back together once again” (Ian Wilson).  For more on the forum, see the <a href="http://www.arl.org/resources/pubs/fallforumproceedings/forum09proceedings.shtml">Proceedings</a> page with selected presentations.</p>
<p><em>CRRA News</em></p>
<p><strong>Focus Groups to be Held at Member Institutions in November and December</strong><br />
Member institutions including Georgetown, Catholic, Villanova, Marquette and Seton Hall University are preparing for focus groups to be held in November/December. From this data, aggregate summaries of findings will be created and shared with all institutions. Focus group data will provide a systematic overview from potential portal users and may serve as a guide for future portal directions.  Thank you to all who are gathering data!  We look forward to your findings.</p>
<p>If you have yet to begin plans for focus groups at your institution, it’s not too late!  To begin the process, call Pat Lawton and/or consult the documents from the focus groups held at Notre Dame.  Documents include steps in planning, sample email announcements for focus group participants, key questions, and moderator notes.  All documents are posted to the Admin area of the CRRA website.</p>
<p><strong>Digital Access Committee (DAC) Meetin</strong>g<br />
The Digital Access Committee (DAC), chaired by Tom Leonhardt, met on October 12.  Members identified the following as important next steps:  create a data input form enabling members with no metadata records for particular collections to easily submit collection-level descriptions to the portal; identify existing common errors in the portal and gather suggestions for further improvements.</p>
<p>Meeting minutes are posted to the Portal admin area.</p>
<p><em>Suggested Reading</em><br />
Two reports of possible interest to CRRA members were released this month.</p>
<p><strong>SPARC Report on Income Models for Open Acces</strong>s<br />
SPARC (the Scholarly Publishing and Academic Resources Coalition) examines the issue of sustainability for current and prospective open-access publishers in “Income models for Open Access: An overview of current practice,” by Raym Crow (<a href="http://www.arl.org/sparc/publisher/incomemodels/imguide.shtml">http://www.arl.org/sparc/publisher/incomemodels/imguide.shtml</a>). Although the report examines online journals in particular, by way of extrapolation, it is useful in thinking about models of sustainability for the portal.  Read with an eye toward the CRRA, it presents a host of possible revenue streams for the portal, including sponsors, advertisers, and value-added services.</p>
<p><strong>JISC Final Report on: “Digitisation of Special Collections: mapping, assessment, prioritization”</strong><br />
<a href="http://www.jisc.ac.uk/media/documents/programmes/digitisation/discmap_final_report_211009_final.pdf"> http://www.jisc.ac.uk/media/documents/programmes/digitisation/discmap_final_report_211009_final.pdf</a><br />
This report may be useful in planning the process to identify collections of interest and prioritize the need for digitization.  The methodologies look rather robust and the study provides a useful framework for identifying variables of interest to CRRA such as Curatorial environment, Age of collections, and Subject area.  Also note that they surveyed intermediaries (librarians, archivists, etc.) as well as end users.</p>
<p>Interesting results include:<br />
•	Articulation of a method of user-driven prioritization<br />
•	Identification and articulation of the need to have collections accessible in one place (a comprehensive collection description and finding utility to support resource discovery)<br />
•	Recommendation that a “standard approach to collection description be adopted where the relationships between a collection and its ‘super-collections’ and ‘sub-collections’ are clearly presented”<br />
•	Finding that high quality metadata is essential for discovery</p>
<p><em>Looking ahead …</em><br />
Eric Lease Morgan will be in Chicago November 14-16 for the Digital Humanities and Computer Science Conference.</p>
<p>Sister Jean Bostley of the Catholic Library Association and Malachy McCarthy of the Claretian Missionaries Archives, USA, have invited Eric and Pat to participate in a panel discussion at ALA in June 2010.  The session is entitled Planning, Building, and Using Religious Archive Digital Sites: The USC Internet Mission Photography Archive and the Catholic Research Resources Alliance.  Date and time to be announced.</p>
<p>Please send your news items for future CRRA Updates to Pat Lawton at plawton@nd.edu.</p>
<p style="text-align: center;">A Happy and Blessed Thanksgiving to all!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/11/crra-october-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRRA Monthly Updates &#8211; Sept09</title>
		<link>http://www.catholicresearch.net/blog/2009/10/crra-monthly-updates-sept09/</link>
		<comments>http://www.catholicresearch.net/blog/2009/10/crra-monthly-updates-sept09/#comments</comments>
		<pubDate>Mon, 05 Oct 2009 16:28:57 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=22</guid>
		<description><![CDATA[To help keep you informed of the CRRA’s activities, I would like to introduce the first installment of the “CRRA Update.”  CRRA Updates will be distributed monthly to all CRRA members and posted here, for your viewing pleasure.  Enjoy! Please send suggestions for future updates to Pat Lawton at plawton@nd.edu. Thank you!]]></description>
			<content:encoded><![CDATA[<p>To help keep you informed of the CRRA’s activities, I would like to introduce the first installment of the “<a href="https://catholic-portal.library.nd.edu/admin/docs/About%20CRRA/Updates/CRRA%20Update_Sept09_blog.pdf">CRRA Update</a>.”   CRRA Updates will be distributed monthly to all CRRA members and posted here, for your viewing pleasure.  Enjoy!</p>
<p>Please send suggestions for future updates to Pat Lawton at plawton@nd.edu.  Thank you!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/10/crra-monthly-updates-sept09/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Seven Simple Steps to Making Your Content Available in the CRRA Portal</title>
		<link>http://www.catholicresearch.net/blog/2009/08/seven-simple-steps-to-making-your-content-available-in-the-crra-portal/</link>
		<comments>http://www.catholicresearch.net/blog/2009/08/seven-simple-steps-to-making-your-content-available-in-the-crra-portal/#comments</comments>
		<pubDate>Wed, 12 Aug 2009 18:50:05 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=13</guid>
		<description><![CDATA[This is an outline &#8211; a recipe &#8211; for getting your metadata records into the &#8220;Catholic Portal.&#8221; Identify specialists &#8211; It takes many people with many skills to extract content for the Portal. It requires bibliographers (subject specialists) who know which materials located in their local library fit the scope of the project. It requires [...]]]></description>
			<content:encoded><![CDATA[<p>This is an outline &#8211; a recipe &#8211; for getting your metadata records into the &#8220;Catholic Portal.&#8221;</p>
<ol>
<li>
<p>
      <strong>Identify specialists</strong> &#8211; It takes many people with many skills to extract content for the Portal. It requires bibliographers (subject specialists) who know which materials located in their local library fit the scope of the project. It requires catalogers (metadata specialists) who know how the local materials are described. It requires systems librarians (database administrators) who can extract metadata records from underlying system(s).</p>
</li>
<li>
<p>
      <strong>Have a meeting</strong> &#8211; Bring all the specialists together to discuss Steps #3-7.</p>
</li>
<li>
<p>
      <strong>Understand the scope of the Portal</strong> &#8211; This is akin to understanding the purpose of the Portal, who is its intended audience, and what is its collection policy. In short, the Portal is intended to contain rare, unique, and/or infrequently held materials useful for scholarly Catholic research.</p>
</li>
<li>
<p>
      <strong>Identify resources/collections</strong> &#8211; List the resources/collections in the library which fall into the scope of the Portal. Examples might include rare books &amp; manuscripts, digitized images, sound recordings, the papers of famous individuals, the archives of leading organizations, pamphlets, newspapers, etc.</p>
</li>
<li>
<p>
      <b>Articulate how identified resources/collections are described</b>- For each of the resources/collections identified in Step #4 determine which ones have metadata and which ones don&#8217;t.</p>
<p>For those items which <em>do</em> have metadata, list how items in the collection are denoted in your various computer systems. Are they all in a particular call number range? Are they the totality of items in your &#8220;special collections&#8221; department and/or encoded as EAD files? Have they all been cataloged with a local note in your integrated library system (ILS)? Are they all or a subset of items saved in a local spreadsheet or database? Etc.</p>
<p>For &#8220;extra credit,&#8221; discuss ways the items which don&#8217;t have metadata can get some in the future.</p>
</li>
<li>
<p>
      <strong>Extract metadata records</strong> &#8211; Given the things discussed in Step #5, collect the metadata records from your system(s). For example, some sort of search might be done to extract all identified MARC records from an ILS. All EAD files describing materials apropos to the Portal might be saved to a directory. A report might be written against a database to create a tab-delimited text file. Etc.</p>
<p>There are three things to remember when extracting the metadata. The first is something we are calling &#8220;<em>MARC-ability</em>&#8220;. For better or for worse, VuFind only accepts MARC records as input, and consequently, all metadata received for ingestion must be translated into MARC. Thus, &#8220;real&#8221; MARC records are easily accepted, but &#8220;tagged&#8221; MARC records are not. EAD files can be cross-walked to MARC and thus easily accepted. Some sort of delimited (CSV, tab, etc.) file works well because they are easily parsed. HTML files are poorly structured making any mapping process very difficult. The same goes for any word-processed file (Word, WordPefect, etc.). MARC, any flavor of XML, and delimited files work best.</p>
<p>Second, each metadata record requires a number of <em>specific fields</em>. Each record requires a unique identifier. For MARC records this is a value in the 001 field. For EAD files, this is denoted by the identifier attribute in the eadid element. Next, each record requires a pointer to where the described item can be found. Generally speaking, these are either call numbers or URLs saved in the appropriate fields. The last requirement is not really a field but formatting. All data must be saved using the UTF-8 character encoding. Any other encodings (like MARC-8) are not readable. If not saved in plain ASCII or UTF-8, then diacritics display incorrectly and confuse the VUFind indexer.</p>
<p>The third and final thing to remember is in regards to four levels of <em>data integrity</em>. The first level speaks to the way your data is structured. For XML files this means they are well-formed. For MARC records, it means the leader is 24 bytes long, the value of the first 5 characters of the leader equals the length of each record, fields are delimited with the appropriate ASCII characters, etc. The second level speaks to validity. For XML files it means the data conforms to a DTD or schema. For MARC records it means authors are in 1xx fields, the title is in 245, notes are in 5xx, etc. The third level of integrity is correctness. &#8220;To what degree is the value in 245 the title of the item? To what degree are the URLs not broken? Etc.&#8221; The last level of integrity is in regards to completeness. A metadata record&#8217;s completeness is directly proportional to its findabilty. The first two levels of integrity can be validated through computer technology. The second two levels are the domain of librarianship.</p>
</li>
<li>
    <strong>Send records to Notre Dame</strong> &#8211; After the records have been exported, email them to emorgan@nd.edu, and they will be ingested into VuFind.</li>
</ol>
<p>You&#8217;re done! We will notify you via email when your records are available for viewing, giving you the opportunity to validate the process and examine the fruits of your labors.</p>
<p>Finally, this &#8220;recipe,&#8221; like any good recipe, is only an outline of what needs to be done. There will surely be variations along the way, but based on our experience, this outline represents a good way to get started.</p>
<h2>Contact information</h2>
<p>If you have questions along the way, don&#8217;t hesitate to contact Eric or Pat:<br />Eric Lease Morgan emorgan@nd.edu 574.631.8604<br />Pat Lawton plawton@nd.edu 574.631.1324</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/08/seven-simple-steps-to-making-your-content-available-in-the-crra-portal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CLIR pre-proposal approved</title>
		<link>http://www.catholicresearch.net/blog/2009/07/clir-pre-proposal-approved/</link>
		<comments>http://www.catholicresearch.net/blog/2009/07/clir-pre-proposal-approved/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 15:02:50 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=6</guid>
		<description><![CDATA[Good news! Together, the CRRA, Catholic University of America, Marquette University, and St. Catherine University libraries submitted a pre-proposal for CLIR’s “Cataloging Special Hidden Collections and Archives” grant program. Our pre-proposal was accepted! We may now submit a final proposal, due Sept. 4. The focus of this funding opportunity is on cataloging. Proposed collections (or [...]]]></description>
			<content:encoded><![CDATA[<p>Good news!</p>
<p>Together, the CRRA, Catholic University of America, Marquette University, and St. Catherine University libraries submitted a pre-proposal for CLIR’s “Cataloging Special Hidden Collections and Archives” grant program. Our pre-proposal was accepted! We may now submit a final proposal, due Sept. 4.</p>
<p>The focus of this funding opportunity is on cataloging.  Proposed collections (or portions of) for cataloging include:   Ade Bethune (St. Catherine), Dorothy Day (Marquette), and Catholic Charities USA (Catholic).  One of the innovative aspects of our proposal is that the cataloged collections will all be accessible via the Catholic Portal.</p>
<p>For further information about the program, please visit CLIR’s website<br />
at <a href="http://www.clir.org/hiddencollections/index.html">http://www.clir.org/hiddencollections/index.html</a>.</p>
<p>We still have a ways to go but we made it through the first round.</p>
<p>Congratulations to all collaborators!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/07/clir-pre-proposal-approved/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>About</title>
		<link>http://www.catholicresearch.net/blog/2009/06/about-2/</link>
		<comments>http://www.catholicresearch.net/blog/2009/06/about-2/#comments</comments>
		<pubDate>Tue, 30 Jun 2009 18:34:56 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
		
		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?page_id=2</guid>
		<description><![CDATA[This is an example of a WordPress page, you could edit this to put information about yourself or your site so readers know where you are coming from. You can create as many pages like this one or sub-pages as you like and manage all of your content inside of WordPress.]]></description>
			<content:encoded><![CDATA[<p>This is an example of a WordPress page, you could edit this to put information about yourself or your site so readers know where you are coming from. You can create as many pages like this one or sub-pages as you like and manage all of your content inside of WordPress.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/06/about-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hello world!</title>
		<link>http://www.catholicresearch.net/blog/2009/06/hello-world-2/</link>
		<comments>http://www.catholicresearch.net/blog/2009/06/hello-world-2/#comments</comments>
		<pubDate>Tue, 30 Jun 2009 18:34:56 +0000</pubDate>
		<dc:creator>Eric Lease Morgan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://vufind.library.nd.edu/blog/?p=1</guid>
		<description><![CDATA[Welcome to the Catholic Research Resources Alliance (CRRA) blog! We are a collaborative effort initiated by eight  Catholic colleges and universities to share their resources electronically with librarians, archivists, researchers, scholars, and the general public interested in the Catholic experience. For more about the CRRA, see: http://www.catholicresearch.net. The CRRA is currently engaged in building The [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to the Catholic Research Resources Alliance (CRRA) blog!</p>
<p>We are a collaborative effort initiated by eight  Catholic colleges and universities to share their resources electronically with librarians, archivists, researchers, scholars, and the general public interested in the Catholic experience.</p>
<p>For more about the CRRA, see: <a href="http://www.catholicresearch.net">http://www.catholicresearch.net</a>.</p>
<p>The CRRA is currently engaged in building The Catholic Research Resources Portal.  The Catholic Portal provides access to rare, unique or infrequently held materials in academic libraries and seminaries&#8217; special collections and archives. By electronically bringing together access to resources in many collections, the Portal will create easy, effective and global discovery of Catholic research resources.</p>
<p>We are currently in the midst of a pilot project to add member content into the VuFind interface.  The test database is here: <a href="../../">http://vufind.library.nd.edu/</a>.  Have a look around, and let us know what you think.</p>
<p>If you have any questions or comments about the CRRA, post them here or send them to me at plawton@nd.edu.</p>
<p>Welcome!</p>
<p>Pat Lawton<br />
Digital Project Librarian for the CRRA</p>
]]></content:encoded>
			<wfw:commentRss>http://www.catholicresearch.net/blog/2009/06/hello-world-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

