Harvesting, updating, and re-indexing

This posting describes the automated process I am currently using to harvest, update, and re-index the MARC records of the “Catholic Portal“.

Step #1 – Make a list

Librarians love lists, and I am no exception. The process begins with a list (databases) of CRRA members who have MARC metadata to share. Each item in the list includes the following fields:

  1. code – a unique three-letter identifier
  2. institution – the name of the CRRA member
  3. library – the name of the member’s library
  4. URL – the location of their member’s MARC records

Right now, the name of this list is libraries.db. It is created by hand.

Continue reading “Harvesting, updating, and re-indexing”

Web 2.0 features

After tweaking with VUFind’s configuration files, our “sandbox” implementation of the “Catholic Portal” now supports many (if not all) of VUFind’s Web 2.0 features — faceted browse, favorites, cover art, reviews, author blurbs, etc. Please give them a whirl. Create an account for yourself and add some items to your Favorites.

NTS (“note to self”), the account creation process did not work until I changed the value of RewriteBase in my httpd.conf file from /vufind to /.

VUFind (version 1.0.1)

I have installed VUFind (version 1.0.1) on our development server, and the address is http://vufind.library.nd.edu/.

At the present time you won’t find very much there except our indexed metadata records — about 60,000 of them. The next steps are to edit some of the underlying configurations to enable bits of functionality (call number displays, cover art, user reviews, etc.) The bigger issues to be resolved include: 1) giving our implementation the CRRA look & feel, and 2) indexing did-level data found in EAD files. More on these things later.

CRRA Update July 2010

CRRA Update JULY 2010

  • Highlights from the June 29, 2010 CRRA All-Members Meeting at Georgetown
  • Next steps
  • Member News
  • Might be of interest …
  • Mark your calendars … For two upcoming CRRA Gatherings January 6, 2011 (San Diego) and March 29, 2011 (Philadelphia)

Highlights from the June 29, 2010 CRRA All-Members Meeting at Georgetown

Almost thirty members met in a top floor conference room at Lauinger Library. Despite the beautiful view of the Potomac River and the Washington Monuments, the discussion was lively and productive. The minutes in a draft version are posted at http://tiny.cc/pwegm (user: catholic; pswd: portal). For those of you at the meeting, please do review and let Pat know of any changes to be made. Those of you unable to attend we hope will read the minutes for a fuller account of the meeting, particularly the additions to be put into the strategic plan for this coming year.

The clear emphasis was on content: building content, enhancing access and digitizing content to support digital access and research. Suggestions for building content included a reaffirmation on focusing on rare, unique and uncommon materials rather than books which are likely owned by many libraries, the importance of adding records on an ongoing basis, setting priorities for themes to build larger collections in some areas quickly so as to better support scholarly research, and to associate portal records with themes. In regard to access, participants applauded the current search functionality and are excited about the work this coming year to index EAD (Encoded Archival Description) files at a more granular level, thereby creating the potential for the scholar to locate a set of papers not called the Graham Greene Collection of Papers but which are instead located in folders in a collection called “Papers of Catholic authors.”

Everyone agreed that making more full text available via the portal is very important. We need a vision for digital access and research, to use our cooperation to facilitate digitization in a variety of ways, from identifying blockbuster collections for high priority, seeking digitization grants, and using our alliance to facilitate member digitization initiatives. Developing a users wiki would allow users not only to help grow the usability but also afford opportunities for creating online communities and doing digital research. The Monday presentation by Art Crivella, Crivella West, on the Humanities Knowledge Kiosk, showed how a digital library could support interactive use and digital research.

Discussion then turned to how we can support our mission of providing global, enduring access to resources about the Catholic experience and facilitating scholarship. Membership dues are one component of financial support. Everyone is encouraged to identify and cultivate prospective members, and to refer them to Pat or Jennifer for further discussions. In addition, we (the CRRA) have received permission from the NEH to apply for a Challenge grant as an independent entity. NEH challenge grants are capacity-building grants, intended to help institutions and organizations secure long-term improvements in and support for their humanities programs and resources. The Challenge grant would provide funds to build capacity and support an endowment for digitization, a graduate research fellowship, scholarships for CRRA workshops, enhanced public outreach, and other costs of maintaining and sustaining the Catholic portal. We will submit a proposal for the next round, due May 2011. – Jennifer Younger and Pat Lawton

Next steps

In July, CRRA committees will meet to develop priorities, assignments and time lines as noted in the process for developing the CRRA strategic plan which was distributed as part of the June 29 agenda. The Board would then be the appropriate entity to adopt the strategic plan.

Member News

Congratulations to Steve Connaghan (Member, CRRA Board of Directors) who was promoted in May 2010 to the position of Director of Libraries at Catholic University. A Catholic University alumnus (B.A. 1991, M.S.L.S. 1994), Steve began working part time in the University Libraries during his junior year, started working professionally there in 1993, and became acting director on January 1, 2010. Congratulations, Steve, and best wishes in your new position!

Might be of interest …

Thoughts from Melissa Terras’ Closing Plenary Speech at the Digital Humanities Conference 2010

At our recent Georgetown meeting, Joe Lucia offered a number of ideas regarding how the CRRA might work together to transcribe and digitize our collective resources. At the DH2010 Conference in London (at which Eric Morgan, ND, was present), Melissa Terras describes this phenomenon as “crowdsourcing” and offers the Bentham Project as an exemplar upon which to reflect on the progress of Digital Humanities. As the Portal is firmly situated within this Digital Humanities world, many of you may find this of interest.

“Crowdsourcing is the harnessing of online activity to aid in large scale projects that require human cognition – is becoming of interest to those in the library, museum and cultural heritage industry, as institutions seek ways to publically engage their online communities, as well as aid in creating useful and usable digital resources. As one of the first cultural and heritage projects to apply crowdsourcing to a non-trivial task, UCL’s Bentham Project has recently set up the “Transcribe Bentham” initiative; an ambitious, open source, participatory online environment which is being developed to aid in transcribing 10,000 folios of [Jeremy] Bentham’s handwritten documents.”

“The Bentham Project, and the Transcribe Bentham initiative … demonstrates neatly the progression of Digital Humanities in historical manuscript based projects. The Bentham Project has been primarily occupied with print output, gaining a web presence in the mid 1990s, then an online database of the Bentham archive in the early 20th Century, and is now carrying out a moderately large scale digitisation project to scan in Bentham’s writings for Transcribe Bentham. In addition, the Bentham Project has gone from a simple web page, to interactive Web 2.0 environment, from MS Word to TEI encoded XML texts, and from relatively inward looking academic project to an outward facing, community- building exercise. We can peer at Digital Humanities through this one project, and see the transformative aspects that technologies have had on our working practices, and the practices of those working in the historical domain.”

Terras ends with a list of key issues that may resonate with us, as we move forward to develop the CRRA community and the Catholic portal: 1. Our Dependence on Primary Sources, Our Dependence on Modern Technology, 2. Legacy Data, 3. Sustainability, 4. Digital Identity, 5. Embracing the Random, Embracing the Open, 6. Impact, 7. Routes to Jobs, 8. Young Scholars, 9. Economic Downturn, 10. Money, The Humanities, and Job Security, 11. Fears for the Future.

The full blog post is at http://melissaterras.blogspot.com/2010/07/dh2010-plenary-present-not-voting.html.

Mark your calendars …

For two upcoming CRRA Gatherings January 6, 2011 (San Diego) and March 29, 2011 (Philadelphia)

We have tentatively set the date for a CRRA reunion and meetings during ALA Midwinter in San Diego and for our Annual All-members Meeting during ACRL in Philadelphia.

Please reserve space on your calendars now – Thursday, January 6, 2011 and Tuesday, March 29, 2011. For the March 29, 2011 meeting, plan to join your CRRA colleagues for dinner on Monday evening and dinner on Tuesday as well for all who can. We will organize the day to include both separate Board and committee retreats and plenary sessions for all.

Further details will be distributed in future Updates and CRRA emails.

All CRRA events and events of possible interest to members are posted to the CRRA calendar, available at http://tiny.cc/Calendar798 and also accessible from the Admin area of the CRRA website.