VUFind is the technical backbone of the “Catholic Portal”, and this posting documents my experiences at the VuFind 2.0 Conference held at the Villanova Conference Center on September 15 & 16, 2010. In short, it provided an opportunity for the community to share successes, challenges, and visions for the future.
The Conference was divided into a number of presentations, group discussions, and informal social events. Joe Lucia (Villanova University) facilitated and opened the meeting with a number of general remarks surrounding libraries and the current environment:
The question is, “Who will fulfill the social mission of libraries in the future?” If libraries don’t do it, then some other institution will. Libraries represent a locus of knowledge for our communities and a place for cultural conversation. Open source software is rooted in this same social mission and congruent with the mission of libraries… Is Google Books and the HaitiTrust a new form of the “Information Commons”? Maybe, but maybe not… Cloud computing is a trend towards aggregation, concentration, and commercialization, but is that the best solution, since it too is not immune to proprietary lock-in… Software as service is also a current trend and we must ask ourselves, “Why not just build something based on the WorldCat APIs?” Public libraries are pointing a way towards the creation of knowledge spaces — a possible lead for academic libraries. Seen in this light, libraries may be new cathedrals.
Demian Katz (Villanova University) then shared how he has integrated VUFind with Serials Solutions’ Summon. After considering a number of options, he decided to go with single search and a two-column display. Do a search. Query the local VUFind (Solr) index. Simultaneously query the remote Summon index. Display both results in a common window with VUFind on one side and Summon on the other. Especially this means books are on the left and articles are on the right. “You can’t modify the Summon relevancy ranking, and thus you get a lot of noise. Merging the indexed content often places local materials lower in the relevancy ranked output.” There are a few things on Katz’s to-do list: the addition of social features, the highlighting of query terms, advanced faceting options, and a mobile interface. You can try this VUFind/Summon combination at library.villanova.edu/Find.
A similar presentation was given by Chris Spalding (Ex Libris) in his description of how VUFind can be integrated with Primo Central. “Through an API access to Primo Central content can be integrated with VUFind. We do two searches, get results, re-rank, and display. The key to the solution is the PC (Primo Central) add-on. We hope to do more collaboration and be as open as possible… We use AWS (Amazon Web Services) to host our content… We hope to share the code as soon as the end of the year, and we are sincerely trying to bootstrap the process of combining VUFind with Primo Central.” The approach described by Spalding is the approach I expected Katz to implement with Summon. Apparently there are problematic issues with both techniques.
Name & title authorities as well as controlled vocabularies was the focus of the next presentation, given by Katz. He first described how he experimented with prototypical Perl “hacks” found in a recent issue of Code4Lib Journal. These hacks exploit the WorldCat API to list authoritative names and subjects. He described another experiment where he integrated locally created authority content with the local VUFind (Solr) index. Finally he described a third possible solution taking advantage of the linked data provided by the Library of Congress. His next experiment will surround the use of the eXtensible Catalog Metadata Services Toolkit to munge and use authority records. “The use of authority lists make it possible for a person to do browse against the ‘catalog’.”
The group then broke into two or three smaller groups to discuss “birds-of-a-feather” sorts of ideas — breakout sessions.. Because of my interest in archival materials and EAD files, I went with the group called Beyond MARC. There we discussed things such as but not limited to the indexing of many different things: websites, EAD files, METS records, and full text. We also discussed the challenges of indexing hierarchical data, the content of boutique collections, and the provision of non-bibliographic services against metadata. In the end, we advocated for the greater use of VUFind record drivers, making it easier to support local customizations, and figuring out how to handle hierarchies.
Working on a project called SWWHEP, Luke O’Sullivan (Swansea University) described how he hacked VUFind to work in a multi-ILS environment with the ultimate goal of providing reciprocal borrowing. Calling himself a “shambrarian” he described MARC as the “Dark Side of open source”. After being given sets of MARC records whose 001 fields had been modified for uniqueness, O’Sullivan essentially created a multitude of configuration files associated with each library system under his charge. When records were returned from searches his code looked at the 001 values and branched accordingly. Of all the implementations described during the Conference, O’Sullivan’s hack was the “kewlest”. See his good work at ifind.swwhep.ac.uk.
Birong Ho (Western Michigan University) was second up on the second day with a description of how she and her team exploited the use of Web Services computing techniques to communicate between VUFind and their local Voyager system. She uses these services to support holds, renews, etc.
I was then given the chance to describe a future for “next generation library catalogs”, a thing I call services against texts. In a nutshell, I advocated for discovery systems to go beyond find and move towards use, and with the increasing availability of full text content such a prospect is increasingly possible. “Quantitative metadata — as opposed to qualitative metadata — makes it easier to compare, contrast, and analyze individual items in collections or collections as a whole.” I then demonstrated how digital humanities computing techniques can be applied to full text content to discover underlying patterns.
We broke into small groups again — table talks — and brainstormed visions for VUFind 2.0. Some of the things we came up with at our table included: relevancy ranking based on social networking data, full text indexing, including content beyond books, personalization based on patrons’ characteristics or history, hooks to download full text from places like the Open Archives, the sharing of social data between VUFind implementations a la Ex Libris’s bX, tighter integration with Open Library, and an integration with VUFind into other applications through APIs.
Here is a short list of some juicy quotes I picked up from some of the attendees:
- “A plug-in architecture may be a good idea.” —Kun Lin
- “Consider bringing different views into VUFind instead of shelling out.” —Eoghan Ó Carragáin
- “Full text indexing is easily implementable as long as you tweak the boosting factor.” —Til Kinsler
- “Maybe part of the solution is to stop giving content to the vendor.” —Greg Pendlebury
- “Remember to exploit the record drivers in order to provide different services and views of content.” —David Lacy
- “Solr’s VUFind schema is currently flat but maybe the data model needs to be more flexible and maybe hierarchal.” —Till Kinstler
- “We are never going to have ‘one bucket’ searching.” —Joe Lucia
Observations and summary
The Conference was well-organized and provided a forum for plenty of discussion and idea generation. The setting was very nice and the food was plentiful. Everybody was able to participate. I heard a number of people say they were either implementing or toying with the idea of implementing Evergreen as their “catalog” and using VUFind as their “discovery layer”. I had not thought of this. Interesting. I appreciated the active participation of Chris Spalding. He was candid and sincere. It was very nice to put names from the mailing lists with faces, and thus the crowd was international. Blacklight was compared & contrasted with VUFind a number of times throughout the meeting. I believe both communities have something to learn from the other.
Alas, I was unable to stay for the fourth quarter of the event. I had a plane to catch, and I had made my reservations under the assumption the Conference would conclude at noon. I was wrong. Consequently I missed the last part of the meeting where next steps were to be articulated. If I had my druthers, two things would happen. First, I hope the development process becomes a bit more structured, complete with regular conference calls and software regression testing. Second, and along similar lines, I hope some entrepreneurial organization comes forward to provide commercial support for VUFind. Such a thing would make it more attractive to the libraries without local technical (computer) expertise.
Finally, I bounced my ideas regarding the indexing of EAD files off of as many people as I could. I think I am on the right track, even though few had experience with the same problem. Wish me luck.