How to Build a National Union Catalogue utιlizing Big Data techniques


Names: Dimitrios Kouis, George Veranis, Vasilios Kalavrouziotis, Athanasios Naskos

Audience: Software developers and librarians, interested in the application of open-source technologies in the library domain (estimated time: 4 hours)

Maximum number of participants: 20-25

Short description:

Union catalogues provide a single point of entry to a large number of physical and virtual collections, by integrating their contents and essentially serving as a “one-stop shop” for (inter)library loan and document delivery services. Besides offering a single point of access to libraries and constituting an information portal for users, a union catalogue can also serve as a tool for improving the bibliographic quality of member libraries’ records and facilitate the work of librarians who can reuse the work of others cataloguers, therefore saving time and increasing productivity. With the increasing availability of data access interfaces and protocols in the majority of modern integrated library systems and the omnipresence of Big Data tools and frameworks, now, more than ever, the process of building a union catalogue integrating heterogeneous library catalogues seems like a well-trodden path. Nevertheless, in order to truly cater for end users, who are often baffled by the inner complexities of bibliographic standards, and to overcome the semantic interoperability problem that arises from the adoption of different cataloguing standards and policies, there are several challenges that need to be taken on as well.
In this workshop, we will share experiences and insights gained during our ongoing effort of building an online union catalogue for Hellenic libraries. We will go through all the necessary steps needed for the development and structuring of a union catalogue, focusing on the technological aspect of such a project, while also highlighting the importance of organizational issues as well.

Workshop outcomes:

○ An overview of the design and implementation phases of a real-world union catalogue
○ Insight and a set of best practices for tasks, such as:
○ incremental harvesting of records from library systems
○ MARC-independent distributed storage and management of records using NoSQL solutions
○ error and inconsistency detection among records
○ efficient record linkage among distinct library systems in a Big Data context
○ middleware development for programmatic data access
○ development of search-oriented interfaces for effective resource discovery by end users