Title:Automatic Heritage Metadata Enrichment with Historic Events
Authors:Marieke van Erp, Johan Oomen, Roxane Segers, chiel van den akker, Lora Aroyo, Geertje Jacobs, Susan Legêne, Lourens van der Meij, Jacco van Ossenbruggen, Guus Schreiber
Publication:MW2011: Museums and the Web 2011

Most digitised and online available objects from GLAMs (Galleries, Libraries, Archives, Museums) can be browsed through a predefined set of formal metadata, such as its creator, year of creation, and type of material. Standards for metadata management and exchange have matured and are being adopted widely. They enable intra-collection search and exploration, and are also main drivers behind supporting domain and cross-boundary access to collections.

However, these formal metadata often do not give access to information pertaining to the content of the object, such as its topic, or what is depicted. This information is often given through textual descriptions which are mostly only accessible through keyword search. Keyword search is limited in the sense that it does not facilitate sorting, or retrieving objects whose descriptions contain terms that are synonymous to the search term.

This paper provides results of an interdisciplinary research project, Agora, that is taking collection access one step further by enabling users to search and browse museum collections through the content descriptions of objects in a structured way. The three-year Agora project is funded by the Netherlands Organisation for Scientific Research and brings together computer scientists, cultural heritage experts, and humanities researchers.

We present advances in collection access through automatic data enrichment and linking to thesauri and external resources using a combination of state of the art information extraction and semantic web technology. In this work, we show results on enrichment, linking and integration of two collections, namely those of the Netherlands Institute for Sound and Vision (NISV) and the Rijksmuseum Amsterdam (RMA). Our first contribution to the state of the art in collection access enables users to browse and search descriptive metadata elements in a structured manner; our focus here is on automatically linking objects to historical events, thus providing users of the collection with objects in context. The figure shows a screenshot of an object from the NISV collection, along with its automatically enriched description field. The entities and events that are identified in the description field, are linked to instances in the event thesaurus. The instances in the event thesaurus are multifaceted objects that are modelled through an event model, providing the historical context to an object.

This new dimension that is added to the collection information has profound implications regarding visualisation; a list or facet view does not do justice to the richness events provide. Our second contribution is a presentation method of enriched collection objects through an interface that is based on the same event model that also underlies the event thesaurus. This interface explicitly shows the event links that connect collection objects and support user interactions that are subsequently used to enrich the knowledge of the collections.

The first evaluations of our approach indicate that event-driven collection access offers users a more meaningful and richer experience. As information extraction and semantic web technologies are advancing at a rapid pace, we foresee that event-driven access as proposed by Agora will be adopted widely in the cultural heritage domain in the near future.