A case study on using voice technology to assist the museum visitor

In 1952, Bell Laboratories developed what is considered the first speech recognition system: Audrey (Pinola, 2011). It could only understand numbers and it required the speaker to pause between words. Since then, other systems followed, each with a growing vocabulary set but each with different limitations on usage. It was only in the 2000s that speech recognition made a significant leap forward, first with the Google Voice Search application and Google Now, and then very closely followed by Apple\’s Siri. Mobile devices offered an ideal platform for practical uses of hands-free speech recognition tools. Both Google tools and Siri also relied heavily on the power of cloud-based computing, the ability to draw upon a massive amount of data pertaining to speech search queries, and an ever evolving and constantly learning analysis of one\’s own unique speech patterns, all of which helped these tools to produce the best possible response to a user’s query. While not mainstream yet, the improved accuracy, as well as the quirkiness of these tools helped increase overall awareness of voice activated technologies in solving basic user questions.

As voice activated smart devices become more mainstream with consumers, we at The Museum of Modern Art (MoMA) in New York City, have taken steps towards analyzing its potential uses on both a consumer and an enterprise level to assist our Museum community. In order to do this, we focused on the Amazon Echo. In this How-To session we will cover how we integrated the Amazon Echo with the MoMA Collections database and built it a new \”skill\” for answering queries related to modern and contemporary art at MoMA. The purpose of this session is to introduce voice activated technologies with a concrete and functioning prototype. We will walk through the technical implementation details of our prototype and discuss what MoMA considers the potential use cases at our Museum.