What is Crowdsourcing and How Does it Apply to History?

One of the challenges of public and digital history projects is deciding what content will be included and what will be excluded.  Often times, a topic may be so scarcely documented that it’s inclusion is not possible.  We came across this dilemma more than once during last summer’s work on the Euclid Corridor ProjectOne of the staff’s favorite historical topics – Leo’s Casino – was so lacking in digital assets that we nearly gave up the search.  We had scoured local libraries, archives and historical societies, and put in many calls to former interview subjects, but nobody had any images relating to the site.  We had even flooded Google with every query we could imagine, but nothing useful was coming up.  The minimum requirements to include any historic site was two audio clips, 6 images, and 250 words of text, and we had failed to find a single image.

Then someone, by accident or genius, found a whole slew of images on Flickr.  They weren’t photos.  They were scanned copies of newspaper ads.  Not ideal, but they were good enough that we now had the bare minimum of information required to tell the story of Leo’s Casino.  Our saviors were a local group of Cleveland boosters known as Cleveland SGS, who had amassed quite a collection of digital images relating to Cleveland history.  They had done for free in their spare time what projects like ours, along with librarians and archivists, typically spend untold amounts of money and labor to achieve.  Of course, academics and librarians add value to the process, providing analysis, context and meaningful organization, but clearly there is a benefit to what Cleveland SGS and countless other “amateur” projects do.  In fact, the only downside is that they are so hard to find.

I tell this story to introduce the notion of crowdsourcing, particularly as it applies to historical research and public and digital history projects.  Cleveland SGS isn’t the only group willing to do for free what academics and other professionals do for money.  We see it in nearly every successful mainstream website.  Large companies provide the framework at their own cost and users provide the content.  YouTube, Facebook, Wikipedia, and scores more rely entirely on user-generated content.

There are two lessons here for teachers, researchers and digital historians.  One is that the Web has clearly created new resources that, though often problematic, cannot be ignored.   Verifiability is still required, as is the case with any historic document.  This can pose a serious challenge on the Web, where things tend to be anonymous and/or poorly documented.  Students should avail themselves to the large store of literature on this topic;  teachers should stop resisting and instead aim for balance between traditional and Web resources.

The second, more exciting lesson is that the public wants to be involved in content creation — not only with YouTube videos and personal blogs, but with real research.  They want to add their voices to the dialogue on our shared history.  We experienced this at the Center during the Euclid Corridor Project, not only in our search for images of Leo’s Casino, but in the willingness of participants to volunteer their time and emotion in sitting for oral history interviews.  Oral history is, by its nature, a form of crowdsourcing.  The public is providing the content; the scholars are providing the framework for collection and dispersal.  The Euclid Corridor Project, like other oral history and public history projects, was still fairly well-regulated in terms of control.  Other projects have opened themselves completely, allowing unfiltered user-contributions.  Omeka has been a major contributor in this shift toward crowdsourcing.  For two of the best examples, check out the Hurricane Digital Memory Bank and the September 11th Digital Archive (the latter was technically a prototype for Omeka).  The Library of Congress has also employed the “wisdom of crowds” for help with image description, adding a vast collection to the Flickr Commons for user annotation.  On a much smaller scale, we are facilitating the use of Omeka (see: Teaching and Learning Cleveland) by undergraduate students and public school teachers to transform their routine research into an archival processing supplement for the university’s Cleveland Memory archive.

Obviously, trained educators and researchers remain at the center of historical studies and production.  But we are seeing increasing amounts of public involvement in the creation of the historical record.  Though this can lead to additional questions about bias and accuracy, the discipline is more than up to the challenge, and richer for it.

Related Reading:

Erin Bell (M.L.I.S.) is Project Coordinator and Technology Director at the Center for Public History + Digital Humanities at Cleveland State University and lead developer for Curatescape, a web and mobile app framework for publishing location-based humanities content. In addition to managing a variety of oral history, digital humanities and educational technology initiatives, he has spoken to audiences of librarians, scholars, and technologists on best practices in web development and publishing.