The Crowd and The Library

Libraries, archives and museums have a long history of participation and engagement with members of the public. In a series of blog posts I am going to work to connects these traditions with current discussions of crowdsourcing. Crowdsourcing is a bit of a vague term, one that comes with potentially exploitative ideas related to uncompensated or undercompensated labor. In this series of I’ll try to put together a set set of related concepts; human computation, the wisdom of crowds, thinking of tools and software as scaffolding, and understanding and respecting end users motivation, that can both help clarify what crowdsourcing can do for cultural heritage organizations while also clarifying a clearly ethical approach to inviting the public to help in the collection, description, presentation, and use of the cultural record.

This series of posts started out as a talk I gave at the International Internet Preservation Consortium’s meeting earlier this month. I am sharing these ideas here with the hopes that I can getting some feedback on this line of thinking.

The Two Problems with Crowdsourcing: Crowd and Sourcing

There are two primary problems with bringing the idea of crowdsourcing into cultural heritage organizations. Both the idea of the crowd and the notion of sourcing are terrible terms for folks working as stewards for our cultural heritage. Many of the projects that end up falling under the heading of crowdsourcing  in libraries, archives and museums have not involved large and massive crowds and they have very little to do with outsourcing labor.

Most successful crowdsourcing projects are not about large anonymous masses of people. They are not about crowds. They are about inviting participation from interested and engaged members of the public. These projects can continue a long standing tradition of volunteerism and involvement of citizens in the creation and continued development of public goods.

For example, the New York Public Library’s menu transcription project, What’s on the Menu?, invites members of the public to help transcribe the names and costs of menu items from digitized copies of menus from New York restaurants. Anyone who wants to can visit the project website and start transcribing the menus. However, in practice it is a dedicated community of foodies, New York history buffs, chefs, and otherwise self-motivated individuals who are excited about offering their time and energy to help contribute, as volunteers, to improving the public library’s resource for others to use.

Not Crowds but Engaged Enthusiast Volunteers

Far from a break with the past, this is a clear continuation of a longstanding tradition of inviting members of the public in to help refine, enhance, and support resources like this collection. In the case of the menus, years ago, it was actually volunteers who sat at a desk in the reading room to catalog the original collection. In short, crowdsourcing the transcription of the menus project is not about crowds at all, it is about using digital tools to invite members of the public to volunteer in much the same way members of the public have volunteered to help organize and add value to the collection in the past.

Not Sourcing Labor but an Invitation to Meaningful Work

The problem with the term sourcing is its association with labor. Wikipedia’s definition of crowdsourcing helps further clarify this relationship, “Crowdsourcing is a process that involves outsourcing tasks to a distributed group of people.” The keyword in that definition is outsourcing. Crowdsourcing is a concept that was invented and defined in the business world and it is important that we recast it and think through what changes when we bring it into cultural heritage. Cultural heritage institutions do not care about profit or revenue, they care about making the best use of their limited resources to act as stewards  and storehouses of culture.

At this point, we need to think for a moment about what we mean by terms like work and labor. While it might be ok for commercial entities to coax or trick individuals to provide free labor the ethical implications of such trickery should give pause to cultural heritage organizations. It is critical to pause here and unpack some of the different meanings we ascribe to the terms work. When we use the term “a day’s work” we are directly referring to labor, to the kinds of work that one engages in as a financial transaction for pay. In contrast, when we use the term work to refer to someone’s “life’s work” we are referring to something that is significantly different. The former is about acquiring the resources one needs to survive. The latter is about the activities that we engage in that give our lives meaning. In cultural heritage we have clear values and missions and we are in an opportune position to invite the public to participate. However, when we do so we should not treat them as a crowd, and we should not attempt to source labor from them. When we invite the public we should do so under a different set of terms. A set of terms that is focused on providing meaningful ways for the public to interact with, explore, understand the past.

Citizen Scientists, Archivists and the Meaning of Amateur

Some of the projects that fit under the heading of crowdsourcing have chosen very different kinds of terms to describe themselves. For example,  Galaxy Zoo project, which invites anyone interested in Astronomy to help catalog a million images of stellar objects, refers to its users as citizen scientists. Similarly, the United States National Archives and Records Administration recently launched crowdsourcing project, the Citizen Archivists Dashboard, invites citizens, not members of some anonymous crowd, to participate. The names of these projects highlight the extent to which they invite participation from members of the public who identify with and the characteristics and ways of thinking of particular professional occupations. While these citizen archivists and scientists are not professional, in the sense that they are unpaid, they connect with something a bit different than volunteerism. They are amateurs in the best possible sense of the term.

Amateurs have a long and vibrant history as contributors to the public good. Coming to English from French, the term Amateur, means a “lover of.” The primarily negative connotations we place on the term are a relatively recent development. In other eras, the term Amateur simply meant that someone was not a professional, that is, they were not paid for these particular labors of love. Charles Darwin, Gregor Mendal, and many others who made significant contributions to the sciences did so as Amateurs. As a continuation of this line of thinking, the various Zooniverse projects see the amateurs who participate as peers, in many cases listing them as co-authors of academic papers published as a result of their work. I suggest that we think of crowdsourcing not as extracting labor from a crowd, but of a way for us to invite the participation of amateurs (in the non-derogatory sense of the word) in the creation, development and further refinement of public goods.

Toward a better, more nuanced, notion of Crowdsourcing

With all this said, fighting against a word is rarely a successful project, from here out I will continue to use and refine a definition for crowdsourcing that I think works for the cultural heritage sector. In the remainder of this series of posts I will explain what I think of as the four key components of this ethical crowdsourcing, this crowdsourcing that invites members of the public to participate as amateurs in the production, development and refinement of public goods. For me these fall into the following four considerations, each of which suggests a series of questions to ask of any cultural heritage crowdsourcing project. The four concepts are;

  1. Thinking in terms of Human Computation
  2. Understanding that the Wisdom of Crowds is Why Wasn’t I Consulted
  3. Thinking of Tools and Software as Scaffolding
  4. A Holistic Understanding of Human Motivation

Together, I believe these four concepts provide us with the descriptive language to understand what it is about the web that makes crowdsourcing such a powerful tool. Not only for improving and enhancing data related to cultural heritage collections, but also as a way for deep engagement with the public.

In the next three posts I will talk through and define these four concepts offer up a series of questions to ask and consider in imagining, designing and implementing crowdsourcing projects at cultural heritage institutions.


Build Some Rep for Digital Preservation

A quick update on the digital preservation stack exchange site proposal. As I mentioned before, there are a series of ways you can help make this proposal a reality, at this point the big task is to get 100 people to commit who have more than 200 reputation on another stack exchange site. We already have 32 people who have achieved this, so we are about a third of the way there.

This will likely be a bit of a long haul, but considering that we have managed to get this far in only about a month I think we are well on our way.

How you get reputation:

You get reputation by asking and answering questions on any of the stack exchange sites. I’ve pasted in a table from their guidelines on reputation below. You will notice that you really get reputation from having your answers or your questions voted up.

This can stack up very quickly,  for example, i’ve asked three questions on the Academia site and answered two, but those questions and answers were pretty good, so they got voted up multiple times and I ended up getting more than enough reputation to get over 200. You can see exactly what questions I asked and answered and what points I got for them here.

answer is voted up +10
question is voted up +5
answer is accepted +15 (+2 to acceptor)

Where to get stack exchange reputation

I built up my 200 reputation on the Academia site, but you can do it anywhere. The important thing is that you pick a site and get 200 rep on that site (you need 200 rep on a single site so getting a little bit on a bunch of different sites isn’t going to cut it.) The full list of sites can be a little bit intimidating, so I figured I would point folks to a few sites they could think about.

  • English Language and Usage Q&A for linguists, etymologists, and serious English language enthusiasts
  • GamingQ&A for passionate videogamers on all platforms
  • Board and Card Games Q&A for people who like playing board games, designing board games or modifying the rules of existing board games
  • Travel Q&A for road warriors and seasoned travelers
  • Photography Q&A for professional, enthusiast and amateur photographers
  • CookingQ&A for professional and amateur chefs

Take a few minutes and look over the unanswered questions on any of the sites you think you might be interested in. Take a minute to try responding to a few. Then think up some questions you might have, search to see if they are already there and if not post them. In all seriousness, you can get 200 rep on one of these sites in a very short period of time and in the process you end up getting a better understanding of how this system works.