“But That’s Not Preservation!” Notes on Preservation’s Divergent Lineages

I’ve found that interdisciplinary dialog about digital preservation often breaks down when someone protests “but that’s not preservation.”

Preservation means a lot of different things in different contexts. Each of those contexts has it’s own history. Those histories are tied up in the changing nature of the mediums and objects for which each conception of preservation and conservation was developed. All to often, discussions of digital preservation start by contrasting digital media to analog media.  This contrast forces a series of false dichotomies. I’m feeling like better understanding a bit about the divergent lineages of preservation could help to establish the range of competing notions at play in defining what is and isn’t preservation.

I’m curious to start building out some of my understanding of the lineages of different kinds of preservation. So I would love if folks could share any examples of writing in this area that might be helpful. I think a lot of this context looks to be in something like Preserving our Heritage: Perspectives from Antiquity to the Digital Age (which I am still digging into.) However, I also think the story is even broader here, and that there is a media archaeology aspect that is missing. That is, my sense is that a series of old new media; like photography, film and recorded sound technologies have been interacting with ideas about what preservation is or should be for more than a century. 

What follows is not so much a coherent final product as it is me openly sharing some of my notes on different strands I see at play in this space.

  • The manuscript tradition: A situation where the allographic nature of a work is primary what matters, that something is the work if it has the same spelling and where copying is the basis of preservation. In this case, something like the Evolution of Manuscript Traditions could be useful.
  • The history of archival traditions: In this case, something like What is Past is Prologue: A History of Archival Ideas Since 1898, and the Future Paradigm Shift is useful. Also, publishing records in documentary editions vs. arranging and describing records and ideally a bit on the interventions that came with microfilming. That is, while we generally think of archives as holding unique and original records in this space there is a lengthy tradition of documentary edition work focused on publishing records and a history of photographic reproduction of records for both access and preservation purposes.
  • The history of art conservation and restoration: For example, Changing Approaches in Art Conservation: 1925 to the Present. I’ve seen a lot on the history of conservation of things like paintings. However, the history of the development of variable media art works, art installations, and works made of materials that rapidly deteriorate has resulted in very smart thinking about what it is about art works one wants to conserve. In this space, Re-collection Art, New Media, and Social Memory,
  • Preservation of dance and live performance:  There are, at this point, long standing traditions in how to preserve and document works of art that produce lived experience. In this space, the Dance Heritage Coalition‘s Documenting Dance: A Practical Guide nicely illustrates the continuity that exists between a variety of modes of documentation technologies, from textual notation, to moving image technologies to new digital methods like motion capture.
  • The history of conservation of living creatures: Everything from taxidermy and insect collecting to living collections like butterfly gardens and zoos as well as things the Svalbad Global Seed Vault. I don’t really have good resources on the history and theory here. Thinking about digging into some history of science journals. In any event, I think there is an interesting story about which techniques are intended for what purposes and what is significant about a living thing that must be preserved toward that particular purpose. That is, when and why do you pin and preserve butterflies as a collection and when and why would you choose to run a butterfly garden. So looking for any ideas folks might have for work in this space.
  • The development of historic preservation of the built environment: I know some good stuff here, like Giving Preservation a History: Histories of Historic Preservation in the United States. In this case, it’s interesting to me that some newer technologies like photogrammetry  or 3D point cloud technologies are being explored as ways to “digitize” or create recordings to preserve and document physical spaces. I find historic preservation particularly interesting in that it often focuses on turning back the clock on a particular building to make it appear as it was at a particular moment in time. In this vein, it can involve recreation and fabrication. Similarly, historic preservation connects in interesting ways to reenactment and living history. In this space, I am a huge fan of Abraham Lincoln as Authentic Reproduction: A Critique of Postmodernism which explores fascinating sets of issues around authenticity in the New Salem Historic reconstructed village and outdoor museum in Illinois.
  • The advent of recorded sound technology and the development of oral history: There is some good stuff on recorded sound technology in Gramaphone, Typewriter, Film and MP3 the Meaning of a Format but they aren’t really explicitly about oral history. In contrast, The History of Oral History isn’t so much focused on the role that recorded sound media have played in the history of oral history. The Media Archaeology work points to how our conceptions of “memory” have themselves been shaped by the advent of these new technologies. That was said of Edison’s phonograph “Speech has become, as it were, immortal” or as an article on Memory and the Phonograph from 1880 would “define the brain as an infinitely perfected phonograph”.
  • The development of photography and microfilming and preservation reformatting: There is some good stuff on this in Lisa Gitelman’s  Paper Knowledge: Toward a Media History of DocumentsIn particular, discussion on the work of the “Joint Committee on Enlargement, Improvement and Preservation of Data” a joint effort of the American Council of Learned Societies and the Social Science Research Council. Which ended up publishing Robert Binkley’s 1931 Manual on Methods of Reproducing Research MaterialsThe book is, to some extent particularly interesting in that it is a cover-page over a photo-offset printing of a type-written manuscript. To this end, the book itself illustrates how changes in the technologies for photo-duplication of documents was effecting access to documents.
  • The history of newspaper conservation: Closely related to the last point, the push to microfilm newsprint based on some of it’s inherent vices. While Double Fold is over the top, it did prompt some really great reactions, like Don’t Fold Up: Responding to Nicholson Baker’s Double Fold 
  • Scientific data and records of observations: Astronomers draw on records of observations of the motion of celestial objects dating back to the ancient world. Lorraine Daston’s “Sciences of the Archives” research group has produced some facilitating work in this vein. I like how this quote from Datson’s research group captures the continuity that exists in these traditions which bridges analog and digital practices and incorporates other new media like photography. “Since ancient times, cultures dispersed across the globe have launched monumental data-centered projects: the massive collections of astronomical observations in ancient China and Mesopotamia, the great libraries from Alexandria to Google Book Search, the vast networks of scientific surveillance of the world’s oceans and atmosphere, the mapping of every nook and cranny of heaven and earth.” They have a great 2012 paper in Osiris that works through this in more depth.

So in all these contexts, I think a few preliminary points start to emerge that I keep thinking about.

  1. Preservation’s meaning is contextual and tradition dependent: As a concept, preservation  has situated meanings in particular traditions and contexts so it’s important to really articulate what one means by the term and what traditions one is drawing on. In this vein, the different traditions have emerged in dialog with the development of media and have their own ideas of what is significant about objects for their use.
  2. Digital vs. Analog Preservation is a false dichotomy: There were already a lot of divergent ideas of what preservation meant in play before digital technology came in to play. In this vein, the intervention of digital technology is just one of a series of technological interventions which has disrupted preservation practices and traditions.
  3. New media is older than digital media: Related to the last point, various media/ technologies of reproduction (and their affordances) have had significant impacts on the traces of the past that can be created and our ability to preserve them. In this vein, scholarship in Media Archaeology focused on reinterpreting and understanding these old new media is likely of considerable value for unpacking those impacts.

So those are some working thoughts and rough notes. Curious and interested for 1) other resources you think are relevant in some of these areas 2) other ways of slicing and characterizing these points 3) other ideas about what the take aways are.


25 Curatememes: A Curation

A few years back, Curatememe set out on a mission to create a space “Where Curators Curate Memes about Curation. Where will the absurdity of our use of the term Curation go next? This Tumblr speculates wildly.” I think it’s now time to declare curation accomplished. Now that curation means whatever, I thought I would curate the best of the best from the tumblr here in a listicle. It seemed appropriate. It will also make it easier for me to find one’s I want to use sometimes.

The memes are arranged more or less reverse chronologically, which offers a sense of how they developed as a body of work over time and preserves the experience of reading back into a tumblr.

I Came. I Archived. I Curated

I came I archived I curated

Stop Worrying and Respect des Fonds

stop worrying and respect des fonds

Curation is as Curation Does

curation is as curation does

Be the Change you want to Curate in the World

be the change you want to curate in the world

Eat Pray Love Curate

Travel Trip Eat Pray Love

The Only Thing we have to Curate

the only thing we have to curate is curation itself


The Point is to Curate it

The point is to curate it 

There are no facts, only curations

no facts only curations

Say Curation One More Time

say curation one more time

I Can’t Believe I Curated the Whole Thing

I cant believe i curated the whole thing

 Who Curates the Curators

who curates the curators

The Things You Curate End Up Curating You

The things you curate end up curating you


Curate the Rainbow

Skittles, fruit flavour sweets


One Word… Curation 

one word curation


He Who Curates the Spice Curates the Universe

he who curates the spice

Curate Yourselves Winter is ComingCurate Yourselves Winter is Comming



Wait… That’s Curation

wait thats curation

I Curated this Xhibit

I curated this xhibit

I Curated You I can Deaccession You

I curated You I can deaccession you


Frankly my dear, I don’t curate a DAMS


I love it when curation means whatever

I love it when Curation Means Whatever

This is not Curation

this is not curation


Did I Cu-rate That

did I cu-rate that




How Would You Teach A Digital Preservation Grad Seminar?

It is looking like I may end up teaching a graduate seminar on digital preservation for the University of Maryland’s iSchool. There is an existing syllabus, but I will have some flexibility in terms of how I shape and design the course and I am curious what thoughts different folks have for what would be the most effective way to teach a graduate seminar on the subject.

Below are a few of the big picture course design questions I am thinking through and some of my initial thoughts on them. I’m curious for any and all input folks might have.

Organizing PrinciplesHow best to organize a digital preservation course? 

  • To what extent should such a grad seminar like this be about frameworks and principles vs examples and cases? I’m thinking that I should cover those, but I’m also thinking that too many of those models fail to address the idea that digital preservation is fundamentally about risk mitigation from future loss. That is, it’s less about a process and more about how to make the best use of available resources and identifying the best opportunities to systematically work to further lessen the risk of loss. I also think that the frameworks often get in the way of first grasping a fundamental understanding of the nature and structure of digital information and digital media. So I’m entertaining the idea of getting to the frameworks at the end as a way to understand the issues but working through the core issues first.
  • How would you organize and structure such a course? If I don’t start with the frameworks, I’m thinking it makes sense to start by working through a core understanding of digital information and digital media and work from there into the various issues in the NDSA levels of digital preservation.

Particular Tools & Software: What role should they play in the course? 

  • What approach should I take toward particular tools? On the one hand, it is very pragmatic to leave a course like this understanding how to use particular tools, but at the same time, the tools are always going to be changing and everyone needs to be able to plan for how to swap in and out different tools to meet the underlying objective. In my digital history courses I have required students to each figure out how to use and then teach the class how to use particular tools and software. I like this approach as teaching yourself how to use new software and evaluating it is an important skill in it’s own right. With that said, some of the digital preservation tools out there are complex enough that I’m not entirely sure this method would do them justice.
  • How much should a course like this require/push students to develop some basic command line literacy? My sense is that many student’s will not have this, but it is challenging to think through how to do much work in this area without that. With that said, the course isn’t about developing that command line literacy, so I’m not sure how far to delve into this kind of thing.

Kinds of assignments: What would be the most useful for the students? 

  • I’m curious for what folks think would be the most useful kinds of assignments. I’m thinking that given the context of planning for risks and the need to make such plans inside the constraints of an institution that it might make the most sense to have students serve as consultants for small cultural heritage organizations and have them develop plans for options to improve their approaches to ensure long term access to their digital content. So I think many of the assignments might be fit around that. With that said, I am curious for any other ideas for how to either improve this idea of a course project or for other kinds of assignments.

Digital Cultural Heritage DC Meetup: 4 Years in & Going Strong

DCHDC 2012
Folks at one of the first DCHDC meetups, September 2012

Four years ago, some of my colleagues in the NDIIPP program thought it could be neat to try and start up a monthly meetup for digital cultural heritage professionals in DC. Butch Lazorchak found a bar in DC that would give us free space upstairs once a month and signed up for a meetup account and we were off.

I love that we ended up sparking something that has become an anchor monthly event for folks from libraries, archives, museums, universities and related non-profits to share ideas and perspectives. I know it’s been a key element in various people finding internships and jobs and for sharing ideas and approaches to working in this area. To that end, I decided it would be worth looking back and checking in with folks who have joined the group. So a few months ago I put together a survey.

4 Years, 40 Meetups, Almost 500 Members

Jamie Mears talking about personal digital archiving at DCPL at a recent meetup.

Over the last four years the Digital Cultural Heritage Meetup group has hosted more than 40 meetups. It seemed like a good point to do some legwork to figure out how the meetup is working.

The meetup continues to draw anywhere between 20-30 some folks a month and I thought it would be useful to survey the 492 people who have signed up to follow the meetup. The loosely organized group of folks who organize the events are working to improve them based on the survey.

Along with that, I thought folks in other cities might be interested in the results too. For an event that makes use of free space and takes a bit of time each month from a handful of people to volunteer to organize I think it has been having a rather substantial impact on the scene in DC.

Info on the survey sample

68 people responded to a survey I put together. This is less than 10% of the total set of people who have signed up to the meetup, but given the way meetup works I would hazard to guess that something like 60 or 70% of the people who signup for the meetup don’t ever end up coming. This is to say, I think responses from 68 people likely give a good view into the whole of who participates.

In the interest of transparency, you can see survey results (PDF), download the tabular data an see what the survey form looked like. As an aside, I would love to see other people take a look at the responses and write up their own reactions and interpretations of the survey results. Along with that, I would love to get further discussion of the results of the survey in the comments on this post.

Who participates in DCHDC and to what extent? 

Survey respondents represented a range of different profiles of DCHDC participants both in how frequently they participate and in where they are at in their careers.

In terms of the frequency of participation, they represented a range of levels of engagement.

  • 19 had participated more than six times,
  • 12 had come at least 4 times,
  • 21 had come two or three times,
  • 14 had participated once
  • 1 respondent had never participated

In terms of their stage in their careers, the survey mostly drew in folks who were either established professionals or in the first five years of their careers.

  • 3 respondents were current students,
  • 25 were in the first five years of their career and
  • 38 were established professionals who had worked in their field for at least five years

There weren’t that many students, but I think that likely represents trends in who participates. What I would note here is that this underscores how well the meetup functions as a middle ground between established and emerging professionals. I would also underscore that the students who do come have clearly gotten a ton out of being able to network with established and early career professionals. So grad students, if you’re listening, I think there is a huge opportunity here for you.

How DCHDC Matters

Across the board, respondents to the survey were largely united on the positive aspects of participating in DCHDC. For those participating, it seems clear that there is consensus that it has become a community that plays an important role in their careers.

  • 97%  of respondents either agreed or strongly agreed that through DCHDC they have learned about projects and issues that are relevant to their work.

  • 97% of respondents reported that DCHDC has become a community they value participating in.

  • 95%  of the respondents either agreed or strongly agreed that participating in DCHDC has expanded their professional network.

  • 80%  of respondents either agreed or strongly agreed that Participating in DCHDC has made them more aware of career opportunities.

Examples quotes of how DCHDC has been Helpful:

The free text responses that respondents provided give some of the best specifics of what has been working about the meetup. I thought I would include some of those inline here.

  • Connecting with professionals at different stages in their careers: “As I am just now beginning a career as a librarian specializing in digital preservation, having the opportunity to hear presentations on related to this area in librarianship is really helpful, as it is still evolving (and will continue to do so). Furthermore, actually having the ability to speak with individuals about their workflows, the politics of advocacy, standards, etc. has enabled me to gain a better understand my work.”
  • Finding professional opportunities: “DCHDC helped me get an internship in my field and lead to a greater understanding of what kind of jobs were out there and what direction I’d like to head in. Beyond the networking and professional advancement aspects, DCHDC has given me the opportunity to learn more about technology and aspects of cultural heritage that weren’t touched on in my program. While I have been unable to attend DCHDC in recent months I speak highly of it and recommend it often.”
  • Getting perspectives from outside a particular field: “I’ve strengthened my digital humanities network outside of the museum sector, and I’ve been able to bring a digital humanities perspective to my museum work. (I also discovered that one of my DCHDC friends was living upstairs from me. :))”
  • Personal/emotional support in career pathways: “The breadth of my knowledge has been expanded. I’ve made friends that have helped me emotionally through some hard career-related stuff. DCHDC has also helped me maintain consistent relationships with key people in the community, and I truly believe this helped me get my last job.”

Common Requests for improving the Meetup:

Along with understanding what people were getting out of DCHDC, I was also interested to learn a bit about how to improve the event.

  • Shorter talks: Originally the idea was to do 5 minute lightning talks, but over time they have become longer. So we decided to shift back to short talks with a quick bit of time for Q&A.

  • Further planned out schedule: This is an entirely volunteer run and organized event series, so planning is a bit of a challenge. That said, if we can get better at lining up a schedule then folks can make sure they plan on coming to weeks that are of particular interest. I think it will also help to bring new folks into the fold who might be drawn in by a particular topic.

  • Recaps/notes/links from talks shared online: This was a request that came through from several people. I don’t have the bandwidth for it, but if anyone wanted to take on something like this it would be welcome and appreciated.

Example Suggestions from Survey: Below are some examples from the survey responses of particular individual requests.

  • “I think it’s great as-is! I’m happy with whatever meeting time/place. earlier was mentioned last time like 6:00 and that would be great too.”

  • “Back to the short talk format. An hour lecture that starts after 7:30 pushes the event FAR TOO LATE, and removes the opportunities for networking and socializing that the above questions address. 20 minutes socializing, 20 minutes max presentation (including Q&A), a few minutes for announcements, and 20 minutes to whenever for after-socializing and closer convo with presenters would make it much more valuable.”

  • “At the beginning dchdc had *very* short presentations bookended by plenty of time to meet people and have free-ranging discussions. It seems that over time the presentations have gotten longer and more dependent on power point. Since not every presentation is relevant to everyone in the community, some might be less likely to attend based on topic, whereas before they might come just for the excellent company.”

  • “Be clearer on the MeetUp about time to socialize vs. presentation time, so that people who want to chat know they either need to come a little early or stay later. When the only information is the time and who’s talking, it makes it seem like the talk is at that time or just shortly after.”

  • “Scheduling or at least soliciting ideas for presentations a bit further in advance could be a good step. A formal call for ideas/volunteers could help bring some new faces and organizations to the fore. That said, I really have no idea how scheduling works and don’t want to mess up a good thing.”

  • “Given the locations in which we’ve met over the past couple of years, a consistent audio-visual/computer setup is key. Ad hoc talks are valuable but are almost always enhanced with graphic examples.”

  • “A Facebook group or some other way to share information about jobs, events, etc in between meetings would be a good supplement.l, especially for when we can’t make it to meetings in person every time.”

Distributing Credit for DCHDC

I should note that while I’m one of the co-organizers for the group since the beginning,  I have not been one of the folks who have really carried the water on this. At various points I’ve missed big chunks of the meetups when I have had to teach classes that meet on the same night. On that front, Bill Lefurgy gets credit for scheduling and running the events for most of our run so far. This is a touch which has recently largely been passed to Atiba Pertilla.  There are also several other folks who have been involved almost all the time and stepped up to run events at various points, I’m thinking of Jennifer Serventi and Patrick Murray-John. There are probably about 5-7 more folks I could list out here, but this is just to say that I think the strongest part of the group comes from a core set of folks that are incredibly generous with their time and welcoming to anyone and everyone who we can encourage to participate.

Going forward

The survey largely confirmed the things I hear from lots of folks about what is valuable and useful about this group. I don’t think we had any clear expectations of what this would be when or how long it would run when we launched it. But here we are, four years out, having moved between three different venues and still going strong. I’m personally very excited to see how this keeps going into the future and always interested in talking more with folks about how it can be improved/enhanced. I’m also happy to talk with anyone who might want to set up similar meetups in other places.

The Insights Interviews: First Person Perspectives on Ensuring Long Term Access to Our Digital Heritage

Back when I was working for the Library of Congress I did, and helped coordinate, a ton of interviews with practitioners and thinkers working in digital preservation for the National Digital Stewardship Alliances innovation working group. At one point, there was discussion of making a book out of the then 33 interviews. As with many ideas, it stalled out at some point. In any event, I worked up an intro for that and a table of contents at one point. So I figured I would just post that here, as I think it makes the interviews a bit easier to navigate. Together they form a whole that is, I think, more useful than just looking at them as part of a serial publication. For context, I wrote the intro below in 2014 and the interviews range from 2011-2014.From the initial draft I also added a set of  9 additional fantastic interviews that Julia Fernandez did as a Junior Fellow focused on understanding, documenting and preserving digital culture. I also added in links to guest posts that Sharon Leon and Mackenzie Smith wrote about approaches to developing open source software that are slightly different in that they predate the focus on interviews as an approach.

I don’t claim the credit for this massive amount of work. A ton of people did a lot of work on running, planning and coordinating these. Off the top of my head Jane Mandelbaum, Martha Anderson, Abbey Potter, Erin Engle, Butch Lazorchak, Jefferson Bailey, Lori Emerson, Julia Fernandez, Ricky Padilla, Barbra Taranto, come to mind as people who either did significant work in running or coordinating interviews.  I know there are many others from the NDSA innovation working group who contributed to doing these as well. 

The Insights Interviews:First Person Perspectives on Ensuring Long Term Access to Our Digital Heritage

Innovation can be a terrible buzzword. It can be a stand in for flavors of the month, and trendy ideas on the upswing of this year’s hype scale. With that said, it remains a critical concept. Particularly in a field like digital preservation where the idea of even keeping up with the scale and deluge of digital media along with an ever changing series of new forms, formats, tools and platforms is often dizzying and overwhelming. In some of my first conversations with Jane Mandelbaum, the Library of Congress co-chair for the National Digital Stewardship Alliance Innovation working group we struck on the idea of focusing on how exactly people are making it work.

Across a range of disciplines and areas an amazing set of professionals have emerged to ensure long term access to digital information and were doing amazing things with that information. When we did our first interviews for the then new Signal digital preservation blog I had no idea how useful and valuable many of them would become as touchstones for our field. Some of these interviews were topical and primarily of interest in the moment, but many of them share important and profound insights (the term the then Director of NDIIPP Martha Anderson suggested for the series). When an NDIIPP colleague approached me about helping to shape a volume out of the best of these interviews I thought it was a great idea. In reflecting on them, I think there are four particular cross cutting reasons that these interviews organized as they are here, are particularly useful for emerging and established professionals in and around the work of digital stewardship.

First Person Perspectives from an Emerging Interdisciplinary Field

Everyone in this volume has launched or established a career in this new and interdisciplinary field of work. As digital technologies reshape work across every sector ensuring long term access to information now touches on nearly every sector. Our education and training systems are responding to these changes, but beyond that, it is invaluable to use these interviews as a point of entry into the work of individuals in this field. In this respect, every interview here is an opportunity to understand someone’s career trajectory and in many cases a chance to gain insight into the skills and knowledge required to take on the hybrid roles that many of these innovative individuals are engaged in. In this respect, the collection is of particular interest to young professionals and students looking to establish the course for their careers.

Practical Dispatches from the Front Lines of Digital Stewardship

A considerable amount of ink and pixels have been spilt over theory of digital stewardship. Models, frameworks and certification criteria abound. These are great resources, but given the rapid pace at which technologies and systems are evolving, understanding how individuals are working to ensure long term access to digital information provides insight into how people on the ground are actually making this work happen. In this respect, each interview in this volume is an illustration of how theory comes into practice. Each interview provides a firsthand frontline narrative of how the models and frameworks of the field are calibrated into the messy realities of resource constraints and practical limitations of the world.

A Cross Section of Work and Issues Involved in Digital Stewardship

Big picture strategy, perspectives of content specialists, exploration of issues in the design and maintenance of infrastructure and systems, needs and desires of researchers scholars and other end users. While the interviews were not done to create a comprehensive picture of the field, as they have accrued over time when we set about sorting out the best of them into different buckets I was thrilled to see how well they covered the waterfront of collecting, organizing, preserving and providing access to digital information.

Disaggregating Digital Stewardship and Preservation

It’s not as tidy as it would be if this all hung together from a single perspective, there is a lot of messiness in the different objectives, frameworks and perspectives which different participants come from. That is something which I think is a particular strength of the volume. The rhetoric of the digital often makes it seem like we should be moving into the clean lines and clear cut universe of a science of digital stewardship. But when we zoom in to the work at each layer of the infrastructure for digital stewardship we are building it becomes evident that the same professional values and approaches that made for idiosyncratic visions of preservation and access in the past are just as present in our digital future as they were in our analog past. Digital stewards engage in their work toward differing objectives through differing means. For instance, considerations about the authenticity of digital artworks are not the same as concerns about the authenticity of electronic records.

Table of Contents

Chapter One: Digital Strategy

  1. Digital Strategy Catches up With the Present: An Interview with Smithsonian’s Michael Edson August 9, 2012
  2. Open Source Software and Digital Preservation: An Interview with Bram van der Werf of the Open Planets Foundation, April 4, 2012
  3. Solving Problems and Saving Bits: An Interview with Jason Scott, August 20, 2013
  4. Digital Humanities Connections to Digital Preservation: Interview with Brett Bobley of the Office for Digital Humanities at the NEH, October 11, 2011

Chapter Two: Understanding Digital Objects

  1. BitCurator’s Open Source Approach: An Interview With Cal Lee, December 2, 2013
  2. What’s a Nice English Professor Like You Doing in a Place Like This: An Interview With Matthew Kirschenbaum August 12, 2013
  3. Media Archaeology and Digital Stewardship: An interview with Lori Emerson, October 11, 2012
  4. Archives, Materiality and the “Agency of the Machine”: An Interview with Wolfgang Ernst February 8, 2013
  5. Historicizing the Digital for Digital Preservation Education: An Interview with Alison Langmead and Brian Beaton, May 6, 2013

Chapter Three: The Curator’s View

  1. Web Archiving and Mainstreaming Special Collections: The Case of the Latin American Government Documents Archive, June 6, 2012
  2. Crossing the River: An Interview With W. Walker Sampson of the Mississippi Department of Archives and History, December 9, 2013
  3. ArtBase and the Conservation and Exhibition of Born Digital Art: An Interview with Ben Fino-Radin May 1, 2012
  4. Exhibiting Video Games: An interview with Smithsonian’s Georgina Goodlander September 25, 2012
  5. The Digital Data Backbone for the Study of Historical Places”: An Interview with Matt Knutzen of the New York Public Library, February 27, 2013
  6. Challenges in the Curation of Time Based Media Art: An Interview with Michael Mansfield April 9, 2013
  7. Insights Interview with Beverly Emmons, Lighting Design Preservation Innovator February 10, 2012
  8. Born Digital Archival Materials at NYPLBorn Digital Archival Materials at NYPL: An Interview with Donald Mennerich, April 22, 2013
  9. Curating Extragalactic Distances: An interview with Karl Nilsen & Robin Dasler, August 18, 2014

Chapter Four: Designing Infrastructures

  1. Engineering Digital Preservation: Interview with David Rosenthal, June 15, 2011
  2. Lessons Learned for Sustainable Open Source Software for Libraries, Archives and Museums, September 15, 2011 (From Mackenzie Smith)
  3. Hydra’s Open Source Approach: An Interview with Tom Cramer, May 13, 2013
  4. Digital Stewardship and the Digital Public Library of America’s Approach: An Interview with Emily Gore, October 28, 2013
  5. The Foundations of Emulation as a Service: An Interview with Dirk von Suchodoletz, December 11, 2012
  6. WWI Linked Open Data: An Interview with Thea Lindquist, July 29, 2013
  7. Toward a Library of Virtual Machines: Insights interview with Vasanth Bala and Mahadev Satyanarayanan, September 21, 2011
  8. Imagine What We’ll Know This Time Next Week: An Interview with Bailey Smith and Anne Wootton of Pop Up Archive, December 6, 2012

Chapter Five: Working with the Public

  1. Crowdsourcing the Civil War: Insights Interview with Nicole Saylor, December 6, 2011
  2. Understanding User Generated Tags for Digital Collections: An Interview with Jennifer Golbeck, May 1, 2013
  3. Galleries, Libraries, Archives, Museums with Wikipedia (GLAM-Wiki): Insights Interview with Lori Phillips, April 20, 2012
  4. The Metadata Games Crowdsourcing Toolset for Libraries & Archives: An Interview with Mary Flanagan, April 3, 2013

Chapter Six: Scholar and Researcher Perspectives

  1. Quest for the Critical E-dition: An interview with Leonardo Flores, March 20, 2013
  2. Machine Scale Analysis of Digital Collections: An Interview with Lisa Green of Common Crawl, January 29, 2014
  3. Sharing, Theft, and Creativity: deviantART’s Share Wars and How an Online Arts Community Thinks About Their Work, September 17, 2012
  4. Astronomical Data & Astronomical Digital Stewardship: Interview with Elizabeth Griffin, October 8, 2014

Chapter Seven: The Digital Vernacular and Digital Folklore

  1. Born Digital Folklore and the Vernacular Web: An Interview with Robert Glenn Howard, February 22, 2013
  2. Understanding Folk Culture in the Digital Age: An interview with Folklorist Trevor J. Blank , June 30, 2014
  3. LOLCats and Libraries: A Conversation with Internet Librarian Amanda Brennan, July 14, 2014
  4. Understanding the Participatory Culture of the Web: An Interview with Henry Jenkins, July 24, 2014
  5. Computational Linguistics & Social Media Data: An Interview with Bryan Routledge, August 1, 2014
  6. Networked Youth Culture Beyond Digital Natives: An Interview With danah boyd, August 11, 2014
  7. Netnography and Digital Records: An Interview with Robert Kozinets, August 13, 2014
  8. Research is Magic: An Interview with Ethnographers Jason Nguyen & Kurt Baer, August 15, 2014
  9. Studying, Teaching and Publishing on YouTube: An Interview with Alexandra Juhasz, September 5, 2014
  10. Archiving from the Bottom Up: A Conversation with Howard Besser, October 10, 2014



Digital Art Curation Seminar

Huge thanks to everyone who shared ideas about what to include in my upcoming digital art curation grad seminar. I’ve decided to use the same course blog that I’ve been using for my digital history seminars, so if you haven’t already, you can tune in to what we will work on at dighist.org.

I’ve embedded a copy of the draft syllabus below and I think I have more or less all of the readings in this Zotero collection.

Curation and Conservation of Digital Art Syllabus

Getting Out There: 2015 in Review

Showing off a red velvet cupcake with the POTUS seal on it at the White House.
Showing off a red velvet cupcake with the POTUS seal on it at the White House.

Another year. Another chance to do a quick look back and make sense of what I’ve been doing and where I think it’s taking me. As I did in 2012,  2013, and 2014, I am taking a few minutes to try to sift and categorize. So if you are interested in a recap of things I’ve done this year this post is for you, if not, I imagine you have already decided to stop reading.

Looking back, I feel like the move from the Library of Congress to IMLS has been a huge chance to better connect with and learn from the field. While NDIIPP was always outward facing, it was still inside an institution that acts as such a center of gravity that it was challenging to really be out there. In contrast, as the core role of IMLS is to serve and support libraries and museums across the nation it has been exhilarating and rewarding to be out in these communities much more.

Dropping the “IIP”: From NDIIPP to NDP

Presenting a framework for the National Digital Platform at the IMLS Focus convening at DCPL's MLK library.
Presenting a framework for the National Digital Platform at the IMLS Focus convening at DCPL’s MLK library.

I started the New Year with a new job. I left NDIIPP to “head National Digital Platform  responsibilities across programs” at the Institute of Museum and Library Services. As NDIIPP stood for the National Digital Information Infrastructure and Preservation Program, I smile a bit thinking that even though I was changing jobs I was keeping the first two words of the program and dropping two “i’s” and a p.

My last day on the job at the Library of Congress was New Year’s eve 2014. In the four and a half years I spent at the Library of Congress I had amazing opportunities to work and learn, and made a lot of friends and colleagues I know I will have for the rest of my life. With that said, it was just impossible for me to pass up the chance to be a part of the emerging National Digital Platform work at IMLS.

IMLS and I go back. When I started working at the Center for History and New Media in 2006 my job was, in part, funded by a IMLS national leadership grant. You can see a bit of what we were up to in this interview I did for the IMLS blog in 2007. Over the years I’ve given talks at the IMLS WebWise conferences and had the opportunity to review for the agency. In all those interactions, I was consistently impressed by all that the IMLS team could accomplish.

Cover of the National Digital Platform convening report. Read the report online (PDF).
Cover of the National Digital Platform convening report (PDF).

In my first year at IMLS, I’ve had the chance to co-develop and publish a vision for the priority, shape a convening and the resulting report on the National Digital Platform priority, and support IMLS investing nearly ten million dollars in more than a dozen grants and cooperative agreements. As a push for transparency, I’m also thrilled that we were able to publish both the first and second round of funded projects proposals online.

Through all of this, I have been so lucky for guidance and leadership from my boss Maura Marx and the insights of my colleague and constant collaborator Emily Reynolds. Along with that, I’m thrilled to find myself surrounded by the dedicated and exceptional staff of the Office of Library Services and the rest of the agency.  The experience has confirmed what I’d always imagined, that I really like helping people think through and refine ideas for their projects and work and thinking about how different areas of research and practice connect and add up to more then the sum of their parts. I can’t imagine any place where I could get to do exactly that kind of work and help support all kinds of libraries across the country keep advancing in the 21st century.

Teaching Digital History & Digital Curation

Outside the office, I was thrilled to be able to continue teaching. I was able to teach a digital history graduate seminar in the Public History program at American University and as a special topics course for the University of Maryland’s iSchool’s digital curation program.  I was totally impressed by what my students were able to do on their projects over the course of a semester. I also started developing a digital art curation and conservation course, which I will be teaching at UMD in the Spring.

Digital History & Preservation: Research, Writing & Speaking

A stack of the author copies I received after my book, Designing Online Communities, was published.
A stack of the author copies I received after my book, Designing Online Communities, was published at the begining of the year.

My book, Designing Online Communities, dropped! and some super smart people claim I said some smart things in it. Along with that, I wrote about the history of transparent gif’s in web archives, and about the implications of distant reading for developing digital infrastructures to support computational humanities scholarship.

My essay Zombies on Flickr, Lego, Handcraft, and Costumed Zombies: What Zombies do on Flickr, was published in New Directions in Folklore. An article I contributed to exploring learning in makerspaces was published in the Harvard Educational Review. I reviewed Preserving Complex Objects for the Journal of Academic Librarianship. I drafted an essay titled Digital Sources & Digital Archives: The Evidentiary Basis of Digital History for a forthcoming Companion to Digital History.

The big talk this year was People, communities and platforms: Digital cultural heritage and the web at the National Digital Forum in New Zealand. Aside from that, I planned and ran a daylong workshop on Roles & Responsibilities for Sustaining Open Source Platforms & Tools at the International Digital Preservation conference.

Love is... ...accepting he's a zombie, featured in my Flickr Zombies article.  by _Matn.
Love is… …accepting he’s a zombie, featured in my Flickr Zombies article. by _Matn.

I gave a lot of shorter talks about the National Digital Platform priority at a range of conferences including Linked Data for Libraries, Museum Computer Network. Along with that, I wrote up some of my take aways from five conferences I participated in as posts for the IMLS Blog.

These included,

All told, it has been a really great year.

Digital Sources & Digital Archives: The Evidentiary Basis of Digital History (Draft)

Below is a draft of an essay I am contributing to a forthcoming book titled A Companion to Digital History. I have permission to share drafts on my personal website, so I thought it would be good to get this up and out there 1) for folks to be able to read it and 2) to see if I could get any substantive commentary and discussion about it to help me revise it. If you would like, you can comment directly on the draft in this google doc.

Digital Sources & Digital Archives: The Evidentiary Basis of Digital History

In an early draft of my undergraduate thesis I wrote that a source “spoke for itself.” My advisor crossed that out and wrote in the margin something like “sources almost never speak for themselves, you have to explicate what the source means for your argument and justify your interpretation.” I imagine this sort of experience is how many individuals learn the ropes of historical research and writing. The task of the historian is to interpret sources.

The world is full of objects, archives, records and texts which historians can study and interrogate to develop and refine our understanding of the past. These are the primary sources of history; materials, relics, and texts, that testify and provide traces of the past. Almost anything could be a primary source. The rings of a tree testify to weather conditions and changes in climate. Probate records document the material goods individuals held at the end of their lives. Court proceedings offer insight into the experiences of the oppressed through the moments they are dragged in front of the justice systems that control and marginalize them[1]. Just as any kind of physical object might serve as a source, as society increasingly produces digital relics, documents, artifacts and other objects the evidentiary basis of history will become increasingly digital.

While things like the rings of a tree have their own value as historical sources, the bulk of historical work continues to be anchored in archives. Historian’s ability to study the past is largely directly indebted to archivists and the range of individuals involved in the production and management of historical records. Archives come in all shapes and sizes; massive federal agencies, small local historical societies, manuscript collections at research libraries to name a few examples. The same digital shift occurring in sources is occurring in archives.

At this point, historians have access to an ever-expanding wealth of digitized versions, or digital surrogates, of a selection of primary sources through online collections. At the same time, an explosion of born-digital materials is being produced and collected at unprecedented scale (websites, the contents of a hard drives, collections of emails, digital video and photos, etc.). While these new forms of sources are emerging so to are notions of digital archives. Organizations like the Internet Archive, and projects like the September 11th Digital Archive, and the Rossetti Digital Archive have emerged with the archive name attached. However, each of these varieties of digital archives represents a somewhat different vision of the nature of the concept of an archive.

So, what happens to history when the basis of its sources and evidence becomes increasingly digital? Similarly, what happens to history when it’s archives become digital? Backing up a bit, given how the very form of archives as institution is anchored in the management of paper documents, what does it even mean to have a “digital archive”? What follows is an attempt to identify and discuss issues in the evidentiary basis of history that arise as the materials and systems that manage those materials become digital. In looking at different kinds of sources and archives I work to suggest practical advice on the kinds of issues and questions one should ask when working to interpret, to find out what one can say, based on digital sources and digital archives.

What are Digital Sources?

When you hold a letter in your hand and read the words on it you can imagine what it was like when the recipient of that letter held it in their hands in the past. As an interpreter of the record, you can think about what it must have been like to receive it and follow a chain of correspondence to understand the exchange of thoughts and ideas. How does this interaction change when you have a digitized copy of a letter? Similarly, how does it change when you are looking at the text of an e-mail message?  

Making sense of a source and making a defensible inference based on the content of a source requires context. That is, knowing a letter was sent from one individual to another and that you found it in the papers of the recipient you can likely infer that it represents a perspective that the author wanted to communicate to the recipient and you likely have reason to assume that the recipient read it. In contrast, if a historian of the future had access to an archived copy of my Gmail account they would need to know a bit about many of the automated rules I’ve set up that “mark as read” emails from a range of individuals and organizations and in some cases are set to “skip my inbox” entirely. So without knowing about those rules one could end up making all kinds of problematic inferences about what I had or had not read based on what was in my email. Understanding my email thus requires an understanding of how people like me used email at a particular period of time and the set of features and functionality that different email clients came with.

As Martha Howell and Walter Prevenier explain in their introduction to historical analysis of sources “to make wise choices among potential sources, historians must thus consider the ways a given source was created, why and how it was preserved, and why it has been stored in an archive, museum, library or any such research site.[2]” The same kinds of questions need to be asked of digital sources. This is particularly challenging given that pace of change in the mediums and context of communications technologies seems to continue to accelerate. Historians need to develop an understanding of digital source criticism and provenance.

Digital Source Criticism & Provenance

Given the range of digital sources and the complexities of their production and use the future of historiography will require a good bit of work in digital source criticism.[3] German historian Johann Gustav Droysen’s 1867 book Outline of the Principles of History explains the concept and importance of source criticism as a part of historical practice. The task of Criticism is to determine what relation the material still before us bears to the acts of will whereof it testifies. The forms of the criticism are determined by the relation which the material to be investigated bears to those acts of which will gave it shape.[4]”  That is, a key part of historical research and writing involves not simply identifying sources of history but working to understand the context in which they were produced.

To this end, working with digital sources prompts the historian to ask the same kinds of questions they have long asked of sources. What is a sources provenance? How it was created and stored? Why does it persist today? These kinds of questions are essential for interpreting a source. This is not simply an issue for those studying society after the advent of computing technology. There are a range of key source criticism questions to should ask of both digitized primary sources and born digital sources. What follows is an exploration of some of the key issues for consideration related to both of these kinds of sources.

Digitized Primary Sources

For anyone studying the world before the emergence of digital media the primary role that digital media will play is as a transmitter of digital surrogates. Libraries, archives and museums have now been actively digitizing sources for thirty years and the result is that one can find millions of digital surrogates of books, maps, photographs and manuscripts in a range of online digital collections. In working from these sources there are a few critical questions to ask from the perspective of provenance and source criticism.

Why was this digitized and not something else?

It has always been important for historians to ask why a particular source has been preserved. It is critical to think through why we have access to some kinds of sources and not others and this is a key part of that reasoning exercise. The same kind of selection questions needs to be asked of any digitized source.

In some cases, archives have digitized full runs of materials; in other cases they have digitized highlights or selections. Generally, libraries, archives and museums have only digitized a sliver of their entire holdings. It’s not enough to find a source, one must be able to contextualize it and understand why they have it at hand and as such it’s important to think through the kinds of limitations on inferences one can make from something based on what you know about the digitization policies of a given organization.

For example, because of copyright restrictions many institutions in the United States are focusing efforts on digitizing materials from before 1923. Or similarly, an archive might have the rights cleared to digitize one particular collection, or the writings of one person instead of another. In each of cases if one want’s to work primarily from digitized materials it is critical to think through how the selection policies for what was digitized can shape and limit one’s ability to make inferences based on those materials.

Is this copy of significant quality for my purpose?

All digitized objects are surrogates for the originals. That’s fine. Historians have a long tradition of working from surrogates. In many cases, the only access historians have to extent historical materials is through copies of reprintings, and copies of copies created through the manuscript tradition. Similarly, when microfilm technology developed in the 1930s historians were thrilled with the prospect of reproductions of sources. Public historian Ian Tyrrell used the same rhetoric often used regarding digitization and the web to describe microfilm in the 30s. In his words, microfilm “democratized access to primary sources by the 1960s and so put a premium on original research and monographic approaches.[5]” The reproduction of sources played a key part in historians increased focus on working from primary sources. In this vein, it’s worth remembering that the development of the technologies that provide access to sources will continue to play a role in shaping the norms and expectations of the composition of history. So, surrogates are nothing new, in many ways they are the norm for many areas of historical practice. With that said, it’s always critical to ask if the surrogate is good enough for the questions a historian is asking.

Historians often want to do straightforward things with a source. So if one wants to be able to say an individual wrote a particular thing in a particular document then as long as you can make out the words in a digitized copy of something that is likely enough. In this case, it is worth differentiating the informational qualities of a source from its artifactual qualities[6]. The informational qualities of a source are generally the words inscribed on it. The artifactual qualities of a source can consist of any number of different features one might study. As historians have become increasingly interested in sources as part of material culture the need to consider artifactual qualities has become increasingly important. Every physical object contains a nearly infinite amount of information in it’s artifactual qualities. For example, beyond the legibility of words on an object, characteristics of handwriting, fingerprints, watermarks, the chemical composition of inks or of paper or vellum can all be interrogated to provide valuable information. All of that information is anchored in the artifactual qualities of the source.

As an example, you can find some rather ugly looking, but for the most part legible, copies of Hamlet in Early English Books Online. They are black and white images created from scans of old microfilm. You can also find much nicer looking copies of the same work in the Folger Shakespeare Library’s online collections. If what you care about is the text of the work, you are mostly fine in either case. With that said, researchers have used high quality full color scans, like those Folger provides, to study the placement of dirt on the margins of the page. The dirt on the pages, which comes from people handling the books, attests to the use of the books over time. That is, there are material traces of use of the books left on them that can be studied. Most interestingly, it can actually only be study when high quality scans of the book are created. That is, aspects of the source only become available for analysis through the production of a very high quality digital surrogate. To that end, the better quality the scans the more potential there is to examine traces of other physical properties of a source[7]. The question for someone working from a digitized surrogate of a source is thus are the significant properties of the source necessary for the sorts of questions you are interested in asking present? Similarly, it is important to consider how some aspect of the quality of a source might be obfuscated in how it was digitized or provided.

How did I find it and how does that effect what I can say about it?

At this point one can visit the Library of Congress, the Digital Public Library of America, Europeana or Google Books on the web and plug in some obscure search terms and find digital surrogates of records, artifacts and a variety of other primary sources. This is amazing. You can find things that you would never have been able to find before[8]. Searching across millions of sources at once is transforming many historians’ methods for research and scholarship[9]. At the same time, full text search presents a whole new set of challenges for reasoning from and interpreting sources.

Where in the past one would develop an explicit strategy to explore a given collection or archive, or to systematically look at all the newspapers from a given date range, search encourages researchers to stumble around and find something that looks interesting. This is all fine if all one wants to do is make an existence proof argument. That is, if one just wants to make the case that something was said at a particular point in time. However, this is a rather low bar for historical argumentation. The extent to which something is representative of a particular moment in time, or a particular community or place is tied explicitly to a range of contextual questions.

To be able to make broader claims based on a given source it is important to work to contextualize it after it is discovered through search. Feel free to search for idiosyncratic terms, to as Stephen Ramsey suggests, “screw around” in searching through digitized sources. However, it then becomes necessary to do the legwork required to understand the original context from which that source emerged and think through the limitations that come from why that source was digitized and not something else. To do this, it is necessary to work backward from a digitized source to understand where it came from and the extent to which it is or isn’t representative of the collection it comes from.

Born Digital Sources

Born digital is the rather clumsy term we have to discuss sources that started off digital; email messages, digital photographs, websites, databases, etc. Going forward, the bulk of the primary sources historians will work with to understand the world in the 21st century are going to be things that started off digital. This is not to suggest that we will every get away from paper sources, but it is to note that much of that paper source material will have started out as digital as well. In those cases, the paper will often be a surrogate for the digital. While archivists and historians are still only just figuring out how to collect, preserve and provide access to born digital primary sources there are already a set of emerging key questions to ask of such sources. What follows is an initial exploration of some key source criticism questions to ask of born digital sources.

What are you not seeing on the screen?

When working with digital objects it’s essential to remember that what they look like on the screen is a performance[10]. The actual digital object is a sequence of markings registered on a medium. Hard drives, CDs, flash drives, etc. are all things that register sequences of markings (bits) that are read by software to show up on a computer screen. In any digital file and any digital file system there is additional encoded information that one could be looking at and reading.

In contrast to looking at a hand written letter, where you can see how hard someone pressed and get a feel for their handwriting, when one looks at an email message on a screen all you see is the words. However, if you poke around in the email headers, or in the metadata associated with a message you can find a wealth of information that isn’t rendered on the screen. New media scholar Nick Montfort has deemed the focus on what things look like on the screen “screen essentialism” and a growing body of work is emerging to provide basic tools and approaches for getting beyond simply taking things as they appear[11]. Two examples of working with particular primary sources will help underscore what historians have to gain by getting beyond screen essentialism.

When curator Doug Reside first opened a file he found on a floppy disk in playwright Jonathan Larson’s papers at the Library of Congress he must have been shocked. Right there on the screen was a different set of text for a famous song from one of the musicals Larson had created[12]. What was it that he was looking at? Was this an alternative version of the song? As Reside dug deeper, and came to understand the nature of the word processing software that Larson had used and the software that Reside was using to render the text with he came to understand exactly what had happened. The word processing software that Larson had used would save a record of changes in the text inside the file. So an individual word-processing file would actually contain a record of the edits to a file over time.

The only way Reside could interpret what he saw on the screen was to learn a bit more about the software that was used to write it and the software he was using to render it. Ultimately, this is a rather fascinating result; works written in this particular word-processing application have within them records of their creation and editing.

The implications of this kind of work extend beyond the structure of individual files. In working to understand the material properties of digital objects, digital humanities scholar Matthew Kirschenbaum opened up a ROM (a copy of a floppy disk) in a Hex editor[13]. This ROM had a copy of an early video game called Mystery House. A Hex editor renders the hexadecimal notation, a recording of each byte on the medium. So the Hex editor showed how the information in the ROM was laid out on the original floppy disk it was saved on. As he explored the disk he found something intriguing, a sequence of text that did not appear in the game he was studying. What had he found? Was this hidden text in the game that wasn’t used? After goggling the text he was able to identify that the text came from a completely different game. From this, he was able to infer that the disk the ROM had been created from had a copy of the other game that had been overwritten by the second game. Kirschenbaum downloaded a copy of a game and was able to figure out what had been on the original disk before the game was saved on it.

Understanding how this happened requires background on how floppy disks and hard drives function. When a file is deleted it generally really isn’t deleted. Instead, a computer marks the space that the file is stored as available to be overwritten. The result is that if you poke around in what is actually written on a computer disk you will find that all sorts of areas on it that the operating system will tell you are empty spaces that actually contain readable information. As a result, as archives increasingly begin accessioning this kind of born digital material they are making decisions on if they want to create forensic copies of this kind of media (that is copies that will contain all that information, including information that is hidden to the user) or if they want to create logical copies of disks and drives that will only contain what the operating system thinks is there. In either event, this suggests a whole new set of skills for interpreting primary sources that historians are going to need to be come adept with. When working with born digital sources it is important to understand them beyond what they look like on the screen. It is critical to move past the performance of a file or a file system and to understand the additional information that may not be immediately revealed. The performance of digital content similarly opens a set of questions about the set of technologies used to interpret it.

What is lost in how it was/is rendered?

When files are rendered on a computer screen a user witnesses something akin to the performance of a play. The underlying data in a file is interpreted and rendered through software for a user to interact with in much the same way that the script of a play is interpreted and performed by a cast on a stage. In each case, while the underlying script or files remains the same, a given performance of a file or a play is going to look and sound different. For some kinds of research questions those differences do not matter, however, it is necessary in either case to be aware of the differences.

Archived websites offer a key case to explore how this plays out in the interpretation of a born digital primary source. At this point, many organizations are using a range of different tools to archive websites. They use a few different kinds of tools to harvest copies of what content was available at a particular URL at a given moment and then use another set of tools to be able to render that content for you to view. For example, you can go to the Internet Archive and type in the URL for www.loc.gov and you will find an interface that lets you see what the homepage of the Library of Congress website looked like at different points in time when the Internet Archive saved a copy of it. With that said, it is important to realize that when you look at a copy of the site in the Internet Archive’s Wayback Machine you are not really seeing what the site looked like at that point in time because a range of characteristics of the way the site looked then are not being replicated.

One views a website through a web browser, and any given browser will render things slightly different. This is particularly true for older sites. Similarly, when one looks at a website from ten or twenty years ago those sites were designed for computers that had smaller screen resolutions, that had different processers, that ran different operating systems. Each intermediary layer of software (the browser, the operating system etc.) and the implied assumptions about computer hardware baked into that software (screen resolution, processor speed, etc.) function as part of the sequence of interpreters that perform a webpage.

When asking questions about what is lost in how a digital object or set of digital objects is rendered it is important to recognize that different elements are more likely susceptible to issues. The distinctions between the informational and artifactual elements of sources previously discussed are similarly relevant in this context. For example, if all one is focused on is how something was written in text on a page, in most cases how it is rendered isn’t likely to be too much of a problem. However, in cases like the presentation of digital art created for the web or in situations where the aesthetics, design and user experience of a web page matter it is very likely that issues in how something is rendered will play a significant role one’s ability to interpret it[14].

How was this created, managed and used and how does that impact what one can say about it?

To be able to accurately interpret a source it is essential to understand the context in which it was created, managed and used. This is particularly challenging in the context of born digital source materials, as there is a rapid and continual churn in the underlying technology and formats that interact with shifting behaviors and social contexts for interpreting the meaning of those behaviors.

As an example, consider what the email signature “Sent from my iPhone” at the bottom of a message communicates[15]. First off, that the sender sent an email from a mobile device which likely explains why their might be typos or it might be brief because of the limits of a smaller interface. At the same time, it tells us that the user didn’t care to change the default signature that Apple added to their messages. So email’s aren’t just emails. The conventions and forms of the medium have developed and changed over time and what it means to send and receive an email has changed too. Part of understanding and interpreting a particular email is going to involve understanding the context through which it was created and the social conventions around email at a given point in time.

Continuing in the case of email, the way that individuals manage their email and how that email is acquired and processed is going to be an important part of interpreting archives of email. Some email users keep complex folder structures for managing email. In some cases organizations restrict the total size of storage space for users to keep email, so individuals end up managing their email by deleting emails to make space for new ones. At the same time, the development of services like Gmail have encouraged a different set of behaviors where individuals are increasingly keeping all of their email and simply using search to work their way through their messages[16]. To this end, developing an understanding of what an individual’s practices and or an organizations practices were around email will be a key part of making sense of any given set of emails.

To illustrate another area of born digital content that has these issues consider the way that people take, manage and work with digital photographs. One of the primary characteristics of digital objects is that it is generally trivial to make exact copies, or seemingly exact, copies of them. As a result, when it comes to digital photographs, people will often have an assortment of copies of an image with varying amounts of metadata associated with them[17]. There is the original file from a camera or a phone, a copy downloaded to a hard drive that might be edited and a range of derivative copies created for sharing on Facebook or a series of photos using different filters. While the original might be the highest resolution, the derivative files are likely seen more and it’s likely that the metadata and descriptive information about each copy can be different. As a result, there isn’t really a master file or copy, so much as there is a constellation if different versions of the photo that each can be studied to understand a personal digital media ecology of an individual or organization.

It is also worth underscoring that what a photo means in a given moment is itself historically contingent as well[18]. In the last few years more photographs have been taken then in the two hundred or so years since the camera was invented. At this point, there are more than 6 billion photos on Flickr, and hundreds of millions of photos on Facebook and Instagram[19]. The combination of camera phones and sites like Flickr, Instagram & Facebook have created a set of practices and social norms where all kinds of people take sequences of photos throughout their day and share them. Similarly, the fact that camera phones quickly began to have two cameras, one in the front and one in the back, illustrates the shift toward the emergence of the selfie as a key use of photographs. In this vein, photos increasingly play a role in the presentation of self in everyday life.

With this noted, digital photos increasingly come with a considerable amount of technical metadata embedded inside them that will be increasingly useful for historians studying these objects. Again, what is shown on the screen is only part of the story with digital objects. With a range of simple tools, it is possible to read the text information encoded through standards like XIFF which can document information about when a photo was originally taken, what software has been used to edit it, and the kind of camera that was used to originally take the photo. The result is that there exist inside many digital photographs records of the provenance of their creation and management that can be used to help contextualize and understand how they were in fact created.

What role did search play in the original experience of content?

The idea of original order, that the order materials are organized in by their creators and managers contains important value for contextualizing records, is somewhat at odds with the basic nature of digital media[20].  From the perspective of an end user, there really isn’t a first row in a database[21]. Instead, a user enters a query and the results of the query come in their own order. As a result, when content is preserved without preserving the interfaces to that content historians are going to be left needing to do a lot of reasoning and theorizing based on how they think those interfaces worked. This poses a key question to ask of born digital primary sources. What role did search interfaces and algorithms play in how users interacted with and made sense of content and what limitations on interpretation does likely not having that information impose? A few examples will illustrate this issue.

One of the biggest challenges facing web archives is that it is very unlikely that anyone is going to be able to recreate the central mode through which web content is accessed and understood. It is unlikely that there will be a historical Google search. While it is possible to find archived copies of many webpages at particular moments in time there won’t be a way to figure out what someone in Washington D.C. who goggled “Benghazi” in March of 2015 would have seen in the search results. Given that search is the primary mode through which web content is found and accessed that means it won’t be easy to figure out what it is likely that people will have come across.

As a related example, consider if someone want’s to study visual representations of any given topic in the 6 billion photos on Flickr. Even if there is an archived copy of all those photos, it would be challenging to figure out what photos someone might have seen if they searched the site at a given point in time. From that archived copy of the photos and their metadata it would be possible to study what kinds of photos people created and shared and through the metadata the relative popularity of given images. However, if one wanted to know what someone would find when they visited Flickr and searched for something you would also need to have a copy of Flickr’s proprietary “interestingness” algorithm which is used to sort out what photos are shown based on a series of weights assigned to different characteristics of photos[22].

Examples of the role of search in the use of digital media are everywhere. The capability of search is itself increasingly shifting how people manage their information, from a “filing” mentality to “piling,” and the result is that knowing how search worked in Gmail, or in the Mac operating system, is going to be increasingly important for making sense of born digital primary sources.

These various questions asked of digitized and born digital sources connect directly to a broader set of issues in how aggregations and collections of these materials are established and described. In this area many different kinds of projects have started to be described as digital archives. In what follows I will briefly explore some of the ways the term is used and discuss the issues that arise in terms of interpreting the various kinds of sources in these different kinds of digital archives.

What are Digital Archives?

When archivists, historians and digital humanists use the term “digital archive” they often mean different and overlapping things. I’m not so much interested in trying to decide whose use of the term is right or wrong, but in clarifying what the term means in different contexts.  In each case below, I have provided an example or two of this type of usage and worked to connect the kind of usage back to the questions one needs to ask of the digital primary sources contained in them.

Collections of Aggregated Digitized Primary Sources

When digital humanities scholars use the term digital archive, they are often describing aggregated collections of digitized primary sources. For example, the Shelly Godwin Archive brings together digitized copies of primary source manuscript collections from a range of different archives around the world to create a single place to access the papers of a particular family.

Historian Joshua Sternfeld has suggested considering calling these kinds of projects a genre of “digital historical representations”.[23] Sternfeld uses that term to talk more broadly about the diverse range of products historians are now creating from digitized sources, including visualizations and databases, but included theses kinds of digital archives under this umbrella. He included these in this category as they tend to be more expansive in what they bring together than what archives have generally focused on.

The origin of this usage is anchored in Jerome McGann’s work on the Rossetti Archive[24].  The Rossetti Archive presents a dizzying array of sources related to 19th century poet, illustrator and painter Dante Gabriel Rossetti. It contains much of what one might find in an archive, like copies of manuscripts and correspondence. However it also includes copies of published works like books and poems as well as a range of visual works by other artists, contemporary periodicals and other related texts. The site provides a wealth of resources and a mixture of interpretation and exhibition of those sources. However, it is often challenging to parse exactly what the scope of what one is looking at in the site.

The idea behind the Rossetti Archive, and a related idea in the William Blake Archive, was to develop a sort of ever growing hypertext aggregation of related digital copies of sources anchored around an individual[25]. In this vein, it has much more of a hybrid of a critical edition with the idea of providing the breadth of resources one might find in a literary archive.

When working with sources in this kind of digital archive it is essential to understand the context from which the original source materials were taken from. In this case, the site is likely presenting materials from a range of different provenance and as such it is important to identify where something is coming from and then think through the kinds of questions one considers about why a particular object persists and others don’t related to the history of a given source. 

Digitized Copies of Entire Archival Collections

In some cases, the term digital archive is used to refer to a digitized copy of the entire contents of an archival collection. For example, the Clara Barton Papers at the Library of Congress are available in full online. It’s not just the contents of the collection that was digitized but the folders they are contained in as well.

Presented online according to the boxes and folders they can be found in at the physical collection in Washington D.C. this kind of presentation of sources provides transparent access to the collection as it was arranged and described by archivists. In this vein, the scope and context note in same finding aid that one would use to contextualize sources and understand how selection and arrangement decisions were made is useful for working with the digitized collection. To this end, something like the Clara Barton papers is functionally a digital surrogate of an entire manuscript collection.

In a case like the Barton papers, the provenance of a given collection is much clearer and easier to parse than in the case of the previously discussed aggregations of digitized sources. With that noted, it is worth considering why a particular archive is digitized and not another as that itself represents it’s own selection/appraisal like decision. In the case of collections at most archives it will be a mixture of legal issues (generally focusing on digitizing older collections that are much less likely to involve a range of copyright and other rights issues), issues of what is thought to be most popular, and what is easiest to digitize.

As another example of where this kind of selection issues is raised, many state archives and historical societies are entering into contracts with companies like Ancestry.com to digitize large parts of their collections. In these cases, companies are generally deciding what collections to digitize based on what they deem to be the most useful to the genealogists who are their customers[26].  To this end, it is worth considering why a particular collection is available and the extent to which the selection of that collection over another for digitization might change the direction of your research and writing. With that said, this is a much less significant issue than in other cases where individual documents have been cherry picked from an archival collection and digitized in that you have a sense of the structure and content of a whole coherent archival collection.

Aside from issues of selection, it is also important to think through considerations of the quality of a given set reproductions of sources for your purpose. In the case of the Clara Barton papers, part of why they were digitized in full is that the entire collection was already microfilmed. So instead of doing high quality digital captures of the original documents it was much less expensive to simply digitize the black and white microfilm. For most purposes those digitized copies of the microfilm are perfectly serviceable. However, as the cases from the EEBO Shakespeare folios illustrated, higher quality color images of the documents would likely enable access to a much broader range of the potentially significant properties of those documents. So it’s still important to consider if the quality of a digital reproduction of an object is good enough for the purpose one intends to use it for. 

Born Digital Archival Collections

When archives acquire born digital materials and process those collections the results are often called digital archives, or born digital archives, as well. For example, Emory University acquired Salman Rushdie’s papers that came with a series of his laptops[27]. Disk images were created of those laptops and at this point it is possible for researchers to login and study the contents and environment he worked in. In this case, researchers can engage directly with an emulated version of his whole computer.

In this case, the digital archive is generally a subset or a hybrid component of an analog archival collection. Often these kinds of materials are described as part of a finding aid and as such it is relatively easy to ascertain their provenance and understand why a particular set of digital objects exists and how decisions have been made in terms of their processing, arrangement and description. With that noted, the standards and practices for collecting, processing and preserving born digital archival material are still developing and evolving. So the quality and consistency of how born digital materials are described and made available varies widely across different repositories.

All of the questions and issues raised earlier about born digital primary sources are important to consider when working with these kinds of collections. In much the same way that a historian who studies 18th century documents needs to learn to read various kinds of handwriting scripts to develop an ability to read and decipher those texts, historians are going to need to develop sophisticated understandings of how digital media systems functioned at particular points in time and how different kinds of users used them. For example, understanding how different people organize their desktops, or how they name their files, and how conventions around those sorts of things have changed over time will be an important part of interpreting born digital archives.

Web Archives

Web Archives represent another genre of born digital archives that are both significant and different enough to warrant their own consideration. At the Internet Archive, a range of National Libraries, and a host of smaller archives and libraries are engaged in work to collect and preserve websites and webpages and these collections are going to be of critical importance for future research. With that said, Web Archives represent a rather different approach to collecting and organizing sources.

The various organizations that archive the web use tools like Heritrix, an open source web crawler, are sent out to grab all of the rendered content of a webpage they can get ahold of and, within defined parameters, the other pages that link to it and all their associated files. As part of this collection process, the tools log information about the date and time that the data was collected. At this point, tools store that content in WARC files, or Web Archive files, which can then be played back via tools like the Wayback machine. So there is a lot of information in here that can be used to assert the authenticity of the data, how a particular URL presented itself to Heritrix and how Heritrix interpreted it at a particular moment in time.

There are a few key points for interpreting and studying web archives. First, web archives are consciously created. That is, an organization has a selection policy and works to collect sites that fit with that policy. So understanding those policies and the scope of a given collection is a key part of interpreting it. In that vein, it is also important to understand how a given repository works, that is many organizations require permission from content creators to collect particular kinds of sites, so in those cases, the scope of a given collection is only going to contain content from site owners that were OK with having their content collected and preserved.

Along with that, a given archived website is actually a copy of how the content of a given URL presented itself to the web crawler at a given moment in time. So, for example, if a site reconfigures how it displays itself based on the IP address of a site visitor then that will be reflected in the archived copy. There various ways that web crawling technologies can miss some of the content provided as well. So it is important to remember that web archives are not exact and pristine copies of the content of a particular URL at a moment in time but instead copies of how that content appeared to the crawler at that point in time.

Collections of User Generated Born Digital Primary Source

One of the biggest affordances of the World Wide Web is the ability for users to respond; to comment, to upload and “share”. This has not been lost on historians and archivists. Projects like the September 11 Digital Archive illustrate the possibility to “crowdsource” an archive and create a collection of born digital materials around a particular issue or topic.

Shortly after the September 11th attacks, the American Social History Project at the City University of New York Graduate Center and the Roy Rosenzweig Center for History and New Media launched a site that allowed anyone to upload records and reflections related to the attacks[28]. It contains copies of email messages, digital photographs, and a range of first hand accounts which a range of site visitors have provided over time. This sort of archive has been similarly developed around other incidents, like the Hurricane Digital Memory Bank created to digital record of Hurricanes Katrina and Rita[29].

Where an archival collection, like the papers of an individual or the records of an organization, accrue over time and have a clear and central connection to the individual or organization as the basis of their provenance these crowdsourced collections have a different kind of cohesion. Something like the September 11th digital archive can’t be understood as being a representative sample of individual’s reactions. It is a partial collection made up of who decided to participate at any given time. To that end, the individual reflections and objects in the collection are invaluable as records of individual experience but making sense of them as a whole is going to be challenging. Ideally, as researchers work with these kinds of collections in the future they will focus on understanding the kinds of voices that are represented in the collections as much as they work to interpret those voices. To that end, records of how these sites prompted users to participate and how those prompts developed and changed over time and how decisions were made about how to set up a site are going to be invaluable for helping researchers understand the scope and content of these collections.

Going Forward

Sources don’t speak for themselves. To that end, historians have developed and deployed techniques for interrogating and understanding sources based on their properties and the context of their creation, use and management. In this essay I’ve worked to explicate some of the work necessary for historians to continue to be as rigorous in working with digital sources and archives as they have been with their analog counter parts.

The key questions of source criticism are the same irrespective of if a source is digital or not. However, given the rapid pace of change around digital technology it is likely that historians are going to need to increasingly focus on establishing and sharing techniques for working with different kinds of digital sources. As information ecologies continually shift it is going to be critical for historians to show their work in making sense of the stratigraphy of digital sources.


[1] For examples of tree rings, see. William Cronon, Changes in the Land: Indians, Colonists, and the Ecology of New England (New York: Hill and Wang, 1983). For examples of the perpetual value of probate records see Bushman, Richard L. The Refinement of America: Persons, Houses, Cities. New York: Knopf, 1992.For examples of using court proceedings see Pagan, John Ruston. Anne Orthwood’s Bastard: Sex and Law in Early Virginia. New York: Oxford University Press, 2003.

[2] Howell, Martha C., and Walter Prevenier. From Reliable Sources: An Introduction to Historical Methods. Ithaca, N.Y: Cornell University Press, 2001, p 28.

[3] For further discussion of digital source criticism see Hering, Katharina. “Provenance Meets Source Criticism.” Journal of Digital Humanities, August 4, 2014. http://journalofdigitalhumanities.org/3-2/provenance-meets-source-criticism/.

[4] Droysen, Johann Gustav Bernhard. Outline of the Principles of History: (Grundriss Der Historik). Translated by Elisha Benjamin Andrews. Boston: Ginn & company, 1897. https://archive.org/stream/outlineofprincip00droy#page/22/mode/2up.

[5] Tyrrell, Ian R. Historians in Public: The Practice of American History, 1890-1970. Chicago: University of Chicago Press, 2005, p. 38.

[6] For further exploration of discussion of informational verses artifactual qualities of digitized sources see Fleischhauer, Carl. “Information or Artifact: Digitizing a Book, Part 1 | The Signal: Digital Preservation.” Webpage, October 17, 2011. http://blogs.loc.gov/digitalpreservation/2011/10/information-or-artifact-digitizing-a-book-part-1/

[7] For a more extensive exploration of this example, see Sarah Werner Where Material Book Culture Meets Digital Humanities , from the Journal of the Digital Humanities, Vol. 1, No. 3 Summer 2012

[8] For an excellent example of the way that searches for obscure terms have made it possible for historians to discover things that would have been nearly impossible in the past see Leary, Patrick. “Googeling the Victorians.” Journal of Victorian Culture 10, no. 1 (Spring 2005): 72–86.

[9] For an exploration of how searching through millions of books is changing research processes in the humanities see Ramsay, Stephen. “The Hermeneutics of Screwing Around; or What You Do with a Million Books.” In Pastplay: Teaching and Learning History with Technology, edited by Kevin Kee. University of Michigan Press, 2014. http://hdl.handle.net/2027/spo.12544152.0001.001. For further exploration on the way that searching through massive amounts of sources suggests the need for changes in how historical writing is framed see Gibbs, Fred, and Trevor Owens. “The Hermeneutics of Data and Historical Writing.” In Writing History in the Digital Age, edited by Kristen Nawrotzki and Jack Dougherty. University of Michigan Press, 2013. http://hdl.handle.net/2027/spo.12230987.0001.001.

[10] For further exploration on the theme of digital objects as performance in the context of a digital art manuscript collection see Arcangel, Cory. “The Warhol Files: Andy Warhol’s Long-Lost Computer Graphics.” Artforum, Summer (2014). https://artforum.com/inprint/issue=201406&id=46874.

[11] For more on screen essentialism see Montfort, Nick. “Continuous Paper: The Early Materiality and Workings of Electronic Literature.” Philadelphia, 2004. http://nickm.com/writing/essays/continuous_paper_mla.html

[12] For further detail on Reside’s work with these files see Reside, Doug. “‘No Day But Today’: A Look at Jonathan Larson’s Word Files,” April 22, 2011. http://www.nypl.org/blog/2011/04/22/no-day-today-look-jonathan-larsons-word-files.

[13] Kirschenbaum, Matthew G. Mechanisms: New Media and the Forensic Imagination. Cambridge, Mass: MIT Press, 2008, pp. 111-159.

[14] For a series of examples of how different browser rendering can dramatically effect the aperance of a born digital work of art see Fino-Radin, Ben. “Rhizome Artbase: Preserving Born Digital Works of Art.” Washington, D.C, 2012. http://www.digitalpreservation.gov/meetings/documents/ndiipp12/DigitalCulture_fino-radin_DP12.pdf.

[15] For discussion of how email signatures like “sent from my iPhone” effect how messages are interpret see Carr, Caleb T., and Chad Stefaniak. “Sent from My iPhone: The Medium and Message as Cues of Sender Professionalism in Mobile Telephony.” Journal of Applied Communication Research 40, no. 4 (November 1, 2012): 403–24. doi:10.1080/00909882.2012.712707.

[16] A growing body of research on how people manage digital information will likely be invaluable for future historians in contextualizing the strategies that individuals used to organize and manage their digital information. For example see, Henderson, Sarah, and Ananth Srinivasan. “Filing, Piling & Structuring: Strategies for Personal Document Management.” In System Sciences (HICSS), 2011 44th Hawaii International Conference on, 1–10. IEEE, 2011. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5718470.

[17] For an exploration of the various reasons individuals copy, edit and describe a range of derivative copies of digital photos see Marshall, Catherine C. “Digital Copies and a Distributed Notion of Reference in Personal Archives.” In Digital Media: Technological and Social Challenges of the Interactive World, edited by Megan Alicia Winget and William Aspray, 89–115. Lanham, Md: Scarecrow Press, 2011.

[18] For documentation of the historically contingent nature of photographs and an exploration of issues in interpreting photos from different historical contexts see  Trachtenberg, Alan. Reading American Photographs: Images As History, Mathew Brady to Walker Evans. 1st ed. New York, N.Y.: Hill and Wang, 1989.

[19] For an exploration of some trends in the history of numbers of photographs taken see Good, Jonathan. “How Many Photos Have Ever Been Taken?” 1000memories, September 15, 2011. https://web.archive.org/web/20120305055510/http://1000memories.com/blog/94-number-of-photos-ever-taken-digital-and-analog-in-shoebox.

[20] Bailey, Jefferson. “Disrespect Des Fonds: Rethinking Arrangement and Description in Born-Digital Archives – Archive Journal Issue 3.” Archive Journal, no. 3 (2013). http://www.archivejournal.net/issue/3/archives-remixed/disrespect-des-fonds-rethinking-arrangement-and-description-in-born-digital-archives/.

[21] For an exploration of the logic, structure and assumptions of databases see Manovich, Lev. The Language of New Media. Cambridge, Mass: MIT Press, 2002 pp. 212-236.

[22] For an example of working through a set of search results on Flickr as a primary source see Owens, Trevor. “Lego, Handcraft, and Costumed Zombies: What Zombies Do on Flickr.” New Directions in Folklore 12, no. 2 (2015): 3–25.

[23] Sternfeld, Joshua. “Archival Theory and Digital Historiography: Selection, Search, and Metadata as Archival Processes for Assessing Historical Contextualization.” The American Archivist 74, no. 2 (October 1, 2011): 544–75.

[24] McGann, Jerome J., ed. The Complete Writings and Pictures of Dante Gabriel Rossetti. Accessed August 8, 2015. http://www.rossettiarchive.org/.

[25] McGann, Jerome J. “The Rationale of Hyper Text.” Text 9 (January 1, 1996): 11–32.

[26] For a discussion of how digitization selections are made in public private partnerships see Kriesberg, Adam M. The Changing Landscape of Digital Access: Public-Private Partnerships in US State and Territorial Archives., 2015. http://deepblue.lib.umich.edu/handle/2027.42/111584 pp. 122-125.

[27]  For further background on the Salman Rushdie digital archive see Emory University. Rushdie Researcher Workstation Tutorial, 2011. https://www.youtube.com/watch?v=oiqHv_SofNo.

[28] For further exploration of the September 11th digital archive see Roy Rosenzweig  Scarcity or Abundance? Preserving the Past in a Digital Era American Historical Review 108, 3 (June 2003): 735-762 as well as Between archive and participation: Public memory in a digital age E Haskins Fall 2007 37, 4

[29] For more background on this see Why Collecting History Online is Web 1.5 Sheila A. Brennan and T. Mills Kelly Center for History and New Media, Case Study http://chnm.gmu.edu/essays-on-history-new-media/essays/?essayid=47

People, communities and platforms: Digital cultural heritage and the web

I was privileged to be able to give the opening keynote for the National Digital Forum in New Zealand a few weeks back. It was a really great conference and event and before too much time went past I wanted to post the content of my talk. I was generally blown away by what the digital library, archive and museum community has been able to accomplish in New Zealand.  Below is a video of my talk and a copy of my slides and notes. You can find many of the other talks on the NDF youtube channel.

Below are the slides from the talk, and my speaker notes. I more or less said exactly what is in the notes, so if you would rather skim or search through the talk those might be useful.

Below you can find the full text of the talk.

Digital Art Curation Grad Seminar: Your Input Welcome

Adobe-Photoshop-0.63-on-System-7-300x225I really enjoyed teaching my Digital Public History Seminar for the University of Maryland’s College of Information Studies. As a follow up, I am thrilled at the prospect of teaching another course for their Archives and Digital Curation specialization. INST 745: Introduction to Digital Arts Curation is a course that is on the books but has yet to be taught. So while the course learning objectives are fixed, I have a lot of flexibility in terms of how I go about achieving them.

In thinking through this, I’ve been toying with the following as a framework. Per the learning objectives of the course I will need to cover a little bit on digitized art. That said, the primary focus of the course is going to be born digital works. So I’m thinking about framing this largely around considering the extent to which various born digital works are best understood through a set of related but distinct set of perspectives.

That means understanding digital works as;

  • fixed creative works to be conserved
  • live performances to be documented and or recorded
  • the result of a creative process of working within the constraints of a given digital medium which produces a trail of potential archival records
  • the execution of an application or source code and data which could be broadly shared for others to use/reuse

Key Conceptual Issues

Part of what I am most excited about with born digital art is that art works are some of the best places to unpack a lot of assumptions that are taken for granted in other areas of digital preservation. That is, thinking through issues in art curation/conservation/documentation/preservation has been one of the best ways I have clarified my own thinking on seemingly more mundane issues in electronic records, software preservation, etc. In that vein, so far I think the following are likely the key conceptual issues I will focus on over the course of the semester.

  • Resistance in the Materials: I love this as a place to get into the materiality of digital objects.
  • Significant Properties: I like how art pushes back hard against the idea of any kind of innate significant properties in objects
  • Fixity: Since the alographic/autographic distinction comes from art and is itself important to understanding the identify of digital objects this is a neat place to explore that.
  • File Formats: It’s huge for digital preservation in general, but I also like the opportunity to think through how formats themselves become materials with affordances that structure experience.
  • Emulation & Virtualization: It has been neat to see the kinds of things Rhizome is doing in this space.
  • Screen Essentialism: It’s an important concept for digital works in general but particularly important in the complexity of art.
  • Platform Studies: I’ve been itching for a chance to assign Racing the Beam, and this is going to be the moment.
  • Social Memory: I love how this perspective shifts away from the things to what the things mean.

Kinds of Art Considered:

nayn-catI’m trying to think broadly in this area. Thinking of areas where what had been analog practices have shifted more or less entirely into digital practices (digital photography, computer aided design, music, film and video) as well as areas where digital media has enabled new kinds of works (video games, flash interactives, chiptunes, electronic literature, animated gifs, web comics). Along with that, I’m just as interested in looking at vernacular art (everyday people’s digital photos, memes, lolcats) as I am at fancier stuff.

Assignment Structure:

I really want to focus on having students produce pragmatic and practical work. So I’m thinking about having students create a series of documents one would create if you were working to justify and plan for the acquisition or documentation of a work. In that vein I’m thinking about having students pick from a list of works I provide and having them do the leg work to plan to preserve it. I’d love ideas about what kinds of documents and material students should create.

Here are a few things I am thinking about:

  • Proposed Collection or Object Acquisition Brief: When I was working at LC I had a few opportunities to advise on potential acquisitions and ended up working up a short format to cover how to describe the technical, legal, preservation and access issues.
  • Preservation Intent Statements: I think this work by work or collection by collection approach to establishing what matters about a given thing and how you are going to ensure that you have access to what matters about it in the future.
  • Digital Curatorial Research File: I really like how Seb Chan talked about doing this for the Planetary App, and I could imagine that serving as a viable deliverable.

Readings I’m Considering: 

Still in the early phases of this. But I figured I would share the list for folks to comment on and to spark suggestions for other things I should be considering.


  • Rinehart, R., & Ippolito, J. (2014). Re-collection: art, new media, and social memory. Cambridge, Massachusetts: The MIT Press.
  • Salter, A., & Murray, J. (2014). Flash: building the interactive web. Cambridge, Massachusetts: The MIT Press.
  • Sterne, J. (2012). MP3: the meaning of a format. Durham: Duke University Press.
  • Kirschenbaum, M. G. (2008). Mechanisms: New Media and the Forensic Imagination. Cambridge, Mass: MIT Press.
  • Montfort, N., & Bogost, I. (2009). Racing the beam: the Atari Video computer system. Cambridge, Mass: MIT Press.

Articles and Reports

Things I could use your help on:

So that is where I am at so far. I’ll likely blog about some of these elements in more detail later, but I wanted to get this up and out there to start soliciting opinions on how best to do the course.

In particular, I would be interested to know:

  • What other kinds of assignments do you think would be useful?
  • Are there any particularly good readings I should be considering?
  • Are there any key conceptual issues you think I should add or subtract?