When Going back is Going Forward: On Returning to the LC

I am overjoyed to announce that I will serve as the first Head of Digital Content Management in Library Services at the Library of Congress.

For the first time in my career I’m going to be an official librarian. In Gov speak, this job is a 1410, I’ve previously been a “information technology specialist (2210) and most recently the most government sounding thing imaginable “miscellaneous administration” (0301). Of the many things I am excited about regarding the position, one is that after more than a decade of working with and supporting libraries I will now be, according to the United States federal government, a practicing librarian.

It’s been hard for me to imagine something that could call me away from the work I get to do at the Institute of Museum and Library Services. But then this came along. This post provides some context on 1) why I’m generally excited to go back 2) why I’m specifically excited to go back and 3) why it’s also so very hard to leave.

A great time to be at the LC

IMLS staff settling in to watch and cheer on Dr. Carla Hayden during her swearing in as the Library of Congress. I’m over in the right corner. Photo Credit Emily Reynolds.

Like librarians around the country, I cheered when Dr. Carla Hayden was nominated to be our Librarian of Congress. I cheered even louder when she was confirmed. I’ve been watching from the sidelines and talking with many of my friends at LC about exciting changes she is bringing to the institution.

It’s clearly a great time to work for the Library of Congress. I’m excited to be a part of that.

I will point out a few top-of-mind examples of exciting work afoot. Kate Zwaard and her team are charting a course for supporting the use and reuse of digital collections. Joe Puccio and his team have established and advanced a smart and bold vision for digital collecting. It’s invigorating to see the Library of Congress so clearly engaging as a partner and a leader in the national community of libraries and librarians. The friendships that came from my previous stint at LC have stuck with me and it is great to see many of those smart, thoughtful, and hardworking friends moving into positions that are enabling them to take the institution to exciting new places.

So that’s some context on general LC things that get me excited, but it’s the particulars of this job that pull me back.

What does being the Head of Digital Content Management entail?

Screenshot from the Library of Congress FY 17 budget justification which explains concepts behind the launch of the Digital Collections Management Section. Note: This document was drafted a good while ago and as such the particulars of the organizational structure and roles in this is likely a bit different than what is outlined here.

Before getting into this, I should kick off by saying that what follows is 1) my personal reading of 2) public information about 3) a job I’ve yet to start at 4) an organization that has changed a lot since I left it three years ago. Further, as is always the case, things on my blog are a) personal reflections b ) not in any way official in regards to any connection with organizations I am now or have ever been associated with and c) last but not least, in no way statements on behalf of the gov. Back to our regularly scheduled blogging.

For some context on where this new Head of Digital Content Management role comes from, I recommend pages 48-49 of the Library of Congress 2017 budget justification. As a general side note, the hefty budget justification tomes are chock full of great info on the institution.

Given that most readers will likely not be cracking open that particular volume, I’m happy to walk through it a bit. On page 48 you can find a description of a request for resources to meet the need for the “establishment of a new digital content management unit … that will be responsible for collecting and managing content for the Library’s collections in digital formats.” There is a bit more context there about what exactly this entails, “The unit will codify and communicate digital content best practices, provide training to staff throughout the Library, and work with Office of the Chief Information Officer to develop the Library’s technical capacity to collect, preserve, and deliver digital collections.” Furthermore, the division “will focus on expanding the Library’s 

acquisition, management, and preservation of digital collections” and “assume responsibility for key born digital acquisitions programs and digital materials not supported elsewhere in the Library, including web archiving.

In short, this involves developing, building out, and supporting a team focused on building capacity for the Library of Congress to collect, manage, and provide access to digital content. It also involves supporting the existing web archiving program, which (IMHO) is a) world class digital collecting program and b) one of the most exciting, dynamic, and growing programs at the Library of Congress.

In short, this is my jam.

If you want evidence of my passion for long term access to digital information look no further than the tattoo on my right arm

This feels like what I have been thinking about and writing about for close to a decade.

Subject matter aside, I’ve also become increasingly passionate about good management and administration. When I moved into a management role at IMLS I delved deep into reading, thinking, and practicing work on organizational culture, emotional intelligence, organizational design and theory, and creating effective teams. The idea of helping build out a unit of the Library of Congress anchored in an ethic of care is (irrespective of the subject) exciting in its own right.

Even more information about digital content management!

But wait… there’s more.

The job posting has, in between standard government-job-speak, some thoughtfully crafted info on how this work is being designed. Again, I’m not on the inside yet. I don’t know exactly how the lived experience of these aspects are developing and coming together. With that noted, the way this job (and the resulting organizational structure it implies) are being described and articulated makes a lot of sense me. I’ve pasted the duties below for context. The thing I want to highlight in these is the connections between this new Digital Content Management Section as three different kinds of things;

  • A custodial unit that manages a set of collecting programs and content;
  • A platform for increasing capacity for digital collecting in units across the Library; and
  • A team that can share out what it learns more broadly in LC and with the field.

In my mind, these seem like three concentric circles nested in each other that can facilitate and support change in both the institution and the field. Below is the exact language from the position listing where you can see these different themes weave in and out. I’ve gone ahead and bolded some parts that connect with those three points I’ve mentioned. I like to stick to my sources and my texts, so here we go.

The Head of Digital Content Management…

  • Coordinates, determines, and manages projects within the section. Serves as a technical expert in the planning, management, and execution of digital collections projects and activities within the scope of the section. Applying broad knowledge of digital libraries and technical solutions provides expert analysis and advice and develops solutions to solve complex issues and problems associated with digital lifecycle management. Identifies and applies new analytical techniques to address situations that are unique or not previously encountered.
  • Oversees the development of requirements related to the management of digital content under care of the section. Directs studies and testing of digital library best practices and standards. Develops cost estimates and IT investment packages to support digital content acquisition and curation programs. Serves as advisor and liaison to the Chief of Digital Collection Management & Services Division (TBD) on matters pertaining to digital collections lifecycle activities. Establishes and maintains effective working relationships with Library staff at multiple levels and across service unit lines on digital collections management.
  • Coordinates digital workflow activities with specialists in curatorial units throughout Library Services, the Law Library, and the Office of Chief Information Officer. Provides training and presentations to staff in stakeholder and curatorial units. Communicates orally and in writing to both technical and non-technical staff concerning digital collections activities.
  • Attends conferences/meetings to make presentations or for professional development to keep abreast of current trends in technology. Works collaboratively inside and outside the section to facilitate and encourage the development and implementation of institution-wide and national best practices and standards.
  • Performs the administrative and human resource management functions related to the staff supervised. Establishes guidelines and performance expectations for staff, which are clearly communicated through the formal employee performance management system. Responsible for advancing the objectives of equal employment opportunity (EEO) by taking positive steps to adhere to nondiscriminatory employment practices in regard to race, color, religion, sex, national origin, age, and disability.

But now… the feels

The last few weeks I’ve felt a lot like No Face (pictured above) batted about by waves of feels.

It is really hard to leave IMLS. I love this place. I love the mission. I love the work. But far and above all of that, I love the people. No one outside the organization can ever really understand what it’s like.

The team I have been able to work with has been a gift. My colleagues and staff in the Office of Library Services are amazing. Each of them is brilliant and talented in different ways and together they are incredibly effective and sharp. It’s also a group of thoughtful and supportive people. I like to think everyone in the team knows that they are respected and they are cared for. At least that is how I know I’ve felt. It’s also something I’ve tried to communicate. So much work goes on behind the scenes with very few people. The small but mighty staff of the Office of Library Services, supported by an assortment of other great tiny offices, are who make all of it happen.

Many of the Office of Library Services team at the opening of ALA 2016

Beyond the direct team, the folks in the Office of Museum Services are a pleasure to work with and the various supporting offices are all filled with dedicated people who work to make rules and regulations designed for federal agencies with thousands of people work for one that has a tiny fraction of a fraction of that. I can honestly say I’ve never worked as hard as I’ve worked at IMLS, but the work has always been deeply rewarding. To my IMLS colleagues, I see the work you do and I understand and appreciate it. I’m going to be less than a mile away and I would love to drop just about anything to get coffee and talk whenever. I hope you’ll all still keep me in the loop re: happy hours.

OLS staff sporting their team jerseys. I’m in the back with a Zotero t-shirt on.

It’s hard to leave all that. However, as various people have come and gone in my time at IMLS, I’ve been aware of just how temporary and fleeting places and communities like this are. Reflecting on changes in the time I’ve been there I can clearly count a range of different eras and moments. We never walk into the same river twice and part of what makes the stratigraphy of our lives special and poignant in our memories requires a perpetual moving and shifting around. With all of that said, I’m joining the IMLS alumni network:  a crew of friends I already know quite well.

I am incredibly proud of the work I’ve been involved in. I count myself lucky to have the chance to develop Maura’s vision in the national digital platform framework.  Along with that, I am thrilled that the calls for proposals for FY 2018 continue to support national digital platform projects. I leave in place a team of some of the smartest and most talented people working on these issues in the country. I can’t wait to see all the amazing work that they will support. I also leave with the hope that my understanding of the functions and missions of IMLS and the Library of Congress can support future connections and collaborations between these unique institutions.

I end where I often find myself, humbled and inspired by the opportunities I have been presented with to serve, and hopeful that I can, to whatever extent possible, pay forward what has been given to me by my mentors and colleagues. As we Wisconsinites say, Forward!

Full Draft of Theory & Craft of Digital Preservation

Here it is, the book printed out for the first time. Or I suppose more accurately, a digital photo of the book printed out for the first time.

This weekend I’m submitting the full draft of the manuscript for my book The Theory and Craft of Digital Preservation to the publisher, Johns Hopkins University Press.

Update: to make it easier to read, I’ve shared a PDF preprint of the whole draft.

I’ve had a lot of fun working on this on nights and weekends over the last year. I have also learned a ton from everyone who has read drafts of the work in progress.

I’ve had a few folks reach out to me after reading parts of drafts and say things like “I’d love to read more of this. When will it be out?” I’m not sure exactly how long it will take for the next round of review and all the improvements that will come from working with a great press. With that said, drafts of the entire book are now online. Instead of having folks pick through my previous blog posts with the links, I figured I would put them all together in order in this post.

So to that end, below you can find an index to the eight chapters and the intro and conclusion. I’m going to leave this up with all the comments in them. I went through and resolved comments offline in my own copies of these but thought it would be fun to leave up the messy original drafts and a record of all the great input and ideas that folks have offered up to improve the text.

Table of Contents

Introduction Beyond Digital Hype & Digital Anxiety (7 pages)

Section One: Theory of Digital Preservation

Ch 1: Preservation’s Divergent Lineages (14 pages)

Ch 2: Understanding Digital Objects (12 pages)

Ch 3: Challenges & Opportunities for Digital Preservation  (11 pages)

Section Two: The Craft of digital Preservation

Ch 4: The Craft of Digital Preservation (6 pages)

Ch 5: Preservation Intent & Collection Development (13 pages)

Ch 6: Managing Copies & Formats (15 pages)

Ch 7: Arranging & Describing Digital Objects (19 pages)

Ch 8: Enabling Multimodal Access & Use (18 pages)

Conclusion: Tools for Looking Forward (9 pages)

Advance Twitter Praise for the Book

I pulled out a few fun tweets from folks responding to the book that I thought were fun to share.

https://twitter.com/save4use/status/877128435696619520

Getting Beyond Digital Hyperbole & Tools for Looking Forward

The book is now whole! I’m going to be spending this weekend working through revisions to the last section based on all of the great comments I’ve been getting, but I’m also now excited to share both the introduction and the conclusion.

If you have any comments or suggestions on these please do go ahead and chime in on them in comments on the docs. In the intro I try to lay out a whole set of axioms for digital preservation, which I’ve gone ahead and reposted below.

Fifteen Guiding Digital Preservation Axioms

As a point of entry to the book I have distilled a set of fifteen guiding axioms. I realize that sounds a little pretentious, but it’s the right word for what these are. These axioms are points that I think should serve as the basis for digital preservation work. They are also a useful way to work out some initial points for defining what exactly digital preservation is and isn’t. Some of them are unstated assumptions that undergird orthodox digital preservation perspectives; some are at odds with that orthodoxy. These axioms are things to take forward as assumptions going into the book. Many of these are also points that I will argue for and demonstrate throughout the book.

  1. A repository is not a piece of software. Software cannot preserve anything. Software cannot be a repository in itself. A repository is the sum of financial resources, hardware, staff time, and ongoing implementation of policies and planning to ensure long-term access to content. Any software system you use to enable you preserving and providing access to digital content is by necessity temporary. You need to be able to get your stuff out of it because it likely will not last forever. Similarly, there is no software that “does” digital preservation.
  2. Institutions make preservation possible. Each of us will die. Without care and management, the things that mattered to us will persist for some period of time related to the durability of their mediums. With that noted, the primary enablers of preservation for the long term are our institutions (libraries, archives, museums, families, religious organizations, governments, etc.) As such, the possibility of preservation is enabled through the design and function of those institutions. Their org charts, hiring practices, funding, credibility, etc. are all key parts of the cultural machinery that makes preservation possible.
  3. Tools can get in the way just as much as they can help. Specialized digital preservation tools and software are just as likely to get in the way of solving your digital preservation problems as they are to help. In many cases, it’s much more straightforward to start small and implement simple and discrete tools and practices to keep track of your digital information using nothing more than the file system you happen to be working in. It’s better to start simple and then introduce tools that help you improve your process then to simply buy into some complex system without having gotten your house in order first.
  4. Nothing has been preserved, there are only things being preserved. Preservation is the result of ongoing work of people and commitments of resources. The work is never finished. This is true of all forms of preservation; it’s just that the timescales for digital preservation actions are significantly shorter than they tend to be with the conservation of things like books or oil paintings. Try to avoid talking about what has been preserved; there is only what we are preserving. This has significant ramifications for how we think about staffing and resourcing preservation work. Preservation is ongoing work. It is not something that can be thought of as a one time cost.
  5. Hoarding is not preservation. It is very easy to start grabbing lots of digital objects and making copies of them. This is not preservation. To really be preserving something you need to be able to make it discoverable and accessible and that is going to require that you have a clear and coherent approach to collection development, arrangement, description and methods and approaches to provide access.
  6. Backing up data is not digital preservation. If you start talking about digital preservation and someone tells you “oh, don’t worry about it, we back everything up nightly” you need to be prepared to explain how and why that does not count as digital preservation. This book can help you to develop your explanation. Many of the aspects that go into backing up data for current use are similar to aspects of digital preservation work but the near term concerns of being able to restore data are significantly different from the long term issues related to ensuring access to content in the future.
  7. The boundaries of digital objects are fuzzy. Individual objects reference, incorporate and use aspects of other objects as part of their everyday function. You might think you have a copy of a piece of software by keeping a copy of its installer, but that installer might call a web service to start downloading files in which case you can’t install and run that software unless you have the files it depends on. You may need a set of fonts, or a particular video codec, or any number of other things to be able to use something in the future and it is challenging to articulate what is actually inside your object and what is external to it.
  8. One person’s digital collection is another’s digital object is another’s dataset.  In some cases the contents of a hard drive can be managed as a single item, in others they are a collection of items. In the analog world, the boundaries of objects were a little bit more straightforward or at least taken for granted. The fuzziness of boundaries of digital objects means that the concept of “item” and “collection” is less clear than with analog items. For example, a website might be an item in a web archive, but it is also functionally a serial publication which changes over time. A collection of web pages are themselves a collections of files.
  9. Digital preservation is about making the best use of your resources to mitigate the most pressing preservation threats and risks. You are never done with digital preservation. It is not something that can be accomplished or finished. Digital preservation is a continual process of understanding the risks you face for losing content or losing the ability to render and interact with it and making use of whatever resources you have to mitigate those risks.
  10. The answer to nearly all-digital preservation question is “it depends.” In almost every case, the details matter. Deciding what matters about an object or a set of objects is largely contingent on what their future use might be. Similarly, developing a preservation approach to a massive and rapidly growing collection of high-resolution video will end up being fundamentally different to the approach an organization would take to ensuring long-term access to a collection of digitized texts.
  11. It’s long past time start taking actions. You can read and ponder complicated data models, schemas for tracking and logging preservation actions, and a range of other complex and interesting topics for years but it’s not going to help “get the boxes off the floor.” There are practical and pragmatic things everyone can and should do now to mitigate many of the most pressing risks of loss. I tried to highlight those “get the boxes off the floor” points throughout the second half of the book. So be sure to prioritize doing those things first before delving into many of the more open ended areas of digital preservation work and research.
  12. Highly technical definitions of digital preservation are complicit in silencing the past. Much of the language and specifications of digital preservation have developed into complex sets of requirements that obfuscate many of the practical things anyone and any organization can do to increase the likelihood of access to content in the future. As such, a highly technical framing of digital preservation has resulted in many smaller and less resource rich institutions feeling like they just can’t do digital preservation, or that they need to hire consultants to tell them about complex preservation metadata standards when what they need to do first is make a copy of their files.  Along with this, digital media affords significant new opportunities for engaging communities with the development of digital collections. When digital preservationists take for granted that their job is to preserve what they are given, they fail to help an organization rethink what it is possible to collect. Digital preservation policy should be directly connected to and involved in collection development policy. That is, the affordances of what can be easily preserved should inform decisions about what an organization wants to go out and collect and preserve.
  13. Accept and embrace the archival sliver. We’ve never saved everything. We’ve never saved most things. When we start from the understanding that most things are temporary and likely to be lost to history, we can shift to focus our energy on making sure we line up the resources necessary to protect the things that matter the most. Along with that, we need to realize that there are varying levels of effort that should be put toward future proofing different kinds of material.
  14. The scale and inherent structures of digital information suggest working more with a shovel than with a tweezers.  While we need to embrace the fact that we can’t collect and preserve everything, we also need to realize that in many cases the time and resources it takes to make decisions about individual things could be better used elsewhere. It’s often best to focus digital preservation decision making at scale. This is particularly true in cases where you are dealing with content that isn’t particularly large. Similarly, in many cases it makes sense to normalize content or to process any number of kinds of derivative files from it and keep the originals. In all of these cases, the computability of digital information and the realities of digital files containing significant amounts of contextual metadata means that we can run these actions in batch and not one at a time.
  15. Doing digital preservation requires thinking like a futurist. We don’t know the tools and systems that people will have and use in the future to access digital content. So if we want to ensure long term access to digital information we need to, at least on some level, be thinking about and aware of trends in the development of digital technologies. This is a key consideration for risk mitigation. Our preservation risks and threats are based on the technology stack we currently have and the stack we will have in the future so we need to look to the future in a way that we didn’t need to with previous media and formats. 

Theory and Craft of Digital Preservation: Part Two Posted for Comment

Some class notes from Alice Rogers in my digital preservation seminar.

It took me longer than I anticipated, but I am now both excited and rather anxious to share drafts of the rest of my forthcoming book. A while back, I posted drafts of the first section of the book. The comments and responses I received on that have been fantastic. I’m now going to turn to reviewing and revising that section based on the generous wealth of feedback I’ve received.

The Craft Half of the Book

The first half of the book was the theory part, the second is the craft part. In the five chapters in this section I try to offer up a set of interrelated frames for working through the ongoing issues and challenges that make up digital preservation as a craft.

Chapter four is largely an explanation and justification for why and how I’ve set up the following four chapters. So I won’t delve too much into giving any context here as it’s better to just read the context in the chapter. With that noted, I’ve included the diagram I use in that chapter to explain how I see each of the subsequent chapters connecting with each other.

 

 

 

First 3 Chapters of Theory and Craft of Digital Preservation for Comment

As I mentioned in December, I’m working on a book called The Theory and Craft of Digital Preservation for Johns Hopkins University Press. For an overview of the book go read that post.

At this point I have a full working rough draft of the book together and I’m getting to a point where it could really benefit from readers input and insights. To that end, I’m posting drafts of the first three chapters up as Google Docs which you should be able to comment on and suggest edits to. When I’ve posted drafts of essays like this in the past I’ve received fantastic comments that has helped me refine both my writing and my thinking. So now we will see if the same kind of thing works for a book.

I’m interested in any and all feedback and input, however, I’m particularly interested in any suggestions for work that I should be citing from women, people of color, and people from the majority world.  Much of the digital preservation and digital media studies literature I’m drawing from is (like many fields) very white, very male and U.S/Eurocentric and I’d like to be working against that not reinforcing it.

So with that context, I’ve provided links to each chapter below and a bit of context for each chapter from the book proposal. My plan is to work through all the comments I get in early March.

Ch 1: Artifact, Information, or Folklore: Preservation’s Divergent Lineages

Interdisciplinary dialog about digital preservation often breaks down when an individual begins to protest “but that’s not preservation.” Preservation means a lot of different things in different contexts. Each of those contexts has a history. Those histories are tied up in the changing nature of the mediums and objects for which each conception of preservation and conservation was developed. All to often, discussions of digital preservation start by contrasting digital media to analog media.  This contrast forces a series of false dichotomies. Understanding a bit about the divergent lineages of preservation helps to establish the range of competing notions at play in defining what is and isn’t preservation.

Building on work in media archeology, this chapter establishes that digital media and digital information should not be understood as a rupture with an analog past, Instead, digital media should be understood as part of a continual process of remediation embedded in the development of a range of new mediums which afford distinct communication and preservation potential. Understanding these contexts and meanings of preservation establishes a vocabulary to articulate what aspects of an object must persist into the future for a given preservation intent.

To this end, this chapter provides an overview of many of these lineages. This includes; the culture of scribes and the manuscript tradition; the bureaucracy and the development of archival theory for arranging archives and publishing records; the differences between taxidermy and insect collecting in natural history collections and living collections like butterfly gardens and zoos; the development of historic preservation of the built environment; the advent of recorded sound technology and the development of oral history; and the development of photography, microfilming and preservation reformatting. Each episode and tradition offers a mental model to consider deploy for different contexts in digital preservation.

The purpose here is not a detailed history of lineages of preservation and the development of media, but instead to illustrate the many different conceptions of preservation exist and how those conceptions are anchored in different objectives. This overview provides readers with a focus on the distinct conceptions of what matters about an object and the innate material properties and affordances of different kinds of media as they relate to preservation.

Ch 2: Understanding Digital Objects

Doing digital preservation requires a foundational understanding of the structure and nature of digital information and media. This chapter works to provide such a background through three related strands of new media studies scholarship. First, all digital information is material. Second, digital information is best understood as existing in and through a nested set of platforms. Third, that the database is an essential media form and metaphor for understanding the logic of digital media.

Given that digital information is always physically encoded on digital media, it is critical to recognize that the raw bit stream (the sequence of ones and zeros encoded on the original medium) have a tangible and objective ability to be recorded and copied. This provides an essential first level basis for digital preservation. It is possible to establish what the entire sequence of bits is on a given medium, or in a given file, and use techniques to create a kind of digital fingerprint for it that can then be used to verify and authenticate perfect copies.

With that noted, those bit streams are animated, rendered, and made usable through nested layers of platforms. In interacting with a digital object, computing devices interact with the structures of file systems, file formats and various additional layers of software, protocols and drivers. Drawing on examples from net art, video games, and born digital drafts of literary works, I explore multiple ways to approach them anchored in different layers of their digital platforms. The experience of the performance of an object on a particular screen, like playing a video game or reading a document, can itself obfuscate many of the important aspects of digital objects that are interesting and important but much less readily visible, like how the rules of a video game actually function or deleted text in a document which still exists but isn’t rendered on the screen.

As a result of this nested platform nature, the boundaries of digital objects are often completely dependent on what layer one considers to be the most significant for a given purpose. In this context, digital form and format must be understood as existing as a kind of content. Across these platform layers digital objects are always a multiplicity of things. For example, an Atari video game is a tangible object you can hold, a binary sequence of information encoded on that medium identical to all the other copies of that game, source code authored as a creative work, a packaged commodity sold and marketed to an audience, and a signifier of a particular historical moment. Each of these objects can coexist in the platform layers of a tangible object, but depending on which is significant for a particular purpose one should develop a different preservation approach.

Lastly, where the index or the codex can provide a valuable metaphor for the order and structure of a book, new media studies scholarship has suggested that the database is and should be approached as the foundational metaphor for digital media. From this perspective, there is no “first row” in a database, but instead the presentation and sorting of digital information is based on the query posed to the data. Given that libraries and archives have long based their conceptions of order on properties of books and paper, embracing this database logic will have significant implications for making digital material available for the long term.

Ch 3: Challenges and Opportunities for Digital Preservation 

With an understanding of digital media and some context on various lineages of preservation, it is now possible to break down what the inherent challenges, opportunities and assumptions of digital preservation are.

We can’t count on long-lived media, interfaces, or formats. Popular digital media of all kinds Disc, Disk, and NAND Flash Wafers all degrade rather quickly — in terms of years, not decades or centuries. Many of these media are relatively complex to read, so the interfaces required to interpret them are likely to not be particularly long lived. The costs of trying to either repair these media or to fix and repair interfaces to read them rapidly becomes prohibitive. As a result, traditional notions of conservation science are, outside of some niche cases, going to be effectively useless for the long-term preservation of digital objects.

Going back to the discussions of preservation lineages, this means that digital preservation is an enterprise that can only focus on the allographic digital object. While all digital information is material, the conservation of that material over the long haul is not broadly practical. Where conservation science is concerned with the chemical and material properties of mediums and artifacts, the science of digital preservation is and will be computer science. With that said, because bitstreams are always originally encoded on tangible media and then created by, acted on and interpreted by all kinds of human made layers of software they end up presenting an extensive range of seemingly artifactual and not simply informational qualities. That is, the physical and material affordances of different digital mediums will continue to shape and structure digital content long after it has been transferred and migrated to new mediums.

First 3 Chapter’s Bibliography 

  • Archimedes Palimpsest Project. “About the Archimedes Palimpsest.” Accessed February 3, 2017. http://archimedespalimpsest.org/about/.
  • Association for Documentary Editing. “About Documentary Editing.” The Association for Documentary Editing. http://www.documentaryediting.org/wordpress/?page_id=482.
  • Bearman, David. Archival Methods. Archives and Museum Informatics Technical Report, vol. 3, no. 1. Pittsburgh, Pa: Archives & Museum Informatics, 1989.
  • Bird, Graeme D. Multitextuality in the Homeric Iliad: The Witness of the Ptolemaic Papyri. Hellenic Studies 43. Washington, D.C. : Cambridge, Mass: Center for Hellenic Studies ; Distributed by Harvard University Press, 2010.
  • Bogost, Ian. Alien Phenomenology, Or, What It’s like to Be a Thing. Posthumanities 20. Minneapolis: University of Minnesota Press, 2012.
  • Brylawski, Sam, Maya Lerman, Robin Pike, and Kathlin Smith. “ARSC Guide to Audio Preservation.” CLIR Publication. Washington, D.C, 2015. http://cmsimpact.org/wp-content/uploads/2016/08/ARSC-Audio-Preservation.pdf.
  • Chun, Wendy Hui Kyong. Control and Freedom: Power and Paranoia in the Age of Fiber Optics. The MIT Press, 2005.
  • Fino-Raidin, Ben. “Rhizome Artbase: Preserving Born Digital Works of Art.” Washington, D.C, July 24-26. http://digitalpreservation.gov/meetings/documents/ndiipp12/DigitalCulture_fino-radin_DP12.pdf.
  • Galloway, Alexander R. Protocol: How Control Exists after Decentralization. The MIT Press, 2006.
  • Gitelman, Lisa. Always Already New: Media. Cambridge, MA: MIT Press, 2006.
  • ———. Paper Knowledge: Toward a Media History of Documents. Durham ; London: Duke University Press Books, 2014.
  • International Council of Museums, Committee for Conservation. “The Conservator-Restorer: A Definition of the Profession,” 1984. http://www.icom-cc.org/47/history-of-icom-cc/definition-of-profession-1984.
  • Kirschenbaum, Matthew. “Software, It’s a Thing.” Medium, July 25, 2014. https://medium.com/@mkirschenbaum/software-its-a-thing-a550448d0ed3.
  • Kirschenbaum, Matthew G. Mechanisms: New Media and the Forensic Imagination. Cambridge, Mass: MIT Press, 2008.
  • Kittler, Friedrich A. Gramophone, Film, Typewriter. Translated by Michael Wutz and Geoffrey Winthrop-Young. Stanford, Calif: Stanford: Stanford University Press, 1999.
  • Krajewski, Markus. Paper Machines: About Cards & Catalogs, 1548-1929. History and Foundations of Information Science. Cambridge, Mass: MIT Press, 2011.
  • Lee, Christopher A. “Digital Curation as Communication Mediation.” In Handbook of Technical Communication, edited by Alexander Mehler and Laurent Romary, 507–530. Boston, MA: Walter de Gruyter, 2012.
  • Manovich, Lev. “Database as a Genre of New Media,” 1997. http://vv.arts.ucla.edu/AI_Society/manovich.html.
  • ———. Software Takes Command: Extending the Language of New Media. International Texts in Critical Media Aesthetics. New York ; London: Bloomsbury, 2013.
  • ———. The Language of New Media. Cambridge, Mass: MIT Press, 2002.
  • McNeill, Lynne S. Folklore Rules: A Fun, Quick, and Useful Introduction to the Field of Academic Folklore Studies. University Press of Colorado, 2013.
  • Mir, Rebecca, and Trevor Owens. “Modeling Indigenous Peoples: Unpacking Ideology in Sid Meier’s Colonization.” In Playing with the Past: Digital Games and the Simulation of History, 91–106, 2013.
  • Montfort, Nick. “Continuous Paper: MLA,” 2004. http://nickm.com/writing/essays/continuous_paper_mla.html.
  • Montfort, Nick, and Ian Bogost. Racing the Beam: The Atari Video Computer System. Platform Studies. Cambridge, Mass: MIT Press, 2009.
  • Nakamura, Lisa. Digitizing Race: Visual Cultures of the Internet. Electronic Mediations 23. Minneapolis: University of Minnesota Press, 2008.
  • Office of Communications, and Library of Congress Office of Communications. “Hyperspectral Imaging by Library of Congress Reveals Change Made by Thomas Jefferson in Original Declaration of Independence Draft.” Press Release. Washington, D.C, July 2, 2010. https://www.loc.gov/item/prn-10-161/analysis-reveals-changes-in-declaration-of-independence/2010-07-02/.
  • Owens, Trevor. “Pixelated Commemorations: 4 In Game Monuments and Memorials.” Play the Past, June 18, 2014. http://www.playthepast.org/?p=4811.
  • Reside, Doug. “‘No Day But Today’: A Look at Jonathan Larson’s Word Files.” New York Public Library Blog, April 22, 2011. http://www.nypl.org/blog/2011/04/22/no-day-today-look-jonathan-larsons-word-files.
  • Rinehart, Richard, and Jon Ippolito, eds. Re-Collection: Art, New Media, and Social Memory. Leonardo. Cambridge, Massachusetts: The MIT Press, 2014.
  • Saylor, Nicole. “Computing Culture in the AFC Archive.” Folklife Today, January 8, 2014. https://blogs.loc.gov/folklife/2014/01/computing-culture-in-the-afc-archive/.
  • Sharpless, Rebecca. “The History of Oral History.” In History of Oral History: Foundations and Methodology, edited by Lois E. Myers and Rebecca Sharpless, 9–32. Lanham, MD: AltaMira Press, 2007.
  • Smigel, Libby, Martha Goldstein, and Elizabeth Aldrich. Documenting Dance: A Practical Guide. Dance Heritage Coalition, 2006. http://www.danceheritage.org/DocumentingDance.pdf.
  • Sterne, Jonathan. MP3: The Meaning of a Format. Sign, Storage, Transmission. Durham: Duke University Press, 2012.
  • Thesaurus Linguae Graecae Project. “Thesaurus Linguae Graecae – History.” Accessed February 3, 2017. https://www.tlg.uci.edu/about/history.php.
  • Tomasello, Michael. The Cultural Origins of Human Cognition. Harvard University Press, 2009.
  • Tyrrell, Ian R. Historians in Public: The Practice of American History, 1890-1970. Chicago: University of Chicago Press, 2005. http://www.loc.gov/catdir/toc/ecip058/2005003459.html.
  • Werner, Sarah. “Where Material Book Culture Meets Digital Humanities.” Journal of Digital Humanities 1, no. 3 (2012). http://journalofdigitalhumanities.org/1-3/where-material-book-culture-meets-digital-humanities-by-sarah-werner/.

Theory & Craft of Digital Preservation: My Next Book

Some class notes from Alice Rogers in my digital preservation seminar.
Some class notes from Alice Rogers in my digital preservation seminar.

This has been brewing for a while, but it’s now enough of a thing that I can share about it. I am excited to announce that I’m on the hook with Johns Hopkins University Press to produce a short book (30-40k words) called The Theory & Craft of Digital Preservation: An Introduction.

I have about half of the book together in a really rough draft form. Much of my nights and weekends for about the next six months will be spent working up the rest of it and getting the whole thing together.

The genesis of the book came when I was designing my digital preservation seminar and realized that I feel like much of the beaten path for talking about digital preservation has more to do with how we got to what we do now than how it would make sense to explain the issues and topics to folks from scratch. So the course has given me a chance to try out the road-map for the book.

I’ve gotten the OK to share drafts of the chapters as they start to come together. I’ve found that I benefit dramatically from doing my writing in the open where folks can help me refine and sharpen my ideas before they end up fixed in any particular medium.

To that end, I figured I would share most of the book proposal I worked up. In working on drafting, some of this has started to shake out a bit differently, but I thought folks might be interested in a preview. I’m thinking I will start posting a chapter or two a month early-ish in the new year.

Overview of the Book

The historical record is increasingly digital. Over the last half century, under headings of “electronic records management” and “digital preservation,” librarians, archivists, and curators have established practices to ensure that our digital scientific, social and cultural record will be available to scholars and researchers into the future. This book is intended as a point of entry into that theory and practice.

Through years of leading collaborative national digital strategy efforts to ensure long-term access to digital content, I have observed that many experts in digital media and libraries, archives and museums often end up talking past each other as they work toward their mutual goals. All too often, discussions of digital preservation fail to fully state and engage with the nature digital objects and media, thereby undermining our ability to fully engage do this work in a common and coherent fashion.

This failure of understanding is rooted in two key fundamental issues: First, that preservation itself is not a single area of activity, but has always been historically intertwined with distinct disciplines that have grappled with the affordances of various historically “new” mediums. Second, that there are distinct affordances of digital media that require rethinking those diverse perspectives on preservation and conservation. The central contribution of this book is to put the lineages of preservation in dialog with the affordances of digital media as basis to articulate a theory and craft of digital preservation.

As a guidebook and an introduction, this text is a synthesis of extensive reading, research, writing, and speaking on the subject of digital preservation. It is grounded in my work on digital preservation at the Library of Congress and before that, working on digital humanities projects at the Center for History and New Media at George Mason University.  The first section of the book synthesizes work on the history of preservation in a range of areas (archives, manuscripts, recorded sound, etc.) and sets that history in dialog with work in new media studies, platform studies, and media archeology. The later chapters build from this theoretical framework as a basis for an iterative process for the practice of doing digital preservation.

This book serves as both a basic introduction to the issues and practices of digital preservation and a theoretical framework for deliberately and intentionally approaching digital preservation as a field with multiple lineages.  The intended audience is current and emerging library, archive, and museum professionals as well as the scholars and researchers who interface with these fields. As such, the book will be useful as assigned reading for graduate courses in digital preservation and digital curation in library science, museum studies, and public history programs. This book is also highly relevant to digital humanities programs and courses as the work of digital humanists increasingly results in the development of digital platforms, tools and resources which face significant sustainability challenges and thus require an understanding of digital preservation planning to succeed.

There are a handful of books on digital preservation, but this book is significantly different in two key ways. First, it is intentionally brief. Because of this, it is more accessible and usable by a wide range of stakeholders in digital preservation. This is not to an exhaustive work on the subject, but a clear and focused perspective and approach. Second, it treats digital preservation as a craft and anchors it in work in humanities scholarship on media and mediums. Much of the extent work on digital preservation approaches the subject as one that is highly technical, which continues to obfuscate many key issues and assumptions, particularly for humanities scholars interested in understanding digital preservation. While the book has a practical bent, it is not a how-to book that would quickly become outdated. It establishes and offers stages and processes for doing digital preservation, but it is not tied to particular tools, methods, or techniques. Instead, it is anchored in an understanding of the traditions of preservation and the nature of digital objects and media.

Sections of the Book 

Introduction: Getting Beyond Digital Hyperbole

At a summit on digital preservation at the U.S. Library of Congress in the early 2000s, a participant from a technology company proposed, “Why don’t we just hoover it all up and shoot it into space.” The “it” in this case being any and all historically significant digital content. Many participants laughed, but it wasn’t intended as a joke. Many have, and continue to seek similar “moon-shots,” singular technical solutions to the problem of enduring access to digital information.

More than a decade later, we find ourselves amid the same set of stories we have heard for at least thirty years. Among the public, there is a persistent belief that if something is on the Internet, it will be around forever.  At the same time, warnings of a potential impending “digital dark age,” where records of the recent past become completely lost or inaccessible appear with regular frequency in the popular press as well.

To many, it seems like the world needs someone to design a system that can “solve” the problem of digital preservation. The wisdom of the cohort of digital preservation practitioners in libraries, archives, and museums who have been doing this work for half a century suggests this is an illusory dream not worth chasing. Working to ensure long-term access to digital information is not a problem for a tool to solve. It is a complex field with a significant ethical dimension. It is a vocation.

The purpose of this book is to offer a path for getting beyond the hyperbole and the anxiety of the digital and establish a baseline for practice in this field. To do this, one needs to first unpack what we mean by preservation. It is then critical to establish a basic knowledge of the nature of digital media and digital information. With these in hand, anyone can make significant and practical advances toward mitigating the most pressing risks of digital loss. For more than half a century, librarians, archivists, and curators have been establishing practices and approaches to ensure long-term access to digital information. Building from this work, this book provides both a sound theoretical basis for digital preservation and a well-grounded approach to its practices and craft.

Section One: Historicizing Preservation and Digital Media

Chapter One: Preservation’s Divergent Lineages

Interdisciplinary dialog about digital preservation often breaks down when an individual begins to protest “but that’s not preservation.” Preservation means a lot of different things in different contexts. Each of those contexts has a history. Those histories are tied up in the changing nature of the mediums and objects for which each conception of preservation and conservation was developed. All to often, discussions of digital preservation start by contrasting digital media to analog media.  This contrast forces a series of false dichotomies. Understanding a bit about the divergent lineages of preservation helps to establish the range of competing notions at play in defining what is and isn’t preservation.

Building on work in media archeology, this chapter establishes that digital media and digital information should not be understood as a rupture with an analog past, Instead, digital media should be understood as part of a continual process of remediation embedded in the development of a range of new mediums which afford distinct communication and preservation potential. Understanding these contexts and meanings of preservation establishes a vocabulary to articulate what aspects of an object must persist into the future for a given preservation intent.

To this end, this chapter provides an overview of many of these lineages. This includes; the culture of scribes and the manuscript tradition; the bureaucracy and the development of archival theory for arranging archives and publishing records; the differences between taxidermy and insect collecting in natural history collections and living collections like butterfly gardens and zoos; the development of historic preservation of the built environment; the advent of recorded sound technology and the development of oral history; and the development of photography, microfilming and preservation reformatting. Each episode and tradition offers a mental model to consider deploy for different contexts in digital preservation.

The purpose here is not a detailed history of lineages of preservation and the development of media, but instead to illustrate the many different conceptions of preservation exist and how those conceptions are anchored in different objectives. This overview provides readers with a focus on the distinct conceptions of what matters about an object and the innate material properties and affordances of different kinds of media as they relate to preservation.

Chapter Two: Understanding Digital Objects

Doing digital preservation requires a foundational understanding of the structure and nature of digital information and media. This chapter works to provide such a background through three related strands of new media studies scholarship. First, all digital information is material. Second, digital information is best understood as existing in and through a nested set of platforms. Third, that the database is an essential media form and metaphor for understanding the logic of digital media.

Given that digital information is always physically encoded on digital media, it is critical to recognize that the raw bit stream (the sequence of ones and zeros encoded on the original medium) have a tangible and objective ability to be recorded and copied. This provides an essential first level basis for digital preservation. It is possible to establish what the entire sequence of bits is on a given medium, or in a given file, and use techniques to create a kind of digital fingerprint for it that can then be used to verify and authenticate perfect copies.

With that noted, those bit streams are animated, rendered, and made usable through nested layers of platforms. In interacting with a digital object, computing devices interact with the structures of file systems, file formats and various additional layers of software, protocols and drivers. Drawing on examples from net art, video games, and born digital drafts of literary works, I explore multiple ways to approach them anchored in different layers of their digital platforms. The experience of the performance of an object on a particular screen, like playing a video game or reading a document, can itself obfuscate many of the important aspects of digital objects that are interesting and important but much less readily visible, like how the rules of a video game actually function or deleted text in a document which still exists but isn’t rendered on the screen.

As a result of this nested platform nature, the boundaries of digital objects are often completely dependent on what layer one considers to be the most significant for a given purpose. In this context, digital form and format must be understood as existing as a kind of content. Across these platform layers digital objects are always a multiplicity of things. For example, an Atari video game is a tangible object you can hold, a binary sequence of information encoded on that medium identical to all the other copies of that game, source code authored as a creative work, a packaged commodity sold and marketed to an audience, and a signifier of a particular historical moment. Each of these objects can coexist in the platform layers of a tangible object, but depending on which is significant for a particular purpose one should develop a different preservation approach.

Lastly, where the index or the codex can provide a valuable metaphor for the order and structure of a book, new media studies scholarship has suggested that the database is and should be approached as the foundational metaphor for digital media. From this perspective, there is no “first row” in a database, but instead the presentation and sorting of digital information is based on the query posed to the data. Given that libraries and archives have long based their conceptions of order on properties of books and paper, embracing this database logic will have significant implications for making digital material available for the long term.

Chapter Three: Challenges  & Opportunities of Digital Preservation

With an understanding of digital media and some context on various lineages of preservation, it is now possible to break down what the inherent challenges, opportunities and assumptions of digital preservation are.

We can’t count on long-lived media, interfaces, or formats. Popular digital media of all kinds Disc, Disk, and NAND Flash Wafers all degrade rather quickly — in terms of years, not decades or centuries. Many of these media are relatively complex to read, so the interfaces required to interpret them are likely to not be particularly long lived. The costs of trying to either repair these media or to fix and repair interfaces to read them rapidly becomes prohibitive. As a result, traditional notions of conservation science are, outside of some niche cases, going to be effectively useless for the long-term preservation of digital objects.

Going back to the discussions of preservation lineages, this means that digital preservation is an enterprise that can only focus on the allographic digital object. While all digital information is material, the conservation of that material over the long haul is not broadly practical. Where conservation science is concerned with the chemical and material properties of mediums and artifacts, the science of digital preservation is and will be computer science. With that said, because bitstreams are always originally encoded on tangible media and then created by, acted on and interpreted by all kinds of human made layers of software they end up presenting an extensive range of seemingly artifactual and not simply informational qualities. That is, the physical and material affordances of different digital mediums will continue to shape and structure digital content long after it has been transferred and migrated to new mediums.

Section Two: Doing Digital Preservation

Chapter Four: Articulating preservation intent

What is it about the thing you want to preserve that matters and what do you need to do to make sure it is there in the future? To many, this seems like a simple question. It is not. Too often we take for granted that there is a de facto answer to this question. However, as a result of the nested platform nature of digital information and the fact that most of what we care about is the meaning that can be made from collections of objects, it is critical to be deliberate about how we answer this question in any given situation. This is why digital preservation must be continually grounded in the articulation of preservation intent.

In some cases, someone can clearly articulate this intent at the start of a project. But  for most preservation projects it is often best to be purposeful and strategic around the preservation intention. This is particularly critical given that deciding what matters most about some set of material can lead to radically different approaches to preserving and describing it.

Through examples of the diverse types of content that different kinds of cultural heritage organizations are preserving and their intent for doing so, this chapter establishes how to articulate preservation intent and how well-articulated preservation intent makes the resulting collections easier to evaluate and more transparent for future users.

Chapter Five: From Bit Preservation to Digital Preservation

Taking into account the challenges and opportunities of digital preservation, it is important to bracket the work into two different challenges: bit preservation and digital preservation. Bit preservation, ensuring authentic copies of digital objects, is the most pressing problem. Thankfully, it is a relatively straightforward problem for which there are a range of simple solutions. With that said, ensuring those authentic copies are interpretable, comprehensible and usable is far more challenging. Thankfully, this work of digital preservation is a much less time sensitive activity.

Bit preservation is accomplished by managing multiple copies of the digital objects you want to preserve, regularly comparing digital fingerprints for those files to ensure that they are all identical, repairing or replacing copies when they fail those checks, and migrating the copies to newer media and continuing to ensure that the digital fingerprints still match. With more resources, there are better ways to systematize and automate these processes, but with relatively small collections it is still possible to do this and be confident you have authentic copies as long as someone continues to mind and tend to them.

Digital preservation is much less straightforward.  The central challenge of digital preservation is that software runs. The active and performative nature of that running is only possible through a regression of dependencies on different pieces of software that are typically tightly coupled with specific pieces of hardware. Along with this, it is important to think through if there is enough context for the digital objects for someone in the future to be able to make sense of them. Two primary strategies exist for approaching these issues: emulation and format migration. Both are discussed and a case is made for why in many cases organizations are hedging their bets and pursuing both strategies.

Chapter Six: Arranging and Describing Digital Objects

The story goes that shortly after the Library of Congress signed an agreement with Twitter to begin archiving all of the tweets, a cataloger asked “But who will catalog all those tweets?” The idea of describing billions of objects was dauntingly incompressible to those who lacked experience with the nature of digital media. Like most digital objects, tweets come with a massive amount of transactional metadata: timestamps, usernames, unique identifiers, links out to URLs on the web. Like most digital objects, the tweets can largely describe themselves.

The usability of digital information will be largely dependent on how we organize, arrange, and describe it.  Arranging and describing digital objects needs to conceptually shift to embrace the nature of digital media and to recognize a distinct transition which has occurred in terms of computability. Digital media continually generates massive amounts of metadata and because it is computable, it is also increasingly possible to process digital data to derive descriptive information and metadata. As a result, arranging and describing digital content should increasingly be focused on limited amounts of expert intervention in chunking and describing content in aggregate and leaving lower levels of description to the objects themselves.

In terms of arranging digital objects, their database nature means that unlike folders in a box or books on a shelf, by their very nature digital media come with a multiplicity of orders. This complicates core archival principles around original order. It also, requires thinking through how to chunk content into reasonable and coherent sets of information that are easier to manipulate and work with as all kinds of current and future users.

In this context, it is critical to revisit the levels of description at which librarians, archivists, and curators work to evaluate in what cases something should be treated as an “item” or a “collection” and what levels of descriptive work should be employed. Given how much objects are self- describing, it makes much more sense to take up archival practices of describing content at the collection level and explaining the scope of a collection, the context of it’s acquisition, and how and why that collection was collected and preserved and to let the lower levels of description be left to the content itself.

Similarly, many digital objects actually index, describe, and annotate other digital objects. For instance, if you take all of the links that appear in articles published in the Drudge Report, the fact that the Drudge Report linked out to those sites tells you something about them. This affords the possibility of starting to think of nearly all-digital objects as both data in their own right and metadata that describes other objects. To this end, we must increasingly think of “description” and “the described” as a fuzzy boundary.

Chapter Seven: Divergent and Multimodal Access and Use

When a user in a research library asks to see a book in an obscure language a librarian will generally bring it out and let them look at it. That librarian may have no idea how to make sense of the text, but they know how to provide access to it and it is assumed that the researcher needs to come with the skills to make sense of it. At the most basic level, we can provide this kind of access to any digital objects we are preserving.

The affordances of digital media open up significant potential for access and use of digital content. At the same time, our experience with commercial software can get in the way of letting others access digital content until one can provide a simple way for any user to double click on a digital object and have it “just work.” It is critical for us to get over the assumptions that are embedded in this mentality and embrace the divergent and multimodal nature of access that digital media present us with.

This means digital preservation practitioners need to be OK with just saying, “Here it is, have at it” and also with consistently exploring the potential for new tools and methods for providing access to digital content. Even if you don’t know how to open a given file, there are a range of emerging techniques and approaches that researchers today and in the future will be able to use in working with digital content. In addition, it is important to think through the types of access restrictions or redaction of information may be necessary.

This means we should be continually exploring ways to make digital content as broadly accessible and usable as individual files, bulk aggregates and a range of other modes. Researchers are increasingly interested in approaching all kinds of digital content as data sets for computational analysis and this requires adopting new ways of thinking about access.

Conclusions: The Theory & Craft of Digital Preservation

Digital preservation is not an exact science. It is a craft in which experts must reflexively deploy and refine their judgment to appraise digital content and implement strategies that make the most sense for minimizing the most pressing risks of loss while working to make it as widely usable and useful as it can be to its’ respective audiences. At least, that is the case I have sought to make in this book. As Stacy Eardman, digital archivist at Beloit College has noted, digital preservation is much like a lyric from the song The Have Nots, “This is the game that moves as you play.”

The craft of digital preservation is anchored in the past. It builds off of the records, files, and works of those who came before us and those who designed and set up the systems that enable the creation, transmission and rendering of their work. At the same time, the craft of digital preservation is also the work of a futurist. We must look to the past trends in the ebb and flow of the development of digital media and hedge our bets on how digital technologies of the future will play out.

My former supervisor, Martha Anderson, who worked as the Managing Director of the National Digital Information Infrastructure and Preservation Program at the Library of Congress, liked to describe digital preservation as a relay race. Digital preservation is not about a particular system, or a series of preservation actions. It is about preparing content and collections for hand offs. We cannot predict what future digital mediums and interfaces will be, or how they will work, but we can select materials from today, articulate aspects of them that matter for particular use cases, make perfect copies of them, and then work to hedge our bets on digital technology trends to try and make the next hand off as smoothly as possible.

 

 

Student Digital Preservation Consultants Looking for Small Cultural Heritage Organizations

WhatIsDP_DigitalPreservation
For many, this is where we find ourselves in organizations just starting to work on digital preservation.

I’m working on drafting up the syllabus for my digital preservation graduate seminar for the University of Maryland’s iSchool for this coming fall. I am a firm believer in learning-by-doing. I also think talking about digital preservation in the abstract, outside the very real resource and time constraints of organizations largely misses the point. As a result, I am planning to have each student work through a series of assignments where they serve as digital preservation consultants to small cultural heritage organizations.

My hope is that this will be a meaningful learning opportunity for the students, as well as a way for them to start building out a portfolio of work that will be relevant to potential future employers. I am also optimistic that this can be a way to provide some help to small cultural heritage organizations that could  benefit from having the additional manpower  think through and develop plans for helping to make the best use of resources to make their digital content more long-lived.

I wanted to share a draft of the series of assignments I am putting together for two reasons:

  • First, to get feedback and input on how to improve the assignment.  I’ve posted it as a Google Doc too, so if you have suggestions for it please feel free to write comments or suggestions directly into the doc.
  • Second, pairing students with individuals who are interested in participating in this work is going to be key. I wanted to circulate this document as a means to identify people and organizations interested in working with a student as a digital preservation consultant for their organization.

Requesting a Graduate Student Digital Preservation Consultant

I think the finish line for digital preservation is a little too close to the starting line here. But it get's at the idea :)
I think the finish line for digital preservation is a little too close to the starting line here. But it get’s at the idea 🙂

If you (and your organization) would be interested in having a University of Maryland graduate student in my digital preservation seminar focus their digital preservation consultant project on your organization please take a two minutes to fill in this 5 question form. I think this is a great opportunity for organizations for a few different reasons.

Here are some reasons to consider filling in the form for your organization. This project is a chance to:

  1. Solicit assistance thinking through digital preservation issues and planning for your organization.
  2. Provide a meaningful learning experience to someone just getting started in the field
  3. Learn t more about digital preservation as the student shares what they are learning through the class

Through the course of the assignments, students will;

  1. Document and review current practices with an organization’s digital content
  2. Draft suggestions for potential next steps to improve management of digital content grounded in the resources an organization has access too
  3. Draft a digital preservation policy for consideration for the organization

On the first day of class (September 1st), I will present the organizations that have filled out the survey my students. In the first few weeks of class I will help to pair each student with an organization for the semester.

If you are matched up with a student, the idea would be that you would commit to doing an interview or two with them about your organization’s collection and current practices for digital material and that you would review and provide input on several of their assignments (listed below).

I should underscore that it is completely fine for organizations to be literally at square one in terms of digital preservation practices and planning. So many cultural heritage organizations are just getting started with their digital preservation planning, and while it can be a bit intimidating to take some first steps in this space. There are many simple and inexpensive things organizations can be doing to mitigate risks of loss . The assignment will be most valuable for both students and organizations in cases where there is little current work  being done in digital preservation. As part of this project, students will be blogging about their work, so you and your organization will need to be OK with them sharing information about the project. This can be a bit intimidating, but by having students work on their public writing skills and inviting a broader audience into discussion about how to do this work in organizations it will help to ensure that the quality of that work is stronger and more useful. Through this public writing process, the results of the work will be more useful to both the student and to your organization.

What follows are details about the design of this assignment. This is also available in the google doc if you would like to suggest edits or make comments.

Digital Preservation Consultant Project

Here you can see a student, working synthesizing what they have found and drafting a plan.
Here you can see a student, working synthesizing what they have found and drafting a plan.

An academic understanding of the issues in digital preservation is necessary but not sufficient for  professional digital preservation work. Digital preservation is fundamentally about making the best use of what are always limited resources to best support the mission of an organization. As such, to really learn how to do digital preservation you need to apply these concepts in the practical realities of an organizational context.

Aside from participating in discussion of the course readings through the course blog, the other course assignments will require you to act as a digital preservation consultant for a cultural heritage organization. For a variety of reasons I suggest this be a small institution. Below are the five assignments you must complete over the course of the semester as part of this project.

  1. Identify Small Cultural Heritage Organization and Establish Partnership (by week 3): For most of the course assignments, you will need to find a small cultural heritage organization that you can work with as a digital preservation consultant. I have identified a list of organizations that are up for participating, but you are free to find other organizations as well. The key requirements here are that 1) they have consented to working with you 2) they have some set of digital content but 3)  their collections are not so complex that you couldn’t possibly do the project. Example institutions include an independent organization (like a house museum, a community archive or library), a small department or subset of an institution (say the archives of a student newspaper or radio station, the special collections department at a public library, or the archives in a museum).
    1. Deliverable: The output of this phase is to identify this organization and confirm that you have a commitment from them to participate. We will check in on this in class as we go, but by the date of this assignment you need to have confirmed participation of an organization that meets these requirements and have posted what organization you are working on in a list on the course website. On the site, post the name of the organization, your name (or handle) and two or three sentences about the organization and its digital content.
  2. Institutional Digital Preservation Survey (Draft by week 6 and send to your org, publish with their comments incorporated by week 8): For your organization, interview one or two staff members to get a handle on their digital collections and practices. Draw from the NSDA levels of preservation as an overall framework for conducting your survey. You will want to focus on gathering information about their practices in five key areas.
    1. First, what is the scope of their digital holdings?
    2. Second, how is that digital content currently being managed?
    3. Third, what are the staff at the organization’s perceptions of the state of their digital content (are they concerned about it, do they see it as mission critical or a nice to have, what do they see as their own self efficacy and their organization’s capacity for sustaining their content)?
    4. Forth, what kinds of digital content would the organization like to be collecting but currently isn’t?
    5. Fifth, what, if any resources, do they have that they could bring to bear on this problem (if they have some significant potential resources that’s great, but realize that there may well be very meaningful smaller resources that could be brought to bear. For example, could one staff member spend 2-4 hrs a week on digital preservation, could they bring in community volunteers, how much could they spend on things like extra hard drives etc.)  Throughout all of this, it will be important to understand what the organization’s collecting mission is. You want to begin to probe all the questions above, but you need to be able to map their answers to the NDSA levels.
    6. Deliverable: You will write and publish a post to the course blog (1200-3000 words) in which you present the findings of your survey. The post should first provide context, what is this organization what are its digital holdings what does it want to be collecting them. From there, work through presenting an accurate and coherent report of the themes and issues that came through in your interviews. At this point you are primarily interested in accurately representing the state of their work. Do not get into making recommendations. Simply do your best to succinctly and coherently explain what you found about the five areas of questioning discussed above. Before publishing this, you must present it to your org for their feedback to make sure you have their input on how you are describing the state of their work.
  3. Institutional Digital Preservation Next Steps Preservation Plan (Week 10): Now that you have the results of your survey, it is time to take out the NDSA levels of digital preservation and the rest of our course readings and figure out what a practical set of next steps would be for your organization.
    1. Deliverable: Post your next steps plan to the course blog (1200-3000 words). After a brief introduction providing context about the organization and its collections, you should work through reviewing  the organization’s current work on digital content using each of the areas of the NDSA levels of digital preservation. Complete by identifying three different levels (low, medium and high resource requirement) of next steps they could take to improve their rating on the NDSA levels of digital preservation. Be creative here, for example could they upload collection items to the Internet Archive or Wikimedia Commons? Or could they buy an extra hard drive and make copies and swap it with a backup buddy at another organization in a different region of the country, etc. The point here is to think about how to get them the furthest up some of the levels with the resources at hand.  Before publishing this, you should present it to your organization for them to review and provide input.
  4. Draft a Digital Preservation Policy for Your Org (Week 12): Now that you have put in place a set of recommendations, it is important to also draft up a set of digital preservation policies and practices for the organization. If this is to have any impact you are going to need to be able to articulate what the organization’s policies could be going forward.
    1. Deliverable: Drawing on the example digital preservation policies we read in class, draft up a short policy document for your institution tuned to what you have learned from working with them. Draw from the examples for models for aspects of this document. Share it with them for some input and feedback. Then Post it to the blog (800-1500 words).
  5. Reflecting on Lessons Learned (Week 13): After doing this work,presenting it, and getting feedback from your organization, you need think through what worked and didn’t work for the project. Taking time for reflection and teasing out the lessons you’ve learned about both digital preservation and working with a cultural heritage organization.
    1. Deliverable: Return to each of the documents you created thus far and synthesize 3-5 points about what did or didn’t work or what your take away lessons are from this process. Think through what you will do differently the next time you help an organization improve its digital preservation practices. Bring in references to what you’ve learned from readings in the course and from what you have learned from your classmates work on their projects (800-1400 words).

All images from Digitalbevaring.dk, published under a Creative Commons Attribution 2.5 Denmark license and created by Jørgen Stamp.

“But That’s Not Preservation!” Notes on Preservation’s Divergent Lineages

I’ve found that interdisciplinary dialog about digital preservation often breaks down when someone protests “but that’s not preservation.”

Preservation means a lot of different things in different contexts. Each of those contexts has it’s own history. Those histories are tied up in the changing nature of the mediums and objects for which each conception of preservation and conservation was developed. All to often, discussions of digital preservation start by contrasting digital media to analog media.  This contrast forces a series of false dichotomies. I’m feeling like better understanding a bit about the divergent lineages of preservation could help to establish the range of competing notions at play in defining what is and isn’t preservation.

I’m curious to start building out some of my understanding of the lineages of different kinds of preservation. So I would love if folks could share any examples of writing in this area that might be helpful. I think a lot of this context looks to be in something like Preserving our Heritage: Perspectives from Antiquity to the Digital Age (which I am still digging into.) However, I also think the story is even broader here, and that there is a media archaeology aspect that is missing. That is, my sense is that a series of old new media; like photography, film and recorded sound technologies have been interacting with ideas about what preservation is or should be for more than a century. 

What follows is not so much a coherent final product as it is me openly sharing some of my notes on different strands I see at play in this space.

  • The manuscript tradition: A situation where the allographic nature of a work is primary what matters, that something is the work if it has the same spelling and where copying is the basis of preservation. In this case, something like the Evolution of Manuscript Traditions could be useful.
  • The history of archival traditions: In this case, something like What is Past is Prologue: A History of Archival Ideas Since 1898, and the Future Paradigm Shift is useful. Also, publishing records in documentary editions vs. arranging and describing records and ideally a bit on the interventions that came with microfilming. That is, while we generally think of archives as holding unique and original records in this space there is a lengthy tradition of documentary edition work focused on publishing records and a history of photographic reproduction of records for both access and preservation purposes.
  • The history of art conservation and restoration: For example, Changing Approaches in Art Conservation: 1925 to the Present. I’ve seen a lot on the history of conservation of things like paintings. However, the history of the development of variable media art works, art installations, and works made of materials that rapidly deteriorate has resulted in very smart thinking about what it is about art works one wants to conserve. In this space, Re-collection Art, New Media, and Social Memory,
  • Preservation of dance and live performance:  There are, at this point, long standing traditions in how to preserve and document works of art that produce lived experience. In this space, the Dance Heritage Coalition‘s Documenting Dance: A Practical Guide nicely illustrates the continuity that exists between a variety of modes of documentation technologies, from textual notation, to moving image technologies to new digital methods like motion capture.
  • The history of conservation of living creatures: Everything from taxidermy and insect collecting to living collections like butterfly gardens and zoos as well as things the Svalbad Global Seed Vault. I don’t really have good resources on the history and theory here. Thinking about digging into some history of science journals. In any event, I think there is an interesting story about which techniques are intended for what purposes and what is significant about a living thing that must be preserved toward that particular purpose. That is, when and why do you pin and preserve butterflies as a collection and when and why would you choose to run a butterfly garden. So looking for any ideas folks might have for work in this space.
  • The development of historic preservation of the built environment: I know some good stuff here, like Giving Preservation a History: Histories of Historic Preservation in the United States. In this case, it’s interesting to me that some newer technologies like photogrammetry  or 3D point cloud technologies are being explored as ways to “digitize” or create recordings to preserve and document physical spaces. I find historic preservation particularly interesting in that it often focuses on turning back the clock on a particular building to make it appear as it was at a particular moment in time. In this vein, it can involve recreation and fabrication. Similarly, historic preservation connects in interesting ways to reenactment and living history. In this space, I am a huge fan of Abraham Lincoln as Authentic Reproduction: A Critique of Postmodernism which explores fascinating sets of issues around authenticity in the New Salem Historic reconstructed village and outdoor museum in Illinois.
  • The advent of recorded sound technology and the development of oral history: There is some good stuff on recorded sound technology in Gramaphone, Typewriter, Film and MP3 the Meaning of a Format but they aren’t really explicitly about oral history. In contrast, The History of Oral History isn’t so much focused on the role that recorded sound media have played in the history of oral history. The Media Archaeology work points to how our conceptions of “memory” have themselves been shaped by the advent of these new technologies. That was said of Edison’s phonograph “Speech has become, as it were, immortal” or as an article on Memory and the Phonograph from 1880 would “define the brain as an infinitely perfected phonograph”.
  • The development of photography and microfilming and preservation reformatting: There is some good stuff on this in Lisa Gitelman’s  Paper Knowledge: Toward a Media History of DocumentsIn particular, discussion on the work of the “Joint Committee on Enlargement, Improvement and Preservation of Data” a joint effort of the American Council of Learned Societies and the Social Science Research Council. Which ended up publishing Robert Binkley’s 1931 Manual on Methods of Reproducing Research MaterialsThe book is, to some extent particularly interesting in that it is a cover-page over a photo-offset printing of a type-written manuscript. To this end, the book itself illustrates how changes in the technologies for photo-duplication of documents was effecting access to documents.
  • The history of newspaper conservation: Closely related to the last point, the push to microfilm newsprint based on some of it’s inherent vices. While Double Fold is over the top, it did prompt some really great reactions, like Don’t Fold Up: Responding to Nicholson Baker’s Double Fold 
  • Scientific data and records of observations: Astronomers draw on records of observations of the motion of celestial objects dating back to the ancient world. Lorraine Daston’s “Sciences of the Archives” research group has produced some facilitating work in this vein. I like how this quote from Datson’s research group captures the continuity that exists in these traditions which bridges analog and digital practices and incorporates other new media like photography. “Since ancient times, cultures dispersed across the globe have launched monumental data-centered projects: the massive collections of astronomical observations in ancient China and Mesopotamia, the great libraries from Alexandria to Google Book Search, the vast networks of scientific surveillance of the world’s oceans and atmosphere, the mapping of every nook and cranny of heaven and earth.” They have a great 2012 paper in Osiris that works through this in more depth.

So in all these contexts, I think a few preliminary points start to emerge that I keep thinking about.

  1. Preservation’s meaning is contextual and tradition dependent: As a concept, preservation  has situated meanings in particular traditions and contexts so it’s important to really articulate what one means by the term and what traditions one is drawing on. In this vein, the different traditions have emerged in dialog with the development of media and have their own ideas of what is significant about objects for their use.
  2. Digital vs. Analog Preservation is a false dichotomy: There were already a lot of divergent ideas of what preservation meant in play before digital technology came in to play. In this vein, the intervention of digital technology is just one of a series of technological interventions which has disrupted preservation practices and traditions.
  3. New media is older than digital media: Related to the last point, various media/ technologies of reproduction (and their affordances) have had significant impacts on the traces of the past that can be created and our ability to preserve them. In this vein, scholarship in Media Archaeology focused on reinterpreting and understanding these old new media is likely of considerable value for unpacking those impacts.

So those are some working thoughts and rough notes. Curious and interested for 1) other resources you think are relevant in some of these areas 2) other ways of slicing and characterizing these points 3) other ideas about what the take aways are.

 

25 Curatememes: A Curation

A few years back, Curatememe set out on a mission to create a space “Where Curators Curate Memes about Curation. Where will the absurdity of our use of the term Curation go next? This Tumblr speculates wildly.” I think it’s now time to declare curation accomplished. Now that curation means whatever, I thought I would curate the best of the best from the tumblr here in a listicle. It seemed appropriate. It will also make it easier for me to find one’s I want to use sometimes.

The memes are arranged more or less reverse chronologically, which offers a sense of how they developed as a body of work over time and preserves the experience of reading back into a tumblr.

I Came. I Archived. I Curated

I came I archived I curated

Stop Worrying and Respect des Fonds

stop worrying and respect des fonds

Curation is as Curation Does

curation is as curation does

Be the Change you want to Curate in the World

be the change you want to curate in the world

Eat Pray Love Curate

Travel Trip Eat Pray Love

The Only Thing we have to Curate

the only thing we have to curate is curation itself

 

The Point is to Curate it

The point is to curate it 

There are no facts, only curations

no facts only curations

Say Curation One More Time

say curation one more time

I Can’t Believe I Curated the Whole Thing

I cant believe i curated the whole thing

 Who Curates the Curators

who curates the curators

The Things You Curate End Up Curating You

The things you curate end up curating you

 

Curate the Rainbow

Skittles, fruit flavour sweets

 

One Word… Curation 

one word curation

 

He Who Curates the Spice Curates the Universe

he who curates the spice

Curate Yourselves Winter is ComingCurate Yourselves Winter is Comming

 

 

Wait… That’s Curation

wait thats curation

I Curated this Xhibit

I curated this xhibit

I Curated You I can Deaccession You

I curated You I can deaccession you

 

Frankly my dear, I don’t curate a DAMS

KABEL 1

I love it when curation means whatever

I love it when Curation Means Whatever

This is not Curation

this is not curation

 

Did I Cu-rate That

did I cu-rate that

 

 

 

How Would You Teach A Digital Preservation Grad Seminar?

It is looking like I may end up teaching a graduate seminar on digital preservation for the University of Maryland’s iSchool. There is an existing syllabus, but I will have some flexibility in terms of how I shape and design the course and I am curious what thoughts different folks have for what would be the most effective way to teach a graduate seminar on the subject.

Below are a few of the big picture course design questions I am thinking through and some of my initial thoughts on them. I’m curious for any and all input folks might have.

Organizing PrinciplesHow best to organize a digital preservation course? 

  • To what extent should such a grad seminar like this be about frameworks and principles vs examples and cases? I’m thinking that I should cover those, but I’m also thinking that too many of those models fail to address the idea that digital preservation is fundamentally about risk mitigation from future loss. That is, it’s less about a process and more about how to make the best use of available resources and identifying the best opportunities to systematically work to further lessen the risk of loss. I also think that the frameworks often get in the way of first grasping a fundamental understanding of the nature and structure of digital information and digital media. So I’m entertaining the idea of getting to the frameworks at the end as a way to understand the issues but working through the core issues first.
  • How would you organize and structure such a course? If I don’t start with the frameworks, I’m thinking it makes sense to start by working through a core understanding of digital information and digital media and work from there into the various issues in the NDSA levels of digital preservation.

Particular Tools & Software: What role should they play in the course? 

  • What approach should I take toward particular tools? On the one hand, it is very pragmatic to leave a course like this understanding how to use particular tools, but at the same time, the tools are always going to be changing and everyone needs to be able to plan for how to swap in and out different tools to meet the underlying objective. In my digital history courses I have required students to each figure out how to use and then teach the class how to use particular tools and software. I like this approach as teaching yourself how to use new software and evaluating it is an important skill in it’s own right. With that said, some of the digital preservation tools out there are complex enough that I’m not entirely sure this method would do them justice.
  • How much should a course like this require/push students to develop some basic command line literacy? My sense is that many student’s will not have this, but it is challenging to think through how to do much work in this area without that. With that said, the course isn’t about developing that command line literacy, so I’m not sure how far to delve into this kind of thing.

Kinds of assignments: What would be the most useful for the students? 

  • I’m curious for what folks think would be the most useful kinds of assignments. I’m thinking that given the context of planning for risks and the need to make such plans inside the constraints of an institution that it might make the most sense to have students serve as consultants for small cultural heritage organizations and have them develop plans for options to improve their approaches to ensure long term access to their digital content. So I think many of the assignments might be fit around that. With that said, I am curious for any other ideas for how to either improve this idea of a course project or for other kinds of assignments.